AIML Introduction
AIML, known as Artificial Intelligence Markup Language, is an XML language for creating natural language software agents, invented and created by Dr. Richard S. Wallace and the Alicebot open source software organization between 1995-2000. AIML is an XML format for rule definition in order to match patterns and determine responses.
- The design goals of AIML are as follows.
- AIML should be easy for the general public to learn and understand.
- AIML should enable minimal concepts to be encoded to support a stimulus-response discipline system component based on I.C.E.
- AIML should be compatible with XML.
- Writing AIML processable program files should be easy and convenient.
- AIML objects should have good readability and clarity for people.
- The design of AIML should be formal and simple.
- AIML should contain dependencies on other languages.
For a detailed primer on AIML, you can turn to Alice Bot’s AIML Primer. You can also learn more about AIML and what it can do at the AIML Wikipedia page. With Python’s AIML package, it is easy to implement artificial intelligence chatbots.
Building a chatbot with AIML
Install the Python aiml library
- Python 2:pip install aiml
- Python 3:pip install python-aiml
Get alice resources.
After Python aiml is installed, there will be an alice subdirectory under Lib/site-packages/aiml in the Python installation directory, which is a simple corpus that comes with the system.
Loading alice under Python
Once you have obtained the alice resource, you can load the alice brain directly using the Python aiml library.
|
|
The above process is very simple, next we have to create our own bot from zero.
Creating a standard startup file
It is standard practice to create a startup file called std-startup.xml as the main entry point for loading AIML files. In this example, we will create a base file that matches a pattern and returns one accordingly. We want to match the pattern load aiml b and then have it load our aiml brain as a response. We will create the basic_chat.aiml file in one step.
|
|
Creating a standard startup file
It is standard practice to create a startup file called std-startup.xml as the main entry point for loading AIML files. In this example, we will create a base file that matches a pattern and returns one accordingly. We want to match the pattern load aiml b and then have it load our aiml brain as a response. We will create the basic_chat.aiml file in one step.
|
|
Random Response
You can also add a random response like the one below. It will respond randomly when receiving a message that starts with “One time I”. * is a wildcard that matches anything.
|
|
Use an existing AIML file
Writing your own AIML file is a lot of fun, but it will take a lot of work. I think it takes about 10,000 patterns before it starts to get real. Fortunately, the ALICE Foundation offers a large number of free AIML files. Browse the AIML files on the Alice Bot website.
Testing the newly created robot
So far, all the AIML files in XML format are ready. They are all important as part of the robot’s brain, but for now they are just information (information). The robot needs to come to life. You can customize the AIML with any language, but here you can use Python.
This is the simplest program we can start with. It creates an aiml object, learns the startup file, and then loads the rest of the aiml file. Then it is ready to chat, and we enter an infinite loop of constantly prompting the user for messages. You will need to enter a pattern that the bot recognizes. This pattern depends on which AIML files you have loaded. We create the startup file as a separate entity so that we can later add more aiml files to the bot without modifying any of the program source code. We can add more files to the startup xml file that are available for learning.
Accelerated Brain Loading
When you start having a lot of AIML files, it will take a long time to learn. That’s where the BRAIN files come from. After the robot learns all the AIML files, it can save its brain directly to a file that will dynamically speed up the loading time in subsequent runs.
|
|
Remember, if you use the brain method as written above, loading at runtime does not save the added changes to brain. You will either need to delete the brain file so that it can be rebuilt at the next start, or you will need to modify the code so that it saves the brain at some point after reloading.
Add Python commands
If you want to provide your bot with some special commands for running Python functions, then you should capture the input message for the bot and then process it before sending it to mybot.respond(). In the above example, we get the user’s input from raw_input. However, we can get the input from anywhere. It could be a TCP socket, or a speech recognition source code. Process the message before it goes to AIML. You may want to skip AIML processing on some specific messages.
Sessions and Assertions
By specifying a session, AIML can tailor different sessions for different people. For example, if a person tells the bot that his name is Alice and another person tells the bot that his name is Bob, the bot can distinguish between the different people. To specify the session you are using, pass it as the second argument to respond()
This is helpful for customizing personalized conversations for each client. You will have to generate your own session ID in some form and keep track of it. Note that saving the brain file will not save all the session values.
|
|
In AIML, we can use the set response in the template to set the assertion
|
|
Using the AIML above, you can tell the robot.
My dogs name is Max
And the robot will answer you.
That is interesting that you have a dog named Max
Then, if you ask the robot.
What is my dogs name?
The robot will answer.
Your dog’s name is Max.
AIML can be used to implement conversational bots, but for Chinese there are the following problems.
- Chinese rule base is small. In general, the richer the rule base, the more human-like the response of the robot. Currently, the rule base for English is very rich, covers a wide range of topics, and is publicly available. However, the publicly available Chinese rule base is basically not available.
- The AIML interpreter does not support Chinese well. In fact, the PyAIML module (parser) under Python can already support Chinese relatively well, but there are also the following problems: English words are generally distinguished by spaces or punctuation, so they have a kind of “natural word separation” feature, and since Chinese input is not separated by spaces, the above will cause some inconvenience in practice. Some inconvenience in practice. For example, to achieve input matching with/without spaces, it is necessary to include both modes in the rule base.
Solutions.
- Build your own corpus (e.g. get training from subtitle files)
- Own Chinese word separation tool (e.g. jieba)