Expert Systems

What are expert systems?

The obvious extension of storing knowledge inside a computer program is to make use of that knowledge to solve practical problems. Expert Systems (ES) are computer programs that take information from the real world (usually by asking a user various questions) and makes a decision or recommendation based on that information.

ES are the part of Artificial Intelligence that has made the most impact on society, and also one of the most successful areas. The first ES was DENDRAL, that took as its input a mass spectrogram for an unknown chemical compound, and worked out the chemical structure of the compound.

ES are often used to diagnose faults in human beings or machinery. They can act as doctors, although, since the number of diseases and other ailments that a human being can get is almost limitless, they have had limited success and are usually restricted to small areas of medicince. For instance, MYCIN was used in the 1970s to diagnose blood diseases. However, they have had more success in diagnosing faults in machinery such as cars and diesel locomotives - not only because these things are much less complicated than humans, but also because it is less dangerous if the ES makes a mistake than diagnosing a human disease wrongly - in other words, because repairing a car does not represent a life-critical situation, people are more willing to trust the ES and more likely to use it.

In fact, millions of people put their lives in the hands of ES every day without realising it. Commercial airliners have autopilots - computer programs that fly the plane - that take over all the flying apart from the take-off and landings.

Essentially, during flight, all the autopilot does is keep the plane level and flying at a constant speed, and it does this by monitoring all the instruments (the fuel gauge, the artificial horizon, the speed and orientation of the plane etc.) and altering the actuators (the control surfaces such as the position of the wing-flaps). Expert systems are also used to fly the space shuttle and in the monitoring and control of many nuclear power stations.

Rules

Expert Systems usually contain a great many rules that tell it how to process the information that you tell it. It also contains variables that represent the different concepts, and it uses the rules to set those variables to different values.

This needs to be illustrated with an example. Suppose we want to identify a type of car from pieces of information such as whether it has a powerful engine, whether it has a sunroof etc. We would need to store various pieces of information about the car:

engine_capacity = DONTKNOW
sunroof = DONTKNOW
four_wheel_drive = DONTKNOW
unleaded = DONTKNOW
etc.

DONTKNOW is some value inside the program that indicates that we have no knowledge. These variables start as DONTKNOW, but can be set to YES, NO (perhaps represented by 1 and 0), or numerical values as appropriate.

A rule would test some variables and alter others:

IF engine_capacity > 2600
   THEN powerful_engine = YES

IF four_wheel_drive = NO AND coupee = YES
   THEN Aston_Martin = YES

A rule can also ask a question:

IF unleaded = DONTKNOW
   THEN unleaded = ASK("Does the car use unleaded petrol?")

In this case, the program will ask the user the question about the unleaded petrol if it hasn't already worked out whether the car takes unleaded petrol or not. There is clearly no point in asking the question if the program already knows the answer!

A rule consists of a condition and then an action. If the condition (the IF part of the rule) is true, then we say that the rule is "triggered". It is possible that several rules are triggered at any stage. However, generally, expert systems make sure that only one rule carries out its action at any stage. We say that, although many rules are triggered, only one rule "fires". This is to ensure that the program doesn't get in a muddled state, with variables being changed unpredictably.

How should the ES decide which rule fires? There is no one way of doing it, but expert systems have adopted strategies such as the following:

  1. The first rule that is triggered is the one that fires. For instance, in the following diagram, all the rules in red have triggered, but only rule 4 actually fires:

    Rule 1
    Rule 2
    Rule 3
    Rule 4 - THIS RULE FIRES!!
    Rule 5
    Rule 6
    Rule 7
    Rule 8
    Rule 9
    Rule 10
    Rule 11

  2. Alternatively, the rules may be divided into sections in some way, and only the first triggered rule in a section could fire, according to some plan. For instance, in the following situation, perhaps rule 7 would fire:

    Rule 1
    Rule 2
    Rule 3
    Rule 4                    
    Rule 5
    Rule 6
    Rule 7 - THIS RULE FIRES!!
    Rule 8
    Rule 9
    Rule 10
    Rule 11

  3. Rules are often arranged so that the ones with the most complex IF clause fire first. For instance, the rule IF animal has sharp teeth OR animal's eyes point forwards THEN animal is carnivore has more parts to its IF statement than IF animal eats meat THEN animal is carnivore, and would fire first.

One important point is that an ES should not ask questions that are not necessary. For instance, if the user has told the ES that the animal has feathers, then the ES knows that the animal is a bird. It does not need to ask whether the animal lays eggs, as all birds lay eggs.

Representing Uncertainty

When an ES asks us a question, we may not be certain of the answer. If we are asked "Is the animal a carnivore?" and it isn't feeding time, then we have to answer "I don't know". We can also have degrees of certainty. We may be fairly certain that the animal lays eggs, or that it can fly, but not absolutely certain.

This raises two points. Firstly, we have to rewrite questions so that the user can express uncertainty. Instead of a question saying "Can the animal fly?" it would say "How certain are you that the animal can fly?" In this case, the user would enter a positive number for "yes" and a negative number for "no". Typically, the answer would range from -10 (indicating "definitely no"), through 0 (indicating "I don't know") to +10 (indicating "definitely yes").

The harder problem involves how to propagate these uncertainties. For instance, consider the rule for identifying bats about (mammals that can fly). If we are 90% certain that the animal is a mammal and 35% certain that it can fly, how certain should we be that the animal is a bat?

There are various ways of representing and propagating uncertainties. I have listed some of them below:

Ad-hoc methods

Probability theory (Bayesian reasoning)

Dempster-Shafer calculus

MYCIN Certainty Factors

Fuzzy Logic

"Ad-hoc" methods are those that are simply developed for a particular situation because they seem to work. They don't necessarily have a mathematical basis. The scientific community is divided as to which method is the best way of representing uncertainty, and each method has its own advocates.

Data Mining

I have a friend who is a vet specialising in the treatment of dogs. One day he treated the 9 dogs listed in the table below and for each dog he filled out a form indicating the symptoms that the dog displayed and whether he prescribed antibiotics for the dog.

Dylan
Max
Lucy
Pippa
Toby
Benji
Jake
Sandy
Rex
Cough
Y
N
N
Y
N
Y
N
N
N
Hyperactive
N
Y
Y
N
N
Y
Y
N
N
Had innoculations
N
N
Y
Y
Y
N
Y
Y
N
Foreign body in paw
Y
N
N
Y
N
Y
N
N
Y
Wound
Y
N
N
N
Y
N
N
Y
Y
Constipated
N
Y
Y
N
N
Y
N
N
N
Needed antibiotics?
Y
N
N
Y
Y
Y
N
Y
Y

How could we turn this table of data into a set of rules that would allow an ES to determine which dogs needed antibiotics and which ones didn't? Well, some of the factors listed above are do require antibiotics and some do not, but the program doesn't know in advance which are which. Let's choose one of the symptoms at random and use it to split the data into two parts:

Hyperactive?
Max
Lucy
Benji
Jake

Dylan
Pippa
Toby
Sandy
Rex

In this diagram, all the dogs requiring antibiotics have been marked in light blue, all those not requiring antibiotics in red. The decision hyperactive/not hyperactive slices the data items into two groups.

In fact, this was quite a good choice. Every single dog that was not hyperactive needed antibiotics. This could be a coincidence, of course. The computer program has no way of knowing whether being hyperactive influences the need for antibiotics or not. However, if the data is typical (and it's a big if!) then we can form a rule:

IF hyperactive IS FALSE
   THEN antibiotics IS TRUE

Of course, there was one dog that required antibiotics inspite of not being hyperactive, Benji. This means that we will need at least one other rule to capture the picture entirely. Let's try another symptom:

Had innoculations
Lucy
Pippa
Toby
Jake
Sandy

Dylan
Max
Benji
Rex

This time the choice is not so good. There are names in both red and blue on both sides of the diagram, so we can't use whether the dog has had innoculations or not as a question to decide about antibiotics.

A program to perform data mining would try all the symptoms, and then combinations of symptoms. There is a methodical way of determining how good a symptom or combination of symptoms is in splitting the data.



Go back

Menu

Example 1

Example 2

Example 3

ES Shell

Questions

Move on