The example that my colleague was using was that of deriving connections in web pages. A computer program would trawl thousands of web sites looking for keywords such as "Microsoft" and "system analyst", and would try to find connections between them. For instance, if 90% of the web pages that mentioned the keyword "cirrhosis" also mentioned the word "alcohol", then the system would conclude that there was a connection of some sort between alcohol and cirrhosis.
The problem was that his symbolic rule-generation system wasn't working very well. That was when I stopped him in a corridor and suggested Hebbian Links.
| Now, it won't have escaped your attention that Hebbian Links are a trained neural architecture. This means that we present it with an input pattern together with an output pattern, as shown in the diagram on the right. However, in the case of data mining from web pages, we don't have data in the form of input and output patterns, merely patterns of which words are present and which ones aren't. | ![]() |
| So we solve this problem by using the Hebbian Links as an auto associator. This means that the same pattern is presented both as the input and as the output to the Hebbian Links and they learn (hopefully) to associate the pattern with itself, as shown in the diagram on the right. | ![]() |
You will notice in that second diagram that there is no connection between each input node and the corresponding output node (i.e. there is no connection between input 1 and output 1, between input 2 and output 2 etc.) It should be pretty obvious why this is the case: Whenever the same pattern is present on both input and output, there will always be a correlation between each input bit and the corresponding output bit, and this would lead to a very strong connection between them if the network weren't restricted. We are looking for more subtle connections, so I have disabled any connections between corresponding input and output bits.
Two tests were carried out. A two-layer Hebbian net with 10 inputs and outputs was initialised as outlined above and then trained using LTP with post-not-pre LTD. Another similar Hebbian net was initialised and trained using LTP with pre-not-post LTD. The training involved presenting all 1000 records in turn and repeating the sequence 10 times. After training, the strengths of the weights were examined and are recorded in Tables 1 and 2 below. (All software was written in Turbo Pascal and run on a standard personal computer).
Table 1. Weights for the Hebbian net trained using LTP with post-not-pre LTD. Red type indicates weights that correspond to pre-programmed rules.
|
Output node |
|||||||||||
|
1 |
2 |
3 |
4 |
5 |
6 |
7 |
8 |
9 |
10 |
||
|
Input node |
1 |
0 | -3.75 | -4.444 | 12.779 | -1.174 | -1.964 | 5.267 | -0.415 | -0.047 | -2.605 |
|
2 |
-0.954 | 0 | 6.174 | 15.566 | 2.974 | 2.056 | 7.564 | 8.021 | 0.865 | 4.611 | |
|
3 |
-4.382 | 3.54 | 0 | 12.372 | 3.873 | -10.997 | -4.154 | 4.757 | 5.643 | -2.859 | |
|
4 |
24.414 | 24.361 | 24.286 | 0 | 17.623 | 10.926 | 14.85 | 8.528 | 5.493 | 18.893 | |
|
5 |
-0.972 | 0.441 | 4.216 | 5.88 | 0 | -1.706 | 2.454 | 2.52 | -4.53 | 4.558 | |
|
6 |
2.482 | 4.106 | -6.259 | 3.844 | 2.756 | 0 | 11.886 | 7.302 | -0.015 | 10.053 | |
|
7 |
10.466 | 9.958 | 1.229 | 8.626 | 7.563 | 12.428 | 0 | 4.028 | 6.964 | 8.058 | |
|
8 |
-2.568 | 3.471 | 2.873 | -5.182 | 0.331 | 0.447 | -3.152 | 0 | -1.525 | -2.49 | |
|
9 |
-0.423 | -1.927 | 5.156 | -6.536 | -4.977 | -5.057 | 1.469 | 0.269 | 0 | 11.378 | |
|
10 |
5.313 | 10.387 | 5.516 | 15.601 | 13.037 | 13.916 | 11.553 | 8.03 | 19.916 | 0 | |
Table 2. Weights for the Hebbian net trained using LTP with pre-not-post LTD.
|
Output node |
|||||||||||
1 |
2 |
3 |
4 |
5 |
6 |
7 |
8 |
9 |
10 |
||
|
Input node |
1 |
0 | -1.25 | -4.694 | 24.279 | -1.174 | 2.536 | 10.267 | -2.665 | -0.547 | 5.645 |
|
2 |
-3.454 | 0 | 3.424 | 24.566 | 0.474 | 4.056 | 10.064 | 3.271 | -2.135 | 10.361 | |
|
3 |
-4.132 | 6.29 | 0 | 24.122 | 4.123 | -6.247 | 1.096 | 2.757 | 5.393 | 5.641 | |
|
4 |
12.914 | 15.361 | 12.536 | 0 | 6.123 | 3.926 | 8.35 | -5.222 | -6.507 | 15.643 | |
|
5 |
-0.972 | 2.941 | 3.966 | 17.38 | 0 | 2.794 | 7.454 | 0.27 | -5.03 | 12.808 | |
|
6 |
-2.018 | 2.106 | -11.009 | 10.844 | -1.744 | 0 | 12.386 | 0.552 | -5.015 | 13.803 | |
|
7 |
5.466 | 7.458 | -4.021 | 15.126 | 2.563 | 11.928 | 0 | -3.222 | 1.464 | 11.308 | |
|
8 |
-0.318 | 8.221 | 4.873 | 8.568 | 2.581 | 7.197 | 4.098 | 0 | 0.225 | 8.01 | |
|
9 |
0.077 | 1.073 | 5.406 | 5.464 | -4.477 | -0.057 | 6.969 | -1.481 | 0 | 20.128 | |
10 |
-2.937 | 4.637 | -2.984 | 18.851 | 4.787 | 10.166 | 8.303 | -2.47 | 11.166 | 0 | |
The entries in the tables in red type correspond to the weights that should be high in order to encode the rules specified. Since there is no inherent direction in the connections, (the algorithm does not specify which of the patterns is input and which is output), the weight matrix should be approximately symmetrical about the major diagonal (with variations due to the initial random setting of the weights).
One problem that can be perceived with this experiment is that different rules that affect a particular data bit cannot be represented as separate items. Consider the following rules Bit 1, bit 3, bit 4 ® bit 10 and Bit 2, bit 5 ® bit 10. These would be represented by the network as high weight values from bits 1, 2, 3, 4 and 5 to bit 10, which would be indistinguishable from a rule Bit 1, bit 2, bit 3, bit 4, bit 5 ® bit 10.
This experiment has shown that the notion of a connectionist architecture being able to perform data mining is not an impossible one. Clearly, although rules can be represented using a simple two-layer Hebbian network, it suffers from the following limitations (among others):
Go back