Main Menu

Hebbian Links - the simplest possible neural net!

In 1949, the psychologist Donald Oldings Hebb (Blimey, I bet he was teased at school!) wrote a book called "Organisation of Behavior". In this book, he attempted to explain how neurons might learn inside the human brain. The theory that he proposed has since become known as Hebbian Learning.

Basically, the theory goes like this: Two neurons in the brain may well have a connection between them. The neurons can be activate (i.e. firing on all cylinders) or inactive (asleep). If both the neurons are active at the same time, then the strength of the connection between the two should increase. If the neurons are not both active at the same time (i.e. one or both of them are inactive), then the strength of the connection does not increase. This idea of a connection between neurons strengthening is referred to as Long Term Potentiation (or LTP).

This rather simplistic theory has been expanded on in later years, by Hebb himself and by other researchers. Another rule has been added to the theory to produce something called Anti-Hebbian learning. This proposes that if just one of the neurons with the connection between them is active, and the other one is asleep, then the strength of the connection will actually weaken. If both the neurons are inactive, then the connection is not affected. This idea is called Long Term Depression (or LTD).

The neuron that provides the signal through the connection is referred to in the literature as the pre-synaptic neuron, the one that accepts the signal is the post-synaptic neuron, as shown in this diagram:

Here is the theory in an easy-to-digest table:

State of first neuron in connection
State of second neuron in connection
What happens to the link between them?
Inactive
Inactive
Nothing
Inactive
Active
It weakens
Active
Inactive
It weakens
Active
Active
It strengthens

In fact, Hebb went beyond the idea of connections altering. He proposed that individual neurons or groups of neurons would spontaneously hook themselves up to form reverberating networks: neuron A would activate neuron B, which activates neuron C which then reactivates neuron A again. A circuit like this would get round the problem of the fact that neurons fatigue (they grow tired the longer they are active). This was Hebb's idea of Cell Assemblies, and has an entire section dedicated all to itself.

Using Hebb's idea of connections strengthening with use and weakening with non-use, we can model a simple Hebbian connection (also called a Hebbian link) between two neurons. What we are modelling is not the neurons themselves, but the strength of the connection between them, so an individual connection boils down to one floating point number - what could be simpler?

Of course, you can't get much of a performance from just one connection, so we might as well model a whole bunch of them in one go. The sort of situation where you would have such a Hebbian net might be like the following:

Imagine that your input was a grid of 5 squares by 5 squares. On to this grid were placed one of two images, representing a (crudely drawn) happy face, and a sad face:

We want a Hebbian net to distinguish between these two faces, so we need one output, which will be 1 for the happy face, and 0 for the sad face. There should be a Hebbian link between all 25 input elements and the single output. In a later example, we will see how to adapt this when there is more than one output. This single output is shown in the diagram below:

The entire Hebbian net is modeled simply by 25 numbers, each of which starts off with some random weight value, say between 0 and 1:

var strengths : array [1..5,1..5] of real;

procedure initialise_strengths;
var i,j : integer;
begin for i:=1 to 5 do
       for j:=1 to 5 do
        strengths[i,j]:=random(1000)/1000
end;

The function random(1000) returns a random number in the range 0 to 999, and random(1000)/1000 turns this random number into a decimal in the range 0 to 0.999. Watch yourselves if you're doing this in C++ or a similar language - both random(1000) and 1000 are integers, so the compiler will assume you want integer division - giving you the answer 0. You C++ merchants had better change the calculation to random(1000)/1000.0

Having set up the connections, they must be trained. Put an input pattern on the input, the desired output pattern that it is supposed to produce on the output and then adjust the strengths of the connections appropriately. This leads to a slight problem - the exact nature of the training rule. If we just take Hebb's idea of connections strengthening, without worrying about them weakening again, then we use the generalised Hebb rule:

Δw = η.ai.aj

In this case, Δw is the change in the weight connecting input element i to output element j. ai is the activation of input element i and aj is the activation of output element j. η is constant term that prevents the changes in the weight from being too extreme.

For example, suppose that a weight from an input neuron to an output neuron is currently 0.75. Suddenly an input and output pattern are presented, and the input element has activation of 0.5 and the output element has an activation of 0.2. If the constant η is 0.01, then the weight is changed (increased) by the value of 0.5 x 0.2 x 0.01 = 0.001. This means that the weight increases from 0.75 to 0.751. Here is the code for updating all the weights for the happy/sad face example. I assume that the input pixels are stored in an array ipt[] and the single output element (0 or 1) is given by opt:

const ETA = 0.01;

var ipt : array [1..5,1..5] of real;

procedure update_strengths;
var i,j : integer;
begin for i:=1 to 5 do
       for j:=1 to 5 do
        strengths[i,j]:=strengths[i,j] + ETA * ipt[i,j] * opt;
end;

Of course, providing that the inputs and outputs are always 0 or greater, the weights can never decrease, they can only ever increase. This is where the generalised Hebbian rule can be extended. If the formula above is changed to the following:

Δw = η.(2ai - 1).aj

then the weight will be decreased whenever the activation of input i is less than 0.5. This is because (2ai - 1) will produce a negative number. However, the value of aj will still always be positive. This situation is called Post-not-pre LTD as it means that the strength will be reduced whenever the input neuron (the "pre-synaptic neuron") is asleep but the output neuron (the "post-synaptic neuron") is awake.

The formula could also be written the other way round:

Δw = η.ai.(2aj - 1)

In this case, the weight is reduced whenever the input neuron is active but the output neuron is inactive, i.e. it is Pre-not-post LTD. You might be tempted to go one stage further and incorporate both types of LTD in one formula:

Δw = η.(2ai - 1).(2aj - 1)

but a moment's thought reveals this to be silly. In this case, if both the input and the output neurons were off (i.e. had activation 0 or some low activation - certainly lower than 0.5), then the two negative numbers produced would multiply to give a positive number, and the weight would actually increase! Just stick to one or the other, whichever one gives you the best results.

Here is the adapted code which updates the strengths. I have included a variable which specifies the "training type", i.e. whether the strengths are to be updated using LTP only (the generalised Hebb rule), Post-not-pre LTD or Pre-not-post LTD:

const ETA = 0.01;
      LTP = 1;            {The 3 different training types}
      POST_NOT_PRE = 2;
      PRE_NOT_POST = 3;

var ipt : array [1..5,1..5] of real;
    train_type : integer;

procedure update_strengths;
var i,j : integer;
begin for i:=1 to 5 do
       for j:=1 to 5 do
        case train_type of
         LTP : strengths[i,j]:=strengths[i,j] + ETA * ipt[i,j] * opt;
         POST_NOT_PRE : strengths[i,j]:=strengths[i,j]
                              + ETA * (2*ipt[i,j]-1) * opt;
         PRE_NOT_POST : strengths[i,j]:=strengths[i,j]
                              + ETA * ipt[i,j] * (2*opt-1)
        end
end;

One question is, "Should the weights be limited to the range 0 to 1?" There is nothing in Hebb's rule that says weights can't increase indefinitely, and nothing in the LTD rules to say that they can't become negative. If you did want to constrain them, you could add lines of code that make sure they never go outside the range 0 to 1:

procedure update_strengths;
var i,j : integer;
begin for i:=1 to 5 do
       for j:=1 to 5 do
        case train_type of
         LTP : strengths[i,j]:=strengths[i,j] + ETA * ipt[i,j] * opt;
         POST_NOT_PRE : strengths[i,j]:=strengths[i,j]
                              + ETA * (2*ipt[i,j]-1) * opt;
         PRE_NOT_POST : strengths[i,j]:=strengths[i,j]
                              + ETA * ipt[i,j] * (2*opt-1)
        end;
      if strengths[i,j] > 1
        then strengths[i,j]:=1;
      if strengths[i,j] < 0
        then strengths[i,j]:=0
end;

Don't forget to call the update_strengths procedure many times, so that the strengths can adjust to their final values. In the following code, I have not defined the procedure copy_input, but this simply copies training pattern number pattern to the array inp[].

for training_run := 1 to 10000 do
  for pattern := 1 to NUM_TRAINING_PATTERNS do
    begin copy_input(pattern);
          update_strengths
    end;

Generally, Hebbian networks learn very slowly. You will find you have to present input patterns many times (often thousands of times) to make them learn correctly, and even then they don't always get it right! If you have, say, 8 training patterns (possibly 4 happy faces and 4 sad ones), then you would present each face with its appropriate output to the net in turn, update the weights, then repeat that for the other 7 faces, and then repeat the whole procedure as many times as are needed.

Running the network

Having set up the connections, you now want to run the network to see if it can correctly classify the happy and sad face pictures that it has been trained upon. This involves putting the test pattern at the inputs, multiplying each of the input elements by the weight of its connection and adding up the results. This will produce an answer for the output, which will probably not be exactly 0 or 1, but will hopefully be close to one or the other. If the answer is above 0.5, then the net has classified the face as happy (i.e. output = 1). If less than 0.5, then that means sad (i.e. output = 0):

procedure run_net;
var i,j : integer;
    sum : real;
begin sum:=0;
      for i:=1 to 5 do
       for j:=1 to 5 do
        sum := sum + strengths[i,j] * ipt[i,j];
      if sum > 0.5
       then writeln('This is a happy face.')
       else writeln('This is a sad face.')
end;

Two-dimensional Hebbian networks

It is a fairly easy matter to extend the output to two dimensions as well as the input. In this case, the simple output variable, opt, is replaced by an array opt[] which represents an output grid:

In this diagram, I have only shown the connections to one output element as including them all would make the diagram too cluttered. Since all the output elements are connected to all the input elements, the strengths[] array now has to have 4 dimensions, not 2. In this case strengths[a,b,c,d] would represent the strength of the connection between element [a,b] of the input pattern to element [c,d] of the output pattern. Apart from that, the code would be very similar to that of the previous example:

var strengths : array [1..5,1..5,1..3,1..3] of real;

procedure initialise_strengths;
var i,j,k,l : integer;
begin for i:=1 to 5 do
       for j:=1 to 5 do
        for k:=1 to 3 do
         for l:=1 to 3 do
          strengths[i,j,k,l]:=random(1000)/1000
end;

procedure update_strengths;
var i,j,k,l : integer;
begin for i:=1 to 5 do
       for j:=1 to 5 do
        for k:=1 to 3 do
         for l:=1 to 3 do
          case train_type of
           LTP : strengths[i,j,k,l]:=strengths[i,j,k,l]
                                       + ETA * ipt[i,j] * opt[k,l];
           POST_NOT_PRE : strengths[i,j,k,l]:=strengths[i,j,k,l]
                                + ETA * (2*ipt[i,j]-1) * opt[k,l];
           PRE_NOT_POST : strengths[i,j,k,l]:=strengths[i,j,k,l]
                                + ETA * ipt[i,j] * (2*opt[k,l]-1)
          end;
      if strengths[i,j,k,l] > 1
        then strengths[i,j,k,l]:=1;
      if strengths[i,j,k,l] < 0
        then strengths[i,j,k,l]:=0
end;

procedure run_net;
var i,j,k,l : integer;
    result : array [1..3,1..3] of real;
begin for k:=1 to 3 do
       for l:=1 to 3 do
        begin result[k,l]:=0;
              for i:=1 to 5 do
               for j:=1 to 5 do
                result[k,l] := result[k,l] + strengths[i,j,k,l] * ipt[i,j]
        end;
   {Put in code here to compare the resulting output of the net with
        the trained patterns}
end;

You see that the output grid needn't be the same size (or even the same shape) as the input pattern. In this case, I am assuming that the output grid is 3-by-3 and the input is 5-by-5.

The only real major change is the routine that runs the net. In this case, we can't simply sum the weighted inputs to produce a final answer, as the answer will come out as a grid of numbers. We have to take the grid of numbers that is produced and comapre it with all the desired outputs from the training patterns to see which one is the closest. That desired output pattern then represents the neural network's choice.

The source code section gives an example of a Hebbian net which has been trained on the "what-where" task, which is a task where a simple shape is placed somewhere on a grid and the network has to identify the shape (i.e. say what it is) and also locate it (say where it is). The where task is relatively easy for the Hebbian net to learn, but the what task (being linearly inseparable) is beyond its simple scope.

What do the egg-heads say about Hebbian links?

You want to read Hebb's original book? It's rather dull, I must warn you. He has also written other things since 1949. I happen to know he was already a fully-fledged professor in 1949 - imagine how old he must have been in 1980!

Data Mining - An example of Hebbian Links


Introduction
Top
Structure