Main Menu

The Structure of (most) Neural Networks

Neuroscientists tell us that the human brain consists of countless billions of specialised cells called neurons. There are many different sorts of neurons, but they can be "boiled down" to the following (very) simple example:

The main cell body receives signals from other neurons via the spindly extensions on the left, called dendrites because they resemble the branches of a tree. The neuron itself will "build up" the signal inside itself, and when it reaches a certain threshold level, it "fires", discharging the signal down its long axon and over to other cells through connections at the end of the axon called synapes. These are connected to the dendrites of other cells.

This model is simplified further when it is simulated on a computer. Generally, the model used is as follows:

The neuron is the circle at the centre of the diagram. It has several inputs which are the lines on the left (a real brain cell can have thousands of inputs) and a single output, the line on the right. The neurons are "wired up" so that the outputs of some feed the inputs of others. If you compare the computer model with the diagram above it, you will see that the synapse connections have disappeared. This is compensated for by the fact that several other cells will take their input from the output of this one neuron.

The input to the cells arrive in the form of signals down the inputs. In the human brain, the signals take the form of chemical connections between the ends of nodes. The signals build up in the cell and eventually it discharges through the output - we say that the cell fires. Then the cell can start building up signals again.

In the human brain, neurons are connected in extremely complex networks with countless interconnections. An artificial neural net has a much simpler structure. The most common streucture is called a feedforward Multi-Layer Perceptron where neurons are arranged in layers, with the outputs of neurons in each layer being connected to the inputs of the neurons in the layer above.

Usually, the net has three layers (it can be shown that three layers is all that you ever need), called the Input layer, the Hidden layer and the Output layer. The input to the entire net is presented to the input layer (at the bottom of the diagram), and is fed up the layers, with the output layer providing the output from the neural net that the user requires.

Each connection from one node to another carries a strength which indicates how important the connection is. Strong connections have more influence on the node they connect into than weaker ones. They contribute more to the firing of the cell. The information carried by the network is stored in the differing strengths of the node connections.

More and more nowadays neural networks are being implemented in specially created integrated circuits. However, most programmers simulate neural networks using software. The strengths between the nodes are called weights in the program and are stored as numbers.

Defining some terms

Earlier, I used the term "feedforward". This means that the connections between one layer and the next only run in one direction. There are connections from layer 1 to layer 2, from layer 2 to layer 3 etc. but no connections in the other direction. The opposite of a feedforward net is called a recurrent net, which have feedback connections. The following diagram shows a recurrent net, which is basically a feedforward net with a few feedback connections added:

Recurrent networks often have what are called attractor states. This means that signals passing through the recurrent net are fed back and changed until they fall into a repeating pattern, which is then stable (i.e. it repeats itself indefinitely as it rattles round the loop). This is a little like a ball being placed on a slope and released - it rolls downhill until it reaches the bottom of a valley, and then stops. The input signals change until they reach one of these attractor states, and then they remain stable. The secret with recurrent networks is to train the weights so that the attractor states are the ones that you want.

The sort of problems that neural networks are asked to sort out can also be classified, as linearly separable or linearly inseparable. The best way to understand this is to consider a neural network with two inputs and one single output:

I have labelled the two inputs x and y deliberately, as we can represent them as co-ordinates on an x-y graph. Let's suppose that we want the network to produce a 1 output if input y has a higher value than input x. The x-y graph can be divided into two regions, as follows:

The yellow region shows all the combinations of x and y that should produce the output 1, and the green region shows the combination of the two inputs that should produce the output 0. This sort of classification is referred to as linearly separable as it is possible to draw a straight line between the two regions.

It's a different matter when you want the network to produce a 1 output when x and y have the same sign (i.e. both negative or both positive) and a 0 output when they have different signs. This is shown by the graph below:

In this case, the problem is called linearly inseparable as it is possible to draw a single straight line on the graph which will separate the two regions, i.e. with only yellow on one side, and only green on the other. Needless to say, linearly separable tasks are a lot easier to teach to neural networks than linearly inseparable ones. Indeed, there are certain types of network (simple perceptrons and Hebbian networks, for example) which simply can't learn linearly inseparable tasks.


Go back
Top
Go on