Main Menu

Single and Multi-Layer Perceptrons

A perceptron is another name for the simulated neuron in the computer program. Over the years since they were first simulated in software, these neurons have gone under various names - quite a common one you will hear is adaline (ADAptive LINear Element - geddit?) - depending on who was doing the simulating. Nowadays, the terminology has pretty much settled down. Each individual cell is called either a node or a perceptron, ...

... a neural network consisting of a layer of these things between the input and the output is a single layer perceptron ...

... and a network consisting of several layers of these stacked on top of each other between input and output is called a multi-layer perceptron ....

As you can imagine, multi-layer perceptrons are more powerful than single-layer perceptrons. In 1969, Marvin Minsky and Seymour Papert wrote a book called Perceptrons in which they proved mathematically that single-layer perceptrons couldn't cope with classification tasks that were linearly inseparable. This book was influential and marked a turning point in the fortunes of neural networks. Up to that time, people had put all their faith in single-layer perceptrons (and had achieved some impressive things with them). Suddenly, Minsky and Papert had shown that they were severely limited!

Effectively, the book killed off research into neural networks for about fifteen years. True, people did know that you could get more power out of perceptrons by stacking the layers up, but they didn't know how to train multi-layer perceptrons. Researchers didn't take neural networks seriously again until about the mid-eighties, when the Back-propagation algorithm for training multi-layer perceptrons was discovered.

Two or Three Layers?

There is some confusion when it comes to counting the layers in a multilayer perceptron. Take another look at that MLP diagram above. That neural network contains two actual layers of processing nodes - represented by the circles. There is a also a series of points at the bottom of the diagram where the input values are fed into the network. Although I, personally, would call that a two-layer network (simply counting the layers of processing nodes), but there are many researchers who would call that a three-layer network (counting the points where the input is passed to the network as a separate layer). These researchers would refer to the three "layers" as the "input layer", "hidden layer", and "output layer".

It can be shown that a network such as the one shown can, in theory, solve pretty much any classification problem, providing there are enough nodes in each layer. Indeed, researchers publish papers with titles such as "Three layers are enough for any problem." But when you meet such people, make sure you ask them to specify exactly what they mean by a layer.

References



Go back

Top

Go on