


The heights were plotted on a chart as follows, with the horizontal axis showing the various heights of the people and the vertical axis showing the number of people with a given height:

Normally we would plot this graph as a bar chart, but when there are a large number of people, with heights that differ from each other by only a millimetre or two, then the bars become very thin, and it becomes possible to join the tops of the bars using a smooth line. For this reason, we tend to plot graphs that show vast amounts of data as smooth lines rather than bar charts.
Note the shape of the curve. It starts off low, then gradually climbs uphill, before levelling off and dropping in the same way. The curve is symmetrical (i.e. one side is a mirror image of the other) and line of symmetry goes through the highest point of the curve. Such a curve is almost universal when it comes to measuring the distributions of various things in the world. We have found it by measuring the heights of 10,000 people, but you would get the same sort of curve if you measured the lengths of people's middle fingers, the lifetimes of lightbulbs, or the value of small change in people's pockets.
Because the curve is so common, it has a special name - the normal distribution curve. It is also sometimes called a bell-shaped curve (for obvious reasons) or a Gaussian distribution curve (after the mathematician Karl Gauss who researched it). The middle (highest) point represents the average value of all the heights - indeed if we calculated the mean, the median and the modal height of everyone in town, they would all lie at exactly this point.
Most of the people's height lie fairly close to the average value, but there are a few dwarfs (towards the left side of the graph) and a few giants (towards the right side). The graph agrees with common sense, saying that the more people's heights differ from the average value, the rarer those people are.
However, suppose I go ten miles down the road and repeat the procedure, measuring the heights of the people in the next town. Again, I plot a graph showing these results, and then I compare the two graphs, as in the following diagram:


Both these graphs have the same middle point (line of symmetry), i.e. they both have the same mean, median and mode. Both graphs are normally distributed. However, once glance tells us that they are very different from each other! The graph of the next town is much narrower, showing that the people there don't differ as much from the average as in my town. In my town, anyone who is taller or shorter than average doesn't stand out much as there is quite a wide range, but in the next town, anyone who is even slightly taller or shorter than average is unusual (and probably has people pointing at him in the street).
Just describing these two graphs in terms of the mean, median and mode is therefore not enough. We also need some way of specifying the width of the graph - so that we can distinguish fat graphs from thin ones. This measure is called Standard Deviation.
The standard deviation is a single figure which states the width of the curve at a specific point. After all, it's no good just stating what the width of the curve is - these curves get narrower as you go up them, and wider as you go down them. You have to state at what height you are measuring the width. In the case of the standard deviation, if the highest point of the curve is N units up from the horizontal axis, then the standard deviation will tell you the width of the curve at the height N/Ö2 units up, as shown in the following diagram:

In practice, this means that you are told the width of the curve at the height of about 0.707 times its maximum height, so if the highest point of the curve were 1000 units up from the axis, then you would be told the width of the curve at the point 707 units up from the axis.
There are two ways of calculating the standard deviation, both equally good. Each is given in statistical text books as a formula, but I have put them in this tutorial as both a formula and a set of instructions. In general, if you can remember the formula, then you can turn it into the set of instructions. If you can remember the set of instructions, then you can reconstruct the formula from it.
The symbol for standard deviation is s. This is the Greek letter "sigma" (in its lower case form), which is the Greek version of the letter S. We will also in a moment meet the upper case letter "sigma", as it also appears in the formulae.
![]() |
![]() |
Maths menu