## Moments of a Probability Distribution

We are now familiar with some of the properties of probability distributions. On this page we will introduce a set of numbers that describe various properties of such distributions. Some of these have already been encountered in our previous discussion, but now we will see that these fit into a pattern of quantities called moments of the distribution.

Moments

Let be any function which is defined and positive on an interval . We might refer to the function as a distribution, whether or not we consider it to be a probability density distribution. Then we will define the following moments of this function:

Observe that moments of any order are defined by integrating the distribution with a suitable power of x over the interval [a,b]. However, in practice we will see that usually moments up to the second are usefully employed to describe common attributes of a distribution.

Moments of a Probability Density Distribution

In the particular case that the distribution is a probability density, we have already established the following :

This follows from the facts that probability distributions are normalized so that the area under the curve is always 1, (hence the zero'th moment is 1) and the average, or mean of the distribution is defined by the integral that also happens to be the first moment. In the past we have used the symbol to represent the mean or average value of x but often the symbol is also used for this quantity.
But what role does the second moment,

play ? We will shortly see that the second moment helps describe the way that the "mass" or probability density is distributed about its mean. For this purpose, we must describe the notion of variance or standard deviation.

Variance and Standard Deviation

Two kids of roughly the same size can balance on a teeter-totter by sitting very close to the point at which the beam pivots as shown in the diagram below.

They can also achieve a balance by sitting at the very ends of the beam, equally far away as shown in the next diagram.

In both cases, the center of mass of the distribution is at the same place: precisely at the pivot point. However, the mass is distributed very differently in these two cases. In the first case, the mass is clustered close to the center, whereas in the second, it is distributed further away. The line segment under the two diagrams represents how far away the masses are from the center of mass. In the first case, this distance is small. In the second case it is larger.

If we want to be able to describe how mass is distributed, we need to talk about attributes of the mass distribution other than just where its center of mass is located. Similarly, if we want to explain to someone how a probability density distribution is distributed about its mean, we would have to consider moments higher than the first. This is precisely what we shall do below. We will use the idea of the variance to describe whether the distribution is clustered close to its mean, or spread out over a great distance from the mean.

The variance is defined as the average value of the quantity . This average is taken over the whole distribution. (The reason for the square is that we would not like values to the left and right of the mean to cancel out. )

The standard deviation is defined as .

If we had a random variable that takes on only discrete values , with probability and this discrete probability distribution has mean we would define the variance as the average given by

Note that it is not necessary to divide by the number of values because the sum of the discrete probabilities is 1, i.e. . Now for a continuous probability density, with mean , we define similarly

The standard deviation is then

Let us see what this implies about the connection between the variance and the moments of the distribution. From the equation for variance we calculate that

Thus

We recognize the integrals in the above expression, since they are simply moments of the probability distribution. Plugging in these facts, we arrive at

Thus the variance is clearly related to the second moment and to the mean of the distribution. Further, the standard deviation is then

Example

Consider the continuous distribution, in which the probability is constant for values of x in the interval [a,b] and zero for values outside this interval. Such a distribution is called a uniform distribution. (It has the shape of a rectangular band of height C and base (b-a).) It is easy to see that the value of the constant C should be 1/(b-a) so that the area under this rectangular band will be 1, in keeping with the property of a probability distribution.
We compute that

(this was already known to us, since we have determined that the zero'th moment of any probability density is 1.) We also find that

This last expression can be simplified by factoring, leading to

Thus we have found that the mean is in the center of the interval [a,b], as expected. The median would be at the same place by a simple symmetry argument: half the area is to the left and half the area is to the right of this point.

To find the variance we might first calculate the second moment,

It can be shown by simple integration that this yields the result

We would then compute the variance

After simplification, we get

The standard deviation is then