Calculus Online: Lab 3
Welcome to Lab 3 of Math 100 Sections 103, 104, 107 and 109.
If you leave this page without saving your work, it will be lost. Always save your work before leaving this page or reloading it.
Probability Density Functions
In this lab, we will explore probability density functions in a little more detail than we did in class. Please read carefully because it will have some material which is new to you.
Probability density functions are a very important type of function since they give us a means of organizing a large collection of data. In fact, whether you are in the natural sciences, commerce or the social sciences, chances are good that you will encounter a probability density function in your future.
To get started, let's imagine
that we are studying a particular type of fish. One of the things we
would like to understand is approximately how big this type of fish
is. The natural thing to do is to measure the size of many of these
fish. This gives us a lot of data, and we would like some convenient
way of analyzing the data. One way would be to use
a bar graph like we see on the right.
Let's look carefully at this graph: the possible sizes of fish have been divided into ten ranges (which we'll call bins ). Over each bin, the area of the bar represents the number of fish in that bin.
For that reason, the height of the bar over a bin represents the density of fish in that bin; that is, it tells us, the ratio of the number of fish in the bin to the width of the bin. At first, that may seem a little strange but we'll see one good reason for doing it this way at the end of the lab. For now, just notice that the number of fish in a bin depends on how big the bin is. Looking at the density will partially negate the effect of the bin size on our graph.
It is very important that you understand what we've just done. The next question will make sure you do.
Question 1 (2 marks)
If we would like to get a better feeling for the size of the fish,
we could increase the number of bins. You will see the effect of
doubling the number of bins to 20 on the right. This is constructed
from the graph with 10 bins
by dividing each bin into two new ones.
First notice that when the number of bins doubles, the width of the each bin is halved. However, the area, which represents the number of fish in a bin, over the two new bins is the same as the area over the original bin.
Here's what happens as we further increase the number of bins.
Notice that as the size of the bins becomes very small, the graph starts to look like the graph of a continuous function (shown on the left). We call this function a probability density function and denote it by (Next term, we will define a probability density function to be something slightly different, but this is good enough for now.)
Here is what this graph means: very few fish have a size in a range where this function is close to zero. However, many fish are of a size where this function is large. We sometimes express this in terms of probability: we say that a fish is unlikely to have a size in a range where the function is close to zero. And a fish is likely to have a size in a range where the function has a large value.
A fire swept through a forest a few years ago. When the diameters of the trees in the forest are measured, the resulting probability density function is as below.
Let's remember the crucial fact about probability density functions: the area under the curve represents the number of occurrences in a given range.
When you drag the dot around in Question 3, you are exploring a new function which is called the cumulative distribution function and denoted We define to be the area under the graph to the left of x . Since our graph is the probability density function , this is the same as saying that measures the number of students scoring lower than x .
There is a very important and wonderful relationship between these two functions; namely,
the derivative of the cumulative distribution function is the probability density function.
To see why this is so, remember that
Let's think about the right hand side of that last equation: measures the number of students scoring less than x + h and measures the number of students scoring less than x. This means that the difference measures the number of students scoring between x and x + h. This is represented by the area under the graph between x and x + h as is shown to the right.
Now this area is approximately the same as the rectangle whose width is h and whose height is This means that
We recognize this ratio as an average rate of
As h decreases, this ratio approaches the derivative,
and so we see that
How did you like this lab? Was it interesting, challenging, frustrating, difficult, boring, fun? Please let us know by sending us email.