The Chain Rule tells us how to differentiate composite functions and since so many
functions can be written as composites, it is a vitally important tool
for computing derivatives. Before we write down the Chain Rule, let's
think about our earlier example.
An Example
Previously, we considered the composite of two linear functions:
and
. We found that the composite was
. Since these are linear functions, their
derivatives are constantsthat is, they do not vary from point to
point. Notice that
The relationship between these numbers is no coincidence. To see
why, let's consider the rate of change of the variable
in terms of
. We will choose two points
and
. These points also give us
Notice that we have
Since the graphs of these function are just straight lines, we
have related the slopes and hence the derivatives.

Another Example
This last question is important and sometimes a source of
confusion when understanding the general Chain Rule. You can see from
the picture that the derivative
is related to
the derivative
. We will see
why this is now.
The Chain Rule
The Chain Rule says that
In the Leibniz notation, this may be written as

We have seen how this works in the examples above. To understand
it more generally, let's use the fact that the derivative provides a
convenient linear approximation for a
function. More specifically, consider a point
with
corresponding points
and
. We have the following linear approximations:
Now
From here, we can use the linear approximation for
with
to obtain
Now if we form the difference quotient for the composite, we find
that
As
becomes small, the approximation improves so
that in the limit, we find the chain rule.

Some examples
