Differentiating Quotients
Implicit Differentiation
UBC Calculus Online Course Notes

The Chain Rule

The Chain Rule tells us how to differentiate composite functions and since so many functions can be written as composites, it is a vitally important tool for computing derivatives. Before we write down the Chain Rule, let's think about our earlier example.


An Example

Previously, we considered the composite of two linear functions: $  y = f(x) = 1000 - \frac x2  $ and $  z = g(y) = 400 - 
\frac y5  $ . We found that the composite was $  z = g(f(x)) = 
200 + \frac x{10}  $ . Since these are linear functions, their derivatives are constants--that is, they do not vary from point to point. Notice that

 
\begin{eqnarray*} 
 & \frac{dz}{dy} = \frac 15 \hspace{1in} \frac{dy}{dx} = \frac 12 & \\ 
 & \frac{dz}{dx} = \frac 1{10} 
\end{eqnarray*}

The relationship between these numbers is no coincidence. To see why, let's consider the rate of change of the variable $  z  $ in terms of $  x  $ . We will choose two points $  x_0  $ and $  x_1  $ . These points also give us


\begin{eqnarray*} 
 y_0  = f(x_0), & & y_1 = f(x_1) \\ 
 z_0 = g(y_0) = g(f(x_0)), & &  z_1 = g(y_1) = g(f(x_1)) 
\end{eqnarray*}

Notice that we have

\[ 
  \frac{z_1 - z_0}{x_1 - x_0} = 
  \frac{z_1 - z_0}{y_1 - y_0}~~~~  \frac{y_1 - y_0}{x_1 - x_0} = \frac 15 
\cdot \frac 12 = \frac 1{10} 
 \]

Since the graphs of these function are just straight lines, we have related the slopes and hence the derivatives.


Another Example
We'll begin by recalling the following fact from high school. Suppose that $  a  $ is some positive number. Then the graph of the function $  y=g(ax)  $ is obtained from the graph of $  y=g(x)  $ by compressing the $  x  $ axis by a factor of $  \frac 1a  $ . The following demonstration will illustrate this fact. If you move the red ball in the rightmost graph, you can see how the function is changed. Also shown are the tangent lines at two points.


For your consideration:

  1. Describe what happens to the graph as you drag the ball to the left.

  2. What happens to the tangent line at the origin?

  3. What happens to the tangent line attached to the ball?

  4. By $  h(x) = g(2x)  $ , we'll denote the composite of $  g  $ with $  f(x) = 2x  $ . What is $  h^\prime(0) 
 $ in terms of $  g^\prime(0)  $ ?

  5. What is $  h^\prime(1)  $ in terms of a derivative of $  g  $ ?

This last question is important and sometimes a source of confusion when understanding the general Chain Rule. You can see from the picture that the derivative $  h^\prime(1)  $ is related to the derivative $  g^\prime(2) = g^\prime(f(1))  $ . We will see why this is now.


The Chain Rule

The Chain Rule says that

\[  
  (g\circ f)^\prime(x) = g^\prime(f(x))\cdot f^\prime(x) 
 \]

In the Leibniz notation, this may be written as

\[  \frac{dz}{dx} = \frac{dz}{dy}|_{y=f(x)}\frac{dy}{dx}. 
 \]

We have seen how this works in the examples above. To understand it more generally, let's use the fact that the derivative provides a convenient linear approximation for a function. More specifically, consider a point $ x_0 $ with corresponding points $  y_0 = f(x_0)  $ and $  z_0 = g(y_0) 
= (g\circ f)(x_0) $ . We have the following linear approximations:


\begin{eqnarray*} 
  f(x_0 + h) & \cong & f(x_0) + f^\prime(x_0) h = y_0 + f^\prime(x_0)h\\ 
  g(y_0 + k) & \cong & g(y_0) + g^\prime(y_0) k 
\end{eqnarray*}

Now

\[ 
(g\circ f)(x_0 + h) = g(f(x_0 + h)) \cong g(y_0 + f^\prime(x_0) h) 
 \]

From here, we can use the linear approximation for $  g  $ with $  k = f^\prime(x_0)h  $ to obtain

 
\begin{eqnarray*} 
	(g\circ f)(x_0 + h) & \cong & g(y_0 + f^\prime(x_0) h) \\ 
                & \cong & g(y_0) + g^\prime(y_0)f^\prime(x_0)h \\ 
	& \cong & (g\circ f)(x_0) + g^\prime(f(x_0))f^\prime(x_0) h 
\end{eqnarray*}

Now if we form the difference quotient for the composite, we find that

\[  \frac{(g\circ f)(x_0 + h) - (g\circ f)(x_0)}{h} \cong 
g^\prime(f(x_0))f^\prime(x_0) 
 \]

As $  h  $ becomes small, the approximation improves so that in the limit, we find the chain rule.


Some examples

Here is how we might use the chain rule. Let's consider the function $  h(x) = (x^3 + 1)^2 $ . Notice that this is the composite of $  f(x) = x^3 + 1  $ and $  g(y) = y^2  $ . Since $ 
f^\prime(x) = 3x^2  $ and $  g^\prime(y) = 2y $ , we find that

\[ 
h^\prime(x) = g^\prime(f(x)) f^\prime(x) = 2(f(x))\cdot 3x^2 = 
6x^2(x^3 + 1). 
 \]