Skip to main content
\( \newcommand{\lt}{ < } \newcommand{\gt}{ > } \newcommand{\amp}{ & } \)

Section5.10The Chain Rule

Subsection5.10.1Review: Rate of Change and Composition

We start by reminding ourselves that a rate of change is a ratio of changes for two variables. If \(y\) is a function of \(x\), say \(y=f(x)\), then the rate of change \(\left.\frac{dy}{dx}\right|_a=f'(a)\) is the rate of change of \(y\) with respect to \(x\) at the value \(x=a\). This measures the instantaneous ratio of changes in \(y\) from \(f(a)\) to changes in \(x\) from \(a\). At any value \(x\) close to \(a\), this means that \begin{equation*} y-f(a) \approx \left.\frac{dy}{dx}\right|_{a} \cdot (x-a). \end{equation*} Changes in the value of \(y\) are approximately proportional to changes in \(x\) from \(a\) and the derivative \(f'(a)\) is the proportionality constant.

Second, we remind ourselves that compositions correspond to chains of dependent variables. Suppose that \(u\) is a function of \(x\), say \(u=g(x)\), and \(y\) is subsequently a function of \(u\), say \(y=f(u)\). We would write this chain as \begin{equation*} \left\{ \begin{matrix} u=g(x) \\ y=f(u) \end{matrix} \right..\end{equation*} Using substitution, we could also just write that \(y\) is a function of \(x\) using composition. \begin{equation*} y=f(g(x)) = f \circ g(x). \end{equation*}

Now, let us consider a particular value for \(x\) and ask how would we determine the rate of change of \(y\) with respect to \(x\) when it is defined with such a composition? A change in \(x\) from \(a\), \(\Delta x = x-a\), would lead to a change in \(u\) from \(g(a)\) using the rate of change \begin{equation*} \Delta u = u-g(a) \approx \left.\frac{du}{dx}\right|_{a} \cdot (x-a) = g'(a) \cdot \Delta x. \end{equation*} In a similar way, a change in \(u\) from its starting value \(g(a)\) would lead to a change in \(y\) from \(f(g(a))\) using the rate of change \begin{equation*} \Delta y = y - f(g(a)) \approx \left. \frac{dy}{du}\right|_{g(a)} \cdot (u-g(a)) = f'(g(a))\cdot \Delta u. \end{equation*} Putting these two results of the chain together, we find that \begin{equation*} \Delta y \approx \left. \frac{dy}{du} \right|_{g(a)} \cdot \left. \frac{du}{dx} \right|_{a} \cdot \Delta x = f'(g(a)) \cdot g'(a) \cdot \Delta x. \end{equation*}

Graphically, this is illustrated in the figure below. The inputs and outputs of the functions for \(g\) and \(f\) are illustrated as number lines. The input \(a\) to the function \(g\) is mapped to the output \(g(a)\). A nearby input \(x\) is mapped to an output \(g(x)\) that is not too far from \(g(a)\). The differences are the values \(\Delta x = x-a\) and \(\Delta u = g(x)-g(a)\). In composition, the outputs \(g(a)\) and \(g(x)\) act as inputs to \(f\).

<<SVG image is unavailable, or your browser cannot render it>>

The derivative provides an approximate ratio in the changes of output values to the changes of input values. The smaller the input, the closer the approximation. (This is why the derivative must be defined as a limit of the average rate of change.) When functions are in composition, each function effectively amplifies the difference in output by the factor of the derivative. So the overall change in the output is a result of the product of the derivatives.

Subsection5.10.2The Chain Rule for Derivatives

The chain rule formalizes the ideas in the previous paragraphs. It states that the derivative of a composition \(f(g(x))\) has a derivative given by \begin{equation*} \frac{d}{dx} [ f(g(x)) ] = f'(g(x)) \cdot g'(x). \end{equation*} Pay close attention to the inputs of \(f'\) and \(g'\). Compare those values to what we had to do in the previous paragraphs. The inputs are different because the functions \(f\) and \(g\) have different inputs in the composition.

This is often abbreviated as \begin{equation*} \frac{dy}{dx} = \frac{dy}{du} \cdot \frac{du}{dx}. \end{equation*} Notice that this form almost looks like an algebraic simplification where the symbol \(du\) on the right would cancel to give the formula on the left.


Find the derivative of \(f(x)=(2x+1)^2\) using the chain rule and compare the result to what you get if you expand \(f(x)\) before differentiation.


Find the derivative of \(f(x) = 3(x^2+3x)^7\).