Subsection6.7.1Review: Rate of Change and Composition

We start by reminding ourselves that a rate of change is a ratio of changes for two variables. If \(y\) is a function of \(x\text{,}\) say \(y=f(x)\text{,}\) then the rate of change \(\left.\frac{dy}{dx}\right|_a=f'(a)\) is the rate of change of \(y\) with respect to \(x\) at the value \(x=a\text{.}\) This measures the instantaneous ratio of changes in \(y\) from \(f(a)\) to changes in \(x\) from \(a\text{.}\) At any value \(x\) close to \(a\text{,}\) this means that

Changes in the value of \(y\) are approximately proportional to changes in \(x\) from \(a\) and the derivative \(f'(a)\) is the proportionality constant.

Second, we remind ourselves that compositions correspond to chains of dependent variables. Suppose that \(u\) is a function of \(x\text{,}\) say \(u=g(x)\text{,}\) and \(y\) is subsequently a function of \(u\text{,}\) say \(y=f(u)\text{.}\) We would write this chain as

Using substitution, we could also just write that \(y\) is a function of \(x\) using composition.

\begin{equation*}
y=f(g(x)) = f \circ g(x).
\end{equation*}

Now, let us consider a particular value for \(x\) and ask how would we determine the rate of change of \(y\) with respect to \(x\) when it is defined with such a composition? A change in \(x\) from \(a\text{,}\) \(\Delta x = x-a\text{,}\) would lead to a change in \(u\) from \(g(a)\) using the rate of change

Graphically, this is illustrated in the figure below. The inputs and outputs of the functions for \(g\) and \(f\) are illustrated as number lines. The input \(a\) to the function \(g\) is mapped to the output \(g(a)\text{.}\) A nearby input \(x\) is mapped to an output \(g(x)\) that is not too far from \(g(a)\text{.}\) The differences are the values \(\Delta x = x-a\) and \(\Delta u = g(x)-g(a)\text{.}\) In composition, the outputs \(g(a)\) and \(g(x)\) act as inputs to \(f\text{.}\)

The derivative provides an approximate ratio in the changes of output values to the changes of input values. The smaller the input, the closer the approximation. (This is why the derivative must be defined as a limit of the average rate of change.) When functions are in composition, each function effectively amplifies the difference in output by the factor of the derivative. So the overall change in the output is a result of the product of the derivatives.

Subsection6.7.2The Chain Rule for Derivatives

The chain rule formalizes the ideas in the previous paragraphs. It states that the derivative of a composition \(f(g(x))\) has a derivative given by

Pay close attention to the inputs of \(f'\) and \(g'\text{.}\) Compare those values to what we had to do in the previous paragraphs. The inputs are different because the functions \(f\) and \(g\) have different inputs in the composition.

To use the chain rule, we must identify the chain/composition that is involved. (This is why we need to learn to recognize compositions.) The chain can be found by recognizing that we need to square \(2x+1\text{,}\) so this is our first step, \(u=2x+1\text{.}\) Then \(y=f(x)\) can be rewritten \(y=u^2\text{.}\) We can also find the derivatives of each step in the chain:

Our function has an intermediate formula \(u=x^2+3x\) that is then raised to the 7th power and multiplied by 3. That is, if \(y=f(x)\) then \(y=3u^7\text{.}\) We would write this as a chain, along with their derivatives: