Theorem 7.3.1
If we have an explicit chain representation,
then the chain rule can be rewritten:
The derivative rules we have learned this far focus on the arithmetic operations that combine expressions into more complex operations—addition, subtraction, multiplication, and division. Another operation that combines expressions is composition. A function \(f\) represents a map from an independent variable to a dependent variable, say \(f : x \mapsto y\text{.}\) Composition occurs when the output from another function becomes the input. The chain rule provides the differentiation rule for composition.
In this section, we develop the chain rule. We begin by reviewing the idea of a chain of variables and the relation this has to function composition. The chain rule is based on the derivative being the limiting rate of change. By considering how an increment of change in the independent variable propogates through the chain, we will see that the rates of change at each step in the chain are multiplied together. After a few examples of using the chain rule for formulas, we then explore a few examples of the chain rule for related rates.
We start by reminding ourselves that a rate of change is a ratio of changes for two variables. If \(y\) is a function of \(x\text{,}\) say \(x \mapsto y=f(x)\text{,}\) then the rate of change \(\left.\frac{dy}{dx}\right|_a=f'(a)\) is the rate of change of \(y\) with respect to \(x\) at the value \(x=a\text{.}\) This measures the instantaneous ratio of changes in \(y\) from \(f(a)\) to changes in \(x\) from \(a\text{.}\) At any value \(x\) close to \(a\text{,}\) this means that
Changes in the value of \(y\) are approximately proportional to changes in \(x\) from \(a\) and the derivative \(f'(a)\) is the proportionality constant.
Second, we remind ourselves that compositions correspond to chains of dependent variables. Suppose that \(u\) is a function of \(x\text{,}\) say \(u=g(x)\text{,}\) and \(y\) is subsequently a function of \(u\text{,}\) say \(y=f(u)\text{.}\) We would write this chain as
Using substitution, we could also just write that \(y\) is a function of \(x\) using composition.
Now, let us consider a particular value for \(x\) and ask how would we determine the rate of change of \(y\) with respect to \(x\) when it is defined with such a composition? A change in \(x\) from \(a\text{,}\) \(\Delta x = x-a\text{,}\) would lead to a change in \(u\) from \(g(a)\) using the rate of change
In a similar way, a change in \(u\) from its starting value \(g(a)\) would lead to a change in \(y\) from \(f(g(a))\) using the rate of change
Putting these two results of the chain together, we find that
Graphically, this is illustrated in the figure below. The inputs and outputs of the functions for \(g\) and \(f\) are illustrated as maps between number lines. The input \(x=a\) to the function \(g : x \mapsto u\) is mapped to the output \(u=g(a)\text{.}\) A nearby input \(x\) is mapped to an output \(g(x)\) that is not too far from \(g(a)\text{.}\) The differences are the values \(\Delta x = x-a\) and \(\Delta u = g(x)-g(a)\text{.}\) In composition, the outputs \(g(a)\) and \(g(x)\) act as inputs to \(f\text{.}\)
The derivative provides an approximate ratio in the changes of output values to the changes of input values. The smaller the input, the closer the approximation. This is why the derivative must be defined as a limit of the average rate of change. When functions are in composition, each function effectively amplifies the difference in output by the factor of the derivative. So the overall change in the output is a result of the product of the derivatives.
The chain rule formalizes the ideas in the previous paragraphs. It states that the derivative of a composition \(f(g(x))\) has a derivative given by
Pay close attention to the inputs of \(f'\) and \(g'\text{.}\) Compare those values to what we had to do in the previous paragraphs. The inputs are different because the functions \(f : u \mapsto y\) and \(g : x \mapsto u\) have different inputs in the composition.
If we have an explicit chain representation,
then the chain rule can be rewritten:
The chain rule is often abbreviated as
Notice that this form almost looks like algebra would cancel the symbol \(du\) on the right to give the formula \(\frac{dy}{dx}\) on the left.
Find the derivative of \(f(x)=(2x+1)^2\) using the chain rule and compare the result to what you get if you expand \(f(x)\) before differentiation.
To use the chain rule, we must identify the chain or composition that is involved. The last operation in this formula is the act of squaring. What do we square? This will be the way that we identify \(u=2x+1\text{.}\) Then the final output is \(y=u^2\text{.}\) We can find the derivatives of each step in the chain:
Consequently, we have
The notation \(u=2x+1\) is simply a reminder that when writing the derivative \(\frac{dy}{du}=2u\) we will ultimately replace \(u=2x+1\text{.}\)
The other approach is to expand \(f(x)\) to a form that is easier to differentiate.
This is a simple polynomial form that has a simple derivative:
We can see that this is actually the same as our earlier derivative if we factor out the common factor of 4.
We could avoid the chain rule in the previous example because expanding the square of our expression could be calculated fairly simply. When this is not possible, the chain rule must be used.
Find the derivative of \(f(x) = 3(x^2+3x)^7\text{.}\)
Our function has an intermediate formula \(u=x^2+3x\) that is then raised to the 7th power and multiplied by 3. That is, if \(y=f(x)\) then \(y=3u^7\text{.}\) We would write this as a chain, along with their derivatives:
The chain rule implies
Note that we had to substitute the formula for \(u\) to find our final result.
In the language of function composition, we could instead do this by writing \(f(x)\) as a composition \(f(x) = g(h(x))\text{:}\)
The chain rule would be written:
Negative and rational powers are much simpler with the chain rule. Using negative powers in composition often helps us avoid needing the quotient rule.
Find \(f''(x)\) where \(f(x) = \frac{3}{x^2+1}\text{.}\)
The first derivative can be found using the quotient or reciprocal rule.
We could also have done this using a chain rule. The relevant chain and associated derivatives are given:
Consequently, we know \(f'(x) = \frac{dy}{dx} = \frac{dy}{du} \frac{du}{dx}\) and
To calculate the second derivative, we differentiate \(f'(x)\text{.}\) We could use either the quotient rule or the product rule with negative powers. In the first case, we find
where we have used the chain rule on \(u^2\) with \(u=x^2+1\) to obtain
Notice that the numerator of \(f''(x)\) has \(x^2+1\) as a common factor, which cancels with one of the corresponding factors in the denominator. A simplified version of \(f''(x)\) is therefore given by
The other approach to finding the second derivative is to start with the product representation of \(f'(x)\) and differentiate using the product rule. In order to differentiation \((x^2+1)^{-2}\text{,}\) we use the chain rule on \(u^{-2}\) with \(u=x^2+1\text{:}\)
This will give us
Remembering that negative exponents correspond to powers in the denominator, we can see this formula requires a common denominator \((x^2+1)^3\) to simplify
We found the same answer both ways. Derivative rules are self-consistent.
There may be times where the chain rule must be used more than once. Any time the last operation on an expression is a function acting on an expression, such as a power as opposed to arithmetic operations like sums or products joining two expressions, we need to use the chain rule.
If \(f(x) = (\sqrt{3x}+2)^4\text{,}\) compute \(f'(x)\text{.}\)
The last operation in \(f(x)\) is raising an expression to the power 4. The derivative will require a chain rule. The first step is to differentiate this last operation.
We need to continue by finding the derivative of the inner expression \(u=\sqrt{3x}+2\text{.}\) This is a sum, and the second term in a sum is a constant. The derivative of a constant is zero. We need to compute the derivative of \((3x)^{1/2}\text{,}\) which is another composition. The expression \(3x\) is raised to a power \(\frac{1}{2}\text{.}\) We need the chain rule one more time.
Substituting this into our original formula for \(f'(x)\text{,}\) we find
Derivative rules are fundamentally about relationships between instantaneous rates. The chain rule is no exception. The biggest difference in the rates that are related by the chain rule and other related rates problems is that the chain rule involves different independent variables for different steps in the chain.
Consider a temperature dependent chemical reaction. At 20 degrees Celsius, the reaction generates a product at a rate of 30 grams per minute. For small changes in temperature, the reaction can generate an addition 5 grams per minute per degree increase in temperature. If the temperature is cooling at a rate of 0.05 degrees per minute, what is happening to the reaction?
Conceptually, we recognize some variables in this problem: the temperature \(T\) (in degrees Celsius), the time \(t\) (in minutes), and the reaction rate \(R\) (in grams per minute). Because temperature is changing in time, we know there is a map \(t \mapsto T\text{.}\) Similarly, we know that the reaction rate depends on temperature, there is another map \(T \mapsto R\text{.}\) In combination, we have a chain \(t \mapsto T \mapsto R\text{.}\)
We identify the values at the instant \(t\) in question. We know \(T=20\) and \(\frac{dT}{dt} = -0.05\text{.}\) (Why?) Similarly, we know \(R=30\) and \(\frac{dR}{dT} = 5\text{.}\) The chain rule tells us the rate of change of the final variable in the chain with respect to the original independent variable in the chain:
That is, the reaction rate is decreasing at a rate of 0.25 grams per minute per minute. (\(R\) has units of grams per minute so \(\frac{dR}{dt}\) has units of grams per minute per minute.)
As an ice cube melts, it maintains the shape of a cube. At one particular instant, each side of the cube is 30 mm and the volume of the cube is melting at a rate of 500 ^{mm3}⁄_{s}. What is the rate of change of the length of the sides at that instant?
Start by identifying the variables in the problem. The state of the ice cube is characterized by the time, the length of the sides, and the total volume. Let \(t\) be the time (in seconds), \(s\) the length of a side (in millimeters), and \(V\) the volume (in cubic millimeters).
Next identify the functions defining relations between the variables. We know that the length and volume are both functions of time, so we know there are maps \(t \mapsto s\) and \(t \mapsto V\text{.}\) This is not a chain because \(t\) is the independent variable for both maps. We also know that the volume is a function of the length of a side, \(s \mapsto V = s^3\text{.}\) From this, we can identify a chain, \(t \mapsto s \mapsto V\text{.}\)
We finish by creating an equation relating our rates. Because our variables are related by a chain, the chain rule establishes this relationship:
The problem gives us \(\frac{dV}{dt} = -500\) ^{mm3}⁄_{s}. The equation \(V=s^3\) is an explicit formula from which we can compute a derivative
At the instant in question, \(s = 30\) mm so that \(\frac{dV}{ds} = 3(30)^2 = 2700\) ^{mm3}⁄_{mm}. The related rates equation involved three rates, two of which we now know. Solving for \(\frac{ds}{dt}\text{,}\) we find
That is, the lengths of the sides are decreasing at a rate of \(-\frac{5}{27}\) ^{mm}⁄_{s}.
In some examples, there are multiple equations relating the variables. In that case, there will also be multiple equations relating their rates.
Many water coolers have cups in the shape of a circular cone. The volume \(V\) of a cone can be calculated in terms of the radius \(r\) of the circular base and the height of the cone \(h\) by
As water fills the cup, the volume of water creates a smaller cone than the cup but one with similar dimensions.
Suppose a cup has a height of 12 cm and a radius at the top of 5 cm. Water is filling the cup at a rate of 80 ^{cm3}⁄_{s}. When the cup is filled to a depth of 6 cm, how fast is the depth changing?
We will work through two different approaches to solving this problem. The first method will be to consider two equations that relate our variables and create two equations for the related rates. The second method will use the two equations relating the variables to create a single function to create a related rates equation.
There are three basic dependent variables: the height of water in the cup, the radius of the circle at the top of the water level, and the volume of water in the cup. All of these change with respect to the independent variable of time. Let \(t\) measure time in seconds, let \(h\) measure the height of water, let \(r\) measure the radius at the top of the water level, and let \(V\) measure the volume of water in the cup. Interpreting the given information, we should note the values of variables at the instant in question. The units of how fast water is filling is a volume per unit time, which we interpret as saying \(\frac{dV}{dt} = 80\text{.}\) The depth of the water informs us that \(h=6\text{.}\) The question asks us to determine \(\frac{dh}{dt}\text{.}\)
The volume of water is related to the radius and height by the equation
In addition, we know that the radius and height must be similar dimensions to the radius and height of the cup itself. This means that the ratios of corresponding sides must be equal, giving a second equation
If we solve for \(r\text{,}\) we find \(r = \frac{5}{12} h\text{.}\)
From the equations relating the dependent variables, we can differentiate to develop equations relating their rates of change. The volume is defined as a constant multiple of \(\frac{1}{3}\pi\) with the product \(r^2 h\text{,}\) and the derivative of \(r^2\) requires the chain rule:
We can also differentiate the equation defining \(r\) to relate the rates for \(r\) and \(h\text{:}\)
With these equations and the data, we can solve for \(\frac{dh}{dt}\text{.}\) Our related rates equation involves the variable \(r\text{,}\) for which we do not have a value. We can use the similar dimensions equation to solve for \(r\text{,}\)
Substituting the values of variables and rates into the related rates equation for \(\frac{dV}{dt}\text{,}\) we find
As this equation has both rates \(\frac{dr}{dt}\) and \(\frac{dh}{dt}\text{,}\) we substitute into the equation our relation \(\frac{dr}{dt} = \frac{5}{12} \frac{dh}{dt}\text{:}\)
Consequently, we conclude the height of water is rising at a rate just higher than 4 ^{cm}⁄_{s}.
The second method uses substitution earlier in the process. Instead of substituting the rate of change from related rates, this approach seeks to write an equation so that \(V\) is only a function of \(h\text{.}\) (We choose \(h\) because it is that variable's rate of change that is desired.) Because \(r=\frac{5}{12}h\text{,}\) we can create a single equation relating \(V\) and \(h\text{:}\)
Once we have the equation relating volume and height of the water, we can differentiate to find a single related rates equation using the constant multiple rule and the chain rule for the power \(h^3\text{:}\)
At this point, we can substitute our known values and solve for \(\frac{dh}{dt}\text{:}\)
A composition or chain occurs when the output of one function acts as the input to another function.
The derivative measures the limiting ratio of changes in the output to the input for small changes in the input. Consequently, in a composition or chain of functions, the overall rate of change is the product of the rates of change for each step.
The chain rule states that
Represented as a chain \(u=g(x)\) and \(y=f(u)\) so that \(y=f(g(x))\text{,}\) the chain rule would be written
This is the derivative of the outer operation times the derivative of the inner expression.
Use the given rates to find the unknown rate.
Given \(\frac{dy}{du} = 4\) and \(\frac{du}{dx} = -3\text{,}\) find \(\frac{dy}{dx}\text{.}\)
Hint: Imagine a chain \(x \mapsto u \mapsto y\) and apply the chain rule.
Given \(\frac{dF}{dP} = 1.2\) and \(\frac{dP}{dt} = 40\text{,}\) find \(\frac{dF}{dt}\text{.}\)
Hint: \(t \mapsto P \mapsto F\text{.}\)
Given \(\frac{dR}{dt} = 50000\) and \(\frac{dp}{dt} = -2\text{,}\) find \(\frac{dR}{dp}\text{.}\)
Hint: \(t \mapsto p \mapsto R\text{.}\)
Given \(\frac{dR}{dt} = 50000\) and \(\frac{dp}{dt} = -2\text{,}\) find \(\frac{dR}{dp}\text{.}\)
Hint: \(t \mapsto p \mapsto R\text{.}\)
Compute the derivatives.
\(\displaystyle \frac{d}{dx}[(3x+2)^3]\)
\(\displaystyle \frac{d}{dx}[(x^2+1)^5]\)
\(\displaystyle \frac{d}{dx}[(2x-5)^{-2}]\)
\(\displaystyle \frac{d}{dx}[\sqrt{4x+1}]\)
\(\displaystyle \frac{d}{dx}[\frac{3}{\sqrt{x^2+4}}]\)
\(\displaystyle \frac{d}{dx}[(x^3+2x)^{-2/3}]\)
\(\displaystyle \frac{d}{dx}[x^4(x^2+1)^3]\)
\(\displaystyle \frac{d}{dx}[x \sqrt{2x+1}]\)
\(\displaystyle \frac{d}{dx}[(3x+1)^{4}(2x-5)^3]\)
\(\displaystyle \frac{d}{dx}[\frac{3x}{(2x+1)^2}]\)
For \(f(x)=x^3(2x+1)^5\text{,}\) find \(f''(x)\text{.}\)
For \(g(x)=\frac{3}{x^2+1}\text{,}\) find \(g''(x)\text{.}\)
For \(h(x)=\sqrt{x^3-1}\text{,}\) find \(h''(x)\text{.}\)
Use the values of \(f(x)\) and \(g(x)\) and their derivatives from the following table to calculate the indicated values.
\(x\) | 0 | 1 | 2 | 3 | 4 | 5 |
\(f(x)\) | 5 | 3 | 1 | 0 | 2 | 4 |
\(g(x)\) | 1 | 4 | 5 | 3 | 2 | 0 |
\(f'(x)\) | -3 | -2 | -1 | 0 | 3 | 5 |
\(g'(x)\) | 4 | 2 | -1 | -2 | -4 | -3 |
For \(h(x) = f(g(x))\text{,}\) find \(h(2)\) and \(h'(2)\text{.}\)
For \(h(x) = g(f(x))\text{,}\) find \(h(2)\) and \(h'(2)\text{.}\)
For \(h(x) = g(2x-3)\text{,}\) find \(h(3)\) and \(h'(3)\text{.}\)
For \(h(x) = f(x^2)\text{,}\) find \(h(2)\) and \(h'(2)\text{.}\)
For \(h(x) = f^2(x) = (f(x))^2\text{,}\) find \(h(1)\) and \(h'(1)\text{.}\)
For \(h(x) = f(2g(x))\text{,}\) find \(h(0)\) and \(h'(0)\text{.}\)
Related Rates
A ripple in a pond spreads as a circle whose radius grows at a speed of 30 ^{cm}⁄_{s}. At what rate is the area enclosed by the ripple increasing?
An oil spill in the ocean is spreading as a circle such that the total area is increasing at a constant rate. After 10 hours, the circle has a radius of 0.1 km. What is the instantaneous rate of change of the radius at this time?
A bacteria colony grows on its substrate in the shape of a circle. Your colleague suggests that the colony only grows along the outer edge such that the rate of change of the area should be proportional to the circumference. Show that this predicts a constant rate of change for the radius.
A spherical balloon is being filled with air at a rate of 0.5 cubic meters per minute. How fast is the radius increasing when the balloon has a radius of 20 cm?
A spherical balloon is being filled with air at a rate of 0.5 cubic meters per minute. At what radius will the balloon have its radius growing at a rate of 1 centimeter per second?
A pile of sand takes the form of a circular cone. As the sand falls, the pile always maintains the same slope so that the height and diameter have the same proportions. When the pile is 2 meters high, the diameter is 4 meters. If the sand pile at that instant is getting taller at a rate of 0.2 meters per minute, at what rate (cubic meters per minute) is sand being added to the pile?