Section 5.6 Taylor's Theorem, Polynomial Approximation
Recall that linear approximation can be used to approximate values of functions. Of course, if we have a calculator, approximations are unnecessary. Approximations are useful when the true value of a number is unknown. However, we also want to know how accurate our approximation is, that is, an upper bound on the error of our approximation. In general,
\begin{equation*}
\boxed{\text{error} = \text{true value} - \text{approximate value}}
\end{equation*}
After all, technically any number is an approximation to any other number. In addition, if the true value is unknown, it is not obvious how to quantify this error.
Subsection 5.6.1 Linear Approximations and Error
Recall that the linear approximation \(L\) of \(f\) about \(a\) is given by,
\begin{equation*}
L(x) = f(a) + f'(a)(x - a)
\end{equation*}
Then, the error \(E(x)\) of this approximation is given by,
\begin{align*}
E(x) \amp = f(x) - L(x)\\
E(x) \amp = f(x) - f(a) - f'(a)(x - a)
\end{align*}
In delta notation, this is,
\begin{equation*}
E(x) = \Delta y - f'(a) \Delta x
\end{equation*}
Graphically, \(E(x)\) represents the vertical distance at \(x\) between the graph of \(f\) and the tangent line to that graph. In general, if \(x\) is near \(a\text{,}\) then \(E(x)\) is small.
Consider the function \(f(x) = x^3\text{.}\) For a point \(a\text{,}\) the tangent line to \(f\) at \(x = a\) is given by \(L(x) = a^3 + 3a^2(x - a)\text{.}\) Then, the error in the linear approximation is given by,
\begin{equation*}
E(x) = x^3 - \brac{a^3 + 3a^2 (x - a)}
\end{equation*}
This can be simplified and factored, but it is made more simple using delta notation. In this notation, \(\Delta y = (a + \Delta x)^3 - a^3 = 3a^2 \Delta x + 3a (\Delta x)^2 + (\Delta x)^3\text{,}\) and \(f'(a) \Delta x = 3a^2 \Delta x\text{.}\) Then,
\begin{align*}
E(x) \amp = 3a^2 \Delta x + 3a (\Delta x)^2 + (\Delta x)^3 - 3a^2 \Delta x\\
\amp = 3a (\Delta x)^2 + (\Delta x)^3\\
\amp = (\Delta x)^2 (3a + \Delta x)
\end{align*}
Notice that as \(\Delta x \to 0\text{,}\) \(E(x) \to 0\text{,}\) due to the factor of \(\Delta x\) in the expression. In addition, \(E(x) \to 0\) quite quickly, because \(\Delta x\) is being squared.
Graphically, it can be seen that the vertical distance between a function \(f\) and its tangent line follows a roughly quadratic pattern. Also, intuitively, $E$ depends on how fast the curve bends away from its tangent line. In other words, if $f''$ is small, then $E$ will be smaller, and if $f''$ is large, then $E$ will be larger. In summary,
Theorem 5.6.2. Linear approximation error bound.
Let \(f\) be a function, differentiable on an interval containing \(a\) and \(x\text{.}\) Then, the error \(E(x) = f(x) - L(x)\) in the linear approximation \(L(x) = f(a) + f'(a)(x - a)\) satisfies,
\begin{equation*}
\boxed{E(x) = \frac{f''(c)}{2}(x - a)^2}
\end{equation*}
for some \(c\) between \(a\) and \(x\text{.}\)
The proof of this theorem is more advanced. Intuitively, this says that the error is proportional to the square of the distance between \(x\) and \(a\text{.}\) Also, the error depends on the magnitude of the second derivative \(f''\) for values between \(x\) and \(a\text{.}\) Notice here that the value \(c\) is unknown.
Proof.
First, assume that \(x > a\text{,}\) and the other case is similar. Apply the generalized MVT to \(E(x) = f(x) - f(a) - f'(a)(x - a)\) and the function \(g(x) = (x - a)^2\) on the interval \([a,x]\text{,}\) then, for some \(c \in (a,x)\text{,}\)
\begin{align*}
\frac{E(x) - E(a)}{(x - a)^2 - (a - a)^2} \amp = \frac{E'(c)}{2(c - a)}\\
\frac{E(x)}{(x - a)^2} \amp = \frac{E'(c)}{2(c - a)}
\end{align*}
Also, note that \(E(a) = 0\text{.}\) Then, again, applying the generalized MVT to \(E'(x) = f'(x) - f'(a)\) and \(h(x) = 2(x - a)\) on the interval \([a,c]\) gives that,
\begin{align*}
\frac{E'(c) - E'(a)}{2(c - a) - 2(a - a)} \amp = \frac{E''(u)}{2}\\
\frac{E'(c)}{2(c - a)} \amp = \frac{E''(u)}{2} = \frac{f''(u)}{2}
\end{align*}
for some \(u \in (a,c)\text{,}\) and note that \(E'(a) = 0\text{,}\) and also that \(E''(x) = f''(x)\text{.}\) Putting it together,
\begin{equation*}
\frac{E(x)}{(x - a)^2} = \frac{f''(u)}{2}
\end{equation*}
and so \(E(x) = \frac{f''(u)}{2} (x - a)^2\text{.}\)
Corollary 5.6.3.
If \(\abs{f''(t)} \lt M\) for all \(t\) between \(a\) and \(x\text{,}\) then,
\begin{equation*}
\boxed{\abs{E(x)} \leq \frac{M}{2}(x - a)^2}
\end{equation*}
Intuitively, the error depends on the magnitude of the concavity of the graph of \(f\text{.}\)
Subsection 5.6.2 Quadratic Approximations
Example 5.6.4. Sine.
\begin{equation*}
\sin{x} \approx x
\end{equation*}
Example 5.6.5. Cosine.
\begin{equation*}
\cos{x} \approx 1 - \frac{1}{2}x^2
\end{equation*}
Example 5.6.6. Exponential.
\begin{equation*}
e^x \approx 1 + x + \frac{1}{2} x^2
\end{equation*}
Subsection 5.6.3 Taylor Polynomials
Intuitively, the linearization of \(f\) at \(x = a\text{,}\)
\begin{equation*}
L(x) = f(a) + f'(a)(x - a)
\end{equation*}
describes the behavior of \(f\) near \(a\) better than any other degree 1 polynomial, in that $L$ matches \(f\) and its derivative \(f'\) at \(a\text{,}\) i.e. \(L(a) = f(a)\) and \(L'(a) = f'(a)\text{.}\)
A better approximation of \(f\) can be obtained by using quadratic or higher-degree polynomials which match more derivatives at \(x = a\text{.}\) For example, consider,
\begin{equation*}
T_2(x) = f(a) + f'(a)(x - a) + \frac{f''(a)}{2}(x - a)^2
\end{equation*}
which satisfies \(T_2(a) = f(a), T_2'(a) = f'(a)\text{,}\) and also \(T_2''(a) = f''(a)\) (notice the factor of 2 in the denominator in \(T_2\) which allows for the last matching). In a similar way, \(T_2\) describes the behavior of \(f\) near \(a\) better than any other degree 2 polynomial. In general, if \(f\) is \(n\) times differentiable on some open interval containing \(a\text{,}\) then the polynomial,
\begin{equation*}
T_n(x) = f(a) + f'(a)(x - a) + \frac{f''(a)}{2!}(x - a)^2 + \frac{f'''(a)}{3!}(x - a)^3 + \dots + \frac{f^{(n)}(a)}{n!}(x - a)^n
\end{equation*}
is the unique polynomial which matches \(f\) and its first \(n\) derivatives at \(x = a\text{,}\)
\begin{equation*}
T_n(a) = f(a) \qquad T_n'(a) = f'(a) \qquad \cdots \qquad T_n^{(n)}(a) = f^{(n)}(a)
\end{equation*}
and describes \(f\) near \(x = a\) better than any other polynomial of degree at most \(n\text{.}\) This is called a Taylor polynomial of \(f\text{.}\)
Definition 5.6.7.
Let \(f\) be \(n\)-times differentiable, \(a \in \mathbb{R}\text{.}\) Then, the \(n\)th order Taylor polynomial centered at \(a\) is given by,
\begin{equation*}
\boxed{T_n(x) = f(a) + f'(a)(x - a) + \dots + \frac{f^{(n)}(a)}{n!}(x - a)^n = \sum_{k=0}^n \frac{f^{(k)}(a)}{k!} (x - a)^k}
\end{equation*}
A Taylor polynomial centered at \(a = 0\) are also called MacLaurin polynomial.
In this way, \(T_1(x)\) is the linearization of \(f\text{,}\) and \(T_2(x)\) is a quadratic approximation of \(f\text{,}\) etc. The 0th-order Taylor polynomial, \(T_0\text{,}\) is a constant function \(T_0(x) = f(a)\text{.}\) Also, in general,
\begin{equation*}
T_n(x) = T_{n-1}(x) + \frac{f^{(n)}(a)}{n!}(x - a)^n
\end{equation*}
Note that sometimes, \(T_n\) is called the \(n\)th degree Taylor polynomial, however \(T_n\) may have degree lower than \(n\) if \(f^{(n)}(a) = 0\text{,}\) etc. Taylor polynomials are named after Brook Taylor (1685-1731).
Subsection 5.6.4 Taylor's Theorem (Taylor's Formula)
Taylor polynomials are used to approximate functions near a value \(x = a\text{.}\) Taylor's theorem provides information about the error of the approximation.
Definition 5.6.8.
The error term (or remainder term) of a Taylor approximation is defined by,
\begin{equation*}
\boxed{E_n(x) = f(x) - T_n(x)}
\end{equation*}
Theorem 5.6.9.
Let \(f\) be \((n+1)\)-times differentiable on an interval containing \(a\) and \(x\text{.}\) Then, the error term is given by,
\begin{equation*}
E_n(x) = \frac{f^{(n+1)}(c)}{(n+1)!} (x - a)^{n+1}
\end{equation*}
for some \(c\) between \(a\) and \(x\text{.}\) More explicitly,
\begin{equation*}
\boxed{f(x) = \underbrace{f(a) + f'(a)(x - a) + \dots + \frac{f^{(n+1)}(a)}{n!}(x - a)^n}_{T_n(x)} + \underbrace{\frac{f^{(n+1)}(c)}{(n + 1)!}(x - a)^{n+1}}_{E_n(x)}}
\end{equation*}
This equation is called Taylor's formula.
Notice that the error term of Taylor's theorem is just the next term in the Taylor approximation, except with the derivative evaluated at some unknown point \(c\) between \(a\) and \(x\text{,}\) rather than at \(a\text{.}\)
Corollary 5.6.10.
If the \((n+1)\)-st derivative of \(f\) is bounded, say \(\abs{f^{(n+1)}(t)} \leq M\) for all \(t\) between \(x\) and \(a\text{,}\) then by Taylor's theorem,
\begin{equation*}
\boxed{E_n(x) \leq \frac{M}{(n + 1)!}(x - a)^{n+1}}
\end{equation*}
Subsection 5.6.5 Proof of Taylor's Theorem
Proof.
For \(n = 0\text{,}\) Taylor's theorem is,
\begin{equation*}
f(x) = f(a) + f'(c)(x - a)
\end{equation*}
for some \(c\) between \(a\) and \(x\text{.}\) This is just the MVT.
Subsection 5.6.6 Misc
Theorem 5.6.11.
Let \(p(x) = a_0 + a_1 x + a_2 x^2 + \dots + a_n x^n\) be a polynomial. Then, for any \(a \in \mathbb{R}\text{,}\) \(p\) can be written in the form,
\begin{equation*}
p(x) = c_0 + c_1 (x - a) + c_2 (x - a)^2 + \dots + c_n (x - a)^n = \sum_{k=0}^n c_k (x - a)^n
\end{equation*}
Proof.
We write \(x = (x - a) + a\text{.}\) Then,
\begin{equation*}
p(x) = a_0 + a_1 \brac{(x - a) + a} + a_2 \brac{(x - a) + a}^2 + \dots + a_n \brac{(x - a) + a}^n
\end{equation*}
Then, expand (using the binomial theorem), and collect like powers of \(x - a\text{.}\)
Theorem 5.6.12.
If \(p(x) = a_0 + a_1 x + a_2 x^2 + \dots + a_n x^n\) is a polynomial, then the coefficients are given by,
\begin{equation*}
a_k = \frac{p^{(k)}(0)}{k!}
\end{equation*}
Proof.
The \(k\)th derivative of \(p(x)\) has constant term \(k! a_k\text{.}\) Thus, \(p^{(k)}(0) = k! a_k\text{,}\) and the result follows.
Theorem 5.6.13.
Let \(T_n(x) = a_0 + a_1 (x - a) + a_2 (x - a)^2 + \dots + a_n (x - a)^n\) is the MacLaurin polynomial of \(f\) at \(x = a\text{.}\) Then, the error term \(E_n(x) = f(x) - T_n(x)\) is such that,
\begin{equation*}
E_n(a) = E_n'(a) = \dots = E^{(n)}(a) = 0
\end{equation*}
and so the \(n\)th Taylor polynomial of \(E_n\) at \(a\) is 0.
Theorem 5.6.14.
If \(g(x)\) is \((n+1)\)-times differentiable on some interval containing \(x\) and \(a\text{,}\) and if \(g(a) = g'(a) = \dots = g^{(n)}(a) = 0\text{,}\) then there exists \(c\) between \(a\) and \(x\) such that,
\begin{equation*}
g(x) = \frac{g^{(n+1)}(c)}{(n+1)!} (x - a)^{n+1}
\end{equation*}