Subsection 6.5.1 Coordinate Systems
Recall that a basis is a spanning set, in that every vector can be written as some linear combination of basis vectors. In addition, bases are useful because it turns out that every vector in a subspace \(H\) can be written uniquely as a linear combination of a basis of \(H\text{.}\)
Theorem 6.5.1. Unique Representation theorem.
\(\mathcal{B} = \set{\vec{b}_1, \dots, \vec{b}_p}\)\(H\text{.}\)\(\vec{x} \in H\text{,}\)\(c_1, \dots, c_p\)
\begin{equation*}
\vec{x} = c_1 \vec{b}_1 + \dots + c_p \vec{b}_p
\end{equation*}
Proof.
Suppose that \(\vec{x}\) has two representations as linear combinations of \(\mathcal{B}\text{,}\) say \(\vec{x} = c_1 \vec{b}_1 + \dots + c_p \vec{b}_p\) and \(\vec{x} = d_1 \vec{b}_1 + \dots + d_p \vec{b}_p\text{.}\) Then, subtracting these two representations,
\begin{equation*}
\vec{0} = \vec{x} - \vec{x} = (c_1 - d_1) \vec{b}_1 + \dots + (c_p - d_p) \vec{b}_p
\end{equation*}
Then, since \(\mathcal{B}\) is linearly independent, every weight must be equal to 0. Thus, \(c_i - d_i = 0\) for every \(i\text{,}\) or \(c_i = d_i\) for every \(i\text{.}\) That is, the representations are the same.
Definition 6.5.2.
Let \(\mathcal{B} = \set{\vec{b}_1, \dots, \vec{b}_p}\) be a basis of a subspace \(H\text{.}\) Then, for \(\vec{x} \in H\text{,}\) the coordinates of \(\vec{x}\) relative to the basis \(\mathcal{B}\) are the weights \(c_1, \dots, c_p\) such that \(\vec{x} = c_1 \vec{b}_1 + \dots + c_p \vec{b}_p\text{,}\) and the vector,
\begin{equation*}
[\vec{x}]_{\mathcal{B}} = \begin{bmatrix} c_1 \\ \vdots \\ c_p \end{bmatrix}
\end{equation*}
is called the coordinate vector of \(\vec{x}\) relative to \(\mathcal{B}\).
Intuitively, the coordinates of a vector \(\vec{x}\) are a “name” for \(\vec{x}\) that uniquely identifies it among all vectors in \(H\text{.}\) In this way, each basis acts as a different naming system. Alternatively, different bases can be thought of as different reference frames or different points of view on a given vector.
Then, consider the question of how to deterine the coordinates of a vector \(\vec{x}\text{,}\) given its coordinates in the standard basis.
Example 6.5.3. Motivating example.
Consider \(\mathbb{R}^2\text{,}\) and the basis \(\mathcal{B} = \set{\vec{b}_1, \vec{b}_2}\text{,}\) where \(\vec{b}_1 = \begin{bmatrix} 2 \\ 1 \end{bmatrix}, \vec{b}_2 = \begin{bmatrix} -1 \\ 1 \end{bmatrix}\text{.}\) Consider the vector \(\vec{x} = \begin{bmatrix} 4 \\ 5 \end{bmatrix}\) (in the standard basis coordinates), and consider the coordinates of this vector in the basis \(\mathcal{B}\text{.}\) The coordinates \(c_1, c_2\) of \(\vec{x}\) relative to \(\mathcal{B}\text{,}\) by definition, satisfy the equation,
\begin{equation*}
c_1 \begin{bmatrix} 2 \\ 1 \end{bmatrix} + c_2 \begin{bmatrix} -1 \\ 1 \end{bmatrix} = \begin{bmatrix} 4 \\ 5 \end{bmatrix}
\end{equation*}
Or, in matrix form,
\begin{equation*}
\begin{bmatrix} 2 \amp -1 \\ 1 \amp 1 \end{bmatrix} \begin{bmatrix} c_1 \\ c_2 \end{bmatrix} = \begin{bmatrix} 4 \\ 5 \end{bmatrix}
\end{equation*}
Then, determining \(c_1, c_2\) involves solving this linear system, for example either by using row operations, or using inverse matrices. Using either method, the solution is \(c_1 = 3, c_2 = 2\text{,}\) and so \(\vec{x} = 3 \vec{b}_1 + 2\vec{b}_2\text{,}\) and,
\begin{equation*}
[\vec{x}]_{\mathcal{B}} = \begin{bmatrix} 3 \\ 2 \end{bmatrix}
\end{equation*}
More generally, in \(\mathbb{R}^n\text{,}\) let \(\mathcal{B} = \set{\vec{b}_1, \dots, \vec{b}_n}\) be a basis, and let \(\vec{x}\) be a vector (specified in the standard basis). Then, if \([\vec{x}]_{\mathcal{B}} = c_1 \vec{b}_1 + \dots + c_n \vec{b}_n\text{,}\) then by definition,
\begin{equation*}
c_1 \vec{b}_1 + \dots + c_n \vec{b}_n = \vec{x}
\end{equation*}
In other words, the following equation holds,
\begin{equation*}
\begin{bmatrix} \vec{b}_1 \amp \dots \amp \vec{b}_n \end{bmatrix} [\vec{x}]_{\mathcal{B}} = \vec{x}
\end{equation*}
The matrix \(\begin{bmatrix} \vec{b}_1 \amp \dots \amp \vec{b}_n \end{bmatrix}\) is denoted by \(P_{\mathcal{B}}\text{,}\) and is called the change-of-coordinates matrix from \(\mathcal{B}\) to the standard basis in \(\mathbb{R}^n\text{.}\) Then,
\begin{equation*}
\boxed{\vec{x} = P_{\mathcal{B}} [\vec{x}]_{\mathcal{B}}}
\end{equation*}
i.e. left-multiplication by \(P_{\mathcal{B}}\) converts the coordinate vector \([\vec{x}]_{\mathcal{B}}\) into \(\vec{x}\text{.}\) Conversely, the columns of \(P_{\mathcal{B}}\) form a basis of \(\mathbb{R}^n\) (by assumption), so \(P_{\mathcal{B}}\) is invertible. Then,
\begin{equation*}
\boxed{[\vec{x}]_{\mathcal{B}} = P_{\mathcal{B}}^{-1} \vec{x}}
\end{equation*}
i.e. left-multiplication by the inverse of the change-of-coordinates matrix \(P_{\mathcal{B}}^{-1}\) converts \(\vec{x}\) to its coordinate vector \([\vec{x}]_{\mathcal{B}}\text{.}\)
Subsection 6.5.2 Change of Basis
Different bases may be useful for different purposes. It is useful to understand how to convert a vector from one coordinate system using one basis, to another coordinate system with another basis.
Example 6.5.4. Motivating example: Two bases in \(\mathbb{R}^2\).
Consider two bases, \(\mathcal{B} = \set{\vec{b}_1, \vec{b}_2}, \mathcal{C} = \set{\vec{c}_1, \vec{c}_2}\text{,}\) and let \(\vec{x}\) have coordinates \([\vec{x}]_{\mathcal{B}} = (3,1)\text{.}\) Then, consider the coordinates of \(\vec{x}\) with respect to \(\mathcal{C}\text{,}\) say \([\vec{x}]_{\mathcal{C}} = (a_1, a_2)\text{.}\) Then, by definition,
\begin{equation*}
\vec{x} = 3\vec{b}_1 + \vec{b}_2
\end{equation*}
and,
\begin{equation*}
\vec{x} = a_1 \vec{c}_1 + a_2 \vec{c}_2
\end{equation*}
Again, the goal is to solve for \(a_1, a_2\text{.}\) The key here is that we need to know how to represent the \(\mathcal{B}\) basis vectors \(\vec{b}_1, \vec{b}_2\) in terms of the “new” basis vectors \(\vec{c}_1, \vec{c}_2\text{.}\) For this example, suppose that \(\vec{b}_1 = 4\vec{c}_1 + \vec{c}_2\) and \(\vec{b}_2 = -6 \vec{c}_1 + \vec{c}_2\text{.}\) Then, to determine \(\vec{x}\) in terms of \(\vec{c}_1, \vec{c}_2\text{,}\) we can substitute the relation between \(\vec{b}_1, \vec{b}_2\) and \(\vec{c}_1, \vec{c}_2\text{,}\)
\begin{align*}
\vec{x} \amp = 3(4\vec{c}_1 + \vec{c}_2) + (-6 \vec{c}_1 + \vec{c}_2)\\
\vec{x} \amp = 6 \vec{c}_1 + 4\vec{c}_2 \amp\amp \text{collecting like terms}
\end{align*}
Thus,
\begin{equation*}
[\vec{x}]_{\mathcal{C}} = \begin{bmatrix} 6 \\ 4 \end{bmatrix}
\end{equation*}
More generally, let \([\vec{x}]_{\mathcal{B}} = \begin{bmatrix} x_1 \\ x_2 \end{bmatrix}\text{,}\) and let,
\begin{align*}
\vec{b}_1 \amp = a_{11} \vec{c}_1 + a_{12} \vec{c}_2\\
\vec{b}_2 \amp = a_{21} \vec{c}_1 + a_{22} \vec{c}_2
\end{align*}
Then,
\begin{align*}
\vec{x} \amp = x_1 \vec{b}_1 + x_2 \vec{b}_2\\
\amp = x_1 (a_{11} \vec{c}_1 + a_{12} \vec{c}_2) + x_2 (a_{21} \vec{c}_1 + a_{22} \vec{c}_2)\\
\amp = (a_{11} x_1 + a_{12} x_2) \vec{c}_1 + (a_{21} x_1 + a_{22} x_2) \vec{c}_2
\end{align*}
Thus,
\begin{equation*}
[\vec{x}]_{\mathcal{C}} = \begin{bmatrix} a_{11} x_1 + a_{12} x_2 \\ a_{21} x_1 + a_{22} x_2 \end{bmatrix}
\end{equation*}
Notice that this can be decomposed in terms of a matrix-vector product,
\begin{equation*}
[\vec{x}]_{\mathcal{C}} = \begin{bmatrix} a_{11} \amp a_{12} \\ a_{21} \amp a_{22} \end{bmatrix} \begin{bmatrix} x_1 \\ x_2 \end{bmatrix}
\end{equation*}
Denoting this matrix by \(P_{\mathcal{B} \rightarrow \mathcal{C}}\text{,}\)
\begin{equation*}
[\vec{x}]_{\mathcal{C}} = P_{\mathcal{B} \rightarrow \mathcal{C}} [\vec{x}]_{\mathcal{B}}
\end{equation*}
Here, \(P_{\mathcal{B} \rightarrow \mathcal{C}}\) is a matrix which converts vectors in \(\mathcal{B}\) coordinates to \(\mathcal{C}\) coordinates (hence the subscript \(\mathcal{B} \rightarrow \mathcal{C}\)). Also, the columns of \(P_{\mathcal{B} \rightarrow \mathcal{C}}\) are the coordinates of the basis vectors of \(\mathcal{B}\) in terms of \(\mathcal{C}\) coordinates.
This generalizes to vectors in \(\mathbb{R}^n\text{.}\)
Theorem 6.5.5. Change of basis.
Let \(\mathcal{B} = \set{\vec{b}_1, \dots, \vec{b}_n}, \mathcal{C} = \set{\vec{c}_1, \dots, \vec{c}_n}\) be bases of \(\mathbb{R}^n\text{.}\) Then, there is a unique \(n \times n\) matrix \(P_{\mathcal{B} \rightarrow \mathcal{C}}\text{,}\) called the change-of-coordinates matrix from \(\mathcal{B}\) to \(\mathcal{C}\), such that,
\begin{equation*}
\boxed{[\vec{x}]_{\mathcal{C}} = P_{\mathcal{B} \rightarrow \mathcal{C}} [\vec{x}]_{\mathcal{B}}}
\end{equation*}
Further, the columns of \(P_{\mathcal{B} \rightarrow \mathcal{C}}\) are the coordinates vectors of the basis \(\mathcal{B}\) with respect to \(\mathcal{C}\text{,}\)
\begin{equation*}
P_{\mathcal{B} \rightarrow \mathcal{C}} = \begin{bmatrix} [\vec{b}_1]_{\mathcal{C}} \amp [\vec{b}_2]_{\mathcal{C}} \amp \dots \amp [\vec{b}_n]_{\mathcal{C}} \end{bmatrix}
\end{equation*}
The matrix \(P_{\mathcal{B} \rightarrow \mathcal{C}}\) is invertible, because its columns form a collection of \(n\) linearly independent vectors, because they are coordinate vectors of the linearly independent set \(\mathcal{B}\text{.}\) Then,
\begin{equation*}
\brac{P_{\mathcal{B} \rightarrow \mathcal{C}}}^{-1} [\vec{x}]_{\mathcal{C}} = [\vec{x}]_{\mathcal{B}}
\end{equation*}
Notice that this is a generalization of the previous matrix which converted from the standard basis \(\mathcal{B}\) to \(\mathcal{E} = \set{\vec{e}_1, \dots, \vec{e}_n}\text{.}\) By above, the change-of-coordinates matrix \(P_{\mathcal{B} \rightarrow \mathcal{E}}\) has columns given by the coordinates of the basis \(\mathcal{B}\) with respect to the standard basis \(\mathcal{E}\text{.}\) Since \(\mathcal{B}\) is given in terms of the standard basis, we have that,
\begin{equation*}
P_{\mathcal{B} \rightarrow \mathcal{E}} = \begin{bmatrix} \vec{b}_1 \amp \vec{b}_2 \amp \dots \amp \vec{b}_n \end{bmatrix}
\end{equation*}