Subsection 5.3.1 Linear Transformations and Matrices
Consider a linear transformation in the plane, \(T: \mathbb{R}^2 \rightarrow \mathbb{R}^2\text{.}\) Recall that any vector \(\vec{x} = (x_1, x_2) \in \mathbb{R}^2\) can be written in terms of the standard basis vectors \(\ihat = (1,0), \jhat = (0,1)\text{,}\) as,
\begin{equation*}
\vec{x} = x_1 \ihat + x_2 \jhat
\end{equation*}
Then, consider the image of \(\vec{x}\) under \(T\text{,}\)
\begin{equation*}
T(\vec{x}) = T(x_1 \ihat + x_2 \jhat)
\end{equation*}
By linearity, this is equivalent to,
\begin{align*}
T(\vec{x}) \amp = T(x_1 \ihat + x_2 \jhat)\\
\amp = x_1 T(\ihat) + x_2 T(\jhat)
\end{align*}
Thus, if we know the images of the standard basis vectors \(T(\ihat)\) and \(T(\jhat)\text{,}\) we can determine the image of any arbitrary vector \(T(\vec{x})\text{.}\) In this way, the transformation \(T\) is “completely determined” by the two images \(T(\ihat)\) and \(T(\jhat)\text{.}\) Intuitively, if you know \(T(\ihat)\) and \(T(\jhat)\text{,}\) then you know \(T\text{.}\)
Further, the transformation \(T\) can also be written as a matrix-vector product,
\begin{align*}
T(\vec{x}) \amp = x_1 T(\ihat) + x_2 T(\jhat)\\
T(\vec{x}) \amp = \begin{bmatrix} T(\ihat) \amp T(\jhat) \end{bmatrix} \begin{bmatrix} x_1 \\ x_2 \end{bmatrix}
\end{align*}
Here, the matrix \(\begin{bmatrix} T(\ihat) \amp T(\jhat) \end{bmatrix}\) has two columns, given by the images of the vectors \(\ihat\) and \(\jhat\text{.}\) If we denote this matrix by \(A\text{,}\) then the equation becomes,
\begin{equation*}
T(\vec{x}) = A\vec{x}
\end{equation*}
That is, we have written \(T(\vec{x})\) as a matrix transformation. Notice that the matrix \(A\) only depends on the transformation \(T\text{,}\) and not on the input vector \(\vec{x}\text{.}\)
This can be generalized to a linear transformation \(T: \mathbb{R}^n \rightarrow \mathbb{R}^m\text{.}\)
Theorem 5.3.1. Matrix representation of a linear transformation.
Let \(T: \mathbb{R}^n \rightarrow \mathbb{R}^m\) be a linear transformation. Then, there exists a unique matrix \(A\) such that,
\begin{equation*}
T(\vec{x}) = A\vec{x} \qquad \text{for all $\vec{x} \in \mathbb{R}^n$}
\end{equation*}
Further, \(A\) is precisely the \(m \times n\) matrix whose \(j\)th column is given by \(T(\vec{e}_j)\text{,}\) where \(\vec{e}_j\) is the \(j\)th column of the \(n \times n\) identity matrix. In other words,
\begin{equation*}
A = \begin{bmatrix} T(\ihat) \amp \cdots \amp T(\vec{e}_n) \end{bmatrix}
\end{equation*}
The matrix \(A\) is called the standard matrix for the linear transformation \(T\text{.}\)
In this way, matrices provide a numerical language for understanding linear transformations. Intuitively, the term linear transformation focusses on the properties of the mapping, whereas matrix transformation describes how the mapping is implemented.
Proof.
For \(\vec{x} \in \mathbb{R}^n\text{,}\)
\begin{equation*}
\vec{x} = x_1 \ihat + \dots + x_n \vec{e}_n
\end{equation*}
Then,
\begin{align*}
T(\vec{x}) = T(x_1 \ihat + \dots + x_n \vec{e}_n) \amp = x_1 T(\ihat) + \dots + x_n T(\vec{e}_n)\\
\amp = \begin{bmatrix} T(\ihat) \amp \cdots \amp T(\vec{e}_n) \end{bmatrix} \begin{bmatrix} x_1 \\ \vdots \\ x_n \end{bmatrix} \\
\amp = A\vec{x}
\end{align*}