Matrix Equations, Matrix-Vector Product

Section 4.2 Matrix Equations, Matrix-Vector Product

Next, we will combine matrices and vectors, in order to write a system of linear equations as an equation relating matrices and vectors.

Subsection 4.2.1 Product of a Matrix with a Vector

Definition 4.2.1.

Let \(A = \begin{bmatrix} a_{ij} \end{bmatrix}\) be an \(m \times n\) matrix, \(\vec{x} \in \mathbb{R}^n\) be a vector, \(\vec{x} = (x_1, \dots, x_n)\text{.}\) Then, the product of \(A\) and \(\vec{x}\text{,}\) is the vector defined by,

\begin{equation*} \boxed{A\vec{x} = \begin{bmatrix} a_{11} x_1 + a_{12} x_2 + \dots + a_{1n} x_n \\ a_{21} x_1 + a_{22} x_2 + \dots + a_{2n} x_n \\ \vdots \\ a_{m1} x_1 + a_{m2} x_2 + \dots + a_{mn} x_n \end{bmatrix} = \begin{bmatrix} \sum_{k=1}^n a_{1k} x_k \\ \sum_{k=1}^n a_{2k} x_k \\ \vdots \\ \sum_{k=1}^n a_{mk} x_k \end{bmatrix}} \end{equation*}

In general, to detemrine the \(i\)th entry of the vector \(A\vec{x}\text{,}\) compute the “product” of the \(i\)th row of \(A\text{,}\) and \(\vec{x}\text{,}\) as,

\begin{equation*} \begin{bmatrix} a_{i1} \amp a_{i2} \amp \dots \amp a_{in} \end{bmatrix} \begin{bmatrix} x_1 \\ x_2 \\ \vdots \\ x_n \end{bmatrix} = \begin{bmatrix} a_{i1} x_1 + a_{i2} x_2 + \dots + a_{in} x_n \end{bmatrix} \end{equation*}

This operation, of multiplying two vectors of matching lengths to produce a single number (in particular a row vector and a column vector), is called a dot product, of the two vectors. More generally, for vectors \(\vec{a} = (a_1, \dots, a_n), \vec{b} = (b_1, \dots, b_n)\text{,}\)

\begin{equation*} \boxed{\begin{bmatrix} a_1 \amp a_2 \amp \dots \amp a_n \end{bmatrix} \begin{bmatrix} b_1 \\ b_2 \\ \vdots \\ b_n \end{bmatrix} = a_1 b_1 + a_2 b_2 + \dots + a_n b_n} \end{equation*}

Subsection 4.2.2 Matrix Form of a Linear System

Multiplication of a matrix and a vector is precisely defined as to align with systems of equations. In particular, for the system of equations,

\begin{alignat*}{3} a_{11}x_1 \amp + a_{12}x_2 \amp + \dots \amp + a_{1n}x_n \amp = b_1\\ a_{21}x_1 \amp + a_{22}x_2 \amp + \dots \amp + a_{2n}x_n \amp = b_2\\ \vdots \quad \amp \qquad \vdots \amp \amp \qquad \vdots \amp \vdots\\ a_{m1}x_1 \amp + a_{m2}x_2 \amp + \dots \amp + a_{mn}x_n \amp = b_n \end{alignat*}

\begin{equation*} \underbrace{A = \begin{bmatrix} a_{11} \amp a_{12} \amp \dots \amp a_{1n} \\ a_{21} \amp a_{22} \amp \dots \amp a_{2n} \\ \vdots \amp \vdots \amp \ddots \amp \vdots \\ a_{m1} \amp a_{n2} \amp \dots \amp a_{mn} \end{bmatrix}}_{\text{matrix of coefficients}} \qquad \underbrace{\vec{x} = \begin{bmatrix} x_1 \\ x_2 \\ \vdots \\ x_n \end{bmatrix}}_{\text{vector of unknowns}} \qquad \underbrace{\vec{b} = \begin{bmatrix} b_1 \\ b_2 \\ \vdots \\ b_n \end{bmatrix}}_{\text{constant vector}} \end{equation*}

\(A\vec{x} = \vec{b}\text{.}\)

Subsection 4.2.3 Linear Combination Interpretation

From the previous definition of a matrix-vector product, we can expand out the product as,

\begin{align*} A\vec{x} = \begin{bmatrix} a_{11} x_1 + a_{12} x_2 + \dots + a_{1n} x_n \\ a_{21} x_1 + a_{22} x_2 + \dots + x_{2n} x_n \\ \vdots \\ a_{m1} x_1 + a_{m2} x_2 + \dots + a_{mn} x_n \end{bmatrix} \amp = \begin{bmatrix} a_{11} x_1 \\ a_{21} x_1 \\ \vdots \\ a_{m1} x_1 \end{bmatrix} + \begin{bmatrix} a_{12} x_2 \\ a_{22} x_2 \\ \vdots \\ a_{m2} x_2 \end{bmatrix} + \dots + \begin{bmatrix} a_{1n} x_n \\ a_{2n} x_n \\ \vdots \\ a_{mn} x_n \end{bmatrix}\\ \amp = x_1 \begin{bmatrix} a_{11} \\ a_{21} \\ \vdots \\ a_{m1} \end{bmatrix} + x_2 \begin{bmatrix} a_{12} \\ a_{22} \\ \vdots \\ a_{m2} \end{bmatrix} + \dots + x_n \begin{bmatrix} a_{1n} \\ a_{2n} \\ \vdots \\ a_{mn} x_n \end{bmatrix} \end{align*}

\(A\vec{x}\)\(A\text{,}\)\(\vec{x}\text{.}\)

Theorem 4.2.2.

The product \(A\vec{x}\) is the linear combination of the columns of \(A\) with the corresponding entries of \(\vec{x}\) as weights. More precisely, if \(A = \begin{bmatrix} \vec{a}_1 \amp \vec{a}_2 \amp \cdots \amp \vec{a}_n \end{bmatrix}\text{,}\) and \(\vec{x} = (x_1, \dots, x_n)\text{,}\) then,

\begin{equation*} \boxed{A \vec{x} = \begin{bmatrix} \vec{a}_1 \amp \vec{a}_2 \amp \cdots \amp \vec{a}_n \end{bmatrix} \begin{bmatrix} x_1 \\ x_2 \\ \vdots \\ x_n \end{bmatrix} = x_1 \vec{a}_1 + \dots + x_n \vec{a}_n} \end{equation*}

Subsection 4.2.4 Matrix Equations as Linear Systems

This provides yet another interpretation of linear systems.

Theorem 4.2.3.

Let \(A\) be an \(m \times n\) matrix with columns \(\vec{a}_1, \dots, \vec{a}_n\text{,}\) and let \(\vec{b} \in \mathbb{R}^m\text{.}\) Then, the matrix equation \(A\vec{x} = \vec{b}\) has precisely the same solution set as the system of linear equations whose augmented matrix is,

\begin{equation*} \begin{bmatrix} \vec{a}_1 \amp \vec{a}_2 \amp \dots \amp \vec{a}_n \amp \vec{b} \end{bmatrix} \end{equation*}

Corollary 4.2.4.

The equation \(A\vec{x} = \vec{b}\) has a solution if and only if \(\vec{b}\) is a linear combination of the columns of \(A\text{.}\)

Subsection 4.2.5 The Identity Matrix

Recall that for real numbers, multiplying a number by 1 results in the number being unchanged. We say that 1 is the multiplicative identity for real numbers, and that \(a \cdot 1 = a\) for all \(a \in \mathbb{R}\text{.}\) Similarly, we can define a matrix which, when multiplied by any vector, leaves it unchanged. This matrix is called an identity matrix.

Definition 4.2.5.

An identity matrix is a square matrix with 1's along the main diagonal (from top left to bottom right) and 0's everywhere else. In particular, the \(n \times n\) identity matrix is denoted by \(I_n\text{,}\) and is given by,

\begin{equation*} I_n = \begin{bmatrix} 1 \amp 0 \amp \dots \amp 0 \\ 0 \amp 1 \amp \dots \amp 0 \\ \vdots \amp \vdots \amp \ddots \amp \vdots \\ 0 \amp 0 \amp \dots \amp 1 \end{bmatrix} \end{equation*}

For example, the \(2 \times 2\) identity matrix \(I_2\) and the \(3 \times 3\) identity matrix \(I_3\) are given by,

\begin{equation*} I_2 = \begin{bmatrix} 1 \amp 0 \\ 0 \amp 1 \end{bmatrix} \qquad I_3 = \begin{bmatrix} 1 \amp 0 \amp 0 \\ 0 \amp 1 \amp 0 \\ 0 \amp 0 \amp 1 \end{bmatrix} \end{equation*}

The identity matrix acts as a multiplicative identity, in that it has the property that \(I_n \vec{x} = \vec{x}\) for all \(\vec{x} \in \mathbb{R}^n\text{.}\) Indeed, for example,

\begin{equation*} I_2 \vec{x} = \begin{bmatrix} 1 \amp 0 \\ 0 \amp 1 \end{bmatrix} \begin{bmatrix} x_1 \\ x_2 \end{bmatrix} = x_1 \begin{bmatrix} 1 \\ 0 \end{bmatrix} + x_2 \begin{bmatrix} 0 \\ 1 \end{bmatrix} = \begin{bmatrix} x_1 \\ x_2 \end{bmatrix} = \vec{x} \end{equation*}

Subsection 4.2.6 Properties of the Matrix-Vector Product

Theorem 4.2.6.

Let \(A\) be an \(m \times n\) matrix, \(\vec{u}, \vec{v} \in \mathbb{R}^n\) be vectors, \(c \in \mathbb{R}\) be a scalar. Then,

\(\displaystyle A(\vec{u} + \vec{v}) = A\vec{u} + A\vec{v}\)
\(\displaystyle A(c\vec{u}) = c(A\vec{u})\)

Proof of \(2 \times 2\) case.

Let \(A = \begin{bmatrix} \vec{a}_1 \amp \vec{a}_2 \end{bmatrix}, \vec{u} = (u_1, u_2), \vec{v} = (v_1, v_2)\text{.}\) Then,

\begin{align*} A(\vec{u} + \vec{v}) \amp = \begin{bmatrix} \vec{a}_1 \amp \vec{a}_2 \end{bmatrix} \begin{bmatrix} u_1 + v_1 \\ u_2 + v_2 \end{bmatrix}\\ \amp = (u_1 + v_1) \vec{a}_1 + (u_2 + v_2) \vec{a}_2\\ \amp = u_1 \vec{a}_1 + v_1 \vec{a}_1 + u_2 \vec{a}_2 + v_2 \vec{a}_2\\ \amp = \brac{u_1 \vec{a}_1 + u_2 \vec{a}_2} + \brac{v_1 \vec{a}_1 + v_2 \vec{a}_2}\\ \amp = A\vec{u} + A\vec{v} \end{align*}

Also,

\begin{align*} A(c\vec{u}) \amp = \begin{bmatrix} \vec{a}_1 \amp \vec{a}_2 \end{bmatrix} \begin{bmatrix} cu_1 \\ cu_2 \end{bmatrix}\\ \amp = cu_1 \vec{a}_1 + cu_2 \vec{a}_2\\ \amp = c\brac{u_1 \vec{a}_1 + u_2 \vec{a}_2}\\ \amp = c (A\vec{u}) \end{align*}

Proof of general case.

Let \(A = \begin{bmatrix} \vec{a}_1 \amp \dots \amp \vec{a}_n \end{bmatrix}, \vec{u} = (u_1, \dots, u_n), \vec{v} = (v_1, \dots, v_n)\text{.}\) Then,

\begin{align*} A(\vec{u} + \vec{v}) \amp = \begin{bmatrix} \vec{a}_1 \amp \dots \amp \vec{a}_n \end{bmatrix} \begin{bmatrix} u_1 + v_1 \\ \vdots \\ u_n + v_n \end{bmatrix}\\ \amp = (u_1 + v_1) \vec{a}_1 + \dots + (u_n + v_n) \vec{a}_n\\ \amp = u_1 \vec{a}_1 + \dots + u_n \vec{a}_n + v_1 \vec{a}_1 + \dots + v_n \vec{a}_n\\ \amp = A\vec{u} + A\vec{v} \end{align*}