Skip to main content

Section 10.1 Matrix Factorizations, LU Factorziation

Recall that factoring an algebraic expression involves writing it as a product of two or more expressions, and is helpful in various ways. A factorization of a matrix \(A\) expresses it as a product of two or more matrices.

Subsection 10.1.1 LU Factorization

Let \(A\) be an \(m \times n\) matrix. Suppose that \(A\) can be converted to row echelon form without using row exchanges. Then, \(A\) can be written in the form \(A = LU\text{,}\) where \(L\) is an \(m \times m\) lower triangular matrix, amnd \(U\) is an \(m \times n\) upper triangular matrix.
The matrix \(U\) is the REF of \(A\text{,}\) and so is naturally upper triangular. The matrix \(L\) is the product of the elementary matrices used to convert \(A\) to \(U\text{,}\) so it is just the identity matrix with additional non-zero entries below the diagonal, corresponding to the row replacements and scaling row operations.
\begin{equation*} \begin{array}{rcc} A = \amp \begin{bmatrix} 1 \amp 0 \amp 0 \amp 0 \\ \ast \amp 1 \amp 0 \amp 0 \\ \ast \amp \ast \amp 1 \amp 0 \\ \ast \amp \ast \amp \ast \amp 1 \end{bmatrix} \amp \begin{bmatrix} \blacksquare \amp \ast \amp \ast \amp \ast \amp \ast \\ 0 \amp \blacksquare \amp \ast \amp \ast \amp \ast \\ 0 \amp 0 \amp 0 \amp \blacksquare \amp \ast \\ 0 \amp 0 \amp 0 \amp 0 \amp 0 \end{bmatrix} \\ \amp L \amp U \end{array} \end{equation*}
This is called an LU factorization of \(A\text{.}\) The matrix \(L\) is called unit lower triangular (ULT) because it is lower triangular and it has all 1's on the main diagonal.
If the LU factorization of \(A\) exists, \(A = LU\text{,}\) then the linear system \(A\vec{x} = \vec{b}\) becomes \(LU\vec{x} = \vec{b}\text{,}\) or, \(L(U\vec{x}) = \vec{b}\text{.}\) Then, to solve this system for \(\vec{x}\text{,}\) we can do it in two parts. First, find the vector(s) \(\vec{y}\) such that \(L\vec{y} = \vec{b}\text{.}\) Then, find the vectors \(\vec{x}\) such that \(U\vec{x} = \vec{y}\text{.}\) Both of these systems are easy to solve because their coefficient matrices are triangular. In this way, we solve the original system by solving two simpler systems.
Intuitively, we are “peeling back the layers” of \(L(U\vec{x}) = \vec{b}\text{.}\)

Subsection 10.1.2 Performing LU Factorization

If \(A\) can be reduced to row echelon form \(U\) using only row replacements which add a multiple of one row to another row below it, then there exists elementary matrices \(E_1, E_2, \dots, E_p\) (say there are \(p\) steps) which are all unit lower triangular, such that,
\begin{equation*} E_p \cdots E_1 A = U \end{equation*}
Then,
\begin{gather*} A = (E_p \cdots E_1)^{-1} U\\ A = (E_1^{-1} \cdots E_p^{-1}) U \end{gather*}
Thus, \(A = LU\text{,}\) where,
\begin{equation*} L = (E_p \cdots E_1)^{-1} = E_1^{-1} \cdots E_p^{-1} \end{equation*}
It turns out that indeed, \(L\) is ULT, because all of \(E_1, \dots, E_p\) are ULT, and the fact that the product of ULT matrices is ULT, and the inverse of a ULT matrix is ULT.
To prove this, we will prove that \(A\) is row equivalent to \(I\text{.}\) The entries below the \((1,1)\)-entry \(a_{11} = 1\) can be changed to 0 by adding a multiple of row 1. Since row 1 has no other non-zero entries, the \((2,2)\)-entry is unaffected. Also, this row operation only affects the first column of \(I\text{,}\) below the main diagonal. Next, the \((2,2)\)-entry is the next pivot, which can be used to change all entries below it to 0 by row replacement. Similarly, the \((2,2)\)-entry is the only non-zero entry in its row of \(A\text{,}\) so the other entries, in particular the \((3,3)\)-entry, is not affected. Also, this row operation only affects the 2nd column of \(I\text{,}\) below the main diagonal. Continuing in this way, the diagonal entries are all of the pivots, and \(A\) is reduced to \(I\text{,}\) so \(A\) is invertible, and the augmented matrix is \(\begin{bmatrix} I \mid A^{-1} \end{bmatrix}\text{.}\) Each row operation only adds multiples of the pivot rows to rows below it, so \(A^{-1}\) is unit lower triangular.

Subsection 10.1.3 Algorithm for LU Factorization

It turns out that computing \(E_1^{-1} \cdots E_p^{-1}\) is much easier than computing \((E_p \cdots E_1)^{-1}\text{.}\) That is, inverting each elementary matrix first and then taking their product, is easier than computing the product first and then inverting.

Subsection 10.1.4 Number of Operations for LU Factorization

Recall that for an arbitrary \(n \times n\) matrix \(A\text{,}\) converting \(A\) to REF requires about \(\frac{2n^3}{3}\) flops. Then, further performing back substitution on this (upper triangular) REF matrix requires about \(n^2\) flops.
Then, suppose we want to consider many linear systems with the same coefficient matrix, \(A\vec{x} = \vec{b}_1, A\vec{x} = \vec{b}_2, A\vec{x} = \vec{b}_k\) (note here \(\vec{b}_i\) just denotes arbitray constant vectors, and not necessarily denote the column vectors of a matrix \(B\)). To solve each of these systems individually would require performing row reduction and back substitution, so a total of \(\frac{2n^3}{3} + n^2\) for each system. Then, in total, this would be \(k \brac{\frac{2n^3}{3} + n^2}\text{.}\) Instead, using LU factorization, we perform the row reduction once to determine \(U\text{,}\) and determine \(L\) by the matrix inverse procedure. In general, computing matrix inverses is difficult, but for elementary matrices, there are very few non-zero entries, so the multiplication and inversion is very fast. Thus, in total, we will say about \({2n^3}{3}\text{.}\)
Then, for each system, we perform back substitution for the systems \(L\vec{y} = \vec{b}, U\vec{x} = \vec{y}\text{,}\) each requiring \(n^2\) flops, which together is \(2n^2\text{.}\) Thus, in total, there are,
\begin{equation*} \frac{2n^3}{3} + 2n^2 \cdot k \quad \text{flops} \end{equation*}