LU Factorization

The goal of this notebook is to give a brief explanation of LU Factorization roughly following TAK § 4.3.

We will not explain how to compute the LU factorization. The method to compute the LU factorization is very close to Gaussian elimination. To learn how to do this see the description in TAK beginning on page 100.

Imports

The following are standard imports.

In [1]:
import numpy as np
import matplotlib.pyplot as plt
import math as m
from mpmath import mp, iv

We add the following import for this notebook (and going forward).

In [2]:
from scipy import linalg

Why use LU Factorization?

Recall that the main motivation for Gaussian elimination was that it can be used to solve the equation $A x =b$. But what if you have several vectors $b_1, \ldots, b_n$ and for each $i \in \{1, \ldots, n\}$ you want to find a solution vector $x_i$ satifying $A x_i = b_i$. Then you would need to repeat Gaussian elimination $n$ times. But observe that the forward elimination phase of Gaussian elimination would be the same in each case. It turns out that we do not need to repeat this calculuation if we keep better track of what we are doing. This will save time as opposed to repeating Gaussian elimination several times.

What is LU Factorization?

An LU factorization of a matrix $A$ is obtained by writing $A$ as a product of two matrices, $$A = LU,$$ where $L$ is a lower triangular matrix with ones on the diagonal and $U$ is an upper triangular matrix. The matrix $U$ is what you would obtain by applying forward elimination to $A$ without pivoting.

For example, if $$A=\begin{bmatrix} 2 & -1 & 4\\ 6 & -2 & 10\\ -2 & 3 & -11 \end{bmatrix},$$ then $A=LU$ where $$L=\begin{bmatrix} 1 & 0 & 0 \\ 3 & 1 & 0 \\ -1 & 2 & 1 \end{bmatrix} \quad \text{and} \quad U = \begin{bmatrix} 2 & -1 & 4 \\ 0 & 1 & -2 \\ 0 & 0 & -3 \end{bmatrix}.$$

In [3]:
A = np.array([[  2,  -1,   4],
              [  6,  -2,  10],
              [ -2,   3, -11]])
L = np.array([[1,  0, 0],
              [3,  1, 0],
              [-1, 2, 1]])
U = np.array([[2, -1,  4],
              [0,  1, -2],
              [0,  0, -3]])
(A == L@U).all()
Out[3]:
True

Why is it useful?

We will see how knowing that $A=LU$ can help you solve the equation $Ax=b$.

Observe we can solve the equation $LUx = b$. First consider the equation $Ly = b$. We can solve for $b$ in a straightforward way since $L$ is lower triangular (with ones on the diagonal).

Once we've found $y$, we need to find $x$ to satisfy $Ux = y$. Since $U$ is upper triangular, this can be solved by back substitution. Once we find $x$, we are done since $$A x = LUx = L y = b.$$

Note that once we find the factorization $A=LU$, we can use this method to solve $Ax=b$ in linear time in the number of entries of $A$. This is the same as the amount of time it takes to examine all the entries of $A$, so this is as fast as you can hope to solve this equation (unless $A$ has a special form you can take advantage of).

What about pivoting?

LU factorization doesn't work if a pivot entry is zero. It will also lead to numerical errors if there are very small pivot entries. The solution, as with Gaussian elimination, is to allow the algorithm to swap rows. Considering this results in a different factorization, $$A = PLU$$ where $P$ is a permutation matrix and $L$ and $U$ are as before.

A permutation matrix is a $P$ matrix with exactly one $1$ in every row and column, and with every other entry zero. For example $$P=\begin{bmatrix} 0 & 1 & 0 & 0 \\ 0 & 0 & 0 & 1 \\ 1 & 0 & 0 & 0 \\ 0 & 0 & 1 & 0 \end{bmatrix}.$$ Such a matrix is called a permutation matrix, because its action by multiplication on a column vector moves the numbers in the vector around. For example, the matrix above satisfies $$P \begin{bmatrix} 1 \\ 2 \\ 3 \\ 4\end{bmatrix} = \begin{bmatrix} 2 \\ 4 \\ 3 \\ 1\end{bmatrix}.$$

Note that for a permutation matrix $P$, we have $$P^{-1}=P^T,$$ where $P^T$ denotes the transpose of $P$.

Using scipy

The scipy package has a linear algebra module called linalg. This module has a LU factorization function called lu. This is really a $PLU$ factorization function. We will demonstrate its use using the matrix $A$ as above.

In [4]:
A
Out[4]:
array([[  2,  -1,   4],
       [  6,  -2,  10],
       [ -2,   3, -11]])
In [5]:
P, L, U = linalg.lu(A)

Here we check that it really is a factorization.

In [6]:
(np.abs(A - P@L@U) < 10**-8).all()
Out[6]:
True

Now lets look at the individual matrices. First $P$:

In [7]:
P
Out[7]:
array([[0., 0., 1.],
       [1., 0., 0.],
       [0., 1., 0.]])

We can see that $P$ is a permuation matrix. Now $L$:

In [8]:
L
Out[8]:
array([[ 1.        ,  0.        ,  0.        ],
       [-0.33333333,  1.        ,  0.        ],
       [ 0.33333333, -0.14285714,  1.        ]])

Observe that $L$ is lower triangular with ones on the diagonal. Now look at $U$.

In [9]:
U
Out[9]:
array([[ 6.        , -2.        , 10.        ],
       [ 0.        ,  2.33333333, -7.66666667],
       [ 0.        ,  0.        , -0.42857143]])

Observe that $U$ is upper triangular.

Solving $Ax=b$ using PLU Factorization

We'll assume $A$ is an $n \times n$ matrix for the discussion below. Then if we have the PLU Factorization, i.e., $A=PLU$, then we can solve $Ax=b$ in a series of steps. Note that our equation becomes $$PLUx = b$$. Assuming $P$, $L$ and $U$ are invertible, we can write $$x = U^{-1}\big(L^{-1}(P^{-1}b)\big).$$ So, first we want to find $z=P^{-1} b$. Then we want to find $y=L^{-1}z$. And finally $x$ will be $U^{-1}(y)$.

(Remark: $P$ and $L$ are always invertible. The matrix $U$ is invertible as long as its diagonal entries are all non-zero.)

Step 1: The permutation matrix.

First we solve $Pz = b$ for $z$. Recalling that $P^{-1}=P^T$, we can do this:

In [10]:
def permuation_solve(P, b):
    Pt = P.transpose()
    return Pt @ b

Step 2: The lower triangular matrix

Now we want to solve $Ly = z$ for $y$. Expanding this out, we want to solve: $$\begin{bmatrix} 1 & 0 & \ldots \\ l_{1,0} & 1 & 0 & \ldots \\ l_{2,0} & l_{2,1} & 1 & 0 & \ldots \\ \vdots & \vdots & \ddots & \ddots & \ddots \\ l_{n-1,0} & l_{n-1,1} & \dots & l_{n-1,n-2} & 1 \end{bmatrix} \begin{bmatrix} y_0 \\ y_1 \\ y_2 \\ \vdots \\ y_{n-1}\end{bmatrix} = \begin{bmatrix} z_0 \\ z_1 \\ z_2 \\ \vdots \\ z_{n-1}\end{bmatrix}. $$

  • The first row tells us that $y_0=z_0$.
  • The second row tells us that $y_1 = z_1 - l_{1,0} y_0$.
  • The third row tells us that $y_2 = z_2 - ( l_{2,0} y_0 + l_{2,1} y_1)$.
  • In general, we can write $y_j$ as $z_j$ minus a dot product, $$y_j = z_j - \begin{bmatrix} l_{j,0} \\ \vdots \\ l_{j,j-1} \end{bmatrix} \cdot \begin{bmatrix} y_0 \\ \vdots \\ y_{j-1} \end{bmatrix}.$$
In [11]:
def lower_triangular_solve(L, z):
    n = len(z)
    y = np.zeros(n)
    y[0] = z[0]
    for j in range(1,n):
        y[j] = z[j] - L[j,:j].dot(y[:j])
    return y

Step 3: The upper triangular matrix

Now we solve $Ux = y$ for $x$. This is the same as the back substitution step in Gaussian elimination, so we will not detail how it works. (See the Gaussian elimination notebook.)

In [12]:
def upper_triangular_solve(U, y):
    n = len(y)
    x = np.zeros(n)
    x[n-1] = y[n-1] / U[n-1,n-1]
    for i in range(n-2, -1, -1):
        x[i] = (y[i] - U[i,i+1:].dot(x[i+1:]))/U[i,i]
    return x

Step 4: Putting the steps together

We want to find $x$ such that $Ax=b$. We know that $A=PLU$. We can find $z$ such that $Pz=b$. We can find $y$ such that $Ly=z$. We can find $x$ such that $Ux=y$. Then we have $$Ax = PLUx = PL y = Pz = b.$$

In [13]:
def plu_solve(P, L, U, b):
    z = permuation_solve(P, b)
    y = lower_triangular_solve(L, z)
    x = upper_triangular_solve(U, y)
    return x

We will check this with our matrix $A=PLU$ from above. Recall $A$ is as below.

In [14]:
A
Out[14]:
array([[  2,  -1,   4],
       [  6,  -2,  10],
       [ -2,   3, -11]])

Let $b=\begin{bmatrix} 1 \\ 2 \\ 3\end{bmatrix}.$

In [15]:
b = np.array([1,2,3])
b
Out[15]:
array([1, 2, 3])

Let's check that our function works. We use our above method to solve for $x$.

In [16]:
x = plu_solve(P,L,U,b)
x
Out[16]:
array([ 2., -5., -2.])

To see that it works we compute $Ax$ and see that it is the same as (or within round off error from) $b$.

In [17]:
A @ x
Out[17]:
array([1., 2., 3.])