Solving Linear Systems (NumPy)
Most linear algebra courses start by considering how to solve a system of linear equations.
In matrix notation, this linear system of \(n\) equations in \(n\) unknowns (\(x_0, \ldots, x_{n-1}\)), can be written as
where
We'll look at how to use SciPy to numerically solve linear systems corresponding to square matrices. We'll also look at how to implement row operations (Gaussian elimination).
First we import the necessary libraries.
scipy.linalg.solve
Let's start with a 3x3 linear system:
Verify (by substitution) that \(\vec{x} = [0,2,-1]\) is a solution.
To solve a (square) linear system in python we use the scipy.linalg.solve
funtion.
- \(A\) must be a nonsingular square matrix to use
.solve
.
If \(\vec{b}\) is a 1D array then .solve
returns a 1D array. However, if \(\vec{b}\) is a 2D array then .solve
returns the solution as a 2D array, as the following shows.
Note
scipy.linalg.solve
requires the matrix to be square and invertible (\(\operatorname{det}(A) \ne 0\)).
Example
Consider another example: $$ 2x_1 + 3x_2 = 11 $$ $$ 3x_1 + 6x_2 = 7 $$
We can verify this answer is correct by observing \(A\vec{x} = \vec{b}\).
Example
Consider two systems of linear equations with the same coefficient matrix: $$ 2x_1 + 3x_2 = 11 \quad \quad \quad 2x_1 + 3x_2 = -1$$ $$ 3x_1 + 6x_2 = 7 \quad \quad \quad 3x_1 + 6x_2 = -3$$ Solving these two systems individually, is equivalent to solving the matrix equation
where
We can use the .solve
command where the b is a matrix.
Gaussian Elimination
The general procedure learned to solve a system of linear equations is Gaussian elimination. The goal is to apply row operations to "eliminate" all the variables except for one in each equation (if possible), and this resulting linear system has the same set of solutions as the original, but is easier to solve. We generally do this on the matrix form of the system, and call this the reduced row echelon form (rref) of the matrix. The elementary row operations we use are:
Elementary Row Operations
- Replace row \(j\) with \(k\) times row \(i\) added to row \(j\). \(\quad R_j \leftarrow kR_i +R_j\)
- Multiply row \(i\) by a scalar \(k\). \(\quad R_i \leftarrow kR_i\)
- Swap rows \(i\) and \(j\). \(\quad R_i \leftrightarrow R_j\)
For example, consider the matrix
Since we are working in python we will index rows starting at \(0\). In other words, an \(n\times m\) matrix has rows \(R_0\) to \(R_{n-1}\).
Here are the row operations taking it to rref:
NumPy (and SciPy) do not have row operations implemented. Not to worry, we can do these ourself. To make it simpler to implement we'll use elementary matrices and matrix multiplication. The main idea is as follows: to do a row operation on a matrix \(A\) (such as a row swap), we can do the row operation on the identity matrix first, then multiply this new matrix to the left of \(A\). This produces the matrix as if we had done the row operation directly to \(A\).
For example, suppose we want to do the following row operation as a multiplication.
First apply the row operation to the identity matrix:
Take this new matrix \(E\) (called and elementary matrix) and multiply it to the left of the matrix you want to perform the row operation on:
Actually work through the product to see "how" it is doing the row operation. Therefore, row operations can be performed by multiplying by elementary matrices.
The following is an implementation of the three elementary row operations. Notice, the main body of each function has us performing the row operation on the identity matrix to create the matrix \(E\), then in the return
statement we do the matrix multiplication. This performs the row operation on the original matrix.
def add_row(A,k,i,j):
"row j <- k times row i added to row j in matrix A."
n = A.shape[0]
E = np.eye(n)
if i == j:
E[j,j] = k + 1
else:
E[j,i] = k
return E @ A
def scale_row(A,k,i):
"row i <- k times row i in matrix A"
n = A.shape[0]
E = np.eye(n)
E[i,i] = k
return E @ A
def swap_row(A,i,j):
"row i <-> row j in matrix A"
n = A.shape[0]
E = np.eye(n)
if i != j:
E[i,i], E[i,j], E[j,i], E[j,j] = 0, 1, 1, 0
return E @ A
Example
We'll redo the example above, this time using our python functions.
A = np.array([[2,1,0],[1,1,-1],[0,-1,2]])
print(A)
print("Swap rows 0 and 1:")
A1 = swap_row(A,0,1)
print(A1)
print("Add -2 times row 0 to row 1:")
A2 = add_row(A1,-2,0,1)
print(A2)
print("Add -1 times row 1 to row 2:")
A3 = add_row(A2,-1,1,2)
print(A3)
print("Add 1 times row 1 to row 0:")
A4 = add_row(A3,1,1,0)
print(A4)
print("Multiply row 1 by -1:")
A5 = scale_row(A4,-1,1)
print(A5)
[[ 2 1 0]
[ 1 1 -1]
[ 0 -1 2]]
Swap rows 0 and 1:
[[ 1. 1. -1.]
[ 2. 1. 0.]
[ 0. -1. 2.]]
Add -2 times row 0 to row 1:
[[ 1. 1. -1.]
[ 0. -1. 2.]
[ 0. -1. 2.]]
Add -1 times row 1 to row 2:
[[ 1. 1. -1.]
[ 0. -1. 2.]
[ 0. 0. 0.]]
Add 1 times row 1 to row 0:
[[ 1. 0. 1.]
[ 0. -1. 2.]
[ 0. 0. 0.]]
Multiply row 1 by -1:
[[ 1. 0. 1.]
[ 0. 1. -2.]
[ 0. 0. 0.]]
Solving a linear system with Gaussian Elimination
We'll consider the following linear system which has more unknowns than equations.
This means the coefficient matrix will not be square. Therefore, in this case we couldn't use .solve
even if we wanted to.
Write it as an augmented matrix:
Use row operations to take it to rref.
A = np.array([[2,0,4,4],[-1,0,-1,0],[-3,0,0,6]])
b = np.array([14,-3,3]).reshape(3,1)
aug_mat = np.hstack([A,b]) # (1)
print(aug_mat)
.hstack
allows us to horizontally stack two arrays to form the augmented matrix.
At this point we have reduced the matrix to rref. We see there are two free variables (\(x_2\) and \(x_4\)). The general solution is:
rref
NumPy (SciPy) does not have a built-in function to put a matrix in reduced row echelon form (rref). However, we can implement our own.
To put a matrix into reduced row echelon form, we define rref
. rref
returns a tuple of two elements. The first is the reduced row echelon form, and the second is a tuple of indices of the pivot columns. The length of this tuple is the rank of the matrix.
def rref(A, tol=1.0e-12):
''' input: a matrix (2D array)
output: rref form of the matrix, and a tuple of pivot columns
'''
m, n = A.shape
i, j = 0, 0
jb = [] # list of pivot columns
while i < m and j < n:
# Find value and index of largest element in the remainder of column j
k = np.argmax(np.abs(A[i:m, j])) + i
p = np.abs(A[k, j])
if p <= tol:
# The column is negligible, zero it out
A[i:m, j] = 0.0
j += 1
else:
# Remember the column index
jb.append(j)
if i != k:
# Swap the i-th and k-th rows
A[[i, k], j:n] = A[[k, i], j:n]
# Divide the pivot row i by the pivot element A[i, j]
A[i, j:n] = A[i, j:n] / A[i, j]
# Subtract multiples of the pivot row from all the other rows
for k in range(m):
if k != i:
A[k, j:n] -= A[k, j] * A[i, j:n]
i += 1
j += 1
# Finished
return A, tuple(jb)
Example
Compute the rref of the matrix
Example
Compute the rref of the matrix
Example
For the linear system, find the rref of the augmented matrix.
Computing an Inverse with Gaussian Elimination
Recall, if we have an invertible matrix \(A\), then we can compute the inverse using Gaussian elimination as follows:
Matrix \(B\) is the inverse of \(A\).
Note
If you are wondering why the inverse matrix can be computed in this way, it comes back to elementary matrices. Suppose \(n\) row operations are applied to take \(A\) to the identity \(I\). Let \(E_1, E_2, \ldots E_n\) be the \(n\) elementary matrices corresponding to these row operations. Then, since row operations could be done by multiplying by elementary matrices, it follows that
Therefore, multiplying \(A^{-1}\) to the right or each side:
This means \(A^{-1}\) is obtained by doing the row operations on \(I\).
Example
Let's find the inverse of
Therefore, we have
We can verify we found the inverse by checking that \(A A^{-1} = I\).
Nullspace
The nullspace of a matrix can be computed with scipy.linalg.null_space
. This returns an orthonormal basis for the nullspace.