Matrix Decompositions (NumPy)
Eigen Decomposition (diagonalization)
An
for some invertible matrix
A beautiful result in linear algebra is that such a decomposition is possible if
Therefore, to use SciPy to diagonalize a matrix we first compute the eigenvalues and eigenvectors with scipy.linalg.eig
. This returns the 1D array of eigenvalues, which we can us to create the diagonal matrix numpy.diag
. We also get the matrix of eigenvectors from the .eig
call, and this is the matrix
Example
Diagonalize the matrix
C = np.array([[-3,5,-5],[-7,9,-5],[-7,7,-3]])
evals, evecs = la.eig(C)
D = np.diag(evals.real) # (1)
P = evecs # (2)
Pinv = la.inv(P)
print(D)
print(P)
- create the diagonal matrix of eigenvalues.
.real
just strips off the imaginary parts since the eigenvalues are real numbers in this case. - create the matrix of eigenvectors
[[-3. 0. 0.]
[ 0. 4. 0.]
[ 0. 0. 2.]]
[[ 5.77350269e-01 -8.33283831e-16 -7.07106781e-01]
[ 5.77350269e-01 -7.07106781e-01 -7.07106781e-01]
[ 5.77350269e-01 -7.07106781e-01 3.33066907e-16]]
Let's check if
We can also use this idea to create a matrix with any eigenvalues and eigenvectors we want.
For example, suppose we would like a
An example of such a matrix would be
In this case,
Example
Let's check the matrix above has the desired eigenvalues and eigenvectors
A = np.array([[2,2,-1],[5/4,1/2,-5/4],[3/2,-3,-1/2]])
evals, evecs = la.eig(A)
D = np.diag(evals.real)
P = evecs
Pinv = la.inv(P)
print(D)
print(P)
[[-2. 0. 0.]
[ 0. 3. 0.]
[ 0. 0. 1.]]
[[-1.68199171e-16 8.94427191e-01 7.07106781e-01]
[ 4.47213595e-01 4.47213595e-01 -5.76888806e-16]
[ 8.94427191e-01 -2.71947991e-16 7.07106781e-01]]
We can see that the columns are scaled versions of
Diagonalization of Symmetric Matrices
If
where
Let's summarize some important theorems about symmetric matrices:
- Eigenvectors corresponding to different eigenvalues of a symmetric matrix are orthogonal.
- Principle Axis Theorem: a matrix is symmetric if and only if it is orthogonally diagonalizable.
- All the eigenvalues of a symmetric matrix are real numbers.
Example
Orthogonally diagonalize the symmetric matrix
Solution: First we summarize how we would do this one (check the results for yourself). The eigenvalues and eigenvectors are
In other words,
If we formed the matrix
Let's now do the calculations.
A = np.array([[5,-4,-2],[-4,5,-2],[-2,-2,8]])
evals, evecs = la.eig(A)
evals = np.round(evals.real,2) # we know the evalues are real numbers
print(evals)
print(evecs)
[ 9. -0. 9.]
[[ 0.74535599 -0.66666667 0. ]
[-0.59628479 -0.66666667 -0.4472136 ]
[-0.2981424 -0.33333333 0.89442719]]
Check if matrix of eigevectors is orthogonal:
- round matrix to 12 decimal places to get rid of any rounding errors.
Note
Luckily, scipy.linalg.eig
already returned an orthonormal collection of eigenvectors.
It is not guaranteed to do this in general, we just lucked out here. However, for symmetric matrices we should use scipy.linal.eigh
to construct an orthonormal collection of eigenvectors. If we use .eig
and the eigenvectors for a specific eigenvalue aren't orthogonal then we have a bit more work to to - see the next example below for how to proceed in this case.
Since the matrix of eigenvectors is orthogonal, it is the matrix
Let's order the eigenvalues so the
idx = evals.argsort()[::-1] # (1)
print('sorted indicies: ', idx)
evals = evals[idx] # resort evals by idx
evecs = evecs[:,idx] # resort evecs by idx
print(evals)
print(evecs)
- This looks at what it would take to sort evals in descending order (indicated by the
-
), and then stores the index order which would put evals in descending order. We can then use this index list to sort both evals and evecs in the same way.
sorted indicies: [1 0 2]
[ 9. 9. -0.]
[[ 0.74535599 0. -0.66666667]
[-0.59628479 -0.4472136 -0.66666667]
[-0.2981424 0.89442719 -0.33333333]]
Now we have evals and evecs sorted.
Let's set
array([[ 1.00000000e+00, 1.04854397e-16, 8.94346325e-17],
[ 1.04854397e-16, 1.00000000e+00, -3.08395285e-18],
[ 8.94346325e-17, -3.08395285e-18, 1.00000000e+00]])
Which is essentially the identity matrix (up to rounding error).
array([[ 9.00000000e+00, -7.76318940e-17, -5.93963572e-16],
[-1.10732349e-16, 9.00000000e+00, -1.01092166e-15],
[-2.48253415e-16, -8.27511384e-16, 7.40148683e-17]])
Which is essentially the diagonal matrix of eigenvalues (up to rounding error). We can check this is the diagonal matrix of eigenvalues by using the np.allclose
, which checks if corresponding entries in an array are equal (up to a tolerance).
Now that we got a few of the subtleties out of the way, let's do another example and focus just on the computations. Again this example has an eigenvalue with algebraic/geometric multiplicty 2. We do the calculations in two ways:
- (preferred) Since this is a symmetric matrix we compute eigenvectors using
scipy.linalg.eigh
, which returns and orthonormal collection of eigenvectors. - Using
scipy.linalg.eig
) python returns two eigenvectors associated with a single eigenvalue , and these vectors are not orthogonal. So we will illustrate how to fix this usingscipy.linalg.orth
, as well as usinggram_schmidt
.
Example
Orthogonally diagonalize the symmetric matrix
Method 1: scipy.linalg.eigh
A = np.array([[2,-1,-1],[-1,2,-1],[-1,-1,2]])
evals, evecs = la.eigh(A)
evals = evals.real
print(evals)
P = evecs
print(P)
[2.22044605e-15 3.00000000e+00 3.00000000e+00]
[[ 0.57735027 0.81649658 0. ]
[ 0.57735027 -0.40824829 -0.70710678]
[ 0.57735027 -0.40824829 0.70710678]]
Therefore,
Method 2: Using scipy.linalg.eig
A = np.array([[2,-1,-1],[-1,2,-1],[-1,-1,2]])
evals, evecs = la.eig(A)
evals = evals.real
print(evals)
P = evecs
This isn't the identity, so the eigenbasis for
First sort the eigenvalues and eigenvectors so eigenvalues are in decreasing order.
Method 2.1: Using scipy.linalg.orth
:
Now grab the first two eigenvectors (those corresponding to
[[-0.6731049 -0.46216498 -0.57735027]
[ 0.73679906 -0.35184345 -0.57735027]
[-0.06369416 0.81400843 -0.57735027]]
Let's check that
array([[ 3.00000000e+00, -5.31972712e-16, 3.42394613e-16],
[-7.64376814e-16, 3.00000000e+00, -3.32289241e-16],
[ 2.59546329e-16, -2.97933590e-16, -6.40987562e-17]])
Which is the diagonal matrix of eigenvalues.
Therefore,
Method 2.2: Using gram_schmidt
:
Since the Gram Schmidt procedure constructs an orthogonal basis by working through the columns of
gram_schmidt
is our own built function. See the Gram Schmidt Orthogonalization section for its implementation.
[[ 0.29329423 0.76200076 -0.57735027]
[-0.80655913 -0.12700013 -0.57735027]
[ 0.5132649 -0.63500064 -0.57735027]]
Check if
array([[3.00000000e+00, 1.48135454e-18, 2.55018705e-16],
[4.20044244e-17, 3.00000000e+00, 1.48298061e-16],
[3.66324758e-16, 2.96096777e-16, 7.39557099e-32]])
This looks like the diagonal matrix of eigenvalues. Let's get rid of the round-off errors to see they really are the same.
Therefore,
Notice the matrix that we found here for
Lastly, we'll do a random example. In this case it is likely there will not be repeated eigenvalues, therefore the matrix of eigenvectors will be the orthogonal matrix we are looking for.
Example
To generate a random symmetric matrix we start with a random
B = np.random.randint(-3,3,size = (5,5))
A = B@B.T # random symmetric matrix
print(A)
evals, evecs = la.eigh(A)
evals = evals.real
print('\n eigenvalues : \n')
print(evals)
print('\n Matrix P of eigenvectors : \n')
P = evecs
print(P)
[[ 15 -8 -7 13 9]
[ -8 10 6 -10 -3]
[ -7 6 11 -7 -15]
[ 13 -10 -7 16 6]
[ 9 -3 -15 6 27]]
eigenvalues :
[ 0.31505478 1.77815469 4.83218655 21.1541805 50.92042348]
Matrix P of eigenvectors :
[[-0.1758979 0.52779507 0.61679654 -0.32435171 0.4526039 ]
[-0.44664489 -0.39834996 0.63566325 0.38465841 -0.29965944]
[ 0.76371882 -0.03150937 0.4440324 -0.20641406 -0.41948681]
[ 0.03737476 -0.74552798 0.06602194 -0.49640648 0.43819274]
[ 0.43000576 -0.07710293 0.11819728 0.6766149 0.58083721]]
Let's check that
Let's check that
[[ 0.31505478 0. -0. 0. 0. ]
[ 0. 1.77815469 0. -0. -0. ]
[-0. -0. 4.83218655 -0. -0. ]
[ 0. -0. 0. 21.1541805 -0. ]
[ 0. -0. -0. -0. 50.92042348]]
This is the diagonal matrix of eigenvalues.
Jordan Decomposition
Not available in NumPy/SciPy. See notes in the SymPy section for how to do it there.
Singular-Value Decomposition (svd)
The theory of orthogonal diagonalization of symmetric matrices is extremely useful and powerful. SVD extends these ideas to matrices that are not symmetric, not diagonalizable, or perhaps not even square! In short, every matrix has a svd.
Definition of SVD
Let
where
is an (square) orthogonal matrix, is an (square) orthogonal matrix, is an (rectangular) diagonal matrix (of non-zero singular values ).
is an (rectangular) column orthogonal matrix, is an (rectangular) column orthogonal matrix, is an (square) diagonal matrix (of non-zero singular values ).
Key Fact:
- first
columns of : column space of ( ) - last
columns of : left nullspace of ( ) - first
columns of : row space of ( ) - last
columns of : nullspace of ( )
The diagonal (but rectangular) matrix
Computing an SVD
To compute
- the square roots of the non-zero eigenvalues of
are the singular values . is the diagonal matrix made up of these values in descending order:
- the columns of
are the eigenvectors of arranged in descending order of their eigenvalues:
- the columns of
are the eigenvectors of arranged in descending order of their eigenvalues. Although, they are usually computed by noticing the first columns of are and the last columns are an orthonormal basis for :
The condensed svd tosses away:
- the last
columns of (i.e. an orthonormal basis of ) - the last
columns of (i.e. an orthonormal basis of ) - the last
columns of (i.e. the zero columns)
since these don't affect the product. Click the "condensed svd" tab above to see the changes to the definitions of
Example
The
has rank
where both the first and third matrices are orthogonal. The condesnsed svd is
We use scipy.linalg.svd
to find a decomposition in python. This function returns a 3-tuple
Warning
The matrices scipy.linalg.svd
may return something slightly different than ypu may have expected.
Example
Find a singular value decomposition of
(array([[-0.70710678, -0.70710678],
[-0.70710678, 0.70710678]]),
array([4. , 2.44948974]),
array([[-7.07106781e-01, 1.20868191e-16, -7.07106781e-01],
[ 5.77350269e-01, 5.77350269e-01, -5.77350269e-01],
[-4.08248290e-01, 8.16496581e-01, 4.08248290e-01]]))
Let's check that this gives a decomposition of
U, d, Vt = la.svd(A)
# build sigma
Sig = np.zeros(A.shape) # construct mxn zeros matrix
Sig[:d.size,:d.size] = np.diag(d) # fill diagonal with d
print(U,'\n')
print(Sig,'\n')
print(Vt,'\n')
print(U@Sig@Vt)
[[-0.70710678 -0.70710678]
[-0.70710678 0.70710678]]
[[4. 0. 0. ]
[0. 2.44948974 0. ]]
[[-7.07106781e-01 1.20868191e-16 -7.07106781e-01]
[ 5.77350269e-01 5.77350269e-01 -5.77350269e-01]
[-4.08248290e-01 8.16496581e-01 4.08248290e-01]]
[[ 1. -1. 3.]
[ 3. 1. 1.]]
The product is indeed
LU Decomposition
The LU decomposition of a matrix
where
A permutation matrix is constructed from the identity matrix
The command scipy.linalg.lu
returns a 3-tuple
Example
Find the LU-decomposition of
(array([[0., 0., 1.],
[1., 0., 0.],
[0., 1., 0.]]),
array([[ 1. , 0. , 0. ],
[-0.25, 1. , 0. ],
[ 0.5 , 0.4 , 1. ]]),
array([[ 4. , -1. , 6. ],
[ 0. , -1.25, 4.5 ],
[ 0. , 0. , -0.8 ]]))
Typically, one would call the function using tuple assignment.
[[0. 0. 1.]
[1. 0. 0.]
[0. 1. 0.]]
[[ 1. 0. 0. ]
[-0.25 1. 0. ]
[ 0.5 0.4 1. ]]
[[ 4. -1. 6. ]
[ 0. -1.25 4.5 ]
[ 0. 0. -0.8 ]]
Let's check the product is
QR Decomposition
The QR decomposition of a matrix
where
The command scipy.linalg.qr
returns a 2-tuple
Example
Find the QR-decomposition of
(array([[-0.43643578, 0.38939578, -0.81110711],
[-0.87287156, 0.03539962, 0.48666426],
[ 0.21821789, 0.92039002, 0.32444284]]),
array([[-4.58257569, 1.09108945, -6.32831882],
[ 0. , -1.34518542, 4.53115088],
[ 0. , 0. , 0.64888568]]))
Typically, one would call the function using tuple assignment.
[[-0.43643578 0.38939578 -0.81110711]
[-0.87287156 0.03539962 0.48666426]
[ 0.21821789 0.92039002 0.32444284]]
[[-4.58257569 1.09108945 -6.32831882]
[ 0. -1.34518542 4.53115088]
[ 0. 0. 0.64888568]]
Let's check the product is