Matrix Decompositions (SymPy)
Eigen Decomposition (diagonalization)
An \(n \times n\) (square) matrix \(A\) is diagonalizable if it can be decomposed (factored) as
for some invertible matrix \(P\) and some diagonal matrix \(D\).
A beautiful result in linear algebra is that such a decomposition is possible if \(A\) has \(n\) linearly independent eigenvectors. Moreover, the columns of \(P\) are the eigenvectors of \(A\) and the diagonal of \(D\) is are the corresponding eigenvalues.
To diagonalize a matrix in SymPy use the .diagonalize()
method. This returns a tuple (P,D)
.
Example
Diagonalize the matrix
\(\left[\begin{matrix} 1 & 1 & 0 \\ 1 & 1 & 1 \\ 1 & 0 & 1 \\ \end{matrix}\right]\)
\(\left[\begin{matrix} -3 & 0 & 0 \\ 0 & 2 & 0 \\ 0 & 0 & 4 \\ \end{matrix}\right]\)
Let's check if \(PDP^{-1} = A\).
\(\left[\begin{matrix} -3 & 5 & -5 \\ -7 & 9 & -5 \\ -7 & 7 & -3 \end{matrix}\right]\)
We can also use this idea to create a matrix with any eigenvalues and eigenvectors we want. For example, suppose we would like a \(3\times 3\) matrix with the following eigenvalues/eigenvectors.
An example of such a matrix would be \(PDP^{-1}\) where
In this case,
Example
Let's check the matrix above has the desired eigenvalues and eigenvectors
This will return an answer involving floating point numbers. If we want to force SymPy to work exclusively over the rational numbers we can define the entries to be rationals.
A = Matrix([[2,2,-1],[Rational(5,4),Rational(1,2),Rational(-5,4)],[Rational(3,2),-3,Rational(-1,2)]])
pprint(A.eigenvects())
\(\left[ \left( -2,1,\left[ \left[\begin{matrix} 0 \\ 1/2 \\ 1 \end{matrix}\right] \right] \right), \left( 1,1,\left[ \left[\begin{matrix} 1 \\ 0 \\ 1 \end{matrix}\right] \right] \right), \left( 3,1,\left[ \left[\begin{matrix} 2 \\ 1 \\ 0 \end{matrix}\right] \right] \right) \right]\)
We can see that the columns are scaled versions of \(\vec{v}_1\), \(\vec{v}_3\), and \(\vec{v}_2\), respectively.
Diagonalization of Symmetric Matrices
If \(A\) is an \(n \times n\) symmetric matrix (i.e. \(A^T = A\)) then it can be diagonalized by an orthogonal matrix \(P\) (i.e. \(P^{-1} = P^T)\).
where \(D\) is the diagonal matrix of eigenvalues, and \(P\) is the matrix of (orthonormal) eigenvectors. A matrix that can be diagonalized by an orthogonal matrix is called orthogonally diagonalizable.
Let's summarize some important theorems about symmetric matrices:
- Eigenvectors corresponding to different eigenvalues of a symmetric matrix are orthogonal.
- Principle Axis Theorem: a matrix is symmetric if and only if it is orthogonally diagonalizable.
- All the eigenvalues of a symmetric matrix are real numbers.
Example
Orthogonally diagonalize the symmetric matrix
Solution: First we summarize how we would do this one (check the results for yourself). The eigenvalues and eigenvectors are
In other words, \(\lambda_1\) has two corresponding eigenvectors (algebraic and geometric multiplicity are both 2), and \(\lambda_2\) has one corresponding eigenvector (algebraic and geometric multiplicity are both 1).
If we formed the matrix \(P\) whose columns are the eigenvectors, then \(P\) would diagonalize \(A\). However, \(P\) would not be orthogonal. We can see this since \(\vec{v}_{1,1}\) and \(\vec{v}_{1,2}\) are not orthogonal. What we do instead, is find an orthonormal basis for the eigenspace \(\mathbb{S}_1 = \operatorname{span}\left\{ \vec{v}_{1,1},\vec{v}_{1,2}\right\}\). We use this orthonormal basis as the columns of \(P\) which correspond to \(\lambda_1\).
Let's now do the calculations.
\(\left( \left[ \begin{matrix} 2 & -1 & -1 \\ 2 & 1 & 0 \\ 1 & 0 & 2 \\ \end{matrix} \right], \left[ \begin{matrix} 0 & 0 & 0 \\ 0 & 9 & 0 \\ 0 & 0 & 9 \\ \end{matrix} \right] \right)\)
We can see \(P\) is not orthogonal, since the second and third columns are not orthogonal (also none of the columns are normalized). We can use python to check this too:
\(\left[ \begin{matrix} 9 & 0 & 0 \\ 0 & 2 & 1 \\ 0 & 1 & 5 \\ \end{matrix} \right]\)Since \(P^TP \ne I\) then \(P\) is not an orthogonal matrix.
We can use Gram-Schmidt on the columns of \(P\) to find an orthonormal collection of eigenbases.
# Let's Gram-Schmidt the columns of P
vectorList = [P.col(i) for i in range(P.shape[1])]
pprint(GramSchmidt(vectorList,True)) # optional argument True normalizes vectors
\(\left[ \left[ \begin{matrix} \frac{2}{3} \\ \frac{2}{3} \\ \frac{1}{3} \end{matrix} \right], \left[ \begin{matrix} \frac{-\sqrt{2}}{2} \\ \frac{\sqrt{2}}{2} \\ 0 \end{matrix} \right], \left[ \begin{matrix} \frac{-\sqrt{2}}{6} \\ \frac{-\sqrt{2}}{6} \\ \frac{2\sqrt{2}}{3} \end{matrix} \right] \right]\)
Let's set \(P\) to be the matrix of these eigenvectors and check it diagonalizes \(A\).
\(\left[ \begin{matrix} \frac{2}{3} & -\frac{\sqrt{2}}{2} & -\frac{\sqrt{2}}{6} \\ \frac{2}{3} & \frac{\sqrt{2}}{2} & -\frac{\sqrt{2}}{6} \\ \frac{1}{3} & 0 & \frac{2\sqrt{2}}{3} \\ \end{matrix} \right]\)
\(\left[ \begin{matrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \\ \end{matrix} \right]\)Therefore, \(P\) is orthogonal. And the following shows it diagonalizes \(A\).
\(\left[ \begin{matrix} 0 & 0 & 0 \\ 0 & 9 & 0 \\ 0 & 0 & 9 \\ \end{matrix} \right]\)Now that we got a few of the subtleties out of the way, let's do another example and focus just on the computations. Again this example has an eigenvalue with algebraic/geometric multiplicty 2.
Example
Orthogonally diagonalize the symmetric matrix
\(P\) is not orthogonal so let's Gram Schmidt its columns.
vectorList = [P.col(i) for i in range(P.shape[1])]
P_list = GramSchmidt(vectorList,True)
P = P_list[0]
for i in [1,2]:
P = P.row_join(P_list[i])
P
\(\left[ \begin{matrix} \frac{\sqrt{3}}{3} & \frac{\sqrt{3}}{3} & \frac{\sqrt{3}}{3} \\ -\frac{\sqrt{2}}{2} & \frac{\sqrt{2}}{2} & 0 \\ -\frac{\sqrt{6}}{6} & -\frac{\sqrt{6}}{6} & \frac{\sqrt{6}}{3} \\ \end{matrix} \right]\)
Verify \(P\) is orthogonal:
\(\left[ \begin{matrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \\ \end{matrix} \right]\)Therefore, \(P\) is orthogonal. And the following shows \(P\) diagonalizes \(A\).
\(\left[ \begin{matrix} 0 & 0 & 0 \\ 0 & 3 & 0 \\ 0 & 0 & 3 \\ \end{matrix} \right]\)Jordan Decomposition
If a matrix \(A\) has \(s\) linearly independent eigenvectors, then it is similar to a matrix \(J\) that is in Jordan form, with \(s\) square blocks on the diagonal:
Each block \(J_i\) has one eigenvector, one eigenvalue, and \(1\)'s just above the diagonal.
An example of such a Jordan matrix is
The double eigenvalue \(\lambda_1 = 8\) has only a single eigenvector \(\vec{e}_1 = [1,0,0,0,0]\). As a result, \(\lambda_1=8\) appears only in a single block \(J_1\). The triple eigenvalue \(\lambda_2 = 0\) has two eigenvectors, \(\vec{e}_3 = [0,0,1,0,0]\) and \(\vec{e}_5 = [0,0,0,0,1]\), which correspond to the two Jordan blocks \(J_2\) and \(J_3\). If \(A\) had \(5\) eigenvectors, all blocks would be \(1\) by \(1\) and \(J\) would be diagonal.
To compute the Jordan form of a matrix use the .jordan_form()
method. It returns the tuple \((P,J)\).
Example
Compute the Jordan form of the following matrix.
\(\left[ \left[\begin{matrix} 0 & -1 & 0 & 0 & 1\\ 0 & 1 & 1 & 1 & 0\\ -1 & 0 & 1 & 0 & 0\\ 0 & 0 & 0 & 1 & 0\\ -1 & 0 & 0 & 0 & 0\\ \end{matrix}\right], \left[\begin{matrix} 0 & 1 & 0 & 0 & 0\\ 0 & 0 & 0 & 0 & 0\\ 0 & 0 & 0 & 0 & 0\\ 0 & 0 & 0 & 8 & 1\\ 0 & 0 & 0 & 0 & 8\\ \end{matrix}\right] \right]\)
Let's check that \(P^{-1}AP = J\).
Example
Find the Jordan form of
Since \(J\) has \(0\)'s down the diagonal and two blocks, this means \(\lambda=0\) is an eigenvalue with algebraic multiplicity \(3\) and geometric multiplicity \(2\).
\(\left[ \left(0,3,\left[ \left[\begin{matrix}1\\0\\0\end{matrix}\right], \left[\begin{matrix}0\\-2\\1\end{matrix}\right] \right]\right)\right]\)
Singular-Value Decomposition (svd)
The theory of orthogonal diagonalization of symmetric matrices is extremely useful and powerful. SVD extends these ideas to matrices that are not symmetric, not diagonalizable, or perhaps not even square! In short, every matrix has a svd.
Definition of SVD
Let \(A\) be an \(m\times n\) matrix of rank \(r\). A svd of \(A\) is a decomposition of the form
where
- \(U\) is an \(m\times m\) (square) orthogonal matrix,
- \(V\) is an \(n\times n\) (square) orthogonal matrix,
- \(\Sigma\) is an \(m\times n\) (rectangular) diagonal matrix (of \(r\) non-zero singular values \(\sigma_1, \ldots, \sigma_r\)).
- \(U\) is an \(m\times r\) (rectangular) column orthogonal matrix,
- \(V\) is an \(n\times r\) (rectangular) column orthogonal matrix,
- \(\Sigma\) is an \(r\times r\) (square) diagonal matrix (of \(r\) non-zero singular values \(\sigma_1, \ldots, \sigma_r\)).
Key Fact: \(U\) and \(V\) give orthonormal bases for all four fundamental subspaces:
- first \(~~~~~r~~~~~\) columns of \(U\): column space of \(A\) (\(= \operatorname{Col}(A)\))
- last \(~m-r~\) columns of \(U\): left nullspace of \(A\) (\(= \operatorname{Null}(A^T)\))
- first \(~~~~~r~~~~~\) columns of \(V\): row space of \(A\) (\(= \operatorname{Col}(A^T)\))
- last \(~n-r~\) columns of \(V\): nullspace of \(A\) (\(= \operatorname{Null}(A)\))
The diagonal (but rectangular) matrix \(\Sigma\) contains the square roots of the nonzero eigenvalues from \(A^TA\) (which are the same as the nonzero eigenvalues from \(AA^T\)). These are called the nonzero singular values of \(A\).
Computing an SVD
To compute \(U\), \(V\) and \(\Sigma\):
- the square roots of the non-zero eigenvalues of \(A^TA\) are the singular values \(\sigma_1, \ldots, \sigma_r\). \(\Sigma\) is the diagonal matrix made up of these values in descending order:
- the columns of \(V\) are the eigenvectors of \(A^TA\) arranged in descending order of their eigenvalues:
- the columns of \(U\) are the eigenvectors of \(AA^T\) arranged in descending order of their eigenvalues. Although, they are usually computed by noticing first \(r\) columns of \(U\) are \(\vec{u}_i = \frac{1}{\sigma_i} A\vec{v}_i\) and the last \(m-r\) columns are an orthonormal basis for \(\operatorname{Null}(A^T)\):
The condensed svd tosses away:
- the last \(m-r\) columns of \(U\) (i.e. an orthonormal basis of \(\operatorname{Null}(A^T)\))
- the last \(n-r\) columns of \(V\) (i.e. an orthonormal basis of \(\operatorname{Null}(A)\))
- the last \(n-r\) columns of \(\Sigma\) (i.e. the zero columns)
since these don't affect the product. Click the "condensed svd" tab above to see the changes to the definitions of \(U\), \(V\) and \(\Sigma\) (thus making them smaller, hence 'condensed').
Example
The \(2 \times 3\) matrix
has rank \(2\) and decomposes as
where both the first and third matrices are orthogonal. The condesnsed svd is
We use SymPy's .singular_value_decomposition
method to find a decomposition in python. This function returns condensed svd the 3-tuple \((U, S, V)\).
Warning
The matrices \(U\) and \(V\) are not necessarily unique. For instance, the columns corresponding to the same eigenvalues can be rearranged and scaled by \(\pm 1\). Therefore, .singular_value_decomposition
may return something slightly different than ypu may have expected.
Example
Find a singular value decomposition of the rank \(2\) matrix
\(\left[ \left[\begin{matrix} -\frac{\sqrt{2}}{2} & \frac{\sqrt{2}}{2} \\ \frac{\sqrt{2}}{2} & \frac{\sqrt{2}}{2} \\ \end{matrix}\right], \left[\begin{matrix} \sqrt{6} & 0 \\ 0 & 4 \\ \end{matrix}\right], \left[\begin{matrix} \frac{\sqrt{3}}{3} & \frac{\sqrt{2}}{2} \\ \frac{\sqrt{3}}{3} & 0 \\ -\frac{\sqrt{3}}{3} & \frac{\sqrt{2}}{2} \\ \end{matrix}\right] \right]\)
Notice that the singular values are sorted in ascending order, and this is the condensed svd.
Let's check that this gives a decomposition of \(A\).
\(\left[\begin{matrix} 1 & -1 & 3 \\ 3 & 1 & 1 \end{matrix}\right]\)
The product is indeed \(A\). Therefore, the condensed svd of \(A\) is
Since \(A\) has rank \(2\) then the condensed svd returns a column orthogonal matrix \(V\), not a full orthogonal matrix. We can extend \(V\) to a full orthogonal matrix by adding another orthogonal column which is in the nullspace of \(A\). The simplest way to do this is to find any vector not currently in the column space of \(V\), then Gram-Schmidt the vectors to replace it with an orthonormal vector.
First we look at the columns of \(V\):
\(\left[\begin{matrix} \frac{\sqrt{3}}{3} & \frac{\sqrt{2}}{2}\\ \frac{\sqrt{3}}{3} & 0 \\ -\frac{\sqrt{3}}{3} & \frac{\sqrt{2}}{2} \end{matrix}\right]\)
The vector \(\begin{bmatrix} 0 \\ 0 \\ 1 \end{bmatrix}\) is not in the column space, so lets add it as a column.
\(\left[\begin{matrix}\frac{\sqrt{3}}{3} & \frac{\sqrt{2}}{2} & 0\\\frac{\sqrt{3}}{3} & 0 & 0\\- \frac{\sqrt{3}}{3} & \frac{\sqrt{2}}{2} & 1\end{matrix}\right]\)
Now the columns of this matrix are not orthonormal, so we could apply Gram-Schmidt. This would require making a list of the columns, applying gram_schmidt
, then taking the resulting list and making a matrix. It is easier to just do a QR-decomposition since \(Q\) will be the matrix we want.
\(\left[\begin{matrix}\frac{\sqrt{3}}{3} & \frac{\sqrt{2}}{2} & - \frac{\sqrt{6}}{6}\\\frac{\sqrt{3}}{3} & 0 & \frac{\sqrt{6}}{3}\\- \frac{\sqrt{3}}{3} & \frac{\sqrt{2}}{2} & \frac{\sqrt{6}}{6}\end{matrix}\right]\)
This is the extended matrix \(V\) we want, and next we just need to extend \(\Sigma\) to a \(2\times 3\) matrix by adding a column of zeros.
\(\left[\begin{matrix}\sqrt{6} & 0 & 0\\0 & 4 & 0\end{matrix}\right]\)
Now let's check that \(A = U\Sigma_{\text{ext}}V_{\text{ext}}\):
\(\left[\begin{matrix} 1 & -1 & 3 \\ 3 & 1 & 1 \end{matrix}\right]\)
The product is indeed \(A\). Therefore, the (full) svd of \(A\) is
LU Decomposition
The LU decomposition of a matrix \(A\) is a factorization into the formed
where \(P\) is a permutation matrix, \(L\) is a lower triangular matrix with \(1\)'s along the diagonal, and \(U\) is an upper triangular matrix.
A permutation matrix is constructed from the identity matrix \(I\) by permuting its rows.
The command LUdecomposition
returns a 3-tuple \((L,U, perm)\), where perm
is a list of row swap index pairs. If \(A\) is the original matrix then
A=(L*U).permuteBkwd(perm)
, and the row permutation matrix \(P\) such that \(P A = L U\) can be computed by P=eye(A.rows).permuteFwd(perm)
.
Example
Find the LU-decomposition of
\(\left( \left[\begin{matrix} 1 & 0 & 0 \\ 2 & 1 & 0 \\ -1/2 & -3/2 & 1 \\ \end{matrix}\right], \left[\begin{matrix} 2 & -1 & 4 \\ 0 & 1 & -2 \\ 0 & 0 & 2 \\ \end{matrix}\right], [~] \right)\)
Typically, one would call the function using tuple assignment.
A = Matrix([[2,-1,4],[4,-1,6],[-1,-1,3]])
L, U, perm = A.LUdecomposition()
P = eye(A.rows).permuteFwd(perm)
P
\(\left[\begin{matrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \end{matrix}\right]\)
Let's check the product is \(A\):
\(\begin{bmatrix} 2 & -1 & 4\\ 4 & -1 & 6\\ -1 & -1 & 3\\ \end{bmatrix}\)
QR Decomposition
The QR decomposition of a matrix \(A\) is a factorization into the formed
where \(Q\) is an orthogonal matrix, and \(R\) is an upper triangular matrix.
The command QRdecomposition
returns a 2-tuple \((Q,R)\).
Example
Find the QR-decomposition of
\(\left( \left[\begin{matrix} \frac{2\sqrt{21}}{21} & \frac{-11\sqrt{798}}{798} & \frac{-5\sqrt{38}}{38} \\ \frac{4\sqrt{21}}{21} & \frac{-\sqrt{798}}{798} & \frac{3\sqrt{38}}{38} \\ \frac{-\sqrt{21}}{21} & \frac{-13\sqrt{798}}{798} & \frac{\sqrt{38}}{19} \\ \end{matrix}\right], \left[\begin{matrix} \sqrt{21} & \frac{-5\sqrt{21}}{21} & \frac{29\sqrt{21}}{21} \\ 0 & \frac{\sqrt{798}}{21} & \frac{-64\sqrt{798}}{399} \\ 0 & 0 & \frac{2\sqrt{38}}{19} \\ \end{matrix}\right] \right)\)
Typically, one would call the function using tuple assignment.
Let's check the product is \(A\).
\(\begin{bmatrix} 2 & -1 & 4\\ 4 & -1 & 6\\ -1 & -1 & 3\\ \end{bmatrix}\)