Gram-Schmidt Orthogonalization and Projection
Gram-Schmidt Procedure
Suppose we have a subspace \(\mathbb{S}\) of \(\mathbb{R}^n\) whose basis consists of \(k\) vectors \(\vec{v}_1,\vec{v}_2, \ldots , \vec{v}_k\).
The Gram-Schmidt procedure computes from this basis a new basis for \(\mathbb{S}\) that is orthonormal (that is, the vectors of of unit length and mutually orthogonal). That is,
where
- \(|\vec{w}_i| = 1\), for all \(i\),
- \(\vec{w}_i \cdot \vec{w}_j = 0\) for all \(i\ne j\).
The procedure usually constructs the vectors by, setting \(\vec{w}_1 = \vec{v}_1/|\vec{v}_1|\), and then recursively building the rest of the vectors by
It follows that, for each \(1\le i \le k\)
Here is an implementation of the Gram-Schmidt procedure. Even though we state the input is a set of linearly independent vectors, it can be used even if the vectors are linearly dependent. There will just be some zero vectors returned in the columns of \(A\) corresponding to the linearly dependent vectors. Therefore, these columns can be ignored.
def gram_schmidt(A):
'''input: A set of linearly independent vectors stored
as the columns of matrix A
outpt: An orthongonal basis for the column space of A.'''
# get the number of vectors.
A = np.copy(A).astype(np.float64) # create a local instance of the array
n = A.shape[1]
for j in range(n):
# For the vector in column j, find the perpendicular
# of the projection onto the previous orthogonal vectors.
for k in range(j):
A[:, j] -= np.dot(A[:, k], A[:, j]) * A[:, k]
# If original vectors aren't lin indep then we can check for this:
# (1)
if np.isclose(np.linalg.norm(A[:, j]), 0, rtol=1e-15, atol=1e-14, equal_nan=False):
A[:, j] = np.zeros(A.shape[0])
else:
A[:, j] = A[:, j] / np.linalg.norm(A[:, j])
return A
- if the perp is 0, then this column is in the span of the previous ones, so the original set of vectors is not linearly independent. We can't normalize the 0 vector so we need to treat this separately. Also, to see if the norm, which is a float, is 0 we us the
.isclose
function.
Example
Use the Gram-Schmidt procedure on the set
[[ 0.70710678 -0.63960215 0.30151134]
[ 0.70710678 0.63960215 -0.30151134]
[ 0. 0.42640143 0.90453403]]
The columns of this matrix form the orthonormal basis.
As a check, we can work through the Gram-Schmidt process by hand and determine that
# check against hand calculated solution:
np.array([[1/np.sqrt(2),1/np.sqrt(2),0],[-3/np.sqrt(22), 3/np.sqrt(22), 2/np.sqrt(22)],[1/np.sqrt(11), -1/np.sqrt(11), 3/np.sqrt(11) ]]).T
array([[ 0.70710678, -0.63960215, 0.30151134],
[ 0.70710678, 0.63960215, -0.30151134],
[ 0. , 0.42640143, 0.90453403]])
Another thing we can check is that the vectors returned by gram_schmidt
are orthonormal. This means that the matrix of these vectors must be orthogonal. Recall, a matrix \(Q\) is orthogonal if \(Q^TQ = I\).
- We round to account for floating point approximation errors.
Example
Use the Gram-Schmidt procedure on the set
Note: the third vector is in the span of the first two.
[[ 0.57735027 0.25819889 0. 0.31622777]
[ 0.57735027 -0.51639778 0. -0.63245553]
[ 0. 0.77459667 0. -0.63245553]
[ 0.57735027 0.25819889 0. 0.31622777]]
The third column is the zero vector and this indicates that \(\vec{v}_3\) is in the span of the previous two vectors. Therefore, we can ignore vector \(\vec{v}_3\).
As a check, we can work through the Gram-Schmidt process by hand and determine that
Gram-Schmidt via QR Decomposition
The Gram-Schmidt can be found hidden in the SciPy's QR decomposition. We'll cover a number of matrix decompositions methods in the next section, but for now we note that QR decomposition is a factorization of the form
where \(Q\) is orthogonal (i.e. columns are orthonormal), and \(R\) is upper triangular. This means that column \(i\) of \(A\) is a linear combination of the first \(i\) columns of \(Q\). Column \(i\) of \(R\) tells us exactly what this linear combination is. This is true for each column \(i\).
The function scipy.linalg.qr
returns the two matrices \(Q\) and \(R\) associated with matrix \(A\).
Let's redo the two examples above.
Example
Use scipy.linalg.qr
to find the Gram-Schmidt basis for the set
First form a column matrix, then return the matrix \(Q\) from QR decomposition.
array([[-0.70710678, 0.63960215, 0.30151134],
[-0.70710678, -0.63960215, -0.30151134],
[-0. , -0.42640143, 0.90453403]])
This is the same collection of orthonormal vectors we found above (up to scaling by \(-1\)).
For bases that don't span all of \(\mathbb{R}^n\), QR decomposition adds extra linearly independent columns. We can detect this by looking at the matrix \(R\) that gets returned.
Example
Use scipy.linalg.qr
to find the Gram-Schmidt basis for the set
First form a column matrix, then return the matrix \(Q\) from QR decomposition.
array([[-0.57735027, 0.25819889, -0.66225429, -0.40177015],
[-0.57735027, -0.51639778, -0.07871036, 0.62753859],
[-0. , 0.77459667, -0.07871036, 0.62753859],
[-0.57735027, 0.25819889, 0.74096465, -0.22576844]])
We may think we are done here, but recall the original set of vectors were linearly dependent. We can see this by looking at \(R\). We just round entries to 3 decimal places to make it easier to read.
array([[-1.732, -1.155, -2.887, -1.155],
[ 0. , 1.291, 1.291, 0.516],
[ 0. , 0. , -0. , 0.079],
[ 0. , 0. , 0. , -0.628]])
The third column has a 0 on the diagonal. This tells us that the third column of \(A\) is in the span of the first two columns of \(Q\), hence it is in the span of the first two columns of \(A\). Therefore, the columns of \(A\) were not linearly independent. We can remove this column from \(A\) and redo the calculation.
array([[-0.57735027, 0.25819889, 0.31622777, -0.70710678],
[-0.57735027, -0.51639778, -0.63245553, 0. ],
[-0. , 0.77459667, -0.63245553, 0. ],
[-0.57735027, 0.25819889, 0.31622777, 0.70710678]])
A square matrix is returned but we only care about the first 3 columns. These 3 columns is the same collection of orthonormal vectors we found above.
Projection onto a Subspace
Suppose we have an orthogonal basis \(\{ \vec{w}_1, \vec{w}_2, \ldots, \vec{w}_k \}\) for a subspace \(\mathbb{S}\). The the projection of a vector \(\vec{v}\) onto \(\mathbb{S}\) is given by:
We used this in the Gram-Schmidt procedure above. Here we just implement the projection and perpendicular as standalone functions in python so we can use them as needed.
def proj(v, A):
'''input : v is a vector, and A is a matrix whose
column vectors are form an orthogonal basis.
output: the projection of v onto the subspace spanned
columns of A.'''
num_rows = A.shape[0]
num_cols = A.shape[1]
w = np.zeros(num_rows)
for i in range(num_cols):
coli = A[:,i]
w += (np.dot(v,coli)/np.dot(coli,coli))*coli
return w
def perp(v,A):
'''input : v is a vector, and A is a matrix whose
column vectors are form an orthogonal basis.
output: the perpendicular of the projection of v onto
the subspace spanned columns of A.'''
return v - proj(v,A)
Example
Let \(\mathbb{S} = \operatorname{span}\left\{ \begin{bmatrix} 1 \\ 1 \\ 1 \\ 1 \end{bmatrix}, \begin{bmatrix} 1 \\ -1 \\ 1 \\ -1\end{bmatrix} \right\}\) and let \(\vec{x} = \begin{bmatrix} 2\\5\\-7\\3 \end{bmatrix}\). Determine \(\operatorname{proj}_{\mathbb{S}}(\vec{x})\) and \(\operatorname{perp}_{\mathbb{S}}(\vec{x})\).
Note: Basis is already orthogonal, so we can project directly onto it.
In the next example, we use the Gram-Schmidt procedure to first find an orthogonal basis for the subspace, then we project the vector onto this.
Example
Project \(\vec{v} = \begin{bmatrix} 1 \\ 1 \\ 1 \end{bmatrix}\) onto the space with basis:
The first thing to do would be to find an orthogonal basis, we could use Gram-Schmidt to do this (see above), but here we didn't normalize:
Then projecting \(\vec{x}\) onto this basis we get
Also,
Using our python implementation, we can check that we get the same answer:
scipy.linalg.orth
There is a built-in command in SciPy which constructs an orthonormal basis for the column space of a matrix. Since such bases are not unique, the one returned by scipy.linalg.org
is unlikely to be the one we obtain from the Gram-Schmidt procedure (or QR decomposition).
For example:
There are instances where we just want any orthogonal basis for a space. In such cases we scipy.linalg.orth
will suffice (instead of using Gram-Schmidt). For instance, to project a vector \(\vec{v}\) onto a subspace \(\mathbb{S}\), any orthogonal basis will suffice. The answer is independent of the basis used.
Example
Let's revisit our last example above. The only change we'll make is that we will use scipy.linalg.orth
instead of gram_schmidt
:
v1 = np.array([1.,1,0])
v2 = np.array([-1,2,1])
A = np.vstack([v1,v2]).T
Q = la.orth(A) # (1)
v = np.array([1,1,1])
print('projection of v onto columnspace of Q:', proj(v,Q))
- Here we use
scipy.linalg.orth
instead ofgram_schmidt
.
This is the same result we found above.
Another example of where we use scipy.linalg.orth
instead of Gram-Schmidt is when we orthogonally diagonalize symmetric matrices in the Matrix Decomposition section.