# Gram–Schmidt process

# Gram–Schmidt process

In mathematics, particularly linear algebra and numerical analysis, the **Gram–Schmidt process** is a method for orthonormalising a set of vectors in an inner product space, most commonly the Euclidean space **R***n* equipped with the standard inner product. The Gram–Schmidt process takes a finite, linearly independent set *S* = {*v*1, ..., *v**k*} for *k* ≤ *n* and generates an orthogonal set *S′* = {*u*1, ..., *u**k*} that spans the same *k*-dimensional subspace of **R***n* as *S*.

The method is named after Jørgen Pedersen Gram and Erhard Schmidt, but Pierre-Simon Laplace had been familiar with it before Gram and Schmidt.^{[1]} In the theory of Lie group decompositions it is generalized by the Iwasawa decomposition.

The application of the Gram–Schmidt process to the column vectors of a full column rank matrix yields the QR decomposition (it is decomposed into an orthogonal and a triangular matrix).

The Gram–Schmidt process

The modified Gram-Schmidt process being executed on three linearly independent, non-orthogonal vectors of a basis for **R**3. Click on image for details. Modification is explained in the next section of this article.

We define the projection operator by

`wheredenotes theinner productof the vectors`

**u**and**v**. This operator projects the vector**v**orthogonally onto the line spanned by vector**u**. If**u**=**0**, we define. i.e., the projection mapis the zero map, sending every vector to the zero vector.The Gram–Schmidt process then works as follows:

The sequence **u**1, ..., **u***k* is the required system of orthogonal vectors, and the normalized vectors **e**1, ..., **e***k* form an ortho*normal* set. The calculation of the sequence **u**1, ..., **u***k* is known as *Gram–Schmidt orthogonalization*, while the calculation of the sequence **e**1, ..., **e***k* is known as *Gram–Schmidt orthonormalization* as the vectors are normalized.

`To check that these formulas yield an orthogonal sequence, first computeby substituting the above formula for`

**u**2: we get zero. Then use this to computeagain by substituting the formula for**u**3: we get zero. The general proof proceeds bymathematical induction.Geometrically, this method proceeds as follows: to compute **u***i*, it projects **v***i* orthogonally onto the subspace *U* generated by **u**1, ..., **u***i*−1, which is the same as the subspace generated by **v**1, ..., **v***i*−1. The vector **u***i* is then defined to be the difference between **v***i* and this projection, guaranteed to be orthogonal to all of the vectors in the subspace *U*.

The Gram–Schmidt process also applies to a linearly independent countably infinite sequence {**v***i*}*i*. The result is an orthogonal (or orthonormal) sequence {**u***i*}*i* such that for natural number *n*:
the algebraic span of **v**1, ..., **v***n* is the same as that of **u**1, ..., **u***n*.

If the Gram–Schmidt process is applied to a linearly dependent sequence, it outputs the **0** vector on the *i*th step, assuming that **v***i* is a linear combination of **v**1, ..., **v***i*−1. If an orthonormal basis is to be produced, then the algorithm should test for zero vectors in the output and discard them because no multiple of a zero vector can have a length of 1. The number of vectors output by the algorithm will then be the dimension of the space spanned by the original inputs.

`A variant of the Gram–Schmidt process usingtransfinite recursionapplied to a (possibly uncountably) infinite sequence of vectorsyields a set of orthonormal vectorswithsuch that for any, thecompletionof the span ofis the same as that of. In particular, when applied to a (algebraic) basis of aHilbert space(or, more generally, a basis of any dense subspace), it yields a (functional-analytic) orthonormal basis. Note that in the general case often the strict inequalityholds, even if the starting set was linearly independent, and the span ofneed not be a subspace of the span of(rather, it's a subspace of its completion).`

Example

Euclidean space

Consider the following set of vectors in **R**2 (with the conventional inner product)

Now, perform Gram–Schmidt, to obtain an orthogonal set of vectors:

We check that the vectors **u**1 and **u**2 are indeed orthogonal:

noting that if the dot product of two vectors is *0* then they are orthogonal.

For non-zero vectors, we can then normalize the vectors by dividing out their sizes as shown above:

Properties

`Denote bythe result of applying the Gram–Schmidt process to a collection of vectors. This yields a map.`

It has the following properties:

It is continuous

It is orientation preserving in the sense that .

It commutes with orthogonal maps:

`Letbe orthogonal (with respect to the given inner product). Then we have`

`Further a parametrized version of the Gram–Schmidt process yields a (strong) deformation retraction of the general linear grouponto the orthogonal group.`

Numerical stability

`When this process is implemented on a computer, the vectorsare often not quite orthogonal, due torounding errors. For the Gram–Schmidt process as described above (sometimes referred to as "classical Gram–Schmidt") this loss of orthogonality is particularly bad; therefore, it is said that the (classical) Gram–Schmidt process isnumerically unstable.`

The Gram–Schmidt process can be stabilized by a small modification; this version is sometimes referred to as **modified Gram-Schmidt** or MGS.
This approach gives the same result as the original formula in exact arithmetic and introduces smaller errors in finite-precision arithmetic.
Instead of computing the vector **u***k* as

it is computed as

`Each step finds a vectororthogonal to. Thusis also orthogonalized against any errors introduced in computation of.`

This method is used in the previous animation, when the intermediate v'3 vector is used when orthogonalizing the blue vector v3.

Algorithm

The following MATLAB algorithm implements the stabilized Gram–Schmidt orthonormalization for Euclidean Vectors. The vectors **v**1, ..., **v***k* (columns of matrix **V**, so that **V(:,j)** is the jth vector) are replaced by orthonormal vectors (columns of **U**) which span the same subspace.

The cost of this algorithm is asymptotically O(*nk*2) floating point operations, where *n* is the dimensionality of the vectors (Golub & Van Loan 1996, §5.2.8).

Via Gaussian elimination

`If the rows {`

*v*1, ...,*v**k*} are written as a matrix, then applyingGaussian eliminationto the augmented matrixwill produce the orthogonalized vectors in place of.^{[2]}For example, takingas above, we haveAnd reducing this to row echelon form produces

The normalized vectors are then

as in the example above.

Determinant formula

The result of the Gram–Schmidt process may be expressed in a non-recursive formula using determinants.

where *D* 0=1 and, for *j* ≥ 1, *D j* is the Gram determinant

Note that the expression for **u**k is a "formal" determinant, i.e. the matrix contains both scalars
and vectors; the meaning of this expression is defined to be the result of a cofactor expansion along
the row of vectors.

The determinant formula for the Gram-Schmidt is computationally slower (exponentially slower) than the recursive algorithms described above; it is mainly of theoretical interest.

Alternatives

`Otherorthogonalizationalgorithms useHouseholder transformationsorGivens rotations. The algorithms using Householder transformations are more stable than the stabilized Gram–Schmidt process. On the other hand, the Gram–Schmidt process produces theth orthogonalized vector after theth iteration, while orthogonalization usingHouseholder reflectionsproduces all the vectors only at the end. This makes only the Gram–Schmidt process applicable foriterative methodslike theArnoldi iteration.`

`Yet another alternative is motivated by the use ofCholesky decompositionforinverting the matrix of the normal equations in linear least squares. Letbe afull column rankmatrix, whose columns need to be orthogonalized. The matrixisHermitianandpositive definite, so it can be written asusing theCholesky decomposition. The lower triangular matrixwith strictly positive diagonal entries isinvertible. Then columns of the matrixareorthonormalandspanthe same subspace as the columns of the original matrix. The explicit use of the productmakes the algorithm unstable, especially if the product'scondition numberis large. Nevertheless, this algorithm is used in practice and implemented in some software packages because of its high efficiency and simplicity.`

In quantum mechanics there are several orthogonalization schemes with characteristics better suited for certain applications than original Gram–Schmidt. Nevertheless, it remains a popular and effective algorithm for even the largest electronic structure calculations.^{[3]}

## References

*Linear Algebra: Theory and Applications*. Sudbury, Ma: Jones and Bartlett. pp. 544, 558. ISBN 978-0-7637-5020-6.

*The American Mathematical Monthly*.

**98**(6): 544–549. doi:10.2307/2324877. JSTOR 2324877.

*SC '11 Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis*: 1:1--1:11. doi:10.1145/2063384.2063386.