Moore–Penrose pseudoinverse

Moore–Penrose pseudoinverse

In mathematics, and in particular linear algebra, a pseudoinverse A+ of a matrix A is a generalization of the inverse matrix.[1] The most widely known type of matrix pseudoinverse is the Moore–Penrose inverse,[2][3][4][5] which was independently described by E. H. Moore[6] in 1920, Arne Bjerhammar[7] in 1951, and Roger Penrose[8] in 1955. Earlier, Erik Ivar Fredholm had introduced the concept of a pseudoinverse of integral operators in 1903. When referring to a matrix, the term pseudoinverse, without further specification, is often used to indicate the Moore–Penrose inverse. The term generalized inverse is sometimes used as a synonym for pseudoinverse.
A common use of the pseudoinverse is to compute a "best fit" (least squares) solution to a system of linear equations that lacks a unique solution (see below under § Applications). Another use is to find the minimum (Euclidean) norm solution to a system of linear equations with multiple solutions. The pseudoinverse facilitates the statement and proof of results in linear algebra.
Notation
In the following discussion, the following conventions are adopted.
will denote one of the fields of real or complex numbers, denoted , , respectively. The vector space of matrices over is denoted by .
For , and denote the transpose and Hermitian transpose (also called conjugate transpose) respectively. If , then .
For , denotes the range (image) of (the space spanned by the column vectors of ) and denotes the kernel (null space) of .
Finally, for any positive integer , denotes the identity matrix.
Definition
(AA+ need not be the general identity matrix, but it maps all column vectors of A to themselves);
(A+ is a weak inverse for the multiplicative semigroup);
(AA+ is Hermitian);
(A+A is also Hermitian).
Properties
Existence and uniqueness
A matrix satisfying the first condition of the definition is known as a generalized inverse. If the matrix also satisfies the second definition, it is called a generalized reflexive inverse. Generalized inverses always exist but are not in general unique. Uniqueness is a consequence of the last two conditions.
Basic properties
If has real entries, then so does .
If is invertible, its pseudoinverse is its inverse. That is, .[10] []
The pseudoinverse of a zero matrix is its transpose.
The pseudoinverse of the pseudoinverse is the original matrix: .[10] []
Pseudoinversion commutes with transposition, conjugation, and taking the conjugate transpose:[10] [] , , .
The pseudoinverse of a scalar multiple of A is the reciprocal multiple of A+: for .
Identities
The following identities can be used to cancel certain subexpressions or expand expressions involving pseudoinverses. Proofs for these properties can be found in the proofs subpage.
Reduction to Hermitian case
The computation of the pseudoinverse is reducible to its construction in the Hermitian case. This is possible through the equivalences:
Products
then
The last property yields the equivalences
Projectors
and
is the orthogonal projector onto the range of (which equals the orthogonal complement of the kernel of ).
is the orthogonal projector onto the range of (which equals the orthogonal complement of the kernel of ).
is the orthogonal projector onto the kernel of .
is the orthogonal projector onto the kernel of .[9]
The last two properties imply the following identities:
Geometric construction
This description is closely related to the Minimum norm solution to a linear system.
Subspaces
Limit relations
The pseudoinverse are limits:
Continuity
Derivative
Examples
Since for invertible matrices the pseudoinverse equals the usual inverse, only examples of non-invertible matrices are considered below.
For the pseudoinverse is (Generally, the pseudoinverse of a zero matrix is its transpose.) The uniqueness of this pseudoinverse can be seen from the requirement , since multiplication by a zero matrix would always produce a zero matrix.
For the pseudoinverse is Indeed, and thus Similarly, and thus
For
For (The denominators are .)
For
For the pseudoinverse is Note that for this matrix, the left inverse exists and thus equals , indeed,
Special cases
Scalars
It is also possible to define a pseudoinverse for scalars and vectors. This amounts to treating these as matrices. The pseudoinverse of a scalar x is zero if x is zero and the reciprocal of x otherwise:
Vectors
The pseudoinverse of the null (all zero) vector is the transposed null vector. The pseudoinverse of a non-null vector is the conjugate transposed vector divided by its squared magnitude:
Linearly independent columns
Linearly independent rows
Orthonormal columns or rows
Orthogonal projection matrices
Circulant matrices
Construction
Rank decomposition
The QR method
which may be solved by forward substitution followed by back substitution.
Singular value decomposition (SVD)
The computational cost of this method is dominated by the cost of computing the SVD, which is several times higher than matrix–matrix multiplication, even if a state-of-the art implementation (such as that of LAPACK) is used.
Block matrices
Optimized approaches exist for calculating the pseudoinverse of block structured matrices.
The iterative method of Ben-Israel and Cohen
Another method for computing the pseudoinverse (cf. Drazin inverse) uses the recursion
Updating the pseudoinverse
Software libraries
The Python package NumPy provides a pseudoinverse calculation through its functions matrix.I and linalg.pinv; its pinv uses the SVD-based algorithm. SciPy adds a function scipy.linalg.pinv that uses a least-squares solver. High-quality implementations of SVD, QR, and back substitution are available in standard libraries, such as LAPACK. Writing one's own implementation of SVD is a major programming project that requires a significant numerical expertise. In special circumstances, such as parallel computing or embedded computing, however, alternative implementations by QR or even the use of an explicit inverse might be preferable, and custom implementations may be unavoidable.
The MASS package for R provides a calculation of the Moore–Penrose inverse through the ginv function.[22] The ginv function calculates a pseudoinverse using the singular value decomposition provided by the svd function in the base R package. An alternative is to employ the pinv function available in the pracma package.
The Octave programming language provides a pseudoinverse through the standard package function pinv and the pseudo_inverse() method.
Applications
Linear least-squares
, we have where and denotes the Frobenius norm.
Obtaining all solutions of a linear system
If the linear system
has any solutions, they are all given by[25]
Minimum norm solution to a linear system
If is satisfiable, the vector is a solution, and satisfies for all solutions.
If is satisfiable, the matrix is a solution, and satisfies for all solutions.
Condition number
Using the pseudoinverse and a matrix norm, one can define a condition number for any matrix:
A large condition number implies that the problem of finding least-squares solutions to the corresponding system of linear equations is ill-conditioned in the sense that small errors in the entries of A can lead to huge errors in the entries of the solution.[26]
Generalizations
In order to solve more general least-squares problems, one can define Moore–Penrose inverses for all continuous linear operators A : H1 → H2 between two Hilbert spaces H1 and H2, using the same four conditions as in our definition above. It turns out that not every continuous linear operator has a continuous linear pseudoinverse in this sense.[26] Those that do are precisely the ones whose range is closed in H2.
In abstract algebra, a Moore–Penrose inverse may be defined on a *-regular semigroup. This abstract definition coincides with the one in linear algebra.
See also
Proofs involving the Moore–Penrose inverse
Drazin inverse
Hat matrix
Inverse element
Linear least squares (mathematics)
Pseudo-determinant
Von Neumann regular ring