Skip to content

Pseudoinverse

A rectangular or rank-deficient matrix has no inverse, but it has a canonical best substitute. The pseudoinverse A+\Av^{+} inverts A\Av on the part of space where A\Av acts invertibly and does nothing elsewhere. It packages least squares and the minimum-norm solution of an underdetermined system into one object, and it is read straight off the SVD.

If A\Av is square and invertible, every σi>0\sigma_i > 0 and A+=A1\Av^{+} = \Av^{-1}. Otherwise A+\Av^{+} inverts only the invertible part.

The SVD splits both spaces into the part the matrix sees and the part it annihilates. The first rr right singular vectors span the row space C(AT)C(\Av^{\rm T}), the rest span the nullspace N(A)N(\Av); the first rr left singular vectors span the column space C(A)C(\Av), the rest span the left nullspace N(AT)N(\Av^{\rm T}). On these pieces,

Avi=σiui,A+ui=1σivi(ir),\Av\,\vv_i = \sigma_i \uv_i, \qquad \Av^{+}\uv_i = \tfrac{1}{\sigma_i}\vv_i \quad (i \le r),

and A+\Av^{+} kills N(AT)N(\Av^{\rm T}) just as A\Av kills N(A)N(\Av).

A maps the row space onto the column space and the nullspace to zero; the pseudoinverse maps the column space back onto the row space and the left nullspace to zero

A\Av is a bijection from the row space onto the column space (stretching by the σi\sigma_i); A+\Av^{+} runs that bijection backward (shrinking by 1/σi1/\sigma_i) and sends the left nullspace to 00. So A+A\Av^{+}\Av is the orthogonal projection onto the row space and AA+\Av\Av^{+} is the orthogonal projection onto the column space.

  • Independent columns (r=nr = n, the least-squares setting): ATA\Av^{\rm T}\Av is invertible and A+=(ATA)1AT\Av^{+} = (\Av^{\rm T}\Av)^{-1}\Av^{\rm T} is a left inverse, A+A=In\Av^{+}\Av = \Iv_n. Then A+b\Av^{+}\bv is the least-squares solution.
  • Independent rows (r=mr = m, the underdetermined setting): AAT\Av\Av^{\rm T} is invertible and A+=AT(AAT)1\Av^{+} = \Av^{\rm T}(\Av\Av^{\rm T})^{-1} is a right inverse, AA+=Im\Av\Av^{+} = \Iv_m. Then A+b\Av^{+}\bv is the shortest solution of Ax=b\Av\xv = \bv, as the theorem below shows.

When Ax=b\Av\xv = \bv has many solutions, they form an affine flat x0+N(A)\xv_0 + N(\Av). The pseudoinverse picks out the one closest to the origin.

This is the second half of a single picture: for an overdetermined system the pseudoinverse delivers the least-squares fit, and for an underdetermined one it delivers the minimum-norm solution. In the general rank-deficient case it does both at once, returning the shortest of all least-squares minimizers.

The construction above is via a particular SVD, but A+\Av^{+} does not depend on the choice. It is the unique matrix satisfying the four Moore–Penrose conditions

AA+A=A,A+AA+=A+,(AA+)T=AA+,(A+A)T=A+A.\Av\Av^{+}\Av = \Av, \quad \Av^{+}\Av\Av^{+} = \Av^{+}, \quad (\Av\Av^{+})^{\rm T} = \Av\Av^{+}, \quad (\Av^{+}\Av)^{\rm T} = \Av^{+}\Av.

The first two say A+\Av^{+} acts as an inverse where it can; the last two say the products AA+\Av\Av^{+} and A+A\Av^{+}\Av are symmetric, hence orthogonal projections onto the column space and row space respectively.