Skip to content

Eigenvalues and Eigenvectors

Naturally, eigenvalues are defined only for square matrices because for non-square matrices, the transformation changes the dimensions of the resulting vector, hence Avλv\Av \vv \neq \lambda \vv for any pair (λ,v)(\lambda, \vv). An n×nn\times n matrix has at most nn linearly independent eigenvectors, but it need not have that many. A matrix with nn linearly independent eigenvectors is called diagonalizable; matrices that fall short (e.g., a Jordan block) are called defective.

One intuitive way to understand eigenvectors is to look at any matrix A\Av as a transformation being applied to a vector v\vv. A transformation can rotate and/or scale a vector. There’s a very nice video about this by Grant Sanderson where he explains this idea with beautiful visuals. In essence, eigenvectors of a matrix (transformation) are those vectors that are not rotated, but only scaled without changing the direction. The magnitude of that scale is what we call the eigenvalue.

Interactive: The directions that don’t rotate

Section titled “Interactive: The directions that don’t rotate”

For a 2×22 \times 2 matrix

A=(abcd),A = \begin{pmatrix} a & b \\ c & d \end{pmatrix},

the faint gray arrows are 12 unit vectors evenly spaced around the unit circle. The accent-colored arrows show their images AvA \vv under multiplication by AA. Most input directions get rotated and scaled; the eigenvectors are the (at most two) directions for which the output points the same way as the input. Drag the sliders to find them.

When the discriminant (ad)2+4bc(a-d)^2 + 4bc goes negative, the eigenvalues become a complex conjugate pair and no real direction is preserved (the transformation has a genuine rotational component).

A = [[1.50, 0.50], [0.30, 1.20]]  ·  λ₁ = 1.765λ₂ = 0.935
λ₁ v₁λ₂ v₂
unit vectors (before)A · v (after)eigenvector v₁eigenvector v₂

The key property of eigenvectors can be seen by looking at Ak\Av^k for some positive integer kk. Let’s say v\vv is an eigenvector of A\Av with eigenvalue λ\lambda. Then we have

Akv=Ak1(λv)=λ(Ak1v)==λkv.\Av^k\vv = \Av^{k-1}(\lambda \vv) = \lambda (\Av^{k-1}\vv) = \dots = \lambda^k\vv.

Thus, if v\vv is an eigenvector of A\Av with eigenvalue λ\lambda then v\vv is also an eigenvector of Ak\Av^k with eigenvalue λk\lambda^k. This also tells us another key fact about matrices:

If A\Av is diagonalizable, its nn linearly independent eigenvectors form a basis for Cn\mathbb{C}^n (or Rn\mathbb{R}^n when the eigenvectors can be chosen real), so any vector x\xv can be expressed as a linear combination of the eigenvectors vi\vv_i:

x=i=1ncivi.\xv = \sum_{i=1}^{n} c_i\vv_i.

A matrix B\Bv is said to be similar to A\Av if B=M1AM\Bv = \Mv^{-1}\Av \Mv for some invertible matrix M\Mv.

This simple result is very useful in computation. For example, a software like MATLAB will use a sequence of matrices M1,M2,,Mk\Mv_1, \Mv_2, \ldots, \Mv_k and reduce A\Av to Bk\Bv_k where B0=A\Bv_0 = \Av and Bi=Mi1Bi1Mi\Bv_i = \Mv_i^{-1}\Bv_{i-1}\Mv_i for i>0i>0. This sequence can be chosen carefully so as to make Bk\Bv_k a diagonal matrix, hence, the eigenvalues show up on the diagonal. This helps us calculate the eigenvalues of a matrix much faster.

For a matrix A\Av, the eigenspace associated with a set of eigenvalues is defined to be the subspace spanned by the eigenvectors associated with those eigenvalues.

Some obvious but important facts are as follows:

  • The sum of the eigenvalues of A\Av is Trace(A){\rm Trace}(\Av), which is the sum of the elements on the diagonal.
  • The product of the eigenvalues of A\Av is detA\det \Av.
  • In general, eigenvalues of A+B\Av + \Bv or AB\Av\Bv cannot be inferred directly from eigenvalues of A\Av and B\Bv.

The first two facts can be proved easily using the characteristic polynomial

P(λ)=det(AλI)=0.P(\lambda) = \det(\Av - \lambda \Iv) = 0.

We know that the sum of the roots of such a polynomial is determined by the coefficient of λn1\lambda^{n-1}, nn being the degree of the polynomial. Also, the constant term in the polynomial determines the product of the roots.

If S\Sv is a symmetric matrix, then

  • Eigenvalues of S\Sv are real if S\Sv is real.
  • Eigenvectors of S\Sv corresponding to distinct eigenvalues are orthogonal, and an orthonormal basis of eigenvectors can always be chosen (even when eigenvalues are repeated, by picking an orthonormal basis within each eigenspace).
  • We always have a full set of eigenvectors, even when some eigenvalues are repeated. For example, consider the identity matrix. It has only one eigenvalue, λ=1\lambda=1, but every vector is an eigenvector.

Let S\Sv be an n×nn\times n symmetric matrix with eigenvalues λ1,,λn\lambda_1, \ldots, \lambda_n. If we consider the matrix

Λ=(λ100λn),\Lambdav = \begin{pmatrix} \lambda_1 & \dots & 0 \\ \vdots & \ddots & \vdots \\ 0 & \dots & \lambda_n \end{pmatrix},

we see that S\Sv and Λ\Lambdav are similar matrices. This means that there must be an invertible M\Mv such that S=MΛM1\Sv = \Mv\Lambda \Mv^{-1} (equivalently, Λ=M1SM\Lambdav = \Mv^{-1}\Sv\Mv). It is not hard to see that M\Mv is the eigenvector matrix (columns of M\Mv are eigenvectors of S\Sv), and we have the spectral decomposition S=MΛM1=QΛQT\Sv = \Mv\Lambda \Mv^{-1} = \Qv\Lambda \Qv^{\rm T}, where we use Q\Qv for M\Mv when the eigenvectors are chosen orthonormal so that M1=MT\Mv^{-1} = \Mv^{\rm T}.

For a diagonalizable square matrix A\Av (which may not be symmetric), we can factorize it as

A=XΛX1,\Av = \Xv\Lambdav \Xv^{-1},

where X\Xv is the eigenvector matrix and Λ\Lambdav is the diagonal eigenvalue matrix. This factorization is another way to look at the fact that the eigenvectors of powers of A\Av are the same as those of A\Av, with correspondingly exponentiated eigenvalues. It is only when A\Av is symmetric (more generally, normal) that we can use X1=XT\Xv^{-1} = \Xv^{\rm T}, since the eigenvectors are orthogonal in that case. A non-diagonalizable (defective) matrix does not admit this factorization, and one has to use the more general Jordan normal form instead.