Eigenvalues and Eigenvectors

Naturally, eigenvalues are defined only for square matrices because for non-square matrices, the transformation changes the dimensions of the resulting vector, hence $\Av \vv \neq \lambda \vv$ for any pair $(\lambda, \vv)$ . An $n\times n$ matrix has at most $n$ linearly independent eigenvectors, but it need not have that many. A matrix with $n$ linearly independent eigenvectors is called diagonalizable; matrices that fall short (e.g., a Jordan block) are called defective.

Geometric interpretation

One intuitive way to understand eigenvectors is to look at any matrix $\Av$ as a transformation being applied to a vector $\vv$ . A transformation can rotate and/or scale a vector. There’s a very nice video about this by Grant Sanderson where he explains this idea with beautiful visuals. In essence, eigenvectors of a matrix (transformation) are those vectors that are not rotated, but only scaled without changing the direction. The magnitude of that scale is what we call the eigenvalue.

Interactive: The directions that don’t rotate

For a $2 \times 2$ matrix

A = \begin{pmatrix} a & b \\ c & d \end{pmatrix},

the faint gray arrows are 12 unit vectors evenly spaced around the unit circle. The accent-colored arrows show their images $A \vv$ under multiplication by $A$ . Most input directions get rotated and scaled; the eigenvectors are the (at most two) directions for which the output points the same way as the input. Drag the sliders to find them.

When the discriminant $(a-d)^2 + 4bc$ goes negative, the eigenvalues become a complex conjugate pair and no real direction is preserved (the transformation has a genuine rotational component).

a = 1.50b = 0.50c = 0.30d = 1.20

A = [[1.50, 0.50], [0.30, 1.20]] · λ₁ = 1.765, λ₂ = 0.935

unit vectors (before)A · v (after)eigenvector v₁eigenvector v₂

Properties

The key property of eigenvectors can be seen by looking at $\Av^k$ for some positive integer $k$ . Let’s say $\vv$ is an eigenvector of $\Av$ with eigenvalue $\lambda$ . Then we have

\Av^k\vv = \Av^{k-1}(\lambda \vv) = \lambda (\Av^{k-1}\vv) = \dots = \lambda^k\vv.

Thus, if $\vv$ is an eigenvector of $\Av$ with eigenvalue $\lambda$ then $\vv$ is also an eigenvector of $\Av^k$ with eigenvalue $\lambda^k$ . This also tells us another key fact about matrices:

If $\Av$ is diagonalizable, its $n$ linearly independent eigenvectors form a basis for $\mathbb{C}^n$ (or $\mathbb{R}^n$ when the eigenvectors can be chosen real), so any vector $\xv$ can be expressed as a linear combination of the eigenvectors $\vv_i$ :

\xv = \sum_{i=1}^{n} c_i\vv_i.

Similarity

A matrix $\Bv$ is said to be similar to $\Av$ if $\Bv = \Mv^{-1}\Av \Mv$ for some invertible matrix $\Mv$ .

This simple result is very useful in computation. For example, a software like MATLAB will use a sequence of matrices $\Mv_1, \Mv_2, \ldots, \Mv_k$ and reduce $\Av$ to $\Bv_k$ where $\Bv_0 = \Av$ and $\Bv_i = \Mv_i^{-1}\Bv_{i-1}\Mv_i$ for $i>0$ . This sequence can be chosen carefully so as to make $\Bv_k$ a diagonal matrix, hence, the eigenvalues show up on the diagonal. This helps us calculate the eigenvalues of a matrix much faster.

For a matrix $\Av$ , the eigenspace associated with a set of eigenvalues is defined to be the subspace spanned by the eigenvectors associated with those eigenvalues.

Some obvious but important facts are as follows:

The sum of the eigenvalues of $\Av$ is ${\rm Trace}(\Av)$ , which is the sum of the elements on the diagonal.
The product of the eigenvalues of $\Av$ is $\det \Av$ .
In general, eigenvalues of $\Av + \Bv$ or $\Av\Bv$ cannot be inferred directly from eigenvalues of $\Av$ and $\Bv$ .

The first two facts can be proved easily using the characteristic polynomial

P(\lambda) = \det(\Av - \lambda \Iv) = 0.

We know that the sum of the roots of such a polynomial is determined by the coefficient of $\lambda^{n-1}$ , $n$ being the degree of the polynomial. Also, the constant term in the polynomial determines the product of the roots.

Symmetric matrices

If $\Sv$ is a symmetric matrix, then

Eigenvalues of $\Sv$ are real if $\Sv$ is real.
Eigenvectors of $\Sv$ corresponding to distinct eigenvalues are orthogonal, and an orthonormal basis of eigenvectors can always be chosen (even when eigenvalues are repeated, by picking an orthonormal basis within each eigenspace).
We always have a full set of eigenvectors, even when some eigenvalues are repeated. For example, consider the identity matrix. It has only one eigenvalue, $\lambda=1$ , but every vector is an eigenvector.

Let $\Sv$ be an $n\times n$ symmetric matrix with eigenvalues $\lambda_1, \ldots, \lambda_n$ . If we consider the matrix

\Lambdav = \begin{pmatrix} \lambda_1 & \dots & 0 \\ \vdots & \ddots & \vdots \\ 0 & \dots & \lambda_n \end{pmatrix},

we see that $\Sv$ and $\Lambdav$ are similar matrices. This means that there must be an invertible $\Mv$ such that $\Sv = \Mv\Lambda \Mv^{-1}$ (equivalently, $\Lambdav = \Mv^{-1}\Sv\Mv$ ). It is not hard to see that $\Mv$ is the eigenvector matrix (columns of $\Mv$ are eigenvectors of $\Sv$ ), and we have the spectral decomposition $\Sv = \Mv\Lambda \Mv^{-1} = \Qv\Lambda \Qv^{\rm T}$ , where we use $\Qv$ for $\Mv$ when the eigenvectors are chosen orthonormal so that $\Mv^{-1} = \Mv^{\rm T}$ .

General matrices

For a diagonalizable square matrix $\Av$ (which may not be symmetric), we can factorize it as

\Av = \Xv\Lambdav \Xv^{-1},

where $\Xv$ is the eigenvector matrix and $\Lambdav$ is the diagonal eigenvalue matrix. This factorization is another way to look at the fact that the eigenvectors of powers of $\Av$ are the same as those of $\Av$ , with correspondingly exponentiated eigenvalues. It is only when $\Av$ is symmetric (more generally, normal) that we can use $\Xv^{-1} = \Xv^{\rm T}$ , since the eigenvectors are orthogonal in that case. A non-diagonalizable (defective) matrix does not admit this factorization, and one has to use the more general Jordan normal form instead.