Eigenvalues and Eigenvectors

Naturally, eigenvalues are defined only for square matrices because for non-square matrices, the transformation changes the dimensions of the resulting vector, hence $\Av \vv \neq \lambda \vv$ for any pair $(\lambda, \vv)$ . Normally, for an $n\times n$ matrix, there are $n$ independent eigenvectors with $\Av\vv_i = \lambda_i\vv_i$ .

Geometric interpretation

One intuitive way to understand eigenvectors is to look at any matrix $\Av$ as a transformation being applied to a vector $\vv$ . A transformation can rotate and/or scale a vector. There’s a very nice video about this by Grant Sanderson where he explains this idea with beautiful visuals. In essence, eigenvectors of a matrix (transformation) are those vectors that are not rotated, but only scaled without changing the direction. The magnitude of that scale is what we call the eigenvalue.

Properties

The key property of eigenvectors can be seen by looking at $\Av^k$ for some positive integer $k$ . Let’s say $\vv$ is an eigenvector of $\Av$ with eigenvalue $\lambda$ . Then we have

\Av^k\vv = \Av^{k-1}(\lambda \vv) = \lambda (\Av^{k-1}\vv) = \dots = \lambda^k\vv.

Thus, if $\vv$ is an eigenvector of $\Av$ with eigenvalue $\lambda$ then $\vv$ is also an eigenvector of $\Av^k$ with eigenvalue $\lambda^k$ . This also tells us another key fact about matrices:

The $n$ independent eigenvectors of $\Av$ form a basis for $\mathbb{R}^n$ , so any vector $\xv$ can be expressed as a linear combination of the eigenvectors $\vv_i$ :

\xv = \sum_{i=1}^{n} c_i\vv_i.

Similarity

A matrix $\Bv$ is said to be similar to $\Av$ if $\Bv = \Mv^{-1}\Av \Mv$ for some invertible matrix $\Mv$ .

This simple result is very useful in computation. For example, a software like MATLAB will use a sequence of matrices $\Mv_1, \Mv_2, \ldots, \Mv_k$ and reduce $\Av$ to $\Bv_k$ where $\Bv_0 = \Av$ and $\Bv_i = \Mv_i^{-1}\Bv_{i-1}\Mv_i$ for $i>0$ . This sequence can be chosen carefully so as to make $\Bv_k$ a diagonal matrix, hence, the eigenvalues show up on the diagonal. This helps us calculate the eigenvalues of a matrix much faster.

For a matrix $\Av$ , the eigenspace associated with a set of eigenvalues is defined to be the subspace spanned by the eigenvectors associated with those eigenvalues.

Some obvious but important facts are as follows:

The sum of the eigenvalues of $\Av$ is ${\rm Trace}(\Av)$ , which is the sum of the elements on the diagonal.
The product of the eigenvalues of $\Av$ is $\det \Av$ .
In general, eigenvalues of $\Av + \Bv$ or $\Av\Bv$ cannot be inferred directly from eigenvalues of $\Av$ and $\Bv$ .

The first two facts can be proved easily using the characteristic polynomial

P(\lambda) = \det(\Av - \lambda \Iv) = 0.

We know that the sum of the roots of such a polynomial is determined by the coefficient of $\lambda^{n-1}$ , $n$ being the degree of the polynomial. Also, the constant term in the polynomial determines the product of the roots.

Symmetric matrices

If $\Sv$ is a symmetric matrix, then

Eigenvalues of $\Sv$ are real if $\Sv$ is real.
Eigenvectors of $\Sv$ are orthogonal.
We may have a full set of eigenvectors even if some eigenvalues are repeated. For example, consider the identity matrix. It has only one eigenvalue, $\lambda=1$ , but every vector is an eigenvector.

Let $\Sv$ be an $n\times n$ symmetric matrix with eigenvalues $\lambda_1, \ldots, \lambda_n$ . If we consider the matrix

\Lambdav = \begin{pmatrix} \lambda_1 & \dots & 0 \\ \vdots & \ddots & \vdots \\ 0 & \dots & \lambda_n \end{pmatrix},

we see that $\Sv$ and $\Lambdav$ are similar matrices. This means that there must be an $\Mv$ such that $\Sv = \Mv^{-1}\Lambda \Mv$ . It is not hard to see that $\Mv$ is the eigenvector matrix (columns of $\Mv$ are eigenvectors of $\Sv$ ), and we have the spectral decomposition $\Sv = \Mv\Lambda \Mv^{-1} = \Qv\Lambda \Qv^{\rm T}$ .

General matrices

For a general square matrix $\Av$ (may not be symmetric), we can factorize it as

\Av = \Xv\Lambdav \Xv^{-1},

where $\Xv$ is the eigenvector matrix and $\Lambdav$ is the diagonal eigenvalue matrix. This factorization is another way to look at the fact that the eigenvectors of exponents of $\Av$ are the same as that of $\Av$ , with corresponding exponentiated eigenvalues. It is only when $\Av$ is symmetric that we can use $\Xv^{-1} = \Xv^{\rm T}$ , since the eigenvectors are orthogonal in that case.