Skip to content

Computing Eigenvalues

Spelling out the characteristic polynomial of an n×nn \times n matrix and finding its roots is not how one computes eigenvalues in practice. For nn above a few dozen the polynomial route is hopelessly ill-conditioned, and beyond n4n \approx 4 there is no closed form at all. Practical eigensolvers iterate, and they all descend from a single geometric idea: multiplying by A\Av many times stretches a vector along the dominant direction.

Why it works: if A\Av has a basis of eigenvectors x1,,xn\xv_1, \dots, \xv_n with eigenvalues λ1>λ2\lvert \lambda_1 \rvert > \lvert \lambda_2 \rvert \ge \cdots, write v0=cixi\vv_0 = \sum c_i \xv_i. Then

Akv0=i=1nciλikxi=λ1k ⁣(c1x1+i2ci(λi/λ1)kxi).\Av^k \vv_0 = \sum_{i=1}^n c_i \lambda_i^k \xv_i = \lambda_1^k\!\left( c_1 \xv_1 + \sum_{i \ge 2} c_i (\lambda_i/\lambda_1)^k \xv_i \right).

All ratios (λi/λ1)k0(\lambda_i/\lambda_1)^k \to 0, so after normalization vk\vv_k aligns with x1\xv_1 provided c10c_1 \ne 0. The error shrinks like the rate λ2/λ1\lvert \lambda_2/\lambda_1 \rvert per step.

Edit the 2×22 \times 2 matrix below and watch the iterate turn toward the dashed line through the dominant eigenvector. The convergence rate displayed is λ2/λ1\lvert \lambda_2 / \lambda_1 \rvert; matrices with close eigenvalues converge slowly, matrices with a well-separated leading eigenvalue converge fast. Setting a12=1,a21=1a_{12} = 1, a_{21} = -1 (a rotation) produces complex eigenvalues and no real eigenvector, and the iterate spirals.

A =
step 0; Rayleigh quotient vTAv = 3.000; dominant eigenvalue λ₁ = 3.000, convergence rate |λ₂/λ₁| = 0.333.

Power iteration finds the largest eigenvalue. To target any other one, replace A\Av with (AμI)1(\Av - \mu \Iv)^{-1}: its eigenvalues are 1/(λiμ)1/(\lambda_i - \mu), and the largest in modulus belongs to the λi\lambda_i closest to μ\mu. Iterating

vk+1=(AμI)1vk(AμI)1vk\vv_{k+1} = \frac{(\Av - \mu\Iv)^{-1} \vv_k}{\lVert (\Av - \mu\Iv)^{-1} \vv_k \rVert}

(solving a linear system each step rather than inverting explicitly) drives vk\vv_k to the eigenvector for the eigenvalue nearest μ\mu, at rate (λnearμ)/(λnextμ)\lvert (\lambda_{\text{near}} - \mu)/(\lambda_{\text{next}} - \mu) \rvert. Taking μ=ρk\mu = \rho_k, the running Rayleigh quotient itself, gives Rayleigh quotient iteration, which is locally cubically convergent for symmetric A\Av: each step roughly cubes the digits of accuracy.

For all the eigenvalues at once, the QR algorithm is the modern workhorse. Repeatedly factor and reassemble:

A0=A,Ak=QkRk  (QR factorization),Ak+1=RkQk.\Av_0 = \Av, \qquad \Av_k = \Qv_k \Rv_k \ \ (\text{QR factorization}), \qquad \Av_{k+1} = \Rv_k \Qv_k.

Each step is a similarity transform: Ak+1=RkQk=QkTAkQk\Av_{k+1} = \Rv_k \Qv_k = \Qv_k^{\rm T} \Av_k \Qv_k, so the spectrum is preserved.

In practice, the basic algorithm is accelerated in two ways: a one-time reduction of A\Av to Hessenberg form (zeros below the first subdiagonal) makes each QR step cheap, and an eigenvalue shift AkμkI\Av_k - \mu_k \Iv at every step turns the rate λj+1/λj\lvert \lambda_{j+1}/\lambda_j \rvert into a much steeper one (cubic, with the Wilkinson shift). The combination is the algorithm of LAPACK and every modern eigensolver.

Singular values are eigenvalues of ATA\Av^{\rm T}\Av, so any eigenvalue algorithm computes the SVD in principle. Forming ATA\Av^{\rm T}\Av explicitly is the same numerically poor idea as solving least squares by the normal equations: the condition number squares. Practical SVD codes instead bidiagonalize A\Av with Householder reflections, A=U1BV1T\Av = \Uv_1 \Bv \Vv_1^{\rm T} with B\Bv upper bidiagonal, and then run an implicit QR sweep on B\Bv that operates as if it were on BTB\Bv^{\rm T}\Bv without ever building it. The Golub–Reinsch algorithm packages this into the standard SVD routine.