Skip to content

Spectral Clustering

Not all learning runs on feature vectors. Often the data is a graph: nodes joined by weighted edges that record similarity, and the task is to split the nodes into clusters that are densely connected inside and sparsely connected across. Spectral clustering solves a relaxation of this combinatorial problem with linear algebra alone. It builds one symmetric matrix from the graph, the Laplacian, and reads the partition off its eigenvector for the second-smallest eigenvalue. The method ties together positive definiteness, the eigenvectors of a symmetric matrix, and the Rayleigh quotient.

Let GG have nn nodes and a symmetric weight matrix W\Wv with wij=wji0w_{ij} = w_{ji} \ge 0 and wii=0w_{ii} = 0, where wijw_{ij} measures the similarity between nodes ii and jj. The degree of node ii is di=jwijd_i = \sum_j w_{ij}.

The Laplacian’s value comes from the quadratic form it defines, which measures how much a function on the nodes varies across edges.

The quadratic form is a sum of penalties wij(xixj)2w_{ij}(x_i - x_j)^2, one per edge: it is small exactly when x\xv takes nearly equal values on strongly connected nodes. This is why the Laplacian detects cluster structure.

Connected components live in the nullspace

Section titled “Connected components live in the nullspace”

The values of x\xv that make the form vanish are precisely the functions constant on each piece of the graph.

A graph in one piece therefore has λ1=0\lambda_1 = 0 simple, and the next eigenvalue λ2>0\lambda_2 > 0 is the algebraic connectivity. When the graph is one piece but nearly splits into two, λ2\lambda_2 is small and its eigenvector is close to a step function that is constant on each near-component: the signal that there is a good cut to make.

To split the nodes into two groups AA and its complement, count the weight crossing the boundary, the cut cut(A)=iA,jAwij\operatorname{cut}(A) = \sum_{i \in A,\, j \notin A} w_{ij}. Minimizing the cut alone tends to peel off a single node, so the cut is balanced by the group sizes. The ratio cut objective cut(A)(1A+1Ac)\operatorname{cut}(A)\big(\tfrac{1}{|A|} + \tfrac{1}{|A^c|}\big) encodes both the boundary weight and the balance, and minimizing it over all partitions is NP-hard. Relaxing the discrete membership to a real vector turns it into an eigenvalue problem.

The relaxed solution is a real vector, so it is turned back into a partition by its sign: nodes with (q2)i>0(\qv_2)_i > 0 go to one cluster, the rest to the other. This recovers the intuition from the near-disconnected case, where q2\qv_2 already looks like the indicator of the split.

The widget builds the Laplacian of a small graph, computes its Fiedler vector, and colors each node by the sign of that vector. Edges crossing the resulting partition are highlighted: they are the cut, and the relaxation chooses the split that keeps their total weight small relative to the group sizes.

01234567
cluster A  cluster B  cut edges. Sign of the Fiedler vector splits the nodes 4 : 4, cutting 1 edge. Algebraic connectivity λ₂ = 0.291; a small λ₂ means a clean split exists.

Spectral clustering is the eigenvector face of cutting a graph. The combinatorial problem is hard, but its continuous relaxation is exactly a Rayleigh-quotient minimization, and the second eigenvector of the Laplacian carries the partition.