Skip to content

Radon-Nikodym

The general construction of conditional expectation rests on a single theorem from measure theory: the Radon-Nikodym theorem. It says that whenever one measure is “dominated” by another in a precise sense, the dominated measure can be written as an integral against the dominating one. The integrand is a density function, generalizing the elementary notion of a probability density.

The intuition: μ\mu is “at least as fine” as ν\nu. Anywhere μ\mu assigns no mass, ν\nu also assigns no mass. So μ\mu controls ν\nu in the sense that μ\mu-null sets are also ν\nu-null sets. The notation νμ\nu \ll \mu reflects this hierarchy.

Examples.

  • If ν\nu has a density f0f \ge 0 with respect to μ\mu, meaning ν(A)=Afdμ\nu(A) = \int_A f \, d\mu for every AA, then νμ\nu \ll \mu. Whenever μ(A)=0\mu(A) = 0, the integral vanishes regardless of ff.
  • A point mass δ0\delta_0 on R\R is not absolutely continuous w.r.t. Lebesgue measure λ\lambda: the set {0}\{0\} has λ({0})=0\lambda(\{0\}) = 0 but δ0({0})=1\delta_0(\{0\}) = 1. The Lebesgue measure does not “see” individual points, but δ0\delta_0 does.
  • The Cantor distribution is also not absolutely continuous w.r.t. λ\lambda: it sits on the Cantor set, which has Lebesgue measure zero but Cantor-measure one.

Two measures μ\mu and ν\nu are called mutually singular, denoted μν\mu \perp \nu, when there exists AFA \in \cF with μ(A)=0\mu(A) = 0 and ν(Ac)=0\nu(A^c) = 0. Absolute continuity and singularity are opposite poles: every pair of σ\sigma-finite measures admits a unique decomposition ν=νac+νs\nu = \nu_{ac} + \nu_s with νacμ\nu_{ac} \ll \mu and νsμ\nu_s \perp \mu (Lebesgue decomposition).

In words: the whole space splits into countably many pieces, each of finite measure. Every probability measure is σ\sigma-finite (take Ω1=Ω\Omega_1 = \Omega, all other Ωn=\Omega_n = \emptyset). Lebesgue measure on R\R is σ\sigma-finite (take Ωn=[n,n]\Omega_n = [-n, n]), even though λ(R)=\lambda(\R) = \infty. Counting measure on an uncountable set is not σ\sigma-finite, and the Radon-Nikodym theorem fails in that case.

The function ff is called the Radon-Nikodym derivative of ν\nu with respect to μ\mu, and is denoted

f  =  dνdμ.f \;=\; \frac{d\nu}{d\mu}.

The notation is chosen to make the identity above mnemonic:

Adνdμdμ  =  Adν  =  ν(A),\int_A \frac{d\nu}{d\mu} \, d\mu \;=\; \int_A d\nu \;=\; \nu(A),

which reads as if dμd\mu “cancels”. The cancellation is formal, not literal, but it captures the working calculus of densities.

Reading the hypotheses.

  • νμ\nu \ll \mu is necessary. If ν\nu assigned positive mass to a μ\mu-null set, no density against μ\mu could reproduce that mass, since Afdμ=0\int_A f \, d\mu = 0 on every μ\mu-null AA.
  • σ\sigma-finiteness is necessary too. Without it, the density may fail to exist or fail to be unique up to μ\mu-null sets.

Reading the conclusion. The single density ff encodes the entire measure ν\nu: every value ν(A)\nu(A) is recovered by integrating ff over AA against μ\mu. So ν\nu and ff carry the same information, with ff being the more concrete object. Absolute continuity is thus a sufficient condition for the existence of a density.

Why this matters for conditional expectation

Section titled “Why this matters for conditional expectation”

Given an integrable random variable XX on (Ω,F,P)(\Omega, \cF, \Pr) and a sub-σ\sigma-field GF\cG \subseteq \cF, the construction of E(XG)\E(X \mid \cG) goes as follows. Assume first that X0X \ge 0. Define a set function on G\cG by

ν(A)  :=  AXdP,AG.\nu(A) \;:=\; \int_A X \, d\Pr, \qquad A \in \cG.

Then ν\nu is a finite measure on (Ω,G)(\Omega, \cG), and it is absolutely continuous with respect to the restriction PG\Pr |_\cG of P\Pr to G\cG: if P(A)=0\Pr(A) = 0 then AXdP=0\int_A X \, d\Pr = 0, so ν(A)=0\nu(A) = 0.

Both ν\nu and PG\Pr |_\cG are finite measures on (Ω,G)(\Omega, \cG), hence σ\sigma-finite. The Radon-Nikodym theorem applied on the measurable space (Ω,G)(\Omega, \cG) produces a G\cG-measurable density YY such that

AYdP  =  ν(A)  =  AXdPfor every AG.\int_A Y \, d\Pr \;=\; \nu(A) \;=\; \int_A X \, d\Pr \qquad \text{for every } A \in \cG.

This YY is exactly E(XG)\E(X \mid \cG). Conditions (1) and (2) of the definition are satisfied:

  • YY is G\cG-measurable by construction (it is a Radon-Nikodym derivative on (Ω,G)(\Omega, \cG)).
  • The integration identity AYdP=AXdP\int_A Y \, d\Pr = \int_A X \, d\Pr holds for every AGA \in \cG.

For general integrable XX, split X=X+XX = X^+ - X^- into positive and negative parts, apply the construction to each, and subtract. The details are worked out in the existence section.

In one line: conditional expectation is a Radon-Nikodym derivative on the smaller σ\sigma-field.