Wald's Extension

Wald’s identity gives the mean of a random sum $Y = \sum_{i=1}^N X_i$ . The natural follow-up: what is the full distribution of $Y$ ? Conditioning on $N$ and combining with characteristic functions gives a clean answer: the characteristic function of $Y$ is the composition of the generating function of $N$ with the characteristic function of $X_1$ .

Probability generating function

The generating function uniquely determines the distribution of $Z$ (just as the characteristic function does for general distributions), and its derivatives at $t = 1$ recover the moments: $g_Z'(1) = \E[Z]$ , etc.

Theorem

Statement
Proof

Let $X_1, X_2, \ldots$ be i.i.d. with characteristic function $\varphi_{X_1}$ , and let $N \ge 0$ be an integer-valued random variable with generating function $g_N$ , independent of $\{X_i\}$ . Set $Y = \sum_{i=1}^N X_i$ . Then the characteristic function of $Y$ is

\varphi_Y(t) \;=\; g_N\!\big( \varphi_{X_1}(t) \big).

The reading: the characteristic function of $Y$ is the composition $g_N \circ \varphi_{X_1}$ . Random summation in time-domain becomes function composition in the transform domain.

Worked example: Geometric number of exponential summands

Let $X_1, X_2, \ldots$ be i.i.d. $\text{Exponential}(\lambda)$ , and let $N \sim \text{Geometric}(p)$ on $\{1, 2, \ldots\}$ with $\Pr(N = k) = (1-p)^{k-1} p$ , independent of the $X_i$ . The standard formulas:

\varphi_{X_1}(t) \;=\; \frac{\lambda}{\lambda - it}, \qquad g_N(t) \;=\; \frac{p t}{1 - (1 - p) t}.

Apply the theorem:

\begin{aligned} \varphi_Y(t) \;=\; g_N\!\big( \varphi_{X_1}(t) \big) &= \frac{p \cdot \frac{\lambda}{\lambda - it}}{1 - (1 - p) \cdot \frac{\lambda}{\lambda - it}} \\ &= \frac{p \lambda}{(\lambda - it) - (1 - p) \lambda} \\ &= \frac{p \lambda}{p \lambda - it}. \end{aligned}

This is the characteristic function of $\text{Exponential}(p\lambda)$ . So

Y \;=\; \sum_{i=1}^N X_i \;\sim\; \text{Exponential}(p \lambda).

A geometric number of i.i.d. exponentials is itself exponential, with rate scaled by the success probability $p$ . The mean is consistent with Wald’s identity:

\E[Y] \;=\; \E[X_1] \cdot \E[N] \;=\; \frac{1}{\lambda} \cdot \frac{1}{p} \;=\; \frac{1}{p \lambda}.

Why this is stronger than Wald’s Identity

Wald’s identity gives only the first moment, $\E[Y] = \E[X_1] \, \E[N]$ . Wald’s extension gives the entire characteristic function, which by the continuity theorem determines the entire distribution. From it one can extract all moments (by differentiating), recognize known distributions (as above), and prove distributional convergence results for random sums.