The characteristic function is a powerful tool in probability theory, often serving as the “probability version” of the Fourier transform .
Using Euler’s formula, we can express this as:
φ X ( t ) = E [ cos ( t X ) ] + i E [ sin ( t X ) ] \varphi_X(t) = \mathbb{E}[\cos(tX)] + i\mathbb{E}[\sin(tX)] φ X ( t ) = E [ cos ( tX )] + i E [ sin ( tX )]
One of the most useful properties of characteristic functions is handling sums of independent variables.
Let X ∼ Poi ( λ ) X \sim \text{Poi}(\lambda) X ∼ Poi ( λ ) . The probability mass function is:
P ( X = k ) = e − λ λ k k ! , k = 0 , 1 , … \mathbb{P}(X=k) = e^{-\lambda} \frac{\lambda^k}{k!}, \quad k=0, 1, \dots P ( X = k ) = e − λ k ! λ k , k = 0 , 1 , …
We compute its characteristic function:
φ X ( t ) = E [ e i t X ] = ∑ k = 0 ∞ e i t k e − λ λ k k ! = e − λ ∑ k = 0 ∞ ( λ e i t ) k k ! = e − λ e λ e i t = e λ ( e i t − 1 ) \begin{aligned}
\varphi_X(t) &= \mathbb{E}[e^{itX}] = \sum_{k=0}^\infty e^{itk} e^{-\lambda} \frac{\lambda^k}{k!} \\
&= e^{-\lambda} \sum_{k=0}^\infty \frac{(\lambda e^{it})^k}{k!} \\
&= e^{-\lambda} e^{\lambda e^{it}} = e^{\lambda(e^{it} - 1)}
\end{aligned} φ X ( t ) = E [ e i tX ] = k = 0 ∑ ∞ e i t k e − λ k ! λ k = e − λ k = 0 ∑ ∞ k ! ( λ e i t ) k = e − λ e λ e i t = e λ ( e i t − 1 )
Now let Y ∼ Poi ( η ) Y \sim \text{Poi}(\eta) Y ∼ Poi ( η ) be independent of X X X .
Using the convolution property:
φ X + Y ( t ) = φ X ( t ) φ Y ( t ) = e λ ( e i t − 1 ) e η ( e i t − 1 ) = e ( λ + η ) ( e i t − 1 ) \begin{aligned}
\varphi_{X+Y}(t) &= \varphi_X(t) \varphi_Y(t) \\
&= e^{\lambda(e^{it} - 1)} e^{\eta(e^{it} - 1)} \\
&= e^{(\lambda + \eta)(e^{it} - 1)}
\end{aligned} φ X + Y ( t ) = φ X ( t ) φ Y ( t ) = e λ ( e i t − 1 ) e η ( e i t − 1 ) = e ( λ + η ) ( e i t − 1 )
We observe that this is exactly the characteristic function of a random variable distributed as Poi ( λ + η ) \text{Poi}(\lambda + \eta) Poi ( λ + η ) .
Question
Does the observation occurring above guarantee that X + Y ∼ Poi ( λ + η ) X+Y \sim \text{Poi}(\lambda + \eta) X + Y ∼ Poi ( λ + η ) ?
General Question : Does the characteristic function uniquely determine the distribution?
Yes! The characteristic function is essentially a Fourier transform, and the Fourier transform is invertible. This means that if two random variables have the same characteristic function, they must have the same distribution (CDF).
We will formalize this with the Inversion Theorem next.
Let X ∼ N ( 0 , 1 ) X \sim \mathcal{N}(0, 1) X ∼ N ( 0 , 1 ) . The density is f ( x ) = 1 2 π e − x 2 / 2 f(x) = \frac{1}{\sqrt{2\pi}} e^{-x^2/2} f ( x ) = 2 π 1 e − x 2 /2 .
φ ( t ) = ∫ − ∞ ∞ e i t x 1 2 π e − x 2 / 2 d x = 1 2 π ∫ − ∞ ∞ e − x 2 / 2 + i t x d x = e − t 2 / 2 2 π ∫ − ∞ ∞ e − ( x − i t ) 2 / 2 d x \begin{aligned}
\varphi(t) &= \int_{-\infty}^\infty e^{itx} \frac{1}{\sqrt{2\pi}} e^{-x^2/2} \, dx \\
&= \frac{1}{\sqrt{2\pi}} \int_{-\infty}^\infty e^{-x^2/2 + itx} \, dx \\
&= \frac{e^{-t^2/2}}{\sqrt{2\pi}} \int_{-\infty}^\infty e^{-(x - it)^2/2} \, dx
\end{aligned} φ ( t ) = ∫ − ∞ ∞ e i t x 2 π 1 e − x 2 /2 d x = 2 π 1 ∫ − ∞ ∞ e − x 2 /2 + i t x d x = 2 π e − t 2 /2 ∫ − ∞ ∞ e − ( x − i t ) 2 /2 d x
By substituting y = x − i t y = x - it y = x − i t and noting that the integral of the Gaussian PDF is 1 (even with a complex shift, which can be rigorously justified using contour integration):
∫ − ∞ ∞ 1 2 π e − ( x − i t ) 2 / 2 d x = 1 \int_{-\infty}^\infty \frac{1}{\sqrt{2\pi}} e^{-(x-it)^2/2} \, dx = 1 ∫ − ∞ ∞ 2 π 1 e − ( x − i t ) 2 /2 d x = 1
Thus:
φ ( t ) = e − t 2 / 2 \varphi(t) = e^{-t^2/2} φ ( t ) = e − t 2 /2
Let Y ∼ N ( μ , σ 2 ) Y \sim \mathcal{N}(\mu, \sigma^2) Y ∼ N ( μ , σ 2 ) .
We can write Y = μ + σ X Y = \mu + \sigma X Y = μ + σ X where X ∼ N ( 0 , 1 ) X \sim \mathcal{N}(0, 1) X ∼ N ( 0 , 1 ) .
Using the linear transformation property of CHFs (φ a X + b ( t ) = e i t b φ X ( a t ) \varphi_{aX+b}(t) = e^{itb}\varphi_X(at) φ a X + b ( t ) = e i t b φ X ( a t ) ):
φ Y ( t ) = e i t μ φ X ( σ t ) = e i t μ e − ( σ t ) 2 / 2 = e i t μ − σ 2 t 2 / 2 \begin{aligned}
\varphi_Y(t) &= e^{it\mu} \varphi_X(\sigma t) \\
&= e^{it\mu} e^{-(\sigma t)^2/2} \\
&= e^{it\mu - \sigma^2 t^2/2}
\end{aligned} φ Y ( t ) = e i t μ φ X ( σ t ) = e i t μ e − ( σ t ) 2 /2 = e i t μ − σ 2 t 2 /2
Let Y ∼ N ( μ 1 , σ 1 2 ) Y \sim \mathcal{N}(\mu_1, \sigma_1^2) Y ∼ N ( μ 1 , σ 1 2 ) and Z ∼ N ( μ 2 , σ 2 2 ) Z \sim \mathcal{N}(\mu_2, \sigma_2^2) Z ∼ N ( μ 2 , σ 2 2 ) be independent.
φ Y + Z ( t ) = φ Y ( t ) φ Z ( t ) = exp ( i t μ 1 − σ 1 2 t 2 2 ) exp ( i t μ 2 − σ 2 2 t 2 2 ) = exp ( i t ( μ 1 + μ 2 ) − ( σ 1 2 + σ 2 2 ) t 2 2 ) \begin{aligned}
\varphi_{Y+Z}(t) &= \varphi_Y(t) \varphi_Z(t) \\
&= \exp\left(it\mu_1 - \frac{\sigma_1^2 t^2}{2}\right) \exp\left(it\mu_2 - \frac{\sigma_2^2 t^2}{2}\right) \\
&= \exp\left(it(\mu_1 + \mu_2) - \frac{(\sigma_1^2 + \sigma_2^2)t^2}{2}\right)
\end{aligned} φ Y + Z ( t ) = φ Y ( t ) φ Z ( t ) = exp ( i t μ 1 − 2 σ 1 2 t 2 ) exp ( i t μ 2 − 2 σ 2 2 t 2 ) = exp ( i t ( μ 1 + μ 2 ) − 2 ( σ 1 2 + σ 2 2 ) t 2 )
This is precisely the characteristic function of a Normal distribution with mean μ 1 + μ 2 \mu_1 + \mu_2 μ 1 + μ 2 and variance σ 1 2 + σ 2 2 \sigma_1^2 + \sigma_2^2 σ 1 2 + σ 2 2 .
⟹ Y + Z ∼ N ( μ 1 + μ 2 , σ 1 2 + σ 2 2 ) \implies Y+Z \sim \mathcal{N}(\mu_1 + \mu_2, \sigma_1^2 + \sigma_2^2) ⟹ Y + Z ∼ N ( μ 1 + μ 2 , σ 1 2 + σ 2 2 )
Importance of CHF
It was obvious that:
E [ Y + Z ] = μ 1 + μ 2 \mathbb{E}[Y+Z] = \mu_1 + \mu_2 E [ Y + Z ] = μ 1 + μ 2 (Linearity of Expectation)
Var ( Y + Z ) = σ 1 2 + σ 2 2 \text{Var}(Y+Z) = \sigma_1^2 + \sigma_2^2 Var ( Y + Z ) = σ 1 2 + σ 2 2 (Linearity of Variance for independent variables)
BUT , it was not obvious that Y + Z Y+Z Y + Z follows a Normal distribution !
We know this only because of the Inversion Formula (uniqueness theorem) of characteristic functions: since the CHF matches that of a Normal, the variable must be Normal.