We are now ready to define expectation. Let X X X be a random variable on a probability space ( Ω , F , P ) (\Omega, \mathcal{F}, \mathbb{P}) ( Ω , F , P ) . Then its expectation E X \mathbb{E}X E X is simply the Lebesgue integration of X X X on ( Ω , F , P ) (\Omega, \mathcal{F}, \mathbb{P}) ( Ω , F , P ) .
Recall that X X X is integrable only if E [ ∣ X ∣ ] < ∞ \mathbb{E}[|X|] < \infty E [ ∣ X ∣ ] < ∞ .
We have seen that E [ X ] = E [ X + ] − E [ X − ] \mathbb{E}[X] = \mathbb{E}[X^+] - \mathbb{E}[X^-] E [ X ] = E [ X + ] − E [ X − ] .
Thus X X X is integrable only if both E [ X + ] \mathbb{E}[X^+] E [ X + ] and E [ X − ] \mathbb{E}[X^-] E [ X − ] are finite.
Remark
We also allow one of E [ X + ] \mathbb{E}[X^+] E [ X + ] or E [ X − ] \mathbb{E}[X^-] E [ X − ] to be ∞ \infty ∞ .
In that case, we say E [ X ] = ∞ \mathbb{E}[X] = \infty E [ X ] = ∞ or − ∞ -\infty − ∞ .
If both are ∞ \infty ∞ , then the expectation is not defined .
Since expectation is an integral, it inherits all properties of the integral:
Linearity : E [ a X + b Y ] = a E [ X ] + b E [ Y ] \mathbb{E}[aX + bY] = a\mathbb{E}[X] + b\mathbb{E}[Y] E [ a X + bY ] = a E [ X ] + b E [ Y ]
Monotonicity : X ≤ Y X \le Y X ≤ Y a.s. ⟹ E [ X ] ≤ E [ Y ] \implies \mathbb{E}[X] \le \mathbb{E}[Y] ⟹ E [ X ] ≤ E [ Y ]
Triangle Inequality : ∣ E [ X ] ∣ ≤ E [ ∣ X ∣ ] |\mathbb{E}[X]| \le \mathbb{E}[|X|] ∣ E [ X ] ∣ ≤ E [ ∣ X ∣ ]
(These hold whenever the quantities are well-defined).
Question
In elementary probability courses, expectation is defined as:
E [ X ] = { ∑ i x i P ( X = x i ) if X is discrete ∫ x f ( x ) d x if X has density f \mathbb{E}[X] = \begin{cases} \sum_i x_i \mathbb{P}(X=x_i) & \text{if } X \text{ is discrete} \\ \int x f(x) \, dx & \text{if } X \text{ has density } f \end{cases} E [ X ] = { ∑ i x i P ( X = x i ) ∫ x f ( x ) d x if X is discrete if X has density f This is clearly not what we defined. The integration is not even on the right probability space (R \mathbb{R} R vs Ω \Omega Ω ).
Why are they the same?
The answer lies in the Change of Variable Formula , which relates the abstract integral on Ω \Omega Ω to an integral on R \mathbb{R} R using the distribution of X X X .
Application
In particular, if X X X is a random variable with density g g g (meaning μ ( d y ) = g ( y ) d y \mu(dy) = g(y) \, dy μ ( d y ) = g ( y ) d y ), then:
E [ f ( X ) ] = ∫ f ( y ) g ( y ) d y \mathbb{E}[f(X)] = \int f(y) g(y) \, dy E [ f ( X )] = ∫ f ( y ) g ( y ) d y If f ( x ) = x f(x) = x f ( x ) = x , we recover E [ X ] = ∫ y g ( y ) d y \mathbb{E}[X] = \int y g(y) \, dy E [ X ] = ∫ y g ( y ) d y .