Skip to content

Random Variables

Intuitively, a random variable represents a numerical value determined by the outcome of a random experiment. For example, if we roll a die, the outcome is “the face that shows up”, and a random variable XX could be “the number of dots on the face”.

In our formal framework, we define a random variable as a function mapping the sample space to real numbers. But not just any function — it must preserve the structure of our probability space.

Measurable Mapping Visualization

If (S,A)=(R,B)(S, \cA) = (\R, \cB), then XX is called a random variable.

When we discuss events defined by the value of XX, we often use the shorthand notation {XB}\{X \in B\} for {ω:X(ω)B}\{\omega : X(\omega) \in B\}.

We want to be able to answer probabilistic questions about XX, such as “What is the probability that XX is greater than 5?”. In set notation, we are asking for P({ω:X(ω)>5})\Pr(\{\omega : X(\omega) > 5\}).

For this probability to be defined, the subset {ω:X(ω)>5}\{\omega : X(\omega) > 5\} must be an event (i.e., it must belong to F\cF), because the probability measure P\Pr is only defined on F\cF. The condition X1(B)FX^{-1}(B) \in \cF ensures precisely this: for any “reasonable” question we ask about the value of XX (represented by a Borel set BB), the set of outcomes satisfying it is an event we can measure.

Every random variable XX naturally comes with a σ\sigma-field that describes the information contained in XX.

Intuition: σ(X)\sigma(X) is the smallest σ\sigma-field on Ω\Omega that makes XX a random variable. It contains exactly the events whose occurrence (or non-occurrence) can be determined just by knowing the value of XX. If σ(X)\sigma(X) is smaller than F\cF, it means XX “forgets” some information about the outcome ω\omega.

A random variable XX transports the probability measure from (Ω,F)(\Omega, \cF) to (R,B)(\R, \cB). This “transported” measure is what we usually call the distribution of XX.

Everything we check about the distribution of XX (like its PDF, CDF, mean, variance) is essentially a property of this measure μX\mu_X.

We say random variables are independent if the information they generate is independent.

This is equivalent to the condition that for any Borel sets C,DBC, D \in \cB:

P(XC,YD)=P(XC)P(YD)\Pr(X \in C, Y \in D) = \Pr(X \in C)\Pr(Y \in D)

or in terms of CDFs: FX,Y(x,y)=FX(x)FY(y)F_{X,Y}(x,y) = F_X(x)F_Y(y).