Techniques for calculating expected values

December 20, 2025

This is a list of techniques I’ve found useful for computing expected values. In the interest of clarity and intuition over strict theoretical correctness, I’ve opted to take some liberties with the notation and not to document every assumption.

  • Weighted average of possible values

    • E[X]=ixiP(X=xi)\mathbb{E}[X] = \displaystyle\sum_i x_i \mathbb{P}(X = x_i)
    • This is the standard definition of expected value. It’s usually only useful if you can easily calculate the probability distribution P(X=xi)\mathbb{P}(X = x_i).
    • Example: If RR is the value of a single roll of a die, then
    E[R]=(116)+(216)++(616)=72\mathbb{E}[R] = \left(1\cdot\frac16\right) + \left(2\cdot\frac16\right) + \cdots + \left(6\cdot\frac16\right) = \frac{7}{2}
  • Linearity of expectation

    • E[X+Y]=E[X]+E[Y]\mathbb{E}[X + Y] = \mathbb{E}[X] + \mathbb{E}[Y]
    • Example: If SS is the sum of two independent rolls of a die, then
    E[S]=E[R1+R2]=E[R1]+E[R2]=7\mathbb{E}[S] = \mathbb{E}[R_1 + R_2] = \mathbb{E}[R_1] + \mathbb{E}[R_2] = 7
    • Example: Suppose nn people enter a restaurant and leave their hats at reception. At the end of dinner, each person gets a hat back at random. In expectation, how many people get their own hat back?
      • Let HH be the number of people who correctly get their own hat back and HiH_i be an indicator variable indicating whether the iith person got their own hat back.
      • E[H]=E[H1]+E[H2]++E[Hn]=nE[H1]=n(1/n)=1\mathbb{E}[H] = \mathbb{E}[H_1] + \mathbb{E}[H_2] + \cdots + \mathbb{E}[H_n] = n\mathbb{E}[H_1] = n(1/n) = 1
      • Note that HiH_i and HjH_j aren’t independent, but linearity of expectation works anyway! HH is a classic example of a random variable whose expected value is much easier to compute than its distribution.
  • Law of total expectation

    • E[X]=E[E[XY]]\mathbb{E}[X] = \mathbb{E}[\mathbb{E}[X|Y]]
    • Sometimes the random variable is simpler if you condition on another random variable.
    • Example: Suppose you roll a fair die and record the value. Then you continue rolling the die until you obtain a value at least as large as the first roll. In expectation, how many additional rolls are there after the first?
      • Let NN be the number of additional rolls after the first and X1X_1 be the value of the first roll.
      • Given X1=iX_1 = i, each subsequent roll succeeds with probability (7i)/6(7-i)/6, so NX1N | X_1 is geometric with mean 6/(7i)6 / (7-i).
      • Then E[N]=E[E[NX1]]=E[6/(7X1)]=i=16(1/6)(6/(7i))=49/20\mathbb{E}[N] = \mathbb{E}[\mathbb{E}[N|X_1]] = \mathbb{E}[6 / (7 - X_1)] = \sum_{i=1}^{6} (1/6)(6/(7-i)) = 49/20.
      • In this case, it’s hard to understand NN without first conditioning on X1X_1, but we can calculate E[N]\mathbb{E}[N] by “averaging” over all the possible values of X1X_1.
  • Symmetry

    • If (X1,,Xn)(X_1, \ldots, X_n) are exchangeable and i=1nXi=T\sum_{i=1}^{n} X_i = T, then E[Xi]=E[T]/n\mathbb{E}[X_i] = \mathbb{E}[T] / n.
    • Example: If you randomly break a stick of length LL into nn pieces, then the expected length of the leftmost piece is L/nL/n.
    • Example: Suppose each day either 100 or 200 birds appear (each with 50% probability) and then divide themselves randomly among 15 statues. Let BB be the total number of birds on each day and BiB_i be the number of birds on the iith statue. Then E[Bi]=E[B]/15=150/15=10\mathbb{E}[B_i] = \mathbb{E}[B] / 15 = 150 / 15 = 10
  • Recursion

    • Example: What is the expected number of coin flips to see HTH?
      • There are 4 states: 0 (Start), 1 (Have seen H), 2 (Have seen HT), and 3 (Done).
      • Let XiX_i represent the number of flips to transition from state ii to state 3. We want to compute E[X0]\mathbb{E}[X_0].
      • Obviously X3=0X_3 = 0, so E[X3]=0\mathbb{E}[X_3] = 0.
      • From state 2, we flip once and transition to either state 3 or state 0, so E[X2]=1+E[X3]/2+E[X0]/2=1+E[X0]/2\mathbb{E}[X_2] = 1 + \mathbb{E}[X_3]/2 + \mathbb{E}[X_0]/2 = 1 + \mathbb{E}[X_0]/2
      • From state 1, we flip once and transition to either state 1 or state 2, so E[X1]=1+E[X1]/2+E[X2]/2\mathbb{E}[X_1] = 1 + \mathbb{E}[X_1]/2 + \mathbb{E}[X_2]/2
      • From state 0, we flip once and transition to either state 0 or state 1, so E[X0]=1+E[X0]/2+E[X1]/2\mathbb{E}[X_0] = 1 + \mathbb{E}[X_0]/2 + \mathbb{E}[X_1]/2
      • We have three equations in three variables. Solving yields E[X0]=10\mathbb{E}[X_0] = 10, E[X1]=8\mathbb{E}[X_1] = 8, E[X2]=6\mathbb{E}[X_2] = 6.
  • Product of independent random variables

    • If XX and YY are independent random variables, then E[XY]=E[X]E[Y]\mathbb{E}[XY] = \mathbb{E}[X]\mathbb{E}[Y]
    • Example: A factory makes widgets. Each widget goes through 2 inspections. Each inspection has an independent 90% chance of passing. Let I1,I2I_1, I_2 be the indicator variables for passing each respective inspection. Then E[I1I2]=E[I1]E[I2]=81%\mathbb{E}[I_1I_2] = \mathbb{E}[I_1]\mathbb{E}[I_2] = 81\%
  • Wald’s equation

    • If X1,X2,X_1, X_2, \ldots are real and iid, and NN is a random integer, then E[X1++XN]=E[N]E[X1]\mathbb{E}[X_1 + \cdots + X_N] = \mathbb{E}[N] \cdot \mathbb{E}[X_1].
    • Example: A pirate opens identical treasure chests one-by-one. Each chest randomly contains between 1 and 6 coins. The pirate stops when he opens a “cursed” chest containing only 1 coin. Let NN be the number of chests opened, CC be the total number of coins collected, and CiC_i be the number of coins in the iith chest. Then E[C]=E[i=1NCi]=E[N]E[C1]=63.5=21\mathbb{E}[C] = \mathbb{E}[\sum_{i=1}^{N} C_i] = \mathbb{E}[N] \cdot \mathbb{E}[C_1] = 6 \cdot 3.5 = 21
  • Tail-sum

    • If XX is non-negative and continuous, then E[X]=0P(X>t) ⁣dt\mathbb{E}[X] = \displaystyle\int_{0}^{\infty} \mathbb{P}(X > t) \mathop{}\!\mathrm{d} t
    • If XX is non-negative and integer-valued, then E[X]=t=0P(X>t)\mathbb{E}[X] = \displaystyle\sum_{t=0}^{\infty} \mathbb{P}(X > t)
    • Example: Let NN be the number of coin flips needed to get the first head. Then E[N]=t=0P(N>t)=1+1/2+1/4+=2\mathbb{E}[N] = \sum_{t=0}^{\infty} \mathbb{P}(N > t) = 1 + 1/2 + 1/4 + \cdots = 2, since the event N>tN > t occurs if and only if the first tt flips are all tails and therefore P(N>t)=1/2t\mathbb{P}(N > t) = 1/2^t.