Expectation, Variance, Covariance
1. Expectation
- Discrete:
E[X]=sum_x x p(x) - Continuous:
E[X]=integral x f(x) dx
Linearity: E[aX+bY+c]=aE[X]+bE[Y]+c.
2. Variance
Var(X)=E[(X-E[X])^2]=E[X^2]-E[X]^2.
Scaling: Var(aX+b)=a^2 Var(X).
3. Covariance and Correlation
Cov(X,Y)=E[(X-E[X])(Y-E[Y])].
Corr(X,Y)=Cov(X,Y)/(sigma_X sigma_Y).
Uncorrelated does not always mean independent.
4. Conditional Expectation
E[X|Y] is a random variable measurable w.r.t. Y.
Law of Total Expectation
E[X]=E[E[X|Y]].
Law of Total Variance
Var(X)=E[Var(X|Y)] + Var(E[X|Y]).
These identities are fundamental in probabilistic ML and hierarchical modeling.
5. Worked Example
Suppose machine chooses mode Y in {A,B} with probabilities 0.7,0.3. If E[X|Y=A]=10, E[X|Y=B]=20, then E[X]=0.7*10+0.3*20=13.
6. Theorem (Variance of Sum)
Var(X+Y)=Var(X)+Var(Y)+2Cov(X,Y).
For independent variables, covariance is zero.
Exercises
- Compute covariance and correlation for two short sample vectors.
- Prove linearity of expectation without independence assumptions.
- Derive variance of average of i.i.d variables.
- Use total expectation in a two-stage random process.