Mutual Information and Channel Capacity

1. Mutual Information

I(X;Y) = H(X) - H(X|Y) = H(Y) - H(Y|X).

Equivalent form: I(X;Y)=sum_{x,y} p(x,y) log( p(x,y)/(p(x)p(y)) ).

If X -> Y -> Z is Markov chain, then:

I(X;Z) <= I(X;Y).

Interpretation: processing cannot create information about original source.

Capacity of channel p(y|x):

C = max_{p(x)} I(X;Y).

It is the highest reliable communication rate (bits/use).

Bit flips with probability p. Capacity: C = 1 - H_2(p) where H_2 is binary entropy.