开发者

How to compute information entropy in a two-step decision?

开发者 https://www.devze.com 2022-12-20 21:57 出处:网络
I have a question which I think involves \"conditional entropy\" in the field of information theory.I am trying to wrap my head around it, but could use some help.Consider an example in which we have

I have a question which I think involves "conditional entropy" in the field of information theory. I am trying to wrap my head around it, but could use some help. Consider an example in which we have four houses. In the first house there are eight people, four people live in the second house, and there are two people in third house, and two people in the fourth house. So, four houses and sixteen people. If I simply choose one of these people at random, then that choice is a selection from among sixteen people, yielding an information entropy of 4 bits for that choice.

But now consider a two-step selection in which first I choose one house at random, and then I choose one of the people in the selected house. So the first step, that of picking one house from the four houses available, generates two bits of information entropy. But now, in the 25% of the time that I pick the first house, the second step adds another three bits in the choosing of one person from among the eight people in the first house. In another 25% of the cases, I need only another two bits to select one person from the four that live in the second house. And finally, in fully half of the cases, I need only a single bit to pick one person from the pair that lives in either the third or the fourth house.

Somehow, it seems to me, that the weighted average of bit-counts for the two-step approach should generate the same four bit total that the single-step method requires. But I can not get the figures to add up, so clearly there is more to the math than I am considering. I was expecting that you should simply be able to add up the probabilities like so:

(picking a house) + (picking a person in that house) ==

log(4) + [(1/4)*log(8) + (1/4)*log(4) + (1/4)*log(2) + (1/4)*log(2)]

But this produces a result of 3.75 bits, and not the 4 bits that I am expecting. Here is a bit of python that I used to evaluate this开发者_运维问答.

from math import log
def log2(x):
    return log(x,2)
x = log2(4) + ((1.0/4)*log2(8) + (1.0/4)*log2(4) + (1.0/4)*log2(2) + (1.0/4)*log2(2))
print x

So, something is missing from my figures. Can anyone point me in the right direction?


If you choose a house at random (with uniform probability, UP for short), then choose a resident at random (UP), you're not choosing one out of 16 UP -- you have a somewhat skewed distribution, which unsurprisingly yields lower entropy (UP maximizes entropy). Eight people are being selected with a probability of 1/32 each, four are being selected with a probability of 1/16 each, and the other four with a probability of 1/8 each. This distribution has an entropy of 3.75 bits, just as you computed with your different approach.

0

精彩评论

暂无评论...
验证码 换一张
取 消