The
birthday paradox states that if there are 23 people in a room then there is roughly a 50/50 chance that two of them have the same birthday. This is not a
paradox in the sense of it leading to a logical contradiction; it is a paradox in the sense that it is a mathematical truth that contradicts common
intuition.
Calculating this probability (and related probabilities) is the birthday problem.
Note that, if you enter a room with 22 people, the chance that somebody there has the same birthday as you is not 50/50, but much lower. This is because the day of the year that must be the joint birthday is already given, namely, by your own birthday.
To compute the approximate probability that in a room of n people, at least two have the same birthday, we disregard leap years and assume that the 365 possible birthdays are equally likely.
The trick is to first calculate the probability that the n birthdays are different. This probability is given by
- <math>p = \frac{364}{365} \cdot \frac{363}{365} \cdot \frac{362}{365} \cdot \cdots \cdot \frac{365-n+1}{365}</math>
which can also be written:
- <math>p = { 365! \over 365^n (365-n)! }</math>
because the second person cannot have the same birthday as the first, the third cannot have the same birthday as the first two, etc.
If you compute the above probability p, then 1 - p is the probability that at least two persons have the same birthday. For n = 23 you will obtain a probability of about 0.507...
By contrast, the probability that someone in a room of n people has the same birthday as you is given by
- <math> 1- \left( \frac{364}{365} \right)^n </math>
which for
n = 22 gives only about 0.059.
The birthday paradox in its more generic sense applies to hash functions where the number of N-bit hashes you can generate before probably getting a collision is not 2N, but rather 2N/2. This is exploited by birthday attacks on cryptographical systems.
The birthday paradox and calculators
In his autobiography, Paul Halmos[?] wrote:
- "Hand-held calculators can be good things and they can have bad effects. The birthday problem can be used to exemplify a bad effect. A good way to attack the problem is to pose it in reverse: what's the largest number of people for which the probability is less than 1/2 that they all have different birthdays? .... the problem amounts to this: find the smallest n for which
- <math>\prod_{k=0}^{n-1}\left(1-\frac{k}{365}\right)<\frac{1}{2}.</math>
- The indicated product is dominated by
- <math>\left(\frac{1}{n}\sum_{k=0}^{n-1}\left(1-\frac{k}{365}\right)\right)^n<\left(\frac{1}{n}\int_0^n\left(1-\frac{x}{365}\right)\,dx\right)^n<\left(1-\frac{n}{730}\right)^n<e^{-n^2/730}.</math>
- The asserted domination comes from the celebrated relation between the geometric and arithmetic means; the next inequality comes from the definition of the definite integral, and the last one from 1 - x < e-x. .... The reasoning is based on important tools that all students of mathematics should have ready access to. The birthday problem used to be a splendid illustration of the advantages of pure thought over mechanical manipulation; the inequalities can be obtained in a minute or two, whereas the multiplications would take much longer, and be much more subject to error, whether the instrument is a pencil or an old-fashioned desk computer. .... What calculators do not yield is understanding, or mathematical facility, or a solid basis for more advanced, generalized theories. A pity."
External links
All Wikipedia text
is available under the
terms of the GNU Free Documentation License