Consider the probability that no two people out of a group of
will have matching birthdays out of
equally possible birthdays. Start with an arbitrary person's birthday, then note that the probability that the second person's birthday is different is
, that the third person's birthday is different from the first two is
, and so on, up through the
th person. Explicitly,
| (1) | |||
| (2) |
But this can be written in terms of factorials as
| (3) |
so the probability that two or more people out of a group of
do have the same birthday is therefore
| (4) | |||
| (5) |
In general, let denote the probability that a birthday is shared by exactly
(and no more) people out of a group of
people. Then the probability that a birthday is shared by
or more people is given by
| (6) |
In general, can be computed using the recurrence relation
| (7) |
(Finch 1997). However, the time to compute this recursive function grows exponentially with and so rapidly becomes unwieldy.
If 365-day years have been assumed, i.e., the existence of leap days is ignored, and the distribution of birthdays is assumed to be uniform throughout the year (in actuality, there is a more than 6% increase from the average in September in the United States; Peterson 1998), then the number of people needed for there to be at least a 50% chance that at least two share birthdays is the smallest such that
. This is given by
, since
| (8) | |||
| (9) |
The number of people needed to obtain
for
, 2, ..., are 2, 2, 3, 3, 3, 4, 4, 4, 4, 5, ... (OEIS A033810). The minimal number of people to give a 50% probability of having at least
coincident birthdays is 1, 23, 88, 187, 313, 460, 623, 798, 985, 1181, 1385, 1596, 1813, ... (OEIS A014088; Diaconis and Mosteller 1989).
The probability can be estimated as
| (10) | |||
| (11) |
where the latter has error
| (12) |
(Sayrafiezadeh 1994).
can be computed explicitly as
| (13) | |||
| (14) |
where is a binomial coefficient and
is a hypergeometric function. This gives the explicit formula for
as
| (15) | |||
| (16) |
where is a regularized hypergeometric function.
A good approximation to the number of people such that
is some given value can be given by solving the equation
| (17) |
for and taking
, where
is the ceiling function (Diaconis and Mosteller 1989). For
and
, 2, 3, ..., this formula gives
, 23, 88, 187, 313, 459, 622, 797, 983, 1179, 1382, 1592, 1809, ... (OEIS A050255), which differ from the true values by from 0 to 4. A much simpler but also poorer approximation for
such that
for
is given by
| (18) |
(Diaconis and Mosteller 1989), which gives 86, 185, 307, 448, 606, 778, 965, 1164, 1376, 1599, 1832, ... for , 4, ... (OEIS A050256).
The "almost" birthday problem, which asks the number of people needed such that two have a birthday within a day of each other, was considered by Abramson and Moser (1970), who showed that 14 people suffice. An approximation for the minimum number of people needed to get a 50-50 chance that two have a match within days out of
possible is given by
| (19) |
(Sevast'yanov 1972, Diaconis and Mosteller 1989).