The birthday paradox, also known as the birthday problem, is considered a counterintuitive phenomenon in probability. It basically asks the question, if you have “n” people at a party, what is the probability that they will share a birthday?
First, here is a calculator for you to try inputting a number and seeing how the probability changes:
birthday paradox calculator
The probability of having a common birthday is:
Now let us get on to explaining the math behind the birthday paradox.
Let us try to solve this problem for just 2 people by hand. If there are say 2 people at a party what is the probability that they will share a birthday? Let us assume we are not dealing with leap years and that birthdays can only be from a set of 365 possible days. The first person has 365 possibilities for a birthday. The second person has 365 possibilities for a birthday. Assuming birthdays are distributed uniformly at random (this is not true in real life), together there are 365 times 365 possibilities of how these two people’s birthdays will fall out. From these possibilities only 365 of them (one for each day) feature the two people sharing a birthday. Thus, the probability is 365 divided by (365 * 365), or 1/365, which is 0.002739, or 0.2739% which is really miniscule.
Another way to solve this problem is to compute the probability that two people will have different birthdays and subtract that from 1 (the total probability mass). The first person has 365 possibilities to have a birthday and once this person’s birthday is known or assigned, the second person has 364 possibilities to have a distinct birthday. Thus the total possibilities are 365 * 364. The probability we are looking for is thus: (365 * 364) / (365 * 365), or 364/365. Subtracting this from 1 gives us the desired probability of 1/365.
Now let us make it more interesting. Let us assume we have 3 people and we wish to determine the probability that two out of these three people will share a birthday. Can we update our probability calculations? Again it is easy to determine the probability that the three will all have distinct birthdays and then subtract that from 1. Thus we compute 1 - (365 * 364 * 363)/(365 * 365 * 365), or approx 0.0082, or 0.82%.
Note that the probability of 3 people having a common birthday has increased from that of 2 people sharing a birthday, i.e., from 0.2739% to 0.82%.
In this manner if we keep continuing this calculation we can see the probabilities increasing and determine when they have crossed a threshold.
Birthday Paradox Calculator in Python
Here is a Python program to compute such probabilities as we increase the number of people:
The above Python program calculates and prints the probability of at least two people sharing the same birthday for groups of people ranging from 2 to 50.
The function prob_different_birthdays(n) calculates the probability that all n people in a group have different birthdays. It does this by multiplying the probabilities that each new person added to the group has a birthday different from those already in the group. For the first person, there are 365 days they could be born on, so the probability is 365/365 = 1. For the second person, there are 364 days left that they could be born on, so the probability is 364/365. For the third person, there are 363 days left, so the probability is 363/365, and so on. The function multiplies these probabilities together for n people.
Finally, as we discussed above, the function then subtracts this probability from 1 to get the probability that at least two people in the group share a birthday (since the event of all people having different birthdays and the event of at least two people sharing a birthday are complementary events).
The loop at the end of the program calls this function for each number of people from 2 to 50, and prints the number of people and the corresponding probability of at least two people sharing a birthday.
The output is:
The interesting thing to note about the above output is that the probability starts small (0.27%) but by the time we reach a threshold of 23 people we cross 50%! This is the birthday paradox, namely that only 23 people are needed for that probability to exceed 50%.
This paradox arises from the compounding power of exponents, where the number of comparisons between birthdays increases dramatically with the number of people in the group, leading to a higher probability of shared birthdays than expected. By the time we reach 50 people, note that the event is quite close to being a certainty (the probability is 97%). Of course to get certainty we will need to have at least 366 people.
The birthday paradox can also be understood in terms of the pigeonhole principle, which states that if you have n pigeons and are sorting them into m holes, where m is less than n, then at least one of the holes will contain more than one pigeon. Here the holes refer to birthdays and pigeons refer to people.
If we update our Python program to run not just from 2 to 50, but from 2 to 366, ie by updating the program to:
The output will look like (only results from 50 people are shown):
Wow - what happened? How can we get a probability of 1.0 with just 153 people? Something sounds fishy. The answer is clearly you will need 366 people to guarantee a probability of 1 but because of roundoff errors in multiplying so many small numbers (and then subtracting them from 1) the Python program starts producing a result of 1 by 153 people. We should be careful when we do numerical calculations because by default Python programs do not use infinite precision arithmetic (or work with numbers as fractions).
Here is a sample run:
The birthday paradox has various real-world applications, including the cryptographic attack called the birthday attack, which exploits this probabilistic model to reduce the complexity of finding a collision for a hash function.
Kodeclik is an online coding academy for kids and teens to learn real world programming. Kids are introduced to coding in a fun and exciting way and are challeged to higher levels with engaging, high quality content.