Imagine you were in a hurry and almost got on a bus when its doors rapidly closed right in front of you. "Don't worry," said an old lady at the bus stop, looking at your awkward situation. She told you that within the following 10 minutes, the next bus would definitely arrive. What is the probability that the next bus will arrive in no more than 8 minutes? The cumulative distribution function, abbreviated as CDF, is literally the answer to such a question. Let's see how it works and what a PMF has to do with it.
Intuition behind the CDF and its definition
Let's get back to the situation introduced above. You know for sure that the bus will arrive within the next 10 minutes. However, you don't know the exact moment when this will happen and if there are any other rules in this bus system. The next bus may arrive within the next minute, or during the 10th minute.
Since you do not know any other rules, you can assume that the bus arrives with equal probability at any minute. Thus, the probability that the next bus will arrive within the first minute is . This means that the probability of its arrival within the first two minutes is the sum of the probabilities that the bus will arrive during the first minute and during the second one. That is, such a probability is equal to .
Let's denote the number of minutes you wait for a bus as (rounded up, a wait of 3 minutes 20 seconds would mean that the bus arrived at the 4th minute of waiting). Then would mean the probability that the waiting time is no more than two minutes – that is, that the bus will arrive in the first two minutes. The probability of such an event, as calculated above, is .
It should be pretty clear that if the bus arrives with equal probability at any moment within ten minutes, then it will arrive with a probability of in the first five minutes, as shown below:
Accordingly, the arrival of the bus within the first eight minutes will have the probability of .
Now, how can you scale from this specific example to a more general case? Well, luckily, CDF works beyond just a bus waiting time example: it works for any random variable. Let's denote a random variable as . Instead of in your probability formula , you will write a random variable that is not tied to a specific case. It should be reasonable enough that you can't limit a random variable's values to ten minutes – after all, now you want to deal with a more general case. Let's say that the random variable will be bounded to some value of , then your formula will have a more general form of . This leads us to the formal definition of CDF.
Formal definition of CDF
The cumulative distribution function of a real-valued random variable is the following function:
where is the probability that .
Since CDF equals probability, it will inherit the properties of probabilities, as you will see in the following sections.
Illustration of CDF
Let's plot a CDF function graph for your bus example.
If is less than 1, then because just by starting to wait for the bus at the bus stop, you start the count from .
When , then . If , then .
You can probably already imagine what the whole graph will look like. As you might have guessed, with every new minute, the value of the function will increase by . But can you guess what the largest value of such a function will be? Well, since CDF is equivalent to probability, it can't be greater than . The function will take such a value when you wait the entire 10 minutes – after all, the old lady at the bus stop told you that the bus would definitely arrive within 10 minutes. Then the whole graph will look like this:
Now you see the cumulative effect: when becomes bigger, the probability of being less or equal to grows (or remains the same).
Properties of CDF
As you have seen before, is a non-decreasing function. This means that moving along the x-axis will never lead to decreasing the value.
CDF's properties come from the fact that it is a probability function. Here they are:
-
Values of lay between 0 and 1 (because the result of is probability).
-
For all , you have .
-
, .
This means that no matter how high you set , the CDF can't be higher than one.
The second part also comes from the fact that CDF is defined via probability. Therefore, it is bounded from below. It can't fall below zero, no matter how low you set to be.
Building CDF from scratch
At the very beginning, you assumed that the bus would arrive with an equal probability at any minute. You made this assumption because you didn't have any additional information. However, in real life that is not the case: the bus doesn't arrive with the same probability during minute 1 and during minute 9.
Imagine that the old lady from the bus stop also told you this: "The bus will not arrive in the next five minutes; most likely, it will arrive not earlier than in 8 minutes". It would mean that . Let's assume that the phrase "most likely" means that at every minute after 8 minutes the probability of the bus arriving will be , that is, . Now you are left with the minutes 6, 7, and 8, and the sum of the probabilities for these minutes equals . Let's again assume that the probability is the same for all these minutes, that is for each of them.
If you put everything together, here's what you'll have: , , and .
By definition, you get the following CDF:
The graph will look like this:
CDF via PMF
Probability Mass Function (PMF) shows the relative impact of each value on the distribution (actually, the probability). Hence, when it comes to building a CDF, PMF just shows points where the CDF value changes.
Let's look at the graph that illustrates both the PMF (green) and the CDF (red) in your bus case:
PMF shows that you have no chance of the bus arriving in 5 minutes, and specifies the weight of every remaining waiting minute. Each non-zero result of the PMF corresponds to an increase of the CDF.
Conclusion
To sum up what this topic has covered, below are some crucial points.
-
CDF is a function that shows distribution via cumulative effect.
-
CDF is a non-decreasing function, laying between 0 and 1.
-
PMF shows you where CDF changes its value.