MathProbabilityContinuous random variables

Continuous random variable

15 minutes read

Imagine yourself in a hurry to get to your office. You are driving down the road with a lot of obstacles (other cars, bumps, traffic lights, etc.). So you have to adjust the vehicle's speed depending on these obstacles: slow it down when you are close to one, and speed it up a little bit when the road is clear. Usually, you don't know which obstacles you will face on the road. So we can say that the process of slowing down and speeding up is random (probabilistic).

If you look at the speedometer from time to time, you may notice that your car's speed takes all values from a certain speed interval (for example from 0 to 60 km per hour). Now, if you set a function that would give the exact value of your car's speed at a random moment of the trip, you get a special type of function which is called a continuous random variable.

This topic will introduce you to the notion of continuous random variables.

Distribution function

We are already familiar with discrete random variables. Such variable takes a set of values {x1,x2,...,xk,...}\left \{ x_{1}, x_{2}, ..., x_{k}, ... \right \} with the allocated probabilities {p1,p2,...,pk,...}\left \{ p_{1}, p_{2}, ..., p_{k}, ... \right \} (both sets may be finite or infinite — simultaneously, of course). This means that if you conduct a lot of trials each time obtaining a specific value of the discrete random variable, approximately in p1p_1-st part of the trials you'll get the result x1x_1, approximately in p2p_2-nd part of the trials you'll get the result x2x_2, and so on. The more trials you conduct, the closer these parts are to the specified probabilities.

Note that the sum of the values in the probability set equals one:

p1+p2+...+pk+...=1p_{1} + p_{2} + ... + p_{k}+... =1

The problem is that you cannot describe continuous random variables in the same simple way because it takes values from an interval (which again may be finite or infinite) and there is no way to list all the points from a given interval. So in order to define a continuous random variable we have to use certain tricks. Actually, there are two possible approaches to do it, which are deeply connected to each other.

The first approach to define a continuous random variable is to use a cumulative distribution function (CDF).

For a given continuous random variable XX, cumulative distribution function F(x)F(x) can be defined as the probability of XX taking value which is less or equal to xx (here xRx\in\mathbb{R} is allowed to be any real number):

F(x)=P(Xx)F(x)=P(X\leq x)

Note that it doesn't matter if the inequality here is strict or not, since the probability of XX taking the exact value xx is always zero.

As one can easily see from this definition, as xx grows, the distribution function accumulates all the values which XX can possibly take. That is why it is called "cumulative" (or sometimes "integral"). We will not use the word "cumulative" further, referring to this function as a distribution function.

Fig.1 shows an example of a particular distribution function, which is defined on the whole number line (from -\infin to ++\infin) although the graph has been plotted only from 10-10 to 1010.

An example of the distribution function

Fig.1. An example of the distribution function

The basic properties of a distribution function can be derived directly from its definition. They may seem obvious but they need to be written down.

  • Its domain is the whole number line;
  • Its range is a closed interval [0,1]\left [ 0,1 \right ];
  • It is a non-decreasing function: if x1x2x_{1}\leq x_{2} then F(x1)F(x2)F(x_{1})\leq F(x_{2});
  • It is a continuous function;
  • limxF(x)=0\lim\limits_{x\to-\infty}F(x)=0 and limx+F(x)=1\lim\limits_{x\to+\infty}F(x)=1

Since the events XxX\le x and X>xX>x are mutually exclusive and together they add up to a certain event (which happens with probability 1), we have P(X>x)=1F(x)P(X>x)=1-F(x). This allows to express the probability of XX taking value in a given interval [a,b][a,b] in terms of distribution function:

P(aXb)=1P(Xa or X>b)=1(F(a)+1F(b))=F(b)F(a)P(a\le X\le b)=1-P(X\le a\ \text{or}\ X>b)=1-(F(a)+1-F(b))=F(b)-F(a)

Probability density function

Another way to describe a continuous random variable is to define its density function (or simply density), which shows how values of a random variable are "spread" over its range. You may also think of density as a relative likelihood that a random variable takes the given value.

Let's consider an example just to get more familiar with the concept. Suppose that bacteria of a certain species typically live 1 to 3 hours. The corresponding continuous random variable is the lifespan of any particular bacterium. What is the probability that a bacterium lives exactly, say, 2 hours (here we assume that the time can be measured with absolute precision)? It equals zero. But a lot of bacteria have a lifespan somewhat close to 2 hours. So if you take an interval of time around the 2-hour mark, there is a non-zero probability that a bacterium's lifespan belongs to that interval. This means that the density of our random variable is positive around 2 hours.

Now, if for some reason our bacteria tends to live fast and die young (so most of them do not last even for 2 hours), the density function would have larger values on the interval [1,2][1,2] than on interval [2,3][2,3]. Finally, since every bacterium lives at least 1 hour and no bacteria exceed 3-hour lifespan, the density for x<1x<1 and for x>3x>3 equals zero.

Now let's give the formal definition. A continuous random variable XX is said to have probability density function f(x)f(x) if the following equality holds for every aa and bb from the domain of XX:

P(aXb)=abf(x)dxP(a\le X\le b)=\int\limits_a^b f(x)\mathrm{d}x

Using the power of calculus we immediately see that f(x)=ddxF(x)f(x)=\frac{\mathrm{d} }{\mathrm{d} x} F(x), whenever the derivative exists. And this is the previously mentioned connection between the two ways of defining a continuous random variable. It depends on the situation which of them is more convenient. Distribution function is intuitively more clear but in many cases you don't know the exact formula for it. Instead, you have sample observations, i.e. a set of values of a random variable you have to work with. In this case, you should use the density function.

Fig.2 shows the graph of the density function corresponding to the distribution function shown above (see Fig.1).

An example of a partial density function

Fig.2. An example of a partial density function

Now we can list the main properties of probability density functions. Recall that

P(aXb)=abf(x)dx=F(b)F(a)P(a\leq X\leq b) = \int\limits_{a}^{b}f(x)\mathrm{d}x = F(b)-F(a)

From this formula and the properties of the distribution function we deduce:

  • +f(x)dx=1\int\limits_{-\infty}^{+\infty}f(x)\mathrm{d}x=1

or, in terms of graphs, the area under the plot of density function always equals 1 (this is because F(+)=1F(+\infty)=1).

  • Since F(x)F(x) is a non-decreasing function, f(x)f(x) is a non-negative function (i.e. f(x)0   xRf(x)\ge0\ \ \ \forall x\in\mathbb{R}).

Now let's consider an important example.

Uniform distribution

Suppose you are going down to the subway on the escalator, just about to get out on the platform. You want to estimate how long you will have to wait for the train, but the only thing you know for sure is that the train arrives every 10 minutes. We can assume that you reach the platform at a random moment in time. How can we calculate that the waiting time is, for example, no more than 5 minutes?

Let us start with defining the density function for this case. We know that the period of time between two consecutive trains does not depend on the moment when you reach the platform. Therefore, the probability density is constant and equals to 110\frac{1}{10} on the time interval, and equals to 0 beyond the interval. This may be written in the following form:

f(x)={0,x<0110,x[0,10]0,x>10f(x)=\left\{\begin{matrix} 0, x< 0\\ \frac{1}{10}, x \in \left [ 0, 10 \right ] \\ 0, x> 10 \end{matrix}\right.

And now we can estimate the probability of waiting for no more than 5 minutes for a train to come, i.e. P(0X5)P(0\leq X\leq 5).

Now we may either directly use the definition of the density function and calculate the integral (which is quite simple) or choose a more visual and descriptive approach by drawing the corresponding graph. Let's do both.

By definition:

P(0X5)=05f(x)dx=05110dx=110x05=(50)/10=0.5P(0\leq X\leq 5)=\int\limits_0^5 f(x)\mathrm{d}x=\int\limits_0^5 \frac1{10}\mathrm{d}x=\frac1{10}\cdot \left.x\right|_{0}^5=(5-0)/10=0.5

The plot of f(x)f(x) is shown on Fig.3.

Uniform distribution with (a=0)

Fig.3. Uniform distribution with a=0a=0 and b=10b=10

The pink rectangular area here is a probability in question. Let's calculate it:

P(0x5)=(50)×0.1=0.5P(0\leq x\leq5) = (5-0)\times0.1=0.5

Hence the probability of waiting in our case is 50%.

This example is a particular case of the so-called uniform distribution. In the general case it can be described as follows. Uniform distribution is a distribution that has a density function with a simple property: it takes values from a particular interval [a,b]\left [ a, b \right ], and the probability equals 1ba\frac1{b-a} on this interval, and equals 0 beyond the interval. Such density function can be written this way:

f(x)={1ba,x[a,b]0,x[a,b]f(x)= \left\{\begin{matrix} \frac{1}{b -a }, x \in \left [ a, b \right ]\\ 0, x \notin \left [ a, b \right ] \end{matrix}\right.

This formula allows us to give an expression for the distribution function of a uniform distribution in the general case:

F(x)={0,x<axaba,x[a,b]1,x>bF(x)=\left\{\begin{matrix} 0, x< a\\ \frac{x-a}{b-a}, x \in \left [ a, b \right ] \\ 1, x> b \end{matrix}\right.

Fig.4 shows its typical plot.

Distribution function with (a=0)

Fig.4. Distribution function with a=0a=0 and b=10b=10

Conclusion

Now we know that probability density function and distribution function are used to define continuous random variables. Moreover, we have used the uniform distribution to see the connection between the probability density function and the distribution function via simple visualizations and formulas. And finally, we have figured out how to calculate the probability that a continuous random variable takes values from a certain interval.

12 learners liked this piece of theory. 1 didn't like it. What about you?
Report a typo