8 minutes read

In this topic, we'll focus on the empirical rule — a section of the probability theory. This statistical rule helps to create risk models that are associated with certain decisions and alternative outcomes. For example, investors use empirical rule formula to predict future stock prices, marketing managers can reduce the likelihood of failure in product sales, and casinos can develop games of chance, such that their profit is maximized.

Probability

Basically, the probability is the degree to which something can happen. Probabilities PP can be expressed as proportions that range from 00 to 11. Moreover, they can be expressed as percentages ranging from 0%0\% to 100%100\%. Roughly speaking, 00 indicates the impossibility of the future event, and 11 indicates certainty.

A scale of probability of an event in range from 0 to 1.

For example, when a coin is tossed, there are two possible outcomes: heads HH or tails TT. So, the probability of the coin landing heads as well as tails is 50%50\% or 0.50.5

P(H)=P(T)=0.5P(H)=P(T)=0.5

To determine the probability of a single event, first, we need to know the total number of favorable and possible outcomes. Let's say we need to determine the probability of getting 44 when throwing a die. In total, we have six possible outcomes: 1,2,3,4,5,61,2, 3, 4, 5, 6. Obviously, there's only one favorable outcome for us: 44. Now that we know the number of outcomes, we use the probability formula:


Probability of an event=number of ways it can occur total number of outcomes\text{Probability of an event}= \dfrac{\text{number of ways it can occur}}{\text{ total number of outcomes}}

The probability of rolling a 44 when throwing a die can be calculated as P(4)=1/60.167P(4) = 1 / 6 ≈ 0.167. However, in these examples, we know exactly the number of possible outcomes, and it's rather small. But what if we have millions of outcomes? Or we even don't know the exact number of outcomes? That's when the law of large numbers (LLN) comes into play.

Chebyshev's inequality

LLN states that if the same experiment or study is repeated independently many times, the average of the results of the trials must be close to the expected value. Note that the independence of the experiments is critical. LLN needs to have a set of random variables not influencing one another. Getting coins from a bag one after another while knowing, for example, the number of gold and silver ones there, won't be considered independent experiments because each time the probability of taking out another silver coin depends on the number of silver coins taken out previously.

Chebyshev's inequality refers to the law of large numbers. The theorem states that a certain proportion of any data set must fall within a particular range around the mean value. This is determined by the standard deviation of the data. The mean is referred to as the average of all the numbers in the data set. Chebyshev's theorem is significant because it applies to a wide distribution range. It can be stated mathematically in the form of an inequality, which is why the theorem is often referred to as Chebyshev's inequality.

Let XX be a random variable with a finite mean μμ and a finite standard deviation σσ and let k>0k > 0 be any positive number. The probability that the difference between XX and μμ is more than kk standard deviations can't be more than 1k21 \over k^2.

P(Xμkσ)1k2P ( | X − μ | ≥ k σ ) ≤ {1 \over k^2}

Now, let's look at an example. Suppose a class takes a test. The average score is 7575 and the standard deviation is 55. What is the proportion of scores that fall between 6565 and 8585?

The mean is 7575. 6565 and 8585 are 1010 points below and above the mean, respectively. The standard deviation is 55. Consequently, you want to determine the proportion of scores that fall within 10/5=210 / 5 = 2 standard deviations of the mean.

P(X7525)122P ( | X − 75 | ≥ 2 \cdot 5 ) ≤ {1 \over 2^2}

P14P ≤ {1 \over 4} or P25%P ≤ 25\%

Here we calculated the probability that any student's score falls more than 22 standard deviations away from the mean (less than 6565 or more than 8585). But we need the opposite: the proportion of scores that fall less than 22 standard deviations away, so 100%25%=75%100\%-25\%=75\% . Therefore, P75%P ≥ 75\%, which means that at least 75%75\% of the scores will fall within the range of 658565-85.

Three-sigma rule

However, a lot of data follows the pattern of a normal distribution. In this case, we usually use the much more convenient 689599.768-95-99.7 or the 3σrule3σ\:rule. It is sometimes called the empirical rule because it originally came from observations and its name translates as "based on observation". You can use this so-called rule of thumb when you are told your data is normal, nearly normal, or if you have a normal, bell-shaped distribution with a single peak. Basically, it's a version of Chebyshev's inequality that gives more precise predictions. Let's take a look at it:

  1. About 68%68\% of the values fall within one standard deviation (1σ)(1σ) of the mean: P(Xμσ)0.68P ( | X − μ | ≤ σ ) ≥ 0.68

  2. About 95%95\% of the values fall within two standard deviations (2σ)(2σ)from the mean: P(Xμ2σ)0.95P ( | X − μ | ≤ 2σ ) ≥ 0.95

  3. About 99.7%99.7\% of the values fall within three standard deviations (3σ)(3σ) from the mean:P(Xμ3σ)0.997P ( | X − μ | ≤ 3σ ) ≥ 0.997Applying an empirical rule to a normal distribution histogram.

For example, we have observed daily sales in a store for some time. The range of values follows a normal distribution with an average value of 150 000150\ 000 dollars and a standard deviation of 20 00020\ 000 dollars. Then, according to the 3σ rule, sales lower than 150 00020 0003=90 000150\ 000 - 20\ 000\cdot3 = 90\ 000 and higher than 150 000+20,0003=210 000150\ 000 + 20,000\cdot3 = 210\ 000 are practically impossible events. This means that it makes no sense to consider these sales as potentially possible.

However, it is often important to get precise results. Let's look at another example. The intelligence quotient (IQ) scores of human population are normally distributed with a mean of 100100 and a size of deviation equal to 1515. What percentage of people have an IQ between 100100 and 130130? According to the empirical rule:

68%68\% of people have an IQ between 8585 and 115115 (IQ more than 100151100-15*1 and less than 100+151100+15*1).

95%95\%of people have an IQ between 7070 and 130130 (IQ more than 100152100-15*2 and less than 100+152100+15*2).

99.7%99.7\% of people have an IQ between 5555 and 145145 (IQ more than 100153100-15*3 and less than 100+153100+15*3).

But we don't need data about the interval of IQ between 7070 and 130130, only about 100100-130130. This area is to the right of the mean if we look at the empirical rule graph.

The percentage of intelligence quotient between 100 and 110 IQ.

In this case, to get the answer, we halve the percentage: 95%/2=47.5%95\% / 2 = 47.5\%.

In this case, to get the answer, we halve the empirical rule percentages: 95%/2=47.5%95\% / 2 = 47.5\%.

Conclusion

In this topic, we got acquainted with the notion of probability. We familiarized ourselves with different methods that are used to calculate the probability of any given event, some of them are:

  • The probability formula: is applied in straightforward cases with a few outcomes and looks as follows:

Probability of an event=number of ways it can occur total number of outcomes\text{Probability of an event}= \dfrac{\text{number of ways it can occur}}{\text{ total number of outcomes}}

  • Chebyshev's theorem: it is particularly useful in situations when the mean and variance are known, as it can be applied to any probability distribution in that case.
  • The empirical rule: is used to predict probable outcomes for a random variable that follows a normal distribution.
3 learners liked this piece of theory. 0 didn't like it. What about you?
Report a typo