MathStatisticsDistributions

Empirical rule

Provided by: Edvancium

8 minutes read

In this topic, we'll focus on the empirical rule — a section of the probability theory. This statistical rule helps to create risk models that are associated with certain decisions and alternative outcomes. For example, investors use empirical rule formula to predict future stock prices, marketing managers can reduce the likelihood of failure in product sales, and casinos can develop games of chance, such that their profit is maximized.

Probability

Basically, the probability is the degree to which something can happen. Probabilities $P$ can be expressed as proportions that range from $0$ to $1$ . Moreover, they can be expressed as percentages ranging from $0\%$ to $100\%$ . Roughly speaking, $0$ indicates the impossibility of the future event, and $1$ indicates certainty.

A scale of probability of an event in range from 0 to 1.

For example, when a coin is tossed, there are two possible outcomes: heads $H$ or tails $T$ . So, the probability of the coin landing heads as well as tails is $50\%$ or $0.5$

$P(H)=P(T)=0.5$

To determine the probability of a single event, first, we need to know the total number of favorable and possible outcomes. Let's say we need to determine the probability of getting $4$ when throwing a die. In total, we have six possible outcomes: $1,2, 3, 4, 5, 6$ . Obviously, there's only one favorable outcome for us: $4$ . Now that we know the number of outcomes, we use the probability formula:

$\text{Probability of an event}= \dfrac{\text{number of ways it can occur}}{\text{ total number of outcomes}}$

The probability of rolling a $4$ when throwing a die can be calculated as $P(4) = 1 / 6 ≈ 0.167$ . However, in these examples, we know exactly the number of possible outcomes, and it's rather small. But what if we have millions of outcomes? Or we even don't know the exact number of outcomes? That's when the law of large numbers (LLN) comes into play.

Chebyshev's inequality

LLN states that if the same experiment or study is repeated independently many times, the average of the results of the trials must be close to the expected value. Note that the independence of the experiments is critical. LLN needs to have a set of random variables not influencing one another. Getting coins from a bag one after another while knowing, for example, the number of gold and silver ones there, won't be considered independent experiments because each time the probability of taking out another silver coin depends on the number of silver coins taken out previously.

Chebyshev's inequality refers to the law of large numbers. The theorem states that a certain proportion of any data set must fall within a particular range around the mean value. This is determined by the standard deviation of the data. The mean is referred to as the average of all the numbers in the data set. Chebyshev's theorem is significant because it applies to a wide distribution range. It can be stated mathematically in the form of an inequality, which is why the theorem is often referred to as Chebyshev's inequality.

Let $X$ be a random variable with a finite mean $μ$ and a finite standard deviation $σ$ and let $k > 0$ be any positive number. The probability that the difference between $X$ and $μ$ is more than $k$ standard deviations can't be more than $1 \over k^2$ .

$P ( | X − μ | ≥ k σ ) ≤ {1 \over k^2}$

Now, let's look at an example. Suppose a class takes a test. The average score is $75$ and the standard deviation is $5$ . What is the proportion of scores that fall between $65$ and $85$ ?

The mean is $75$ . $65$ and $85$ are $10$ points below and above the mean, respectively. The standard deviation is $5$ . Consequently, you want to determine the proportion of scores that fall within $10 / 5 = 2$ standard deviations of the mean.

$P ( | X − 75 | ≥ 2 \cdot 5 ) ≤ {1 \over 2^2}$

$P ≤ {1 \over 4}$ or $P ≤ 25\%$

Here we calculated the probability that any student's score falls more than $2$ standard deviations away from the mean (less than $65$ or more than $85$ ). But we need the opposite: the proportion of scores that fall less than $2$ standard deviations away, so $100\%-25\%=75\%$ . Therefore, $P ≥ 75\%$ , which means that at least $75\%$ of the scores will fall within the range of $65-85$ .

Three-sigma rule

However, a lot of data follows the pattern of a normal distribution. In this case, we usually use the much more convenient $68-95-99.7$ or the $3σ\:rule$ . It is sometimes called the empirical rule because it originally came from observations and its name translates as "based on observation". You can use this so-called rule of thumb when you are told your data is normal, nearly normal, or if you have a normal, bell-shaped distribution with a single peak. Basically, it's a version of Chebyshev's inequality that gives more precise predictions. Let's take a look at it:

About $68\%$ of the values fall within one standard deviation $(1σ)$ of the mean: $P ( | X − μ | ≤ σ ) ≥ 0.68$
About $95\%$ of the values fall within two standard deviations $(2σ)$ from the mean: $P ( | X − μ | ≤ 2σ ) ≥ 0.95$
About $99.7\%$ of the values fall within three standard deviations $(3σ)$ from the mean: $P ( | X − μ | ≤ 3σ ) ≥ 0.997$

For example, we have observed daily sales in a store for some time. The range of values follows a normal distribution with an average value of $150\ 000$ dollars and a standard deviation of $20\ 000$ dollars. Then, according to the $3σ$ rule, sales lower than $150\ 000 - 20\ 000\cdot3 = 90\ 000$ and higher than $150\ 000 + 20,000\cdot3 = 210\ 000$ are practically impossible events. This means that it makes no sense to consider these sales as potentially possible.

However, it is often important to get precise results. Let's look at another example. The intelligence quotient (IQ) scores of human population are normally distributed with a mean of $100$ and a size of deviation equal to $15$ . What percentage of people have an IQ between $100$ and $130$ ? According to the empirical rule:

$68\%$ of people have an IQ between $85$ and $115$ (IQ more than $100-15*1$ and less than $100+15*1$ ).

$95\%$ of people have an IQ between $70$ and $130$ (IQ more than $100-15*2$ and less than $100+15*2$ ).

$99.7\%$ of people have an IQ between $55$ and $145$ (IQ more than $100-15*3$ and less than $100+15*3$ ).

But we don't need data about the interval of IQ between $70$ and $130$ , only about $100$ $-$ $130$ . This area is to the right of the mean if we look at the empirical rule graph.

The percentage of intelligence quotient between 100 and 110 IQ.

In this case, to get the answer, we halve the percentage: $95\% / 2 = 47.5\%$ .

In this case, to get the answer, we halve the empirical rule percentages: $95\% / 2 = 47.5\%$ .

Conclusion

In this topic, we got acquainted with the notion of probability. We familiarized ourselves with different methods that are used to calculate the probability of any given event, some of them are:

The probability formula: is applied in straightforward cases with a few outcomes and looks as follows:

$\text{Probability of an event}= \dfrac{\text{number of ways it can occur}}{\text{ total number of outcomes}}$

Chebyshev's theorem: it is particularly useful in situations when the mean and variance are known, as it can be applied to any probability distribution in that case.
The empirical rule: is used to predict probable outcomes for a random variable that follows a normal distribution.

3 learners liked this piece of theory. 0 didn't like it. What about you?

Report a typo

Empirical rule

Probability

Chebyshev's inequality

Three-sigma rule

Conclusion

Related topics