9 minutes read

Let's say the annual rainfall is modeled as a normal distribution with the mean of 80 cm80\ cm and the standard deviation of 20 cm20\ cm. A farmer will have crop only if the rainfall is between 40 cm40\ cm and 100 cm100\ cm. What is the probability that the farmer will have crop this year? To answer this question, you will dive into the topic of normal distribution.

Normal random variable

We can call a continuous random variable XX Normal or Gaussian if it has a PDF that the following formula can describe:

fX(x)=1σ2πe(xμ)2/(2σ2)f_X(x) = \frac{1} {\sigma\sqrt{2\pi}}e^{-(x - \mu)^{2}/(2\sigma^{2}) }

The normal distribution graph looks like a symmetric bell-shaped curve, centered around its mean.

normal distribution graph

The symbol μ\mu is the mean, σ\sigma is the standard deviation, and σ2\sigma^{2} is the variance of the normal distribution. σ\sigma is assumed to be greater than zero.

To denote that a continuous random variable XX has a normal distribution, you can use the notation XN(μ,σ2)X \sim \mathcal{N}(\mu,\,\sigma^{2})\,.

In the probability theory, the normal distribution is considered one of the most important distributions. It can accurately describe the distribution of values for many natural phenomena. Besides, it has very convenient analytical properties.

Below are a few real-life examples of normally distributed data:

  • Human height distribution across the population;
  • Human IQ;
  • Blood pressure;
  • Errors in measurements.

Standard normal distribution

A normal random distribution, where the mean is equal to zero (μ=0\mu = 0) and the standard deviation is equal to one (σ=1\sigma = 1) is called a standard normal distribution and is denoted as N(0,1)\mathcal{N}(0,1)\,. The graph is centered around the μ=0\mu = 0 value. Have a look at it below.

normal distribution graph, the center is equal to 0.4, x in [-4, 4]

The PDF formula for the standard normal variable becomes the following:

fX(x)=12πex2/2f_X(x) = \frac{1} {\sqrt{2\pi}}e^{-x ^{2}/2}

Standard normal table

The corresponding CDF of the standard normal distribution is denoted by Φ\Phi:

Φ(x)=P(Xx)=P(X<x)=12πxet2/2dt\Phi(x)=P(X\leq x) = P(X< x)=\frac{1}{\sqrt{2\pi}}\int_{-\infty}^{x}e^{-t^{2}/2}dtThese values correspond to the area under the PDF up until the required value of point xx.

For example, the CDF value for x=1x = 1 is Φ(1)=P(X1)=P(X<1)\Phi(1)=P(X\leq 1) = P(X< 1), and it is equal to the blue shaded area under PDF of the standard normal distribution illustrated below:

normal distribution graph, the center is equal to 0.4, x in [-4, 4], x<=1

Don't be afraid of the formula, because there is a table, known as a Standard normal table or Z-table, which records CDF values for the standard normal distribution. It is a very useful tool for calculating any normal distribution problem.

Standard normal table

standard normal table

From the table above, you can get the probability of an event that you are considering, if its distribution is normal. As for the example above, the CDF value for x=1x = 1 is Φ(1)=P(X1)=P(X<1)\Phi(1)=P(X\leq 1) = P(X< 1) = 0.84134. Thus, the probability of XX being less than one is about 84%.

If you want to see the Z-table in full, just refer to the special website.

Standardizing

You might think, how can you solve the general normal distribution problem, when the mean is not zero and the variance is not one? Obviously, you can't have an infinite number of tables for varying mean and variance quantities. The way to solve this is to convert general normal distribution into standard normal distribution, and this process is known as standardizing.

Let XX be the normal distribution with mean μ\mu and variance σ2>0\sigma^{2} > 0. In order to convert XN(μ,σ2)X \sim \mathcal{N}(\mu,\,\sigma^{2})\, into standard normal distribution ZN(0,1)Z \sim\mathcal{N}(0,1)\,, you will use Z-score formula:

Z=XμσZ= \frac{X-\mu}{\sigma}

ZZ is the "z-score" (standard score): it is the value that you will look at in the standard normal table to get the required probability.

Probability of getting crop this year

In order to solve the general normal distribution problem, you have to follow a couple of simple steps:

  1. Express the problem in terms of normal distribution.
  2. Standardize the given normal distribution.
  3. Look up probability using the Z-score result (or results in the standard normal table) and calculate.

Now you can go back to the problem formulated at the very beginning of the topic. To get the probability that the farmer will have crop this year, you should follow the algorithm above.

1. Express the problem in terms of normal distribution.

Let annual rainfall be represented as a continuous random variable XX. Since you know that it was modeled as a normal distribution, you can write it down in the form of XN(μ,σ2)X \sim \mathcal{N}(\mu,\,\sigma^{2})\,. You also know the mean value, which is μ=80 cm\mu = 80\ cm, and the standard deviation, which is σ=20 cm\sigma = 20\ cm. Now you can put everything together for this problem:

XN(80,400)X \sim \mathcal{N}(80, 400)\,You need to calculate the probability that the farmer will have crop this year. It will happen only if the rainfall is between 40 cm40\ cm and 100 cm100\ cm. Let's illustrate it in terms of probability:

P(40X100)P(40 \leq X \leq 100)

2. Standardize the given normal distribution.

To standardize XN(80,400)X \sim \mathcal{N}(80, 400)\, into standard normal distribution ZN(0,1)Z \sim\mathcal{N}(0,1)\,, you will have to apply Z-score formula for both boundaries of your interest: Z=XμσZ= \frac{X-\mu}{\sigma}

P(40X100)=P(40μσXμσ100μσ)=P(408020Z1008020)=P(2Z1)P(40 \leq X \leq 100) = P(\frac{40-\mu}{\sigma} \leq \frac{X-\mu}{\sigma} \leq \frac{100-\mu}{\sigma}) = P(\frac{40-80}{20} \leq Z \leq \frac{100-80}{20}) = P(-2 \leq Z \leq 1)

3. Look up probability using the Z-score result (or results in the standard normal table) and calculate.

Standard normal table only provides probability values up to a specific point of interest. Thus, you have to arrange the formula above to look like this:

P(2Z1)=P(Z1)P(Z2)=Φ(1)Φ(2)P(-2 \leq Z \leq 1) = P(Z \leq 1) - P(Z \leq -2) = \Phi(1) - \Phi(-2)Let's see them in the graph and get their values from the standard normal table.

Φ(1)=P(Z1)=0.84134\Phi(1) = P(Z \leq 1) = 0.84134

normal distribution graph, the center is equal to 0.4, x in [-4, 4], x<=1

Φ(2)=P(Z2)=0,02275\Phi(-2) = P(Z \leq -2) = 0,02275

normal distribution graph, the center is equal to 0.4, x in [-4, 4], x<=-2

Φ(1)Φ(2)=P(Z1)P(Z2)=0.841340.02275=0.81859\Phi(1) - \Phi(-2) = P(Z \leq 1) - P(Z \leq -2) = 0.84134 - 0.02275 = 0.81859

normal distribution graph, the center is equal to 0.4, x in [-4, 4], -2<=x<=1

Thus, the probability of the farmer getting crop this year is 0.818590.81859, which means it is about 82%.

Conclusion

Below is a summary of the concepts covered in this topic:

  • A continuous random variable XX is said to be normal or Gaussian if it has a PDF that can be described by the formula fX(x)=1σ2πe(xμ)2/(2σ2)f_X(x) = \frac{1} {\sigma\sqrt{2\pi}}e^{-(x - \mu)^{2}/(2\sigma^{2}) }.

  • The normal distribution graph looks like a symmetric bell-shaped curve, centered around its mean.

  • To denote a continuous random variable XX as a normal distribution, you can use the notation XN(μ,σ2)X \sim \mathcal{N}(\mu,\,\sigma^{2})\,.

  • A normal random distribution, where the mean is equal to zero (μ=0\mu = 0) and the standard deviation is equal to one (σ=1\sigma = 1), is called standard normal distribution and is denoted as N(0,1)\mathcal{N}(0,1)\,.

  • A corresponding CDF of the standard normal distribution is the following:Φ(x)=P(Xx)=P(X<x)=12πxet2/2dt\Phi(x)=P(X\leq x) = P(X< x)=\frac{1}{\sqrt{2\pi}}\int_{-\infty}^{x}e^{-t^{2}/2}dt.

  • CDF values for the standard normal distribution are recorded in a table known as the standard normal table or Z-table.

  • The process of converting general normal distribution into standard normal distribution is known as standardizing.

In the following topics you will learn where the normal distribution plays a key role, such as in the central limit theorem, for example. You will also learn about other distributions and their implementations. You will implement your knowledge of random variables to learn some laws of statistics as well.

8 learners liked this piece of theory. 1 didn't like it. What about you?
Report a typo