Let's say the annual rainfall is modeled as a normal distribution with the mean of and the standard deviation of . A farmer will have crop only if the rainfall is between and . What is the probability that the farmer will have crop this year? To answer this question, you will dive into the topic of normal distribution.
Normal random variable
We can call a continuous random variable Normal or Gaussian if it has a PDF that the following formula can describe:
The normal distribution graph looks like a symmetric bell-shaped curve, centered around its mean.
The symbol is the mean, is the standard deviation, and is the variance of the normal distribution. is assumed to be greater than zero.
To denote that a continuous random variable has a normal distribution, you can use the notation .
In the probability theory, the normal distribution is considered one of the most important distributions. It can accurately describe the distribution of values for many natural phenomena. Besides, it has very convenient analytical properties.
Below are a few real-life examples of normally distributed data:
- Human height distribution across the population;
- Human IQ;
- Blood pressure;
- Errors in measurements.
Standard normal distribution
A normal random distribution, where the mean is equal to zero () and the standard deviation is equal to one () is called a standard normal distribution and is denoted as . The graph is centered around the value. Have a look at it below.
The PDF formula for the standard normal variable becomes the following:
Standard normal table
The corresponding CDF of the standard normal distribution is denoted by :
These values correspond to the area under the PDF up until the required value of point .
For example, the CDF value for is , and it is equal to the blue shaded area under PDF of the standard normal distribution illustrated below:
Don't be afraid of the formula, because there is a table, known as a Standard normal table or Z-table, which records CDF values for the standard normal distribution. It is a very useful tool for calculating any normal distribution problem.
Standard normal table
From the table above, you can get the probability of an event that you are considering, if its distribution is normal. As for the example above, the CDF value for is = 0.84134. Thus, the probability of being less than one is about 84%.
If you want to see the Z-table in full, just refer to the special website.
Standardizing
You might think, how can you solve the general normal distribution problem, when the mean is not zero and the variance is not one? Obviously, you can't have an infinite number of tables for varying mean and variance quantities. The way to solve this is to convert general normal distribution into standard normal distribution, and this process is known as standardizing.
Let be the normal distribution with mean and variance . In order to convert into standard normal distribution , you will use Z-score formula:
is the "z-score" (standard score): it is the value that you will look at in the standard normal table to get the required probability.
Probability of getting crop this year
In order to solve the general normal distribution problem, you have to follow a couple of simple steps:
- Express the problem in terms of normal distribution.
- Standardize the given normal distribution.
- Look up probability using the Z-score result (or results in the standard normal table) and calculate.
Now you can go back to the problem formulated at the very beginning of the topic. To get the probability that the farmer will have crop this year, you should follow the algorithm above.
1. Express the problem in terms of normal distribution.
Let annual rainfall be represented as a continuous random variable . Since you know that it was modeled as a normal distribution, you can write it down in the form of . You also know the mean value, which is , and the standard deviation, which is . Now you can put everything together for this problem:
You need to calculate the probability that the farmer will have crop this year. It will happen only if the rainfall is between and . Let's illustrate it in terms of probability:
2. Standardize the given normal distribution.
To standardize into standard normal distribution , you will have to apply Z-score formula for both boundaries of your interest:
3. Look up probability using the Z-score result (or results in the standard normal table) and calculate.
Standard normal table only provides probability values up to a specific point of interest. Thus, you have to arrange the formula above to look like this:
Let's see them in the graph and get their values from the standard normal table.
Thus, the probability of the farmer getting crop this year is , which means it is about 82%.
Conclusion
Below is a summary of the concepts covered in this topic:
-
A continuous random variable is said to be normal or Gaussian if it has a PDF that can be described by the formula .
-
The normal distribution graph looks like a symmetric bell-shaped curve, centered around its mean.
-
To denote a continuous random variable as a normal distribution, you can use the notation .
-
A normal random distribution, where the mean is equal to zero () and the standard deviation is equal to one (), is called standard normal distribution and is denoted as .
-
A corresponding CDF of the standard normal distribution is the following:.
-
CDF values for the standard normal distribution are recorded in a table known as the standard normal table or Z-table.
-
The process of converting general normal distribution into standard normal distribution is known as standardizing.
In the following topics you will learn where the normal distribution plays a key role, such as in the central limit theorem, for example. You will also learn about other distributions and their implementations. You will implement your knowledge of random variables to learn some laws of statistics as well.