Quite often, the sample space does not help us determine the numerical characteristics of an experiment. An excellent example of this is an experiment with the sum of two dice. Here, we are always dealing with a set of pairs. However, it is much easier to work with one set of numbers representing the sum of the elements in a pair.
For this purpose, we will now become familiar with the concepts of a discrete random variable, probability distribution, and the independence of a random variable. These concepts will later help us define characteristics of a random variable, such as expected value and dispersion.
In fact, you deal with discrete random variables all the time in real life when you pick an apple from a fridge, knowing that some of them are green and some of them are red, or choose a route through the park, knowing that some routes are long and others are short.
Discrete random variable
Imagine that your friend said they picked a number from to and asked you to guess it in three tries. We know that the sample space has elements, so it's discrete. We also know that your friend picked the number at random. So, we can say that this number is a discrete random variable. What would that be if we wanted to be more strict about this definition? How would we use such an entity?
A random variable is an arbitrary function from the sample space to the real numbers . However, we will only need a discrete random variable , which is a function from the sample space to the integers.
Let's discuss some more examples to better understand this concept.
Imagine tossing two dice. In this experiment, can be defined as:
where is the result of the first dice and is the second. The argument here is a pair because the sample space is a set of pairs. All the possible outcomes are presented in the following picture.
Imagine that the first dice has points, and the second one has . Their sum will look like this:
Let's move on to a coin toss, since it is also a random experiment. Why is it so? Let's say we tossed a coin times. We want to find out how many times tails come up. In that case, is a set of binary strings of length where is a tail and is a head.
So, if we toss a coin three times and tails come up twice, then the proper representation of binary sets is . We can define as the sum of bits in the string, so it gives us the number of tails.
We've already learned about one-dimensional random variables. But such variables can also have more than one dimension! For example, when you pick a seat at the cinema and expect to be seated next to your friend who already got a ticket somewhere in the middle, you are dealing with a two-dimensional random variable.
Distribution
Technically, we cannot calculate probability with a random variable because we map our elementary outcomes to numbers. However, as a workaround, we can define the probability of a random variable taking a specific value instead.
Let's assume that is a finite set. The set of values is finite, too. We can denote the set of values as . We can assume that an event is the set of points of the sample space for which . Mathematically, this statement should be written as:
We can calculate the probability of by summarizing the probabilities of each outcome in our set:
So, a function that describes the probability of a variable taking a specific value is called the probability distribution. It can also be represented as a set below.
Let's take our example with two dice and calculate the distribution. There are different outcomes in this experiment. Now we have to calculate the probability for each possible sum. How do we do it? We have to use all sample points picked by the event. So, the probability that the score sum is is , as there is only one suitable outcome . If we are looking for , then there are two options, and . Now, you can do the calculations for other values and compare the result with the following diagram.
Let's consider another example of the distribution of a random variable representing the number of siblings a person has. In this case, the random variable can take on discrete values: , and so on, representing the number of siblings a person has. Let's assume that we have a sample of people. of them have no siblings, have sibling, have siblings, and has siblings. The following diagram represents the distribution in this case.
There are two things to remember. First, a simple table or diagram is enough to represent any distribution. Second, our distribution is independent of . Thus, a variable's set of values and distribution can fully characterize every random variable.
Common discrete distributions
As you progress in probability, you will encounter many distributions, but here we will focus on the two most common ones.
First, let's consider the Bernoulli distribution. It is commonly used when we have one yes-no question, like tossing a coin. It is a special case of the Binomial distribution, which is used when we have a set of identical yes-no questions.
Let's discuss the Binomial distribution in more detail. Imagine that we toss a coin times and count how many times a tail appears. We can generalize it even further. With every toss, there is a probability of success () and a probability of failure (). It is crucial for the experiments to be independent of each other. If we want to count the number of successful outcomes, we can refer to binary strings again. A random variable refers to the number of successes in a series of similar experiments.
Independence
Let's go back to the dice example and consider one more random variable that's equal to the product of the scores:
If we need to calculate the probability of a set of values, we can refer to the joint distribution. In general, it is extremely difficult or even impossible to calculate a joint distribution. There is one important exception, and that is the independence of random variables. Two random variables and are independent if:
for all and
So, and from the dice example are not independent because:
When the sum of two dice equals , we know that is the outcome for each dice. However, if a random variable represents the outcome of the first dice and represents the outcome of the second dice, they are independent. For all , the following is true:
Later we will see that the independence of random variables greatly simplifies many calculations.
Now we know the basic definitions and properties of random variables and distributions. But what are they needed for? Let's find out together in the next topic!
Calculations and expectations
If you know the distribution, you can estimate the expected value of a random variable. It is a measure of its average or long-term value. It represents the average outcome we would expect to observe if we repeated an experiment or observation many times.
Calculating the expected variable can be a little tricky, and it deserves a separate topic. Here we'll just briefly discuss what can be concluded using the expected value. Imagine that you were asked if it is challenging or not to get a sum equal to at least when throwing dice. For each dice, the expected value is , so for dice, the sum can be roughly . So it is not that challenging to get a sum equal to at least .
You can also estimate more properties using distributions, but this will be discussed in the following topics. There's one thing left before finishing our topic: hints for solving tasks.
Challenges with discrete random variables
Here are some points that you should remember when dealing with discrete random variables.
The gambler's fallacy. It is a cognitive bias that occurs when people believe that the outcome of a random event is more likely to occur or not occur based on previous outcomes, even when the events are statistically independent. For example, let's consider a game of flipping a fair coin. If the coin is flipped and lands on heads five times in a row, someone experiencing the gambler's fallacy might incorrectly believe that the next flip is more likely to result in tails, thinking that "tails is due" or "it's about time for tails to come up." However, in reality, the probability of getting heads or tails on each flip remains , regardless of the previous outcomes. The outcome of one coin flip does not influence the outcome of subsequent flips.
Conditions are important. Let's consider an example of rolling a fair six-sided dice. Each outcome is equally likely to occur, so the probability of getting any specific number is . Now, let's modify the example slightly. Suppose we have a biased six-sided dice, where the probability of rolling a is , the probability of rolling a is , and the remaining numbers ( and ) each have a probability of . To calculate the probability of a compound event, let's say we want to find the probability of rolling an even number and then rolling a on the next roll. The probability of rolling an even number on the first roll is the sum of the probabilities of rolling a or , which is . Now, since we want to roll a on the next roll, the probability of rolling a is . To find the probability of both events occurring, we multiply the individual probabilities together: .
Dealing with samples of large size. If you're dealing with a sample of large size, it might be useful to find a subset of the sample or redefine the problem to the opposite one and then use the outcome to find what is asked. For example, suppose we flip a fair coin times and count the number of heads. The random variable in this case is the number of heads obtained, and it can take values from to . The probability distribution of this random variable follows a binomial distribution. By redefining the problem to the opposite, we can also consider the distribution of the number of tails obtained. Since the coin is fair, the number of tails will be equal to the number of flips minus the number of heads. So, in this case, the random variable is the number of tails, which can also take values from to . The probability distribution of the number of tails will also follow a binomial distribution. By redefining the problem to the opposite, we can gain a different perspective and analyze the distribution of the complement of the random variable. This approach can be useful in certain scenarios, such as when calculating probabilities or making comparisons between different outcomes.
Conclusion
Let's summarize what we have learned:
We deal with discrete random variables in everyday situations when we pick or choose something.
We can use a function from to to avoid dealing with a specific .
We can consider the distribution of a discrete random variable to be a simple table where each of its values is associated with the probability of the variable taking that specific value. Diagrams help to visualize it.
The Bernoulli distribution is a special case of the binomial distribution with only one trial.
Dependence and independence affect calculations with random variables.
If you know the distribution, you can estimate the expected value of the random variable, which shows its average or long-term value.
Don't forget about the gambler's fallacy and the importance of conditions when dealing with random variables.
If you deal with large sample sizes, it might be useful to use subsets of the sample or even redefine the problem.