15 minutes read

rouletteQuite often, it is necessary to estimate the behavior of a random variable as a whole. Let's imagine you walk into a casino. You have enough money to spend the whole night playing and want to have a good time. There are two games in this casino where you can win double the amount of the bet. In the first one the bet is 100100 and probability of winning is 51%51\%. In the second game the bet is 200200 and you are likely to win with the probability of 55%55\%. Naturally you would play the second game because in the long run you expect to win more (or lose less) than in the first game. We made the choice intuitively, based on the average outcome of every game. By doing this we basically found and compared their expected values or in other words expected monetary outcomes of both bets. In this topic we will learn the formal definition of expected values, how to calculate them, and where and when to use them.

Definition of expected value

Let's define expected value in the case of a discrete random variable with a finite set of possible values. Consider random variable XX that takes the values x1,x2,,xi,,xkx_1, x_2, \dots, x_i, \dots, x_k with probabilities p1,p2,,pi,,pkp_1, p_2, \dots, p_i, \dots, p_k, respectively. Then the expected value of XX is defined as

E[X]=i=1kxipi=x1p1+x2p2++xkpk\mathbb{E}[X] =\sum_{i=1}^k x_i \cdot p_i = x_1\cdot p_1 + x_2 \cdot p_2 + \cdots + x_k \cdot p_k

But what exactly stands behind this formula? Let's observe it on a classic example. So, the random variable XX is representing the outcome of a roll of a typical die: the possible values are 11, 22, 33, 44, 55, and 66, we are equally likely to get any of them with a probability of 16\frac{1}{6}. For random variable XX we can calculate the expected value using the definition:

EX=116+216+316+416+516+616=3.5\mathbb{E}X = 1 \cdot \frac{1}{6} + 2 \cdot \frac{1}{6} + 3 \cdot \frac{1}{6} + 4 \cdot \frac{1}{6} + 5 \cdot \frac{1}{6} + 6 \cdot \frac{1}{6} = 3.5

This means that every time we roll a dice, the value we expect to get on average is 3.53.5. But in case we got ourselves a different dice – for example a 77-sided one, with 22 sides having number 66, the expected value of our random variable would change:

E[X]=117+217+317+417+517+627=2773.86\mathbb{E}[X] = 1 \cdot \frac{1}{7} + 2 \cdot \frac{1}{7} + 3 \cdot \frac{1}{7} + 4 \cdot \frac{1}{7} + 5 \cdot \frac{1}{7} + 6 \cdot \frac{2}{7} = \frac{27}{7} \approx 3.86

Having mastered the dice example, let's apply our new formula somewhere else. Imagine that you are working in an analytical department in a bank, your job is to decide whether to give out a loan. A person comes to the bank and asks for a loan of 100100 dollars at interest rate of 1010 percent. You have to decide whether to give out a loan or not.

For convenience, we shall consider that there are only two possible outcomes:

  • the person will pay off the loan, that is, they will return 110110 dollars to the bank

  • will not pay anything

Also, assume that the bank knows (or at least can estimate the probability of each of these events). In our example, let the client be a responsible young man, who will repay the loan with the probability of 0.950.95 and will pay nothing with the probability of 0.050.05. To make a decision, you have to compute the expected value of profit:

1010 dollars (potential profit, also known as interest on the loan)×0.95\times 0.95 (probability of repayment of the loan)100− 100 dollars (potential loss) ×0.05\times 0.05 (probability of total default of the loan) =4.5= 4.5 dollars >0> 0.

So "average" returns are positive — and following your report the bank decides to lend.

bank

Sometimes you can find different notations for the expected value of a random variable:EX=E[X]=E(X)=μX\mathbb{E}X = \mathbb{E}[X] = \mathbb{E}(X) = \mu_X

But what about the expectation of discrete random variables with an infinite set of values? Well, the good news is that formula stays almost the same:

E[X]=xiRxxipi\mathbb{E}[X] =\sum_{x_i\in R_x} x_i \cdot p_iHere random variable XX takes values from the countable infinite set Rx={x1,x2,x3,}R_x = \{ x_1, x_2, x_3, \dots\}. And in this case, we can imagine the expected value as the average value of the observed data if we repeat the experiment an infinite number of times.

Finding the expectation in the examples above was easy. However to find an infinite sum you might need a little more math. Don't worry, the mathematicians have already done it for you! In a few minutes, you will find out how to calculate the expectation for the most useful random variables with an infinite set of values.

Unfortunately, the infinite sum mentioned above is often equal to infinity. Here is an example of such a pitiful situation.

Imagine you come to a casino, and they offer you to play a game of chance with the following rules. The initial stake begins at 2 dollars and is doubled every time a head appears. When it is a tail, the game ends, and the player wins whatever is in the pot, so the player gets 2k2^k dollars, where kk is the number of consecutive head tosses. We already know that to estimate the potential gain, you need to calculate the expected value. Let's try to do this by definition:

EX=212+414+818+=1+1+1+=\mathbb{E}X = 2 \cdot \frac{1}{2} + 4 \cdot \frac{1}{4} + 8 \cdot \frac{1}{8} + \cdots = 1 + 1 + 1 + \cdots = \infty

As you see, the expected win is an infinite amount of money. So, don't miss out!

Properties of expected value

Mathematical expectation has some useful properties that can seriously facilitate calculations.

For example, let's calculate the expected value of the sum of the dots rolled after 33 tosses of a die. Above, we have already calculated the expected value of one roll of a 66-sided die. Therefore, all we have to do is multiply the expected value we got previously by three:

E[3X]=3×E[X]=3×3.5=10.5\mathbb{E}[3 X] = 3 \times \mathbb{E}[X] = 3 \times 3.5 = 10.5It's possible because of the linearity of the expected value.

Let a,bRa, b \in \mathbb{R} – the constants, and X,YX, Y are discrete random variables, then:

E[aX+bY]=aEX+bEY\mathbb{E}[aX + bY] = a\mathbb{E}X + b\mathbb{E}Y

Take a look at another example. Let's say now in the game one die is normal, 66-sided, and the other one is 33-sided. The random variable XX corresponds to the number we get on the 66-sided die, and Y – to the one we get on the 33-sided one. We'll calculate the expected value of one roll of both dices:

E[X+Y]=E[X]+E[Y]=3.5+(1×13+2×13+3×13)=3.5+2=5.5\mathbb{E}[X+Y]=\mathbb{E}[X]+\mathbb{E}[Y]=3.5+(1 \times \frac{1}{3} + 2 \times \frac{1}{3} + 3 \times \frac{1}{3})=3.5+2=5.5

Finally, we'll count the expected value of three rolls of 66-sided and four rolls of 33-sided dice:

E[3X+4Y]=3×E[X]+4×E[Y]=3×3.5+4×2=10.5+8=18.5\mathbb{E}[3X+4Y]=3 \times \mathbb{E}[X]+4 \times \mathbb{E}[Y]=3 \times 3.5+4 \times 2=10.5+8=18.5

One more significant property is necessary for comparing random variables and making estimates on them. For example, there are two mechanics in an auto repair shop, and they are paid a salary in proportion to how many customers they serve per day. One of them is more experienced and manages to serve from 33 to 55 customers every day. The second mechanic has started working recently, so performs tasks slower: the second mechanic manages to serve only 232-3 clients per day. It is clear that the "average" salary the first mechanic will receive would be bigger than that of the second one.

In such cases, we speak about the monotony of the expected value.

Here is the formal definition of this property: let X,YX, Yare discrete random variables such that XYX \leq Y, that is i,jN\forall i, j \in \mathbb{N} xiyjx_i \leq y_j, then:

E[X]E[Y]\mathbb{E}[X] \leq \mathbb{E}[Y]

But the expectation of the product of random variables is not always so easy to calculate. The point is in the non-multiplicative nature of the expected value.

Let X,YX, Yare discrete random variables, then E[XY]\mathbb{E}[X \cdot Y] is not necessarily equal to E[X]E[Y]\mathbb{E}[X]\cdot\mathbb{E}[Y]. Let's look at the following example: X:={0,p1=121,p2=12X := \begin{cases} 0, & p_1 = \frac{1}{2} \\ 1, & p_2 = \frac{1}{2} \end{cases}

Y:=X2={02,p1=1212,p2=12={0,p1=121,p2=12Y := X^2 = \begin{cases} 0^2, & p_1 = \frac{1}{2} \\ 1^2, & p_2 = \frac{1}{2} \end{cases} = \begin{cases} 0, & p_1 = \frac{1}{2} \\ 1, & p_2 = \frac{1}{2} \end{cases}

What will the expectation of the product of these random variables be equal to?

E[XY]=E[X3]=012+112=12\mathbb{E}[XY] = \mathbb{E}[X^3] = 0 \cdot \frac{1}{2} + 1 \cdot \frac{1}{2} = \frac{1}{2}

E[X]=E[Y]=012+112=12\mathbb{E}[X] = \mathbb{E}[Y] = 0 \cdot \frac{1}{2} + 1 \cdot \frac{1}{2} = \frac{1}{2}

E[X]E[Y]=1212=1412=E[XY]\mathbb{E}[X] \cdot\mathbb{E}[Y] = \frac{1}{2} \cdot \frac{1}{2} = \frac{1}{4} \not= \frac{1}{2} = \mathbb{E}[X\cdot Y]

However, for independent random variables, the multiplicative property of the expected value will be satisfied, that is if X,YX, Yare independent discrete random variables, then E[XY]\mathbb{E}[X\cdot Y] is equal to E[X]E[Y]\mathbb{E}[X]\cdot\mathbb{E}[Y].

The linearity of the expected value takes place regardless of whether the discrete random variables are dependent or not, in contrast to multiplicative property.

Expected value of special discrete distributions

So how to calculate the expectation for the most important discrete variables?

Bernoulli distribution

Let's begin with the easiest one. As you may remember, a random variable XX with the Bernoulli distribution may take the value 1 with the probability p and value 0 with probability 1 — p. If we insert these numbers into the formula of expected value of a random variable, we get:

E[X]=i=01xipi=x0p0+x1p1=0(1p)+1p=0+p=p\mathbb{E}[X] =\sum_{i=0}^1 x_i \cdot p_i = x_0\cdot p_0 + x_1 \cdot p_1 = 0 \cdot (1-p) + 1 \cdot p = 0 + p= p

So, for a random variable XX with the Bernoulli distribution the expected value would be the probability of it taking the value 1 or:

E[X]=p\mathbb{E}[X] = p

Imagine you play a game, and your chances of winning are 70%. A random variable XX describes the outcome of a single game, and naturally, it has Bernoulli distribution, because there are only 2 possible outcomes — you either win or lose. Bearing in mind the probability of you winning, you intuitively expect to win 0.7 of the games you play.

Geometric distribution

Recall the formula of Geometric distribution:

Px(k)=qk1pP_x(k) = q^{k-1} \cdot p, for k=1,2,3,k = 1, 2, 3, \dots, where 0<p<10 < p < 1 and q=1pq = 1 - p.

Then the expected value of the random variable XX with the geometric distribution is:

E[X]=p1p\mathbb{E}[X] = \frac{p}{1-p}

Here is how it works. Let there be a breeder who breeds dogs. Let's assume the breeder is determined to breed a rare-colored puppy, and wonders how many puppies will appear before the puppy of the desired color is born. Let the probability of "not a success" (birth of a regular-colored puppy) be p=0.9. Let's use the formula above:

E[X]=0.910.9=9\mathbb{E}[X] = \frac{0.9}{1 - 0.9} = 9

Thus, "on average", the breeder will have nine regular-colored puppies before a rare-colored puppy appears.

many ordinary pugs, one is special

Binomial distribution

As for the XBinomial(n,p)X \sim \mathnormal{Binomial(n,p)}, the formula of expected value is even simpler:

E[X]=np\mathbb{E}[X] = np

Let's look again at our breeders. Imagine that they were afraid of such an impressive number of potential puppies and decided that was ready to get only 55 puppies. The breeders are now wondering how many rare-colored puppies out of these 55 puppies they will have "on average". In this case, the probability of "success" (birth of a rare-colored puppy) is p=0.1p = 0.1. Let's calculate the expected value:

E[X]=50.1=0.5\mathbb{E}[X] = 5 \cdot 0.1 = 0.5

What does the following result tell us? Of course, the breeder can't have half a puppy. Calculations only mean that for 55 attempts, the fulfillment of a dream "on average", sadly, does not fit. If the breeder still decides to have 1010 puppies, then the expectation will look like this:

E[X]=100.1=1\mathbb{E}[X] = 10 \cdot 0.1 = 1

If you think about it, this is exactly the result that we expected: if the probability of a rare-colored puppy's birth is 0.10.1, then we intuitively understand that one rare-colored puppy should appear once in every 1010 puppies.

Poisson distribution

At last, let's not forget the Poisson distribution formula:

Px(k)=eλλkk!P_x(k) = \frac{e^{-\lambda} \cdot \lambda^k}{k!}, for k=1,2,3,k = 1, 2, 3, \dots

This distribution seems to be the most complicated of all considered. It is difficult to imagine that when conducting a series of experiments on average we will get λ\lambda! But it's really so because for Poisson distribution

E[X]=λ\mathbb{E}[X] = \lambda

Recall that in the Poisson distribution λ=np\lambda = n \cdot p. In fact, the parameter λ\lambda is initially chosen as the expected value of the considered discrete random variable, so the result once again confirms the correctness of the choice of the parameter.

Conclusion

Let's summarize what we have learned. Firstly, we have formally determined the expected value — the value you expect to get on average in every single attempt or event. Then, we have looked at its properties of linearity and monotonicity, that describe the mathematical properties of expected values and makes calculating them easier. Finally, we have calculated the expected values for some special cases: the expected value of geometric distribution is equal to p1p\frac{p}{1-p}, the expected value of binomial distribution is equal to npnp, and the expected value of Poisson distribution is equal to λ\lambda.

6 learners liked this piece of theory. 0 didn't like it. What about you?
Report a typo