15 minutes read

Formula of total probability

First of all, it's important to remember the definition of mutually exclusive (disjoint) events. Two events are mutually exclusive if they can't both occur at the same time. For example, event "randomly chosen number is greater than 11" and event "randomly chosen number is less than 11" are disjoint.

Next, we need one more definition. When a sample space Ω\Omega is distributed down into some mutually exclusive events H1,H2,H_1, H_2, \dots (finite or infinite set) such that their union forms the sample space itself H1H2=ΩH_1 \cup H_2 \cup {}_ \cdots = \Omega, then events H1,H2,H_1, H_2, \dots are called exhaustive events. Also events H1,H2,H_1, H_2, \dots are called hypotheses.

And the last in this paragraph is the law of total probability. Let's suppose, that H1,H2,H_1, H_2, \dots are exhaustive events and we need to calculate the probability of some event AA the following way (nn can be ++\infty):

P(A)=i=1nP(AHi)P(Hi)\mathbb{P}(A) = \sum \limits_{i =1}^{n}\mathbb{P}(A|H_i)\mathbb{P}(H_i)

Why is this formula correct? Event AA can be represented as a union of non-intersecting events AHiA \cap H_i. These events are mutually non-intersecting, because the events H1,H2,H_1, H_2, \dots are mutually exclusive. Therefore:

P(A)=i=1nP(AHi)\mathbb{P}(A) = \sum_{i=1}^{n}\mathbb{P}(A \cap H_i)

Next, we use the definition of the conditional probability:

P(AHi)=P(AHi)P(Hi)\mathbb{P}(A \cap H_i) = \mathbb{P}(A|H_i)\mathbb{P}(H_i)

Finally, we have the already written formula:

P(A)=i=1nP(AHi)P(Hi)\mathbb{P}(A) = \sum_{i =1}^{n}\mathbb{P}(A|H_i)\mathbb{P}(H_i)

the oval (event) is divided into 4 parts

Let's see a simple example of usage of the total probability formula. Three factories are producing the same pills. The first one is producing 20%20\% of all pills, the second one - 30%30 \% and the third one - 50%50 \%. And the first factory has 5%5 \% defective pills, the second - 3%3 \%, the third - 4%4 \%. What is the probability of buying defective pills?

Let event HiH_i mean that pills are produced by a factory number ii, event AA is buying defective pills.

We have P(AH1)=0.05\mathbb{P}(A | H_1) = 0.05, P(AH2)=0.03\mathbb{P}(A|H_2) = 0.03 and P(AH3)=0.04\mathbb{P}(A|H_3) = 0.04.

Using the formula of total probability we get the answer:

P(A)=P(AH1)P(H1)+P(AH2)P(H2)+P(AH3)P(H3)=0.050.2+0.030.3+0.040.5=0.039\mathbb{P}(A) = \mathbb{P}(A|H_1)\cdot\mathbb{P}(H_1) + \mathbb{P}(A|H_2)\cdot\mathbb{P}(H_2) + \mathbb{P}(A|H_3)\cdot\mathbb{P}(H_3) = 0.05 \cdot 0.2 + 0.03 \cdot 0.3 + 0.04 \cdot 0.5 = 0.039

Bayes' theorem: simple form

In this paragraph we introduce the simple form of Bayes' formula.

Suppose, there are events AA, BB and P(A)>0,P(B)>0\mathbb{P}(A) > 0, \mathbb{P}(B) > 0.

First, let's look at the definition of conditional probability:

P(AB)=P(AB)P(B)\mathbb{P}(A|B) = \frac{\mathbb{P}(A \cap B)}{\mathbb{P}(B)}

But, on the other hand:

P(BA)=P(BA)P(A)=P(AB)P(A)\mathbb{P}(B|A) = \frac{\mathbb{P}(B \cap A)}{\mathbb{P}(A)} = \frac{\mathbb{P}(A \cap B)}{\mathbb{P}(A)}

Notice that both fractions have the same numerators and we can equate them:
P(AB)P(B)=P(BA)P(A)\mathbb{P}(A|B)\mathbb{P}(B) = \mathbb{P}(B|A)\mathbb{P}(A)Also it can be divided by P(B)\mathbb{P}(B) and the result is a simple form of Bayes' theorem:

P(AB)=P(BA)P(A)P(B)\mathbb{P}(A|B) = \frac{\mathbb{P}(B|A)\mathbb{P}(A)}{\mathbb{P}(B)}

With this formula we could calculate conditional probability without knowing the probability of ABA \cap B.

Extended Bayes' theorem

So, we have the formula P(AB)=P(BA)P(A)P(B)\mathbb{P}(A|B) = \frac{\mathbb{P}(B|A)\mathbb{P}(A)}{\mathbb{P}(B)}. It's easy to remember and use that, but in real problems, the probability of event BB is very often unknown. We can try to use the formula of total probability for event BB:

P(AB)=P(BA)P(A)i=1nP(BHi)P(Hi)\mathbb{P}(A|B) = \frac{\mathbb{P}(B|A)\mathbb{P}(A)}{\sum \limits _{i=1}^{n}\mathbb{P}(B|H_i)\mathbb{P}(H_i)}where H1,,HnH_1, \dots, H_n are hypotheses

Okay, now let's try Bayes' theorem where BB is some hypothesis HkH_k and understand the trustability of this hypothesis.

Suppose we have a hypothesis H1,H2,,HnH_1, H_2, \dots,H_n and some event AA.

Using the previous formula we know that

P(AHk)=P(HkA)P(A)P(Hk)\mathbb{P}(A|H_k) = \frac{\mathbb{P}(H_k|A)\mathbb{P}(A)}{\mathbb{P}(H_k)}

We can use the formula of total probability for event AA, as we have done before: P(AHk)=P(HkA)P(A)P(Hk)=P(HkA)P(A)i=1nP(HkHi)P(Hi)\mathbb{P}(A|H_k) = \frac{\mathbb{P}(H_k|A)\mathbb{P}(A)}{\mathbb{P}(H_k)}= \frac{\mathbb{P}(H_k|A)\mathbb{P}(A)}{\sum \limits_{i=1}^{n}\mathbb{P}(H_k|H_i)\mathbb{P}(H_i)}

The last formula is called Bayes' theorem or Bayes' formula.

With Bayes' theorem we can recalculate hypotheses' probabilities if some information is known about the result of the experiment.

Actually, if some event AA happened and we know all probabilities P(AHi)\mathbb{P}(A|H_i) and P(Hi)\mathbb{P}(H_i) we can calculate the probability that some hypothesis HkH_k really happened.

For example, let's look at the problem about the factories from the previous paragraph. Now we need to find the probability of having bought a pill produced by the first factory, if this pill is defective.

HiH_i means that pills are produced by factory number ii, as it meant before, event AA is buying defective pills. Formally, we need to find P(H1A)\mathbb{P}(H_1|A). We calculated that P(A)=0.039\mathbb{P}(A) = 0.039.

Let's use the Bayes' theorem:

P(H1A)=P(AH1)P(H1)P(AH1)P(H1)+P(AH2)P(H2)+P(AH3)P(H3)=0.050.20.0390.256\mathbb{P}(H_1|A) = \frac{\mathbb{P}(A|H_1)\mathbb{P}(H_1)}{\mathbb{P}(A|H_1)\mathbb{P}(H_1) + \mathbb{P}(A|H_2)\mathbb{P}(H_2) + \mathbb{P}(A|H_3)\mathbb{P}(H_3)} = \frac{0.05 \cdot 0.2}{0.039} \approx 0.256

By this way we recalculated the probability that we have bought a pill from the first factory. If it was unknown that this pill is defective, the probability that we have bought a pill from the first factory would be P(H1)=0.2\mathbb{P}(H_1) = 0.2.

Example

Let's consider an example: two shooters toss a coin and decide which of them will shoot. If it comes up heads, the first one will shoot, if it comes up tails- the second one. The probability that the first shooter hits the target is 11, the second one - 10410^{-4}. Suppose that we know the result of this experiment: the target was hit. For each shooter what is the probability that he was shooting in this case?

We have 22 hypotheses: H1H_1 is that the first shooter will shoot, H2H_2 is for the second. Also let the event AA mean that the target was hit. We know that P(AH1)=1\mathbb{P}(A|H_1) = 1 and P(AH2)=104\mathbb{P}(A|H_2) = 10^{-4}. So, we need to calculate P(H1A)\mathbb{P}(H_1 | A) and P(H2A)\mathbb{P}(H_2|A).

Notice, that if we didn't know that the target was hit, it would be clear that P(H1)=P(H2)=12\mathbb{P}(H_1) = \mathbb{P}(H_2) = \frac{1}{2}.

Now we are ready to use Bayes' formula.

P(HiA)=P(Hi)P(AHi)P(H1)P(AH1)+P(H2)P(AH2)\mathbb{P}(H_i|A) = \frac{\mathbb{P}(H_i)\mathbb{P}(A|H_i)} {\mathbb{P}(H_1)\mathbb{P}(A|H_1) + \mathbb{P}(H_2)\mathbb{P}(A|H_2)}

P(H1A)=121121+121040.9999\mathbb{P}(H_1|A) = \dfrac{\frac{1}{2}\cdot1}{\frac{1}{2} \cdot 1 + \frac{1}{2} \cdot {10^{-4}}} \approx 0.9999

P(H2A)=12104121+121040.0001\mathbb{P}(H_2|A) = \dfrac{\frac{1}{2}\cdot 10^{-4}}{\frac{1}{2} \cdot 1 + \frac{1}{2} \cdot {10^{-4}}} \approx 0.0001

So, if it's unknown that target was shot, then the probabilities of heads and tails are equal, but if the target was shot, then the probability of heads is 10410^4 times more than the probability of tails.

Conclusion

Now we have learned very important formulas – the formula of the total probability and Bayes' formula. The formula of the total probability helps us to calculate probability using exhaustive events. And Bayes' formula can recalculate the probability of some event if something is known about the result of this event. Both formulas are widely used in next topics.

64 learners liked this piece of theory. 10 didn't like it. What about you?
Report a typo