Keras 2.x Projects
上QQ阅读APP看书,第一时间看更新

Bayes' theorem

In the previous section, Bayesian decision theory, we learned how to calculate many types of probabilities. Now it is time to benefit from these acquired skills. We will do this by defining Bayes' theorem as follows:

Let A and B be two dependent events. Previously, we said that the joint probability between the two events is calculated using the following formula:

                                             
Or, similarly, using the following formula:

                                             
By analyzing the two proposed formulas, it is clear that they have the first equal member. This implies that even the second members are equal, so we can write the equation as follows:

                                           
By solving these equations for conditional probability, we get the following:

                                                 

Or, in a similar way, we can calculate the following:

                                                  

The proposed formulas represent the mathematical statement of Bayes' theorem. The use of one or the other depends on what we are looking for.

Let's look at an example. Suppose you are given two coins. The first coin is fair (heads and tails) and the second coin is biased (heads on both sides). You randomly choose a coin and toss it, getting heads as a result. What is the likelihood of it being the second coin (the wrong coin)?

Let's start by distinguishing the various events that come into play. Let's identify these events:

  • A: The first coin was chosen
  • B: The second coin was chosen
  • C: After the toss comes a head

To avoid making mistakes, let's look at what we need to calculate. The question made by the problem is simple. It asks us to calculate the likelihood of choosing the second coin, knowing that after the launch we got heads. In symbolic terms, we have to calculate P(B|C).

According to Bayes' theorem, we can write the following:

                                                  

Now, compute the three probabilities that appear in the previous equation. Remember that P(B|C) is called posterior probability, and that is what we want to calculate. P(B) is called prior probability, linked to the second event (B), and is equal to 1/2, since we have two possible choices (two coins are available):

                                                                

P(C|B) is called likelihood and is equal to 1, as it gives the chances of heads, knowing that you have chosen the second coin (which has two heads and so is a certain event). Therefore, we get the following:

                                                              

Finally, P(C) is called marginal likelihood and is equal to 3/4, as the coins have four faces (possible cases), of which three have heads (favorable cases):

                                                               

At this point, we can enter the calculated probabilities in the Bayes' formula to get the result:

                                              

The concepts underlying the Bayesian theory are sufficiently clear. Now we can focus on a practical case of classification.