Comments from JL stirred memories of my own experience with self-learning. I remember trying to learn how to model biological phenomenon by picking up John Maynard Smith’s Mathematical Ideas in Biology from the library many years ago. One of the problems that I had a really tough time understanding was the following problem in probabilistic thinking:

Of three prisoners, Matthew, Mark and Luke, two are to be executed, but Matthew does not know which. He therefore asks the jailer ‘Since either Mark or Luke are certainly going to be executed, you will give me no information about my own chances if you give me the name of one man, either Mark or Luke, who is going to be executed’. Accepting this argument, the jailer truthfully replied ‘Mark will be executed’. Thereupon, Matthew felt happier, because before the jailer replied his own chances of execution were 2/3, but afterwards there were only two people, himself and Luke, who could be the one not to be executed, and so his chance of execution is only 1/2. Is Matthew right to feel happier?

This problem is also known as the “Serbelloni problem” and according to John Maynard Smith, it “nearly wrecked a conference in theoretical biology in 1966”. It seems that there is nothing wrong with Matthew’s intuition – one could reason that given the information that Mark would be executed, only two possibilities remain: (Mark, Matthew) or (Mark, Luke) would be executed. Since Matthew is in one of the two possible outcomes, his chance of dying is surely 1/2.

After checking the hint at the back of the book, I was puzzled that that was not the case. Maynard Smith merely said that an application of Bayes’ Theorem or “common sense” should solve the problem. I had some ideas about Bayes’ Theorem at that time, so following the hint, one arrives at the solution of 2/3, which is the same as the probability of dying prior to receiving any information from the guard. To be precise, let I be the information given by the guard and H be the hypothesis that Matthew will die. What is required is the conditional probability P(H|I). According to Bayes’ Theorem, this can be related to P(I|H) as follows:

P(H|I) = P(I|H)P(H) / P(I)

Let’s consider P(I|H). If Matthew is known to die, then the outcomes must be either (Matthew, Mark) or (Matthew, Luke). Since the guard cannot explicitly say whether Matthew will die, he either says Mark or Luke will die, with probability 1/2. P(H) is just 2/3 since it is equal to P(Matthew, Mark) + P(Matthew, Luke), both of which occurs with probability 1/3.

How about P(I)? We could express P(I) as P(I,H)+P(I,H’), where H’ is the complement of H, and further rewrite it as P(I|H)P(H) + P(I|H’)P(H’). The first term is the same as the numerator. For the second term, P(H’) = 1/3, and P(I|H’) = 1/2, since if we know that Matthew does not die, then the only outcome is (Mark, Luke), and the guard reveals that Mark dies with probability 1/2. Putting everything together, we get

P(H|I) = (1/2)(2/3) / { (1/2)(2/3) + (1/2)(1/3)} = 2/3.

Bayes’ Theorem works but may lead to mechanical application without stimulating more intuition about conditional thinking. With more experience, P(H|I) could be computed as follows. Given information from the guard, only two outcomes are possible: (Mark, Matthew), (Mark, Luke). The total probability shrinks from 1 to 2/3. Now between these two outcomes, the guard can only reveal Mark or Luke. Consider the case of the guard revealing Mark. Since there are two Marks to one Luke, the guard reveals Mark with probability 2/3, after which, only Matthew or Luke could die, each with probability 1/2. Now consider the other case, where the guard reveals Luke. This occurs with probability 1/3, after which, only Mark or Matthew could die. But there are two Marks to one Matthew, so the probability of Matthew dying is 1/3. Putting these together we have {(2/3)(1/2) + (1/3)(1/3)} / (2/3) = 2/3.

Things may be clearer with the help of the figure below. Maybe this is the “common sense” that Maynard Smith talked about.

Very nice. Just that I am uninterested in the theorem. I just use my own “calculation”

It’s good to develop an intuition for conditional probabilities. I guess the theorem helps in more mundane tasks and also provides a general framework to look at conditional probabilities.