The Sleeping Beauty problem
[in which a non-math deacon tries to be smart about math, or possibly English?]
This is a post for smart math people, and for anyone else interested in math-related brain teasers. (Disclaimer: I’m interested in math-related brain teasers, but I’m not a math person.)
A couple of years ago I ran across the Sleeping Beauty problem—a brain teaser similar to statistical puzzles I have long been familiar with, like the Monty Hall problem and the Boy or girl problem, for which it’s easy for the uninitiated to form an intuitive, reasonable-sounding opinion that is wrong.
What prompted me to write this short inquiry was the astonishing discovery that the answer to the Sleeping Beauty problem is … in dispute? There is no agreed-upon answer?? Multiple academic papers by smart math people have been published, arguing both sides, and there’s still no consensus?!?
How is this possible?!!
Excursus: Monty Hall and Boy or Girl?
Let’s do a couple of warm-up rounds to get into the spirit of this sort of thing (and to give readers whose thing this is definitely not a chance to bail!).
1. The Monty Hall problem
Here’s a classic formulation of the problem, via Wikipedia:
Suppose you’re on a game show, and you’re given the choice of three doors: Behind one door is a car; behind the others, goats. You pick a door, say No. 1, and [before your chosen door is opened] the host, who knows what’s behind the doors, opens another door, say No. 3, which has a goat. He then says to you, ‘Do you want to pick door No. 2?’ Is it to your advantage to switch your choice?
You might think, intuitively, that “It’s two doors, it doesn’t matter whether I switch or not”—but you’d be wrong. The answer is that you should switch. Your initial guess had a 1/3 chance of being right, meaning there is a 2/3 chance that the car is behind one of the other doors. The host has just revealed which of those two doors the car isn’t behind. Since your initial guess was more likely wrong than not, the remaining mystery door is more likely right than not.
Not convinced? Look at it this way: Suppose that instead of three doors, you get a new car if you can pick the ace of spades out of a deck of cards. Obviously there’s a very low chance that the card you pick is correct. But if I, knowing where the ace of spades is, proceed to turn over 50 of the remaining 51 cards that are not the ace of spades, leaving only the card you picked and one mystery card, you’d have to be an idiot not to switch cards. Right?
That’s an extreme version of the problem, but the same principle applies, with much smaller differences in the odds, to the three-door Monty Hall problem. Make sense?
2. Boy or Girl?
Here’s the version of the Boy or Girl problem I first heard: Your friend’s dog just had two puppies. You want a male puppy, and your friend has promised that if at least one of the puppies is a male, you can have it. Your friend says “Good news, I have a male for you!” What is the probability that the other puppy is also a male? (To simplify, let’s assume that the sex of any given puppy is a straight 50/50 proposition.)
Again, your intuitive answer might be “The sex of the two puppies is unrelated; one being a male has nothing to do with the other being male or female, so it’s 50/50.” But, again, you would be wrong! Look at it this way. 50/50 would be the correct answer to a slightly different scenario: Suppose you’re on the phone with your friend as the puppies are being born, and as the first one arrives, your friend says “Good news, I have a male for you!” In this scenario, knowing the sex of the first puppy born doesn’t affect (within the parameters of the problem) the odds of the second puppy being a male, so the odds of a second male are 50/50. In the actual problem, though, both puppies are already born, and your friend—knowing the sex of both of them—confirms that at least one is male (if not the first, then the second).
In other words, the one piece of information we have is that your friend’s dog did not have two female puppies (a 1/4 probability). Two male puppies is also a 1/4 probability, meaning that half the time any two puppies will be opposite sex. This point is easier to make with a simple grid:
Statistically, there are four possibilities, not three: You could have first a male and then a female, but you could also have first a female and then a male, and those two possibilities together are as likely as the possibility of two puppies of the same sex, whether both males or both females. Of those four possibilities, we’ve eliminated one—two females—so the chances that the other puppy is also male is not 1/2, but 1/3.)
Warm-ups are done! Let’s move onto the main challenge.
The Sleeping Beauty problem
Here’s how Wikipedia gives this one. (FWIW, I discovered the problem in a news article and formed my approach to it before reading Wikipedia.)
“Sleeping Beauty volunteers to undergo the following experiment and is told all of the following details: On Sunday she will be put to sleep. Once or twice, during the experiment, Sleeping Beauty will be awakened, interviewed, and put back to sleep with an amnesia-inducing drug that makes her forget that awakening. A fair coin will be tossed to determine which experimental procedure to undertake:
“If the coin comes up heads, Sleeping Beauty will be awakened and interviewed on Monday only. If the coin comes up tails, she will be awakened and interviewed on Monday and Tuesday. In either case, she will be awakened on Wednesday without interview and the experiment ends.
“Any time Sleeping Beauty is awakened and interviewed she will not be able to tell which day it is or whether she has been awakened before. During the interview Sleeping Beauty is asked: ‘What is your credence now for the proposition that the coin landed heads?’”
My intuitive (or naive?) answer is that the probability of a fair coin landing heads is 50/50 or 1/2, and so Sleeping Beauty should just say that her credence that the coin landed heads is exactly fifty-fifty. What seems to make the problem complicated, though, is that Sleeping Beauty would be interviewed twice for tails, and only once for heads. Thus, on any given interview, from Sleeping Beauty’s point of view, it might seem that there is a greater chance that she’s been awakened for a result of tails than for a result of heads. Obviously, if Sleeping Beauty were to be awakened and interviewed only for tails, then if you woke her up and interviewed her, she should say that her credence that the coin landed heads is zero. One might argue, then, that under the terms of the problem she should say her credence for heads is 1/3.
But I don’t buy it! As with the Boy or Girl problem, it may be helpful to compare a slightly different question: Suppose that, instead of asking Sleeping Beauty her credence that the coin landed heads, we asked her to guess whether the coin landed heads or tails. If Sleeping Beauty’s goal is to maximize correct answers and minimize wrong answers, she should guess tails, since she may be right twice, but could only be wrong once, whereas if she guesses heads, she could only be right once, but may be wrong twice.
The catch is that maximizing correct guesses and minimizing wrong ones doesn’t seem to be the problem posed to Sleeping Beauty! The question is “What is your credence for the proposition that the coin landed heads?” which I take to be equivalent to “What do you consider to be the probability that the coin landed heads?” (Is this really an English question, not a math problem?)
And it seems to me that, whether you ask her just once or a million times, the actual probability of any toss of a fair coin landing heads is 1/2, and she should always say 1/2. How many times you ask the question strikes me as smoke and mirrors.
This doesn’t seem that complicated to me. Yet if it were as straightforward as I think it is, it probably wouldn’t be a topic of dispute among smart math people, so I must be missing something, right? What is it?
Variations
Let’s try a couple of variations on the scenario (like we did with Monty Hall and Boy or Girl) that may help illuminate the question.
First, suppose we repeat the Sleeping Beauty experiment with 100 Sleeping Beauties. In that case, based on the toss of 100 fair coins, approximately half the Sleeping Beauties will be awakened once and half will be awakened twice. If you ask them to guess the outcome and they all guess tails, there will be twice as many right guesses as wrong guesses, so tails is clearly the “optimal” guess—again, assuming that the goal is to maximize correct guesses.
But if they’re asked about their credence for the proposition that the coin landed on heads—what they think the probability is that it’s heads—approximately half of them have a coin on heads, sooo they should always answer 1/2, right? The fact that half of them answer 1/2 once and the other half answer 1/2 twice seems irrelevant. 1/2 is the correct answer, and it’s the correct answer for every Sleeping Beauty in every interview in every scenario.
Now let’s consider a second variation—this one with real significance for how the question should be answered.
Suppose that, dispensing with the whole magic drug amnesia conceit, we recruit 100 volunteers and tell them that if a coin toss comes up heads, we will select one of them to ask what her credence is as to the outcome of the coin toss, and the other 99 will be asked nothing. If it comes up tails, we will ask all 100 of them what their credence is that the coin came up heads.
In this variation, if the coin comes up heads, the 99 volunteers who are not interviewed will know the outcome of the coin toss from their non-interview status. If it comes up tails, each volunteer, knowing only that she is being interviewed, and having no knowledge of the interview or non-interview status of the other volunteers, will have no way of knowing whether the coin came up tails or whether she was the sole volunteer selected for the question after a heads outcome. And this is also the case for the one volunteer chosen in the heads scenario.
What is different about this scenario is that the volunteer(s) being interviewed have one piece of meaningful information: They know that they are not in the no-interview scenario, in which a) the coin has come up heads and b) they have not been selected for the interview. Prior to the coin toss, each volunteer has a slightly higher than 50 percent chance of being interviewed: a 100 percent chance of being interviewed if the coin comes up tails, and a 1 in 100 chance of being interviewed if the coin comes up heads.
The slimness of the odds of being the one volunteer selected for the interview if the coin comes up heads is evidence: evidence supporting the conclusion, if one finds oneself being interviewed and thus not in the no-interview scenario, that the coin has very likely come up tails. While it’s true that the outcome of the coin toss is 50-50, the compounded probability of a heads outcome plus being the one volunteer selected is much less likely than a tails outcome. This being the case, the interviewee should say that she gives little credence to the proposition that the coin landed heads. Of course this means there is a 50-50 chance that one volunteer will be wrong, versus a 100 percent chance that all the volunteers will be right. Still, you have to make your decision based on available information.
The same logic applies, to a lesser degree, even if there are just two volunteers. Both volunteers have a 100 percent chance of being interviewed in the event of a tails outcome and a 50-50 chance of being interviewed in the event of a heads outcome. The compounded likelihood of a either volunteer being interviewed based on being selected after a heads outcome is thus 1/4, which means that, granted that one is being interviewed, there is a 2/3 chance that the coin came up tails and a 1/3 chance that it came up heads. (It’s just like the Boy or Girl problem that way!)
That said…
…this logic does not hold when the multiple interviews for a tails outcome are aggregated in a single volunteer with the intervention of an amnesia drug, as in the original Sleeping Beauty problem. In this scenario, Sleeping Beauty has no meaningful information. There is no non-interview scenario; all Sleeping Beauty knows is that she will be interviewed either way, one or more times. The fact that she will be interviewed more times for a tails outcome than a heads outcome is not evidence.
Thus, if asked to guess the outcome of the coin toss, with a goal of maximizing correct guesses and minimizing wrong ones, she should guess tails; but if asked her credence for the proposition that the coin landed heads, she should say 50/50.
Or so it seems to me. And it doesn’t seem complicated. What am I missing, smart math friends?!
I rear back in terror at the sight of these sorts of things, so: I can't help you!
So many questions.
What if we modify the Sleeping Beauty problem thusly: on tails she will be woken twice and intervened as before, but for heads she will be woken twice, but only one of those times she’ll be interviewed?
For the 100 women problem: what if the causality is flipped? That is, no women are brought in until the coin is flipped. Heads, one is sought out, instructed and interviewed; tails, 100 are sought out and interviewed?
Does the shift from “credence” to “probability” shift the question from one about the coin toss itself to one of a likelihood within the system as a whole?