Kazani pfp
Kazani

@kazani

The math and assumptions behind the red-blue button thought experiment 1. Everyone in the world votes privately: red or blue. 2. If a majority presses blue, everyone lives. Otherwise, only the red voters live. At first glance, red seems obviously rational. You live if blue wins, and you also live if blue loses. Blue only saves you if blue wins. But that conclusion depends on three assumptions: 1. You do not care about anyone else: pure self-interest. 2. Your decision is independent of everyone else's decision. 3. You should only consider what your choice directly causes. Under those assumptions, red makes perfect sense. But can blue be rational too? Argument 1: Even mild altruism can make blue rational Imagine a majority-blue world where most people intend to vote blue, but each person has a tiny error rate, ε, of accidentally pressing red. Say ε=0.001. Suppose blue has a safety margin of m votes. Would a slightly altruistic person switch from blue to red? We compare the expected benefit with the expected cost. Benefit of red: You benefit only if blue was going to lose anyway. With you still voting blue, that requires at least m+1 other blue voters to make a mistake. Probability: ε^(m+1) Value: you save your own life. Cost of red: Your red vote could destroy the safety margin. This happens if exactly m other blue voters make a mistake, just enough that blue would have squeaked through with your vote, but loses without it. In that case, blue would have survived if you had voted blue, but fails because you switched. Probability: ε^m Value: everyone you care about who voted blue dies. Because ε is tiny, ε^m is much larger than ε^(m+1). Specifically, the pivotal case is about 1/ε times more likely. If ε = 0.001, the pivotal case is about 1,000 times more likely than the case where blue was already doomed. So blue is rational if: ε^m * (value of loved ones) > ε^(m+1) * (value of your life) Divide both sides by ε^m: (value of loved ones) > ε * (value of your life) If ε = 0.001, you only need to value the combined lives of your blue-voting loved ones at more than 0.1% of your own life. You do not need saintly concern for 8 billion strangers. Ordinary family-directed altruism can mathematically override the selfish temptation to vote red. Therefore, majority-blue can be a stable, mathematically coherent equilibrium. Argument 2: With enough correlation, blue can be selfishly optimal Now compare two possible worlds. All-red equilibrium: You die if you accidentally slip to blue, while red wins. Death probability: roughly ε. All-blue equilibrium: You die only if enough people accidentally slip to red that blue loses its majority. Death probability: roughly ε^N, where N is enormous. So blue is not just safer for humanity. In a stable, all-blue world, blue is vastly safer for you personally. The gap between ε and ε^N is the difference between a 1-in-1,000 risk and something closer to 1-in-a-googol. So why does classical Causal Decision Theory recommend red? Because CDT says your choice does not physically cause other people's choices. Therefore, you should treat their votes as fixed and pick the locally safest option. But other decision theories, such as Evidential Decision Theory, Updateless Decision Theory, and Functional Decision Theory, take correlations seriously. If many people are running similar reasoning on the same puzzle, your choice is evidence about what they will choose. You are not causing their vote. Your reasoning is correlated with theirs. If that correlation is strong enough, then your choosing blue is itself evidence that others are choosing blue, putting you in the vastly safer all-blue world. In that case, even a perfectly selfish rational agent can prefer blue. No altruism required. Summary When someone says “the math rules out blue,” they usually mean: Assuming pure self-interest, causal independence, and Causal Decision Theory, red is rational. That is true. But those assumptions are doing the work.
1 reply
1 recast
9 reactions