When assessing theories (both quantum and classical) which predict very large universes (e.g. inflationary cosmology), there is a key philosophical ambiguity which must be resolved before one can make predictions with the theory. One way to resolve this ambiguity is with the assumption (or principle) of typicality. (The word "typical" was the most common word used in the debate between Page, Hartle and Srednicki, and other cosmologists. Philosophers usually use the term "Self-sampling assumption", SSA, for closely related ideas.)

This ambiguity is closely related to the anthropic principle, and does not arise for theories of small universes. Further, this is decidedly a philosophical question, but one which seems unavoidable for doing certain kids of pure physics.

The ProblemEdit

Summary: in order to make predictions from theories of the universe containing multiple observers indistinguishable from us, we must make an assumption about which exact observer we are, or which of the observers we are likely to be.

First, consider a complete classical theory $ T_1 $ of a spatially and temporally finite universe, which specifies initial conditions as well as dynamical laws. Assume that $ T_1 $ describes the formation of the Milky Way, the development of the solar system, and the rise of life and humans on earth. Further, assume that it predicts---for detailed reasons involving chemistry, biology, etc.---that life arises only once in the finite universe.

Making predictions with this theory is then merely a matter of computation. All predictions can be made unambiguously, e.g. "Will it rain tomorrow?", "Is there a Higgs boson with mass between 125 and 130 GeV?", "Will there be more than 3 supernovae in the Milky Way galaxy during the next century?". Falsifying the theory is merely a matter of comparing its predictions to observations.

Now, however, consider a second, still classical, theory $ T_2 $ which describes a much, much larger (though still finite) universe. In that universe, there are many galaxies which are observationally indistinguishable from the Milky Way. In fact, there are (say) 100 Earth's, each with scientists who have made indistinguishable observations as us. However, these Earth's are not identical. Two of the Earth will be irradiated by a gamma-ray burst tomorrow (ending human life), and the other 98 will survive unscathed. What, then, is the prediction of $ T_2 $ about humanities survival?

Finally, consider a third classical theory $ T_3 $ which, like some inflationary cosmologies, predicts a universe which is both spatially and temporally infinite. There will then be an infinitude of Earths indistinguishable from ours, scattered through out both space and time, which will experience different future observations.

Unlike $ T_1 $, theories $ T_2 $ and $ T_3 $ require us to make an additional assumption about which of the observers we are likely to be before we can make predictions. This is a rather novel problem in the philosophy of science, whose resolution is not so obvious.

For a good, albeit verbose, statement of the problem, see Page's "Insufficiency of the Quantum State for Deducing Observational Probabilities".

The Xerographic DistributionEdit

Summary: Must introduce xerographic distribution over observers, which Hartle and Srednicki think need not be uniform.

Hartle and Srednicki (HS) argue in "Science in a very large universe" that the most general way to handle this problem is to define a xerographic distribution $ \chi(O_i) $ over observers $ O_i $. It is generally agreed that all possible solutions to the above problem can be expressed in terms of a xerographic distribution; the question is whether the general formalism of such distributions is useful.

The naive choice is simply the uniform distribution $ \chi(O_i) = 1/N $, where N is the number of observers. (For infinite N, there are additional ambiguities in the choosing the proper measure. But, as far as I can tell, this problem is logically separable.) HS call this assuming "Typicality" and it is generally done implicitly. HS draw two objections to this assumptions.

Objection #1: Arbitrariness of reference classEdit

In order to make predictions, it seems reasonable that our "reference class" (that is, the set of observers over which we average according to $ \chi(O_i) $) considered should be those observers in the theory which have exactly the same data $ D_0 $ as we have, where $ D_0 $ contains all past observations and everything we know about ourselves (we're human, the Earth has a radius of 4,000 miles, the positron was discovered in 1932). We already know we aren't those observers who have different data, and we seemingly have no reason to prefer any of the observers who have our data, since they are indistinguishable.

But we have a problem if we want to make retrodictions---predictions about the past. We want our theories to explain what we already know to be true. For instance, we want a fundamental theory of cosmology to be able to say something about the CMB temperature, which we already know to be 3K.

Again, this isn't a problem for theories with a single observer; that entire history of the observable universe is predicted unambiguously by such theories, and that history can be compared with reality. But consider a theory which predicts some observers to observe 3K and some to observe 100K. Our reference class can't be observers with exactly our data; those observers see 3K by definition, and the theory is trivially confirmed. So we must widen our reference class. But what wider class should we choose? All observers with the ability to detect CMB radiation? All observers smart enough to know what the CMB is? All conscious observers? There is large ambiguity in the choice of reference class, and the posterior likelihood we assign to theories will be (sometimes strongly) dependent on this choice. (This is very closely related to arguments about anthropic reasoning.)

Objection #2: Surprising conclusionsEdit

The typicality assumption leads to some surprisingly strong conclusions, both for prediction and retrodiction.

  • HS give the following retrodiction example. Suppose there are two theories, one which predicts only the 6 billion humans in the Solar System, and one which predicts both 6 billion humans (on Earth) and 100 trillion "Jovian" aliens (living nearly invisible in the atmosphere of Jupiter). If our reference class is "conscious observer" and our observation is "find ourselves to be human", then the latter theory seems to extremely unlikely. It "predicts" us to be a Jovian with very high probability. But intuitively, it seems impossible for us to conclusively determine that there are no tiny Jovians on Jupiter without actually going and looking.
  • The following prediction example is used widely. Suppose we have a theory which predicts a spatial finite but temporally infinite universe. In this theory, the universe has a few trillion years of normal evolution before settling into heat death for the rest of eternity. During heat death, there is some non-zero probability per unit hypervolume that a human brain will materialize out of the thermal soup with exactly my current memories, A "Boltzmann Brain" (BB). Given infinite time, there will be an infinite number of BBs which are exactly the same as me, an "ordinary observer" (OO) (of which there is, say, only one because of the size of the universe and the finite time before heat death). Assuming typicality over observers with exactly my data (One OO plus infinite BBs), this theory predicts with high probability that my brain is floating in space and will quickly dissolve. Since this does not happen, we seem to have excluded this theory. That is, we have made a very strong assertion about the very distant future of the universe from the comfort of our armchair without knowing almost anything about the underlying physics.

Several other surprising conclusions drawn in the literature are cited by HS, although I think they all fit into one of these two categories.

Theoretical FrameworkEdit

Summary: Hartle and Srednicki argue that we should asses both theories and xerographic distributions with experimental evidence.

The xerographic distribution formalism immediately prompt this question: if we're not to always choose the uniform distribution, which should we choose? HS contend that xerographic distributions should be treated similarly to underlying theories:we start with Bayesian priors, make predictions, and then calculate posterior likelihoods. In this way, the xerographic distribution would be partially determined by experimental evidence. Like theories, we would need strong priors concerning reasonableness/beauty/simplicity in order to extract useful experimental predictions. It's always possible to choose a contrived xerographic distribution to retrodict any observation, just as it's possible to choose a contrived underlying theory to do the same. HS call the combination of underlying theory with xerographic distribution a "theoretical framework".

For the case of BBs, HS conclude that we have strong observational evidence against theoretical framework consisting of both the eternal universe theory and the uniform xerographic distribution. However, HS maintain that if our prior preference for the eternal universe theory is strong, we can choose the "atypical" distribution where we favor the single OO over the infinitude of BBs. Using this distribution, we can continue to make normal experimental predictions, and we seemingly have no reason to prefer this theoretical framework to the one of a theory of a single ordinary observer (and no BBs) in a temporally finite universe, with the trivial xerographic distribution.


Summary: Most debaters agree that uniform distribution must be chosen for predictions.

The problem with debates like these in the literature is that there is often no published, widely acknowledged conclusion. I'll do my best to summarize which I think is the general consensus, which is consistent with my private communication with Srednicki.

The case of retrodiction, and the accompanying ambiguity in choosing a reference class, remains murky. This is basically the problem of using the anthropic principle, and there is no consensus among either physicists nor philosophers. It does seem, however, that once a reference class is chose most people think a uniform xerographic distribution should be used.

For prediction, most of the debaters (but not Hartle and Srednicki) think that a uniform distribution must be chosen, though basically all agree that the concept of a xerographic distribution is applicable. This means that some theories which predict infinite universes (such as certain inflationary cosmologies) can be discarded. Given a bunch of observers with truly indistinguishable information, it is hard to argue that we should prefer some over others.

Summary of the LiteratureEdit

M. Srednicki and J. Hartle, "Are We Typical?", Phys. Rev. D 75, 123523 (2007). (preprint: arXiv:0704.2630)

Hartle and Srednicki collect example of surprising results obtained by assuming typicality. They review Bayesian reasoning and claim that all we can conclude from our observations is that our data appears somewhere in the universe.

D. N. Page, "Typicality Defended", (preprint only?: arXiv:0707.4169)

The contents of this preprint seem to be subsumed in "Typicality Derived", although this paper contains a bit more discussion about what should constitute "data". Appears to be first to use the terminology of "first-" and "third-person predictions".

D. N. Page, "Typicality Derived", Phys. Rev. D 78, 023514 (2008). (preprint: arXiv:0804.3592)

Page (echoing others) points out that if all we can conclude from our observations is that our data appears somewhere in the universe (as HS argue), then we basically can't do science since large universes predict that all' data appears somewhere
He then reiterates his argument for the typicality assumption, giving a good example involving coin tosses.
Finally, he formalizes the method is a couple of ways equivalent ways and generalizes it so it can handle BBs pseudo-rigorously

D. N. Page, "Insufficiency of the Quantum State for Deducing Observational Probabilities", Phys. Lett. B, Volume 678, Issue 1, 6 July 2009, Pages 41-44. (preprint: arXiv:0808.0722)

Page states ambiguity of making predictions with theories of very large universes.

D. N. Page, "Born Again", (preprint only?: arXiv:0907.4152)

A proof, in case it wasn't obvious, that there is no observable which can be used in a naive application of the Born rule to make predictions which mimic the uniform xerographic distribution. Only tangentially related to the typicality discussion.

M. Srednicki and J. Hartle, "Science in a very large universe", Phys. Rev. D 81, 123524 (2010). (preprint: arXiv:0906.0042)

Hartle and Srednicki accept that if restricted to third-person predictions, their formulation would make theories of large universes impotent. So, they define first-person predictions and introduce the xerographic distribution (which had never been made explicit before, so long as I know).
Note that HS spend some time accusing Page of changing quantum mechanics, whereas they are only supplementing it for the purpose of calculating first-person probabilities. I think this is just semantics, and their only real point of disagreement is whether a uniform distribution must be chosen. (In his "reformulation" of QM, page implicitly chooses the uniform distribution
Note also that HS seem to confuse the cases of retrodiction and prediction in their debate over typicality, the crucial difference between which I have emphasized above. In particular, they use the more convincing objections to anthropic reasoning (retrodiction) to argue against the typicality assumption for prediction. I find it hard to believe that they didn't notice how important this distinction is, but it's the only way I can make sense of their paper.

Nick Bostrom, "Anthropic Bias: Observation Selection Effects in Science and Philosophy", Routledge New York & London (2002).

I haven't read this, but apparently this is the go-to reference for all thing anthropic. I wouldn't feel comfortable saying anything definitive about the anthropic principle until I had read this. In my past experience with Bostrom, he is very clear and insightful (a physicist's philosopher).