13 June 2025

The (Measurement Problem) Problem

The measurement problem is perhaps the best-known puzzle in quantum mechanics. A vast literature addresses this problem without ever resolving it. The measurement problem is also responsible for one of the most recognisable symbols of quantum mechanics, i.e. Schrödinger's cat.

The proposed solutions to this problem—the so-called "interpretations of quantum mechanics"—keep the mathematics as it is, and redefine reality to make sense of the maths. This procedure is problematic for several reasons: it inverts the scientific method of describing reality with mathematics (and it reifies mathematics), it's not testable, the different realities proposed are all mutually exclusive, and in the final analysis, changing reality to validate mathematics has no explanatory power.

Worse, the competing "interpretations" act more like ideologies in that they attract adherents who take sides and proselytise. I refer to these as "ideological" or "programmatic" interpretations because they are accompanied by a theoretical agenda. New ideological interpretations appear all the time, but none ever disappears. My intention here is not to spend time debating the merits of the ideological interpretations, all of which seem to me to be fundamentally unscientific, but to show that they are merely the froth on a much deeper set of philosophical issues.

The problem I wish to focus on is that the measurement problem does not, in fact, arise in the context of making measurements. Rather, it is a theoretical problem that arises from attempting to physically interpret a highly abstract mathematical procedure with no physical analogue.

In this essay, I will argue that the mathematical tail is wagging the metaphysical dog in quantum physics. Nature is not a mathematician. A problem arising from a mathematical procedure is being foisted onto reality rather than being dealt with in the mathematics.

Like many paradoxes, the associated difficulties seem to reside in how the problem is framed. So I want to begin this essay by attempting to accurately frame the situation in which the problem occurs.


What is the "Measurement Problem"?

The measurement problem hinges on precise distinctions between:

  • What the formalism says,
  • What the interpretations claim, and
  • What the observations show.

So we need to be as clear as possible about each of these.

The Formalism.

By the 1920s, a combination of empirical results and theoretical breakthroughs—such as Max Planck's description of blackbody radiation and Albert Einstein's description of the photoelectric effect—had shown that existing theories were insufficient to explain atoms.

Light had started off as vague "rays" and, under the influence of James Clerk Maxwell, had become electromagnetic waves propagating in space. Then light was observed to be quantised and to have particle-like properties. In 1923, Louis de Broglie proposed that all matter particles are also wave-like and have a "wavelength". At the time, this crystallised as the wave-particle duality postulate: matter is both a particle and a wave. As we will see, the tension implied by this duality continued to be influential, even after the advent of quantum field theory (QFT), which proposed that all matter is fundamentally excitations in fields.

Atomic spectra were known to consist of a handful of frequencies rather than a continuous rainbow, and some progress had been made on clarifying the relationships between the frequencies. Both Werner Heisenberg and Erwin Schrödinger were trying to explain atomic spectra in the wake of Planck and Einstein. Schrödinger aimed at a realist explanation in terms of wave mechanics using a Hamiltonian, rather than a Newtonian, formulation. He was inspired by the Hamilton-Jacobi equation because it draws a formal analogy between classical mechanics and wave optics. This suggested that classical trajectories might arise from some underlying wave phenomenon.

The result was the famous Schrödinger equation, which describes a "wavefunction", 𝜓. The equation can be written in many ways, though the most familiar forms came later. Where Schrödinger's formalism relied on differential equations, Heisenberg used matrices to describe virtual oscillations. Schrödinger himself proved that the two approaches were equivalent. Max Born and Pascual Jordan developed the matrix formulation into a kind of algebra with "functions" and "operators". An operator is a rule that can be applied to a function and returns another function.

There's a disconnect between what Schrödinger set out to do and how quantum mechanics turned out, which he was never reconciled with. Schrödinger was seeking a realist theory and wanted to treat 𝜓 as a physical wave in space. Heisenberg was influenced by positivism and wanted to explain phenomena purely in terms of what could be observed. His approach treated atomic spectra lines as resulting from a virtual oscillator. These early ideas were soon replaced by the idea that 𝜓 is a vector in Hilbert space (named after the mathematician David Hilbert).

The modern formalism of quantum mechanics has two modes: evolution over time and extracting information. To make for a simpler narrative, I'm going to focus on what it tells us about the position of a particle.

Other things being equal, the wavefunction 𝜓 evolves smoothly and continuously in time according to the time-dependent Schrödinger equation. We can rewrite the Schrödinger equation to give us 𝜓 as a function of position: 𝜓(x). For Schrödinger, x was simply a coordinate and 𝜓(x) represented the amplitude of the wave at that coordinate. In his view, the modulus squared of the wavefunction at that point |𝜓(x)|² reflected the charge density.

However, Max Born had other ideas. As described by John Earman (2022):

Born rejected Schrödinger's proposal because the spreading of the wave function seemed incompatible with the corpuscular [i.e. particle] nature of the electron, of which Born was convinced by the results of scattering experiments.

Born introduced a further level of abstraction. In Born’s view, 𝜓 does not describe a physical wave. It is an abstract mathematical function. The quantity |𝜓(x)|² gives the probability density for the electron, which tells us how probability is distributed over space. To find the probability of detecting the particle in a given range of x, one must integrate |𝜓(x)|² over that range. This is called "applying the Born rule". (Note that the procedure is slightly different for discrete quantities like spin).

Paul Dirac and John von Neumann added yet more layers of abstraction. Dirac invented his own operator algebra. Von Neumann recast 𝜓 in terms of vectors in an infinite-dimensional space—a Hilbert space. The Dirac-von Neumann formalism is how quantum mechanics is taught.

We can also retrieve another kind of position information, which is the expectation value for x. This is the long-term average position of the particle. This involves applying the position operator to the 𝜓, producing another function.

In the standard textbook formulation of quantum mechanics, 𝜓 represents a vector in an abstract vector space, and 𝜓(x) is a function representing that vector in terms of positions in space. It evolves smoothly and continuously; then we apply the Born rule, |𝜓(x)dx, to extract probabilities of particles appearing in defined regions of space. The physical interpretation of this formalism presents seemingly intractable problems.


Interpretations

Having attempted to present the formalism in neutral terms, we could simply stop at this point. In practice, however, we are compelled by inclination and convention to say what the theory means. And this involves saying what the mathematics means in physical terms. As noted above, by "interpretations" I'm not referring to ideological views like "Copenhagen" or "Many Worlds". Rather, I'm trying to draw attention to the contradictory assumptions that underlie these ideological views.

Traditionally, scientists aim to explain some phenomenon P in terms of its causes. If A causes P, then A explains P. In the abstract mathematics of vectors in Hilbert space, causality is not defined. However, rather than this fact shutting down speculation, it has enabled a proliferation of competing interpretive frameworks, many of which are treated as axiomatic.

There are numerous ideas about what, if anything, 𝜓 represents, and these include some very fundamental tensions: particle versus wave, realist versus anti-realist; ontic versus epistemic, observer-independent versus observer-dependent.

As noted, the wave-particle duality postulate remains at the heart of quantum mechanics. Waves and particles are not simply different kinds of entities. The mathematics used to describe them is also fundamentally different. Particles have qualities like position and momentum, while waves have properties like displacement, wavelength, and period. Quantum mechanics purports to replace waves and particles with a single unified mathematics, but what this means in physical terms remains unclear. And in the end, we still use the wavefunction to extract information on position and momentum (though not at the same time).

Quantum field theory reformulates particles as excitations of fields, but the legacy of dualistic terminology and beliefs persists, especially in the use of the wavefunction to compute particle positions (which is partly why I chose the example of position above).

The realist view is that reality extends to the atomic world, even though we cannot directly experience it (by "directly" here, I mean without apparatus). In this view, atoms, electrons, and photons are assumed to be real entities with real properties that we can either measure or infer. We are so used to science being committed to realism that it may seem strange to insist on this. However, the considered opinion is that classical physics fails to describe the atomic and subatomic scales, and this opened the door to antirealism.

The antirealist view is that if we cannot see or measure something, then it makes little sense to assert that it is real. Non-observable entities, such as electrons, are merely mathematical objects that we use for making calculations to account for experimental observations. Quantum theory emerged in the early twentieth century when positivism was having its brief moment in the sun. Positivists were against metaphysics on principle (based on a rather naive reading of Kant). Where something could not be observed, they argued that it should not have a place in our descriptions of reality. And in the 1920s, this included atoms, electrons, and protons. Heisenberg initially set out to eliminate anything unobservable from his formulation of quantum mechanics.

As noted Schrödinger was a realist. As were Planck and Einstein, although Einstein had changed what "real" meant. Historian Mara Beller (1996) notes that Heisenberg, Pauli, Born, and Bohr all denied being positivists, but were given to making strongly positivist pronouncements in their work.

We can also note that, on the whole, scientific realism is strongly associated with a commitment to metaphysical reductionism. Reductionism holds that only substances are real. Substance is described as "fundamental". In this view, structures and systems are simply adventitious phenomena. Thus, physicists tend to view the "particles" of the standard model (or the fields of QFT) to be the ultimate building blocks, the foundations of reality. Although, ironically, the fields of QFT are not observable, even in principle.

Another level of debate occurs over the ontological status of the wavefunction 𝜓. How one thinks about this issue is naturally influenced by one's pre-existing view about realism versus antirealism. For example, anti-realists are not inclined to see 𝜓 as real; realists are so inclined.

A major issue with such views is that assumptions can concatenate: if A, then B; if B, then C; if C, then D; and so on. For example, if I assume a realist stance, then it may seem natural to conclude that the wave function is real. And if the wavefunction is real, then something real must happen to it when we measure the location of a particle. And so on.

Yet realism is not a given; it is a metaphysical stance—one that rests on its own chain of assumptions.

Most of us fall one way or the other in the realism versus antirealism divide, and this limits what views are available in other contexts. Views in which the wavefunction is a real entity are called 𝜓-ontic, and views in which it is only concerned with knowledge are called 𝜓-epistemic (I mentioned this in my essay about Probability).

A 𝜓-ontic view holds that the wavefunction is a real entity, which is obviously also a realist view. Some high-profile physicists, notably Sean Carroll, now routinely state "the wavefunction is real" without qualification. Such views are colloquially referred to as 𝜓-ontologies. For Carroll, this move is a prelude to introducing Hugh Everett's Many Worlds ideology, in which the reality of 𝜓 is axiomatic.

A 𝜓-epistemic view holds that the wave function only represents our knowledge of the system. The issue of what real process creates atomic phenomena is incidental. The problem here is that, e.g. 𝜓(x) on its own provides us no knowledge of position in terms of space or time. To get knowledge out, we have to apply the Born rule, and even then, we don't obtain knowledge about position, but only about probabilities.

None of these views resolves all the issues. For example, 𝜓(x) appears to represent all possible locations without distinguishing between them. And since this situation arises prior to applying Born’s rule, it is typically interpreted as describing the system "before a measurement is made". However, if all positions are represented and none is distinguished—let alone selected—then the interpretation itself seems to require further interpretation.

If each interpretation needs another to explain it, we’re not solving the problem; we’re just piling words on top of mathematics. And this process can lead to infinite regress. Adding interpretive layers does nothing to resolve more fundamental ambiguities and contradictions, even when it obscures them.

The next dichotomy brings us to the heart of the measurement problem.

On one view, by far the dominant view, the observer plays a central role in causation at the atomic level. Readers are likely familiar with the phrase that observation collapses the wavefunction, even if they are vague about what it means. In my essay Observations and Superpositions, I refuted the idea that observation can occur prior to the outcome of the events we are observing. Still, this issue continues to arise and cause confusion.

However, the term "observer" is a misnomer. What "observation" means in this context is that a particle is detected by a particle detector. The "observer" is never a human being—we cannot see particles. The "observer" is a Geiger counter, a photomultiplier, a photographic plate, etc.

Eugene Wigner went one step further and postulated that "consciousness" collapses the wavefunction, and this idea has proved irresistible to mystics. "Consciousness" is in scare quotes because it is undefined by Wigner, and decades later, there is still no standard definition of this often reified abstraction. Physicists, by and large, view Wigner's theory as an embarrassing moment in an otherwise distinguished career and sweep it under the rug. The idea that "consciousness" is involved in deciding outcomes is simply nonsensical.

It is also possible to take the view that quantum phenomena are not affected by observation. That the "collapse of the wave function" is, for example, random.


The Measurement Problem

The "measurement problem" can be viewed from a plurality of philosophical standpoints, and thus, there are many ideas about what it connotes. However, the basic problem is that to extract a probability, we have to switch from evolving the wavefunction in time to applying the Born rule.

This leads to a number of unresolved ontological questions, some of which are artificial. The ontological status of the wavefunction is generally decided prior to these questions, but there is an ongoing dissensus on this issue. A constant question for physics is how descriptions, especially mathematical descriptions, relate to reality (presuming our metaphysical commitments admit a reality). This leads to a number of deep questions:

  • What does it mean for the wavefunction to "collapse"?
  • If there is a change in mathematical procedure between time evolution and making a prediction, does this amount to a change in reality?
  • If we cannot make any distinction in terms of position using time evolution, what is the ontological status of position prior to applying the Born rule?
  • What is the ontological status of probability?
  • Is making a prediction using the Born rule equivalent to making a measurement?

Each question has multiple serious answers, and is surrounded by a halo of versions for non-specialists that are more or less inaccurate. The idea of wavefunction collapse, in terms of position, is an interpretation of three facts:

  1. During time evolution of the wavefunction 𝜓, we obtain no definite information about position.
  2. When we apply the Born rule to 𝜓(x), we get the probability of a particle appearing in a range.
  3. When we measure position, we get definite position information.

An obvious question is, given how definitely we can measure position, why is the formalism that supposedly predicts this so vague and indefinite?

During time evolution, an ontic interpretation is that the particle has no definite position. An epistemic interpretation is that we don't know anything about the position prior to measuring it. A popular interpretation is that the particle is in all positions simultaneously.

Quantum physicists adopted the term "superposition" from wave mechanics to describe this situation. In wave mechanics, superposition refers to the simple mathematical fact that if two waves are described by functions f(x) and g(x), then their combined displacement at x is just f(x) + g(x). There is nothing remarkable about this; it is commonly observed and well understood in classical terms.

Superposition of waves

Quantum superposition, in the case of a single particle, can be read as "no position", "unknown position", or "all positions". We often read about electrons being "smeared out" in space. Or forming a cloud around the nucleus. But these images raise more questions than they answer (not least questions about how such an atom can be stable). My impression is that most scientists now lean towards "no position".

The epistemic position is more or less intuitive, but this does not make it right. As we have seen, physical intuition can be a poor guide to the physics of things we can't physically experience (like galaxies or atoms). Ignorance is our natural state. Still, what can it mean for a particle to have no definite position in an ontological sense? Is it extended in space? Does it not exist? No one can say.

How we view superposition determines how we view applying the Born rule. An epistemic view says that applying the Born rule changes our state from ignorant to knowledgeable (again, this is intuitive, but so what?). But if we take the wavefunction to be something real, then we are forced to seek a realist interpretation of applying the Born rule. Somehow, and no one knows how, the Born rule causes the wavefunction to do or be something different. If we start with the idea that all positions exist simultaneously, then applying the Born rule appears to do something dramatic, which has been called "collapsing the wavefunction". Which of these views actually applies is unclear.

And finally, we have the fact that when we measure the positions of particles, they appear in one definite place. Again, depending on how we interpret the wavefunction to start with, and how we interpret the outcome of applying the Born rule, there are several possible positions one can take on this.

It's seldom explicit, but it seems to be common to assume that making a measurement is functionally equivalent to applying the Born rule. This is why people say "measurement collapses the wavefunction". However, the result of the former is a single location in space, while the result of the latter is a probability of occurring within a range of space. In fact, it's not at all clear how taking a measurement relates to the quantum formalism.

The idea that the wavefunction "collapses" is a rather dramatic way of framing the issue. The idea of "collapse" requires that we view the wavefunction as real, in a realist framework. The time evolution is viewed as pregnant with possibilities, and applying the Born rule gives birth to a single possibility. And how we frame this in physical terms depends on our prior philosophical commitments as sketched out above.


Conclusion

One way to view science is that it seeks to replace guesses and superstition with objective knowledge that is accurate and precise to certain limits. Belief is merely a feeling about an idea, and thus highly subjective. The antipathy between belief and science is never far from the news.

And yet, beliefs play a central role in the story of quantum mechanics, even if it is merely the belief in instrumentalism. Some physicists believe that the wavefunction is real; others believe it is not real. Some believe that the wavefunction represents reality; others believe it represents knowledge about reality. Some believe that measurement collapses the wavefunction; others believe that wavefunction collapse is an incoherent idea. After a century of trying, physics alone seems incapable of resolving such issues.

Part of the reason that quantum theory seems so complicated is that it involves multiple layers of abstraction, each of which is subject to multiple competing interpretations. Some interpretations are accepted as axiomatic, and others are treated as postulates. Just when you think you've understood one aspect, you discover another that contradicts it. Commentators often take philosophical stances without sufficient justification. And at the popular level, explanations of quantum mechanics often do more harm than good.

It's almost guaranteed that any given statement about quantum beyond the bare mathematics will be contradicted by another point of view. It is ridiculously difficult to give a concise account of quantum mechanics that won't be shot down by someone.

However, given the choices on offer, a thoughtful person might well choose none of the above. My sense is that the ideological/programmatic interpretations, which keep the maths and redefine reality, are all incoherent. Notably, as I have pointed out many times now, all of these interpretations are mutually exclusive. For example, the ontology of Many Worlds is completely unrelated to the ontology of, say, Pilot Wave theory, which is unrelated to spontaneous collapse interpretations, which are unrelated to information-theoretic interpretations. Given the plurality of choices on offer and the lack of any coherent ideas that can distinguish between them, I can see how it makes sense to focus on what works. In the absence of a viable ontology, adopting an instrumentalist stance—in which we have a functioning mechanism for obtaining probabilities—makes a lot of sense. However, it abandons the attempt at explanation, which I find unsatisfactory.

For realism, at least, reality itself is a point of reference. A good theory is not only consistent with reality, it is consistent with other theories about reality. When two theories are diametrically opposed, like realism and anti-realism, then only one of them can reflect reality. One of them must be a bad theory. But when it comes to quantum mechanics, we still cannot tell which is which.

~~Φ~~


Bibliography

Beller, Mara. (1996). "The Rhetoric of Antirealism and the Copenhagen Spirit". Philosophy of Science 63(2): 183-204.

Earman, John. (2022) “The Status of the Born Rule and the Role of Gleason's Theorem and Its Generalizations: How the Leopard Got Its Spots and Other Just-So Stories.” [2022 preprint]

30 May 2025

Theory is Approximation

A farmer wants to increase milk production. They ask a physicist for advice. The physicist visits the farm, takes a lot of notes, draws some diagrams, then says, "OK, I need to do some calculations."

A week later, the physicist comes back and says, "I've solved the problem and I can tell you how to increase milk production".

"Great", says the farmer, "How?".

"First", says the physicist, "assume a spherical cow in a vacuum..."

What is Science?

Science is many things to many people. At times, scientists (or, at least, science enthusiasts) seem to claim that they alone know the truth of reality. Some seem to assume that "laws of science" are equivalent to laws of nature. Some go as far as stating that nature is governed by such "laws". 

Some believe that only scientific facts are true and that no metaphysics are possible. While this view is less common now, it was of major importance in the formulation of quantum theory, which still has problems admitting that reality exists. As Mara Beller (1996) notes:

Strong realistic and positivistic strands are present in the writings of the founders of the quantum revolution-Bohr, Heisenberg, Pauli and Born. Militant positivistic declarations are frequently followed by fervent denial of adherence to positivism (183). 

On the other hand, some see science as theory-laden and sociologically determined. Science is just one knowledge system amongst many of equal value. 

However, most of us understand that scientific theories are descriptive and idealised. And this is the starting point for me. 

In practising science, I had ample opportunity to witness hundreds or even thousands of objective (or observer-independent) facts about the world. The great virtue of the scientific experiment is that you get the same result, within an inherent margin of error associated with measurement, no matter who does the experiment or how many times they do it. The simplest explanation of this phenomenon is that the objective world exists and that such facts are consistent with reality. Thus, I take knowledge of such facts to constitute knowledge about reality. The usual label for this view is metaphysical realism.

However, I don't take this to be the end of the story. Realism has a major problem, identified by David Hume in the 1700s. The problem is that we cannot know reality directly; we can only know it through experience. Immanuel Kant's solution to this has been enormously influential. He argues that while reality exists, we cannot know it. In Kant's view, those qualities and quantities we take to be metaphysical—e.g. space, time, causality, etc.—actually come from our own minds. They are ideas that we impose on experience to make sense of it. This view is known as transcendental idealism. One can see how denying the possibility of metaphysics (positivism) might be seen as (one possible) extension of this view. 

It's important not to confuse this view with the idea that only mind is real. This is the basic idea of metaphysical idealism. Kant believed that there is a real world, but we can never know it. In my terms, there is no epistemic privilege.

Where Kant falls down is that he lacks any obvious mechanism to account for shared experiences and intersubjectivity (the common understanding that emerges from shared experiences). We do have shared experiences. Any scenario in which large numbers of people do coordinated movements can illustrate what I mean. For example, 10,000 spectators at a tennis match turning their heads in unison to watch a ball be batted back and forth. If the ball is not objective, or observer-independent, how do the observers manage to coordinate their movements? While Kant himself argues against solipsism, his philosophy doesn't seem to consider the possibility of comparing notes on experience, which places severe limits on his idea. I've written about this in Buddhism & The Limits of Transcendental Idealism (1 April 2016).

In a pragmatic view, then, science is not about finding absolute truths or transcendental laws. Science is about idealising problems in such a way as to make a useful approximation of reality. And constantly improving such approximations. Scientists use these approximations to suggest causal explanations for phenomena. And finally, we apply the understanding gained to our lives in the form of beliefs, practices, and technologies. 


What is an explanation?

In the 18th and 19th centuries, scientist confidently referred to their approximations as "laws". At the time, a mechanistic universe and transcendental laws seemed plausible. They were also gathering the low-hanging fruit, those processes which are most obviously consistent and amenable to mathematical treatment. By the 20th century, as mechanistic thinking waned, new approximations were referred to as "theories" (though legacy use of "law" continued). And more recently, under the influence of computers, the term "model" has become more prevalent. 

A scientific theory provides an explanation for some aspect of reality, which allows us to understand (and thus predict) how what we observe will change over time. However, even the notion of explanation requires some unpacking.

In my essay, Does Buddhism Provide Good Explanations? (3 Feb 23), I noted Faye's (2007) typology of explanation:

  • Formal-Logical Mode of Explanation: A explains B if B can be inferred from A using deduction.
  • Ontological Mode of Explanation: A explains B if A is the cause of B.
  • Pragmatic Mode of Explanation: a good explanation is an utterance that addresses a particular question, asked by a particular person whose rational needs (especially for understanding) must be satisfied by the answer.
In this essay, I'm striving towards the pragmatic mode and trying to answer my own questions. 

Much earlier (18 Feb 2011), I outlined an argument by Thomas Lawson and Robert McCauley (1990) which distinguished explanation from interpretation.

  • Explanationist: Knowledge is the discovery of causal laws, and interpretive efforts simply get in the way.
  • Interpretationist: Inquiry about human life and thought occurs in irreducible frameworks of values and subjectivity. 
"When people seek better interpretations they attempt to employ the categories they have in better ways. By contrast, when people seek better explanations they go beyond the rearrangement of categories; they generate new theories which will, if successful, replace or even eliminate the conceptual scheme with which they presently operate." (Lawson & McCauley 1990: 29)

The two camps are often hostile to each other, though some intermediate positions exist between them. As I noted, Lawson and McCauley see this as somewhat performative:

Interpretation presupposes a body of explanation (of facts and laws), and seeks to (re)organise empirical knowledge. Explanation always contains an element of interpretation, but successful explanations winnow and increase knowledge. The two processes are not mutually exclusive, but interrelated, and both are necessary.

This is especially true for physics where explanations often take the form of mathematical equations that don't make sense without commentary/interpretation.  


Scientific explanation.

Science mainly operates, or aims to operate, in the ontological/causal mode of explanation: A explains B if (and only if) A is the cause of B. However, it still has to satisfy the conditions for being a good pragmatic explanation:  "a good explanation is an utterance that addresses a particular question, asked by a particular person whose rational needs (especially for understanding) must be satisfied by the answer."

As noted in my opening anecdote, scientific models are based on idealisation, in which an intractably complex problem is idealised until it becomes tractable. For example, in kinematic problems, we often assume that the centre of mass of an object is where all the mass is. It turns out that when we treat objects as point masses in kinematics problems, the computations are much simpler and the results are sufficiently accurate and precise for most purposes. 

Another commonly used idealisation is the assumption that the universe is homogeneous or isotropic at large scales. In other words, as we peer out into the farthest depths of space, we assume that matter and energy are evenly distributed. As I will show in the forthcoming essay, this assumption seems to be both theoretically and empirically false. And it seems that so-called "dark energy" is merely an artefact of this simplifying assumption. 

Many theories have fallen because of employing a simplifying assumption that distorts answers to make them unsatisfying. 

A "spherical cow in a vacuum" sounds funny, but a good approximation can simplify a problem just enough to make it tractable and still provide sufficient accuracy and precision for our purposes. It's not that we should never idealise a scenario or make simplifying assumptions. The fact is that we always do this. All physical theories involve starting assumptions. Rather, the argument is pragmatic. The extent to which we idealise problems is determined by the ability of the model to explain phenomena to the level of accuracy and precision that our questions require. 

For example, if our question is, "How do we get a satellite into orbit around the moon?" we have a classic "three-body" problem (with four bodies: Earth, moon, sun, and satellite). Such problems are mathematically very difficult to solve. So we have to idealise and simplify the problem. For example, we can decide to ignore the gravitational attraction caused by the satellite, which is real but tiny. We can assume that space is relatively flat throughout. We can note that relativistic effects are also real but tiny. We don't have to slavishly use the most complex explanation for everything. Given our starting assumptions, we can just use Newton's law of gravitation to calculate orbits. 

We got to relativity precisely because someone asked a question that Newtonian approaches could not explain, i.e. why does the orbit of Mercury precess and at what rate? In the Newton approximation, the orbit doesn't precess. But in Einstein's reformulation of gravity as the geometry of spacetime, a precession is expected and can be calculated. 


Models

I was in a physical chemistry class in 1986 when I realised that what I had been learning through school and university was a series of increasingly sophisticated models, and the latest model (quantum physics) was still a model. At no point did we get to reality. There did seem to me to be a reality beyond the models, but it seemed to be forever out of reach. I had next to no knowledge of philosophy at that point, so I struggled to articulate this thought, and I found it dispiriting. In writing this essay, I am completing a circle that I began as a naive 20-year-old student.

This intuition about science crystallised into the idea that no one has epistemic privilege. By this I mean that no one—gurus and scientists included—has privileged access to reality. Reality is inaccessible to everyone. No one knows the nature of reality or the extent of it. 

We all accumulate data via the same array of physical senses. That data feeds virtual models of world and self created by the brain. Those models both feed information to our first-person perspective, using the sensory apparatus of the brain to present images to our mind's eye. This means that what we "see" is at least two steps removed from reality. This limit applies to everyone, all the time.

However, when we compare notes on our experience, it's clear that some aspects of experience are independent of any individual observer (objective) and some of them are particular to individual observers (subjective). By focusing on and comparing notes about the objective aspects of experience, we can make reliable inferences about how the world works. This is what rescues metaphysics from positivism on one hand and superstition on the other. 

We can all make inferences from sense data. And we are able to make inferences that prove to be reliable guides to navigating the world and allow us to make satisfying causal explanations of phenomena. Science is an extension of this capacity, with added concern for accuracy, precision, and measurement error. 

Since reality is the same for everyone, valid models of reality should point in the same direction. Perhaps different approaches will highlight different aspects of reality, but we will be able to see how those aspects are related. This is generally the case for science. A theory about one aspect of reality has to be consistent, even compliant, with all the other aspects. Or if one theory is stubbornly out of sync, then that theory has to change, or all of science has to change. Famously, Einstein discovered several ways in which science had to change. For example, Einstein proved that time is particular rather than universal.  Every point in space has its own time. And this led to a general reconsideration of the role of time in our models and explanations. 


Sources of Error

A scientific measurement is always accompanied by an estimate of the error inherent in the measurement apparatus and procedure. Which gives us a nice heuristic: If a measurement you are looking at is not accompanied by an indication of the errors, then the measurement is either not scientific, or it has been decontextualised and, with the loss of this information, has been rendered effectively unscientific.

Part of every good scientific experiment is identifying sources of error and trying to eliminate or minimise them. For example, if I measure my height with three different rulers, will they all give the same answer? Perhaps I slumped a little on the second measurement? Perhaps the factory glitched, and one of the rulers is faulty? 

In practice, a measurement is accurate to some degree, precise to some degree, and contains inherent measurement error to some degree. And each degree should be specified to the extent that it is known.

Accuracy is itself a measurement, and as a quantity reflects how close to reality the measurement is. 

Precision represents how finely we are making distinctions in quantity.

Measurement error reflects uncertainty introduced into the measurement process by the apparatus and the procedure.

Now, precision is relatively easy to know and control. We often use the heuristic that a ruler is accurate to half the smallest measure. So a ruler marked with millimetres is considered precise to 0.5 mm. 

Let's I want to measure my tea cup. I have three different rulers. But I also note that the cup has rounded edges, so knowing where to measure from is a judgment call. I estimate that this will add a further 1 mm of error. Here are my results: 

  • 83.5 ± 1.5 mm.
  • 86.0 ± 1.5 mm.
  • 84.5 ± 1.5 mm

The average is 84.6 ± 1.5 mm. So we would say that we think the true answer lies between 86.1 and 83.1 mm. And note that even though I have an outlier (86.0 mm), this is in fact within the margin of error. 

As I was measuring, I noted another potential source of error. I was guesstimating where the widest point was. And I think this probably adds another 1-2 mm of measurement error. When considering sources of error in a measurement, one's measurement procedure is often a source. In science, clearly stating one's procedure allows others to notice problems the scientists might have overlooked. Here, I might have decided to mark the cup so that I measured at the same point each time. 

Now the trick is that there is no way to get behind the measurement and check with reality. So, accuracy has to be defined pragmatically as well. One way is to rely on statistics. For example, one makes many measurements and presents the mean value and the standard deviation (which requires more than three measurements). 

The point is that error is always possible. It always has to be accounted for, preferably in advance. We can take steps to eliminate error. An approximation always relies on starting assumptions, and these are also a source of error. Keep in mind that this critique comes from scientists themselves. They haven't been blindly ignoring error all these years. 


Mathematical Models

I'm not going to dwell on this too much. But in science, our explanations and models usually take the form of an abstract symbolic mathematical equation. A simple, one-dimensional wave equation takes the general form:

y = f(x,t)

That is to say that the displacement of the wave (y) is a function of position (x) and time (t). Which is to say that changes in the displacement are proportional to changes in position in space and time. This describes a wave that, over time, moves in the x direction (left-right) and displaces in the y direction (up-down). 

More specifically, we model simple harmonic oscillations using the sine function. In this case, we know that spatial changes are a function of position and temporal changes are a function of time. 

y(x) = sin(x)
y(t) = sin(t)

It turns out that the relationship between the two functions can be expressed as 

y(x,t) = sin(x ± t).

If the wave is moving right, we subtract time, and if the wave is moving to the left, we add it. 

The sine function smoothly changes between +1 and -1, but a real wave has an amplitude, and we can scale the function by multiplying it by the amplitude.

y(x,t) = A sin(x ± t).

And so on. We keep refining the model until we get to the general formula:

y(x,t) = A sin(kx ± ωt ± ϕ).

Where A is the maximum amplitude, k is the stiffness of the waving medium, ω is the angular velocity, and ϕ is the phase.

The displacement is periodic in both space and time. Since k = 2π/λ (where λ is the wavelength), the function returns to the same spatial configuration when x/n = λ (where n is a whole number). Similarly, since ω = 2π/T (where T is the period or wavetime), the function returns to the same temporal configuration when t/n = T.

What distinguishes physics from pure maths is that, in physics, each term in an equation has a physical significance or interpretation. The maths aims to represent changes in our system over time and space. 

Of course, this is idealised. It's one-dimensional. Each oscillation is identical to the last. The model has no friction. If I add a term for friction, it will only be an approximation of what friction does. But no matter how many terms I add, the model is still a model. It's still an idealisation of the problem. And the answers it gives are still approximations.


Conclusion

No one has epistemic privilege. This means that all metaphysical views are speculative. However, we need not capitulate to solipsism (we can only rely on our own judgements), relativism (all knowledge has equal value) or positivism (no metaphysics is possible). 

Because, in some cases, we are speculating based on comparing notes about empirical data. This allows us to pragmatically define metaphysical terms like reality, space, time, and causality in such a way that our explanations provide us with reliable knowledge. That is to say, knowledge we can apply and get expected results. Every day I wake up and the physical parameters of the universe are the same, even if everything I see is different. 

Reality is the world of observer-independent phenomena. No matter who is looking, when we compare notes, we broadly agree on what we saw. There is no reason to infer that reality is perfect, absolute, or magical. It's not the case that somewhere out in the unknown, all of our problems will be solved. As a historian of religion, I recognise the urge to utopian thinking and I reject it. 

Rather, reality is seen to be consistent across observations and over time. Note that I say "consistent", not "the same". Reality is clearly changing all the time. But the changes we perceive follow patterns. And the patterns are consistent enough to be comprehensible. 

The motions of stars and planets are comprehensible: we can form explanations for these that satisfactorily answer the questions people ask. The patterns of weather are comprehensible even when unpredictable. People, on the other hand, remain incomprehensible to me.

That said, all answers to scientific questions are approximations, based on idealisations and assumptions. Which is fine if we make clear how we have idealised a situation and what assumptions we have made. This allows other people to critique our ideas and practices. As Mercier and Sperber point out, it's only in critique that humans actually use reasoning (An Argumentative Theory of Reason,10 May 2013). 

We can approximate reality, but we should not attempt to appropriate it by insisting that our approximations are reality. Our theories and mathematics are always the map, never the territory. The phenomenon may be real, but the maths never is.  

This means that if our theory doesn't fit reality (or the data), we should not change reality (or the data); we should change the theory. No mathematical approximation is so good that it demands that we redefine reality. Hence, all of the quantum Ψ-ontologies are bogus. The quantum wavefunction is a highly abstract concept; it is not real. For a deeper dive into this topic, see Chang (1997), which requires a working knowledge of how the quantum formalism works, but makes some extremely cogent points about idealised measurements.

In agreeing that the scientific method and scientific explanations have limits, I do not mean to dismiss them. Science is by far the most successful knowledge seeking enterprise in history. Science provides satisfactory answers many questions. For better or worse, science has transformed our lives (and the lives of every living thing on the planet). 

No, we don't get the kinds of answers that religion has long promised humanity. There is no certainty, we will never know the nature of reality, we still die, and so on. But then religion never had any good answers to these questions either. 

~~Φ~~


Beller, Mara. (1996). "The Rhetoric of Antirealism and the Copenhagen Spirit". Philosophy of Science 63(2): 183-204.

Chang, Hasok. (1997). "On the Applicability of the Quantum Measurement Formalism." Erkenntnis 46(2): 143-163. https://www.jstor.org/stable/20012757

Faye, Jan.(2007). "The Pragmatic-Rhetorical Theory of Explanation." In Rethinking Explanation. Boston Studies in the Philosophy of Science, 43-68. Edited by J. Persson and P. Yikoski. Dordrecht: Springer.

Lawson, E. T. and McCauley, R. N. (1990). Rethinking Religion: Connecting Cognition and Culture. Cambridge: Cambridge University Press.


Note: 14/6/25. The maths is deterministic, but does this mean that reality is deterministic? 

23 May 2025

The Curious Case of Phlogiston

I'm fascinated by revisionist histories. I grew up in a British colony where we were systematically lied to about our own history. Events in the 1970s and 1980s forced us to begin to confront what really happened when we colonised New Zealand. At around the same time, modern histories began to appear to give us a more accurate account. James Belich's Making Peoples had a major impact on me. Michael King's Being Pakeha also struck a chord, as did Maurice Shadbolt's historical novel Monday's Warriors.

Most people who know a little bit about the history of science will have heard of phlogiston. The phlogiston theory is usually portrayed as exactly the kind of speculative metaphysics that was laid to rest by artful empiricists. Phlogiston became a symbol of the triumph of empiricism over superstition. As a student of chemistry, I imbibed this history and internalised it. 

The popular history (aka science folklore) has a Whiggish feel in the sense that Lavoisier is represented as making a rational leap towards the telos of the modern view. Such, we are led to believe, is the nature of scientific progress. My favourite encyclopedia repeats the standard folklore:

The phlogiston theory was discredited by Antoine Lavoisier between 1770 and 1790. He studied the gain or loss of weight when tin, lead, phosphorus, and sulfur underwent reactions of oxidation or reduction (deoxidation); and he showed that the newly discovered element oxygen was always involved. Although a number of chemists—notably Joseph Priestley, one of the discoverers of oxygen—tried to retain some form of the phlogiston theory, by 1800 practically every chemist recognized the correctness of Lavoisier’s oxygen theory.—Encyclopedia Britannica.

Compare this remark by Hasok Chang (2012b: time 19:00) in his inaugural lecture as Hans Rausing Professor of History and Philosophy of Science, at Cambridge University:

I became a pluralist about science because I could not honestly convince myself that the phlogiston theory was simply wrong or even genuinely inferior to Lavoisier's oxygen-based chemical theory.

When I was reading about the systematic misrepresentation of the work of J. J. Thomson and Ernest Rutherford in physics folklore, Chang's lecture came to mind. I discovered Chang 4 or 5 years ago and have long wanted to review his account of phlogiston, but was caught up in other projects. In this essay, I will finally explore the basis for Chang's scepticism about the accepted history of phlogiston. While I largely rely on his book, Chang pursued this theme in two earlier articles (2009, 2010).


Setting the Scene

The story largely takes place in the mid-late eighteenth century. The two principal figures are Joseph Priestley (1733 – 1804) and Antoine-Laurent de Lavoisier (1743 – 1794). 

A caveat is that while I focus on these two figures, the historical events involved dozens, if not hundreds, of scientists. Even in the 1700s, science was a communal and cooperative affair; a slow conversation amongst experts. My theme here is not "great men of history". My aim is to explore the historiography of science and reset my own beliefs. Chang's revisionist history of phlogiston is fascinating by itself, but I am intrigued by how Chang uses it as leverage in his promotion of pluralism in science. Priestley and Lavoisier are just two pegs to hang a story on. And both were, ultimately, wrong about chemistry. 

Chang (2012: 2-5) introduces Priestley at some length. He refers to him as "a paragon of eighteenth-century amateur science" who "never went near a university", while noting that he was also a preacher and a "political consultant" (from what I read, Priestley was really more of a commentator and pamphleteer). As a member of a Protestant dissenting church, Priestley was barred from holding any public office or working in fields such as law or medicine. In the 1700s, British universities were still primarily concerned with training priests for the Church of England. That said, Priestley was elected a fellow of the Royal Society in 1766, which at least gained him the ears of fellow scientists. Priestley is known for his work identifying different gases in atmospheric air. He first discovered "fixed air" (i.e. carbon-dioxide) and became a minor celebrity with his invention of carbonated water. He also discovered oxygen, more on this below. 

However, Chang provides no similar introduction to Lavoisier. Rather, Lavoisier appears in a piecemeal way as a foil to his main character, Priestley. The disparity seems to be rhetorical. Part of Chang's argument for plurality in science is that Priestley was on the right track and has been treated poorly by historians of science. By focusing primarily on Priestley and treating Lavoisier as secondary, Chang might be seen as rebalancing a biased story.

I'm not sure that this succeeds, because as a reviewer, I now want to introduce Lavoisier to my readers, and I have to rely on third-party sources to do that. Chang doesn't just leave the reader hanging; he misses an opportunity to put Lavoisier in context and to draw some obvious comparisons. That Priestley and Lavoisier inhabited very different worlds is apposite to any history of phlogiston.

Lavoisier was an aristoi who inherited a large fortune at the age of five (when his mother died). He attended the finest schools where he became fascinated by the sciences (such as they were at the time). This was followed by university studies, where Lavoisier qualified as a lawyer, though he never practised law (he did not need to). As an aristo, Lavoisier had access to the ruling elite, which gave him leverage in his dispute with Priestley. He was also something of a humanitarian and philanthropist, spending some of his fortune on such projects as clean drinking water, prison reform, and public education. Despite this, he was guillotined during the French Revolution after being accused of corruption in his role as a tax collector. He was later exonerated of corruption.

The contrasting social circumstances help to explain why Lavoisier was able to persuade scientists to abandon phlogiston for his oxygen theory. Lavoisier had money and class on his side in a world almost completely dominated by money and class. 

Having introduced the main players, we now need to backtrack a little to put their work in its historical context. In the 1700s, the Aristotelian idea that the world is made of earth, water, fire, and air was still widely believed. To be clear, both water and air were considered to be elemental substances. 18th-century medicine was still entirely rooted in this worldview.

Alchemy still fascinated the intelligentsia of the day. On one level, alchemists pursued mundane goals, such as turning lead into gold, and on another, they sought physical immortality (i.e. immortality in this life rather than in the afterlife).

The telescope and microscope were invented in the early 1600s. With the former, Galileo observed the Moon and Jupiter's satellites, becoming the first empirical scientist to upset the existing worldview by discovering new facts about the world. 

That worldview was still largely the synthesis of Christian doctrine with Aristotelian philosophy created by Thomas Aquinas (1225–1274). The microscope had also begun to reveal a level of structure to the world, and to life, that no one had previously suspected existed. The practice of alchemy began to give way to natural philosophy, i.e. the systematic investigation of properties of matter. Priestley and Lavoisier were not the only people doing this, by any means, but they were amongst the leading exponents of natural philosophy. 

One of the key phenomena that captured the attention of natural philosophers, for obvious reasons, was combustion. The ancient Greeks believed that fire was elemental and that combustion released the fire element latent in the fuel. This is the precursor to the idea of phlogiston as a substance.


Phlogiston Theory

The first attempt at a systematic account of phlogiston is generally credited to Georg Ernst Stahl (1659 – 1734) in Zymotechnia fundamentalis "Fundamentals of the Art of Fermentation" (1697). The term phlogiston derives from the Greek φλόξ phlóx "flame", and was already in use when it was applied to chemistry.

The basic idea was that anything which burns contains a mixture of ash and phlogiston. Combustion is the process by which phlogiston is expelled from matter, leaving behind ash. And we see this process happening in the form of flames. And thus, a combustible substance was one that contained phlogiston. Phlogiston was "the principle of inflammability". 

However, experimentation had begun to show interesting relationships between metals and metal-oxides (known at the time as calx). One could be turned into the other, and back again. For example, metallic iron gradually transforms into a reddish calx, which is a mixture of a couple of different oxides of iron. To turn iron-oxide back into iron, we mix it with charcoal or coke and heat it strongly. And this reversible reaction is common to all metals. 

Chemists used phlogiston to explain this phenomenon. Metals, they conjectured, were rich in phlogiston. This is why metals have such qualities as lustre, malleability, ductility, and electrical conductivity. In becoming a calx, the metal must be losing phlogiston, and by analogy, a calx is a kind of ash. On the other hand, charcoal and coke burn readily, so they must also be rich in phlogiston. When heated together, the phlogiston must move from the charcoal back into the calx, reconstituting the metal. 

This reversible reaction was striking enough for Immanuel Kant to use it, in The Critique of Pure Reason (1781), as an example of how science "began to grapple with nature in a principled way" (Chang 2012: 4).

Priestley is famous for having discovered oxygen, but as Chang emphasises, it was Lavoisier who called it that. Priestley called it dephlogisticated air, i.e. air from which phlogiston has been removed. "Air" in this context is the same as "gas" in modern parlance.

Priestley produced dephlogisticated air by heating mercury-oxide, releasing oxygen and leaving the pure metal. According to the phlogiston theory, such dephlogisticated air should readily support combustion because, being dephlogisticated, it would readily accept phlogiston from combustion. And so it proved. Combustion in dephlogisticated air was much more vigorous. Breathing the new air also made one feel invigorated. Priestley was the first human to breathe pure oxygen, though he tested it on a mouse first.

Formerly considered elemental, atmospherical air could now be divided into "fixed air" and "dephlogisticated air". A missing piece was inflammable air (hydrogen), which was discovered by Henry Cavendish in 1766, when he was observing the effects of acids on metals. Cavendish had also combusted dephlogisticated air and inflammable air to make water. And Priestley had replicated this in his own lab.

Priestley and Cavendish initially suspected that inflammable air was in fact phlogiston itself, driven from the metal by the action of acids. A calx in acid produced no inflammable air, because it was already dephlogisticated. However, the fact that dephlogisticated air and phlogiston combined to make water was suggestive and led to an important refinement of the phlogiston theory.

They settled on the idea that inflammable air (hydrogen) was phlogisticated water, and that dephlogisticated air (oxygen) was actually dephlogisticated water. And thus, the two airs combined to form water. In this view, water is still elemental. 

It was Lavoisier who correctly interpreted this reaction to mean that water was not an element but a compound of hydrogen and oxygen (and it was Lavoisier who named inflammable air hydrogen, i.e. "water maker"). However, it is precisely here, Chang argues that phlogiston proves itself to be the superior theory.

Chang notes that, without the benefit of hindsight, it's difficult to say what is so wrong with the phlogiston theory. It gave us a working explanation of certain chemical phenomena, and it made testable predictions that were accurate enough to be taken seriously. For its time, phlogiston was a perfectly good scientific theory. So the question then becomes, "Why do we see it as a characteristic example of a bad scientific theory disproved by empiricism?" Was it really such a bad theory?


A Scientific Blunder?

On one hand, Chang argues that, given the times, phlogiston theory was a step in the right direction, away from alchemical views and towards seeing electricity as the flow of a fluid, which then leads towards the modern view of chemical reactions involving the flow or exchange of electrons. And on the other hand, Lavoisier's theory is far from being "correct".

If the argument is that phlogiston was an ad hoc concept that could not be observed, then why is the same criticism not levelled against Lavoisier for the role of elemental luminaire or caloric in his theory? Caloric is what we would now call "heat", and it is clearly not elemental.

The terms "oxidation" and "reduction" (and the portmanteau "redox") are generalisations from Lavoisier's explanation of metals and metal-oxides. A metal-oxide can be "reduced" to the pure metal, and a metal oxidised to form the oxide. And one can make them go back and forth by altering the conditions.

While oxidation and reduction apply to metals and their oxides, such reactions are not typical. Most redox reactions don't involve metals or oxygen. When fluorine reacts with hydrogen, for example, we say that hydrogen is "oxidised" (gives up an electron) and that fluorine is "reduced" (gains an electron). And this terminology doesn't make much sense. Even with a BSc in chemistry, I always have to stop and think carefully about which label applies because it's not intuitive.

A commonly cited reason for the collapse of the phlogiston theory is that a metal gains weight in becoming a calx. The implication is that phlogiston theory was at a loss to explain this. Superficially, the early versions of phlogiston theory argue that in becoming a calx, the metal loses phlogiston, so we would expect it to lose weight, rather than gain it. The idea that the metal combines with oxygen is correct in hindsight, and is how we see the formation of metal-oxides in the present.

However, Priestley and another phlogistonist, Richard Kirwan, did have an explanation for weight gain. I've already noted that Priestley's ideas matured and that, latterly, he had concluded that inflammable air (hydrogen) was phlogisticated water, and dephlogisticated air (oxygen) was dephlogisticated water. In Priestley's mature view, the metal formed a calx by combination with water and the loss of phlogiston. The added weight was due to the dephlogisticated water. When the calx was reduced, the metal absorbed phlogiston and gave up water. 

Like Chang, when I review this explanation, keeping in mind the state of knowledge at the time, I can't see how Lavoisier's explanation is any better. Seen in the context of the times (late 18th century), there was nothing illogical about the phlogiston theory. It explained observations and made testable predictions. As Chang (2010: 50) says:

We really need to lose the habit of treating ‘phlogiston theory got X wrong’ as the end of the story; we also need to ask whether Lavoisier’s theory got X right, and whether it did not get Y and Z wrong.

Chang cites several historians of science commenting on this. For example, John McEvoy (1997) notes that...

by the end of the eighteenth century, almost every major theoretical claim that Lavoisier made about the nature and function of oxygen had been found wanting.

And Robert Siegfried (1988):

The central assumptions that had guided [Lavoisier's] work so fruitfully were proved empirically false by about 1815.

These comments are in striking contrast to the claim made by Britannica: "by 1800, practically every chemist recognized the correctness of Lavoisier’s oxygen theory". The story in Britannica is the widely accepted version of history. At the same time, Chang makes clear, the story in Britannica is simply false.

Lavoisier's theory of acids, his theory of combustion, and his theory of caloric were all clearly wrong from the viewpoint of modern chemistry. For example, Lavoisier claimed that all acids contain oxygen (the name oxygen means "acid maker"). However, hydrochloric acid (which we have in our stomachs) does not contain oxygen. Indeed, the action of acids is now thought to be because of their ability to produce hydrogen ions (aka naked protons, aka phlogisticated water), which are extremely reactive.

Moreover, as Chang (2012: 9) shows, the problems with Lavoisier's theory were well known to his contemporaries. Many scientists voiced their concerns at the time. The point is well taken. If we are judging by modern standards, then Lavoisier and Priestley were both wrong, Lavoisier no less than Priestley. Nonetheless, Lavoisier, with his fortune and his access to the French aristoi, had more leverage than dissenting Priestley.

That said, Lavoisier clearly won the argument. And the brief account of his triumph in Britannica is a classic example of the adage that the victors write history.


What We Lost

What Chang tries to do next is declared by the subtitle of section 2: "Why Phlogiston Should Have Lived" (2012: 14). The first section of the book is deliberately written relatively informally with the idea that a general reader could appreciate the argument. In this second section, however, he develops a much more philosophically rigorous approach and introduces a great deal more jargon, some of which is specific to his project.

My aim in this essay is to continue the discussion at the same level. This inevitably means losing exactly the nuances that Chang introduces and probably diverging from his intentions to some extent. I do recommend reading the rest of his argument. What follows is my, all too brief, interpretation of Chang's argument. 

While his history is revisionist, Chang's point is not to promote a speculative counterfactual history (which is to say, a fictitious alternative history). Rather, he seeks to make an argument for pluralism. Where pluralism means the coexistence of different explanations for any given phenomenon, until such time as the best explanation emerges. 

Chang argues that Lavoisier's view that oxygen was being exchanged in chemical reactions was clearly inferior and only applicable to metal/calx reactions. By the time this became clear, phlogiston was discredited and could not be revived. And Lavoisier's counterintuitive oxidation-reduction model became the norm in chemistry, and still is, despite its obvious disadvantages. 

The idea that phlogiston was being exchanged in chemical reactions was not a bad theory (for the time). Moreover, phlogiston was already conceptually linked to electricity. Getting from redox to the exchange of electrons took another century. Chang argues that the conceptual leap from phlogiston to the exchange of electrons could have been considerably easier than it was, starting from Lavoisier's theory.

Chang's argument for pluralism is not simply based on the two theories being equally false. Indeed, he goes to some pains to explain what they both got right. The point is that the phlogiston theory had untapped potential. In prematurely killing off phlogiston and adopting Lavoisier's oxygen theory (which as we have seen was disproved a few decades later), we actually retarded the progress of science. And when Lavoisier was proven wrong, we had no alternative theory and simply retained his awkward and misleading terminology. 

Had we allowed the two theories to co-exist a little longer, so that Lavoisier's explanation could be thoroughly tested and proven false before it was adopted, there is a possibility that we might have lighted on the electron exchange theory of chemical reactions a century earlier than we did. Indeed, as hinted above, phlogiston was already linked to electricity. Seen with hindsight, the rush to judgment about chemical reactions meant that scientists of the late 17th and early 18th centuries missed a huge opportunity. 

Chang is a pragmatist. He knows we cannot go back. His argument is that we should be alert to this situation in the present and the future and be less eager to settle on a theory where ambiguity remains. Arguably, the temporary triumph of the various Copenhagen interpretations of Schrödinger's equation was a similar example. We settled too early, for reasons unconnected to science, only to have the chosen theory be disproved some decades later. 

I don't read Chang as saying that we should hold on to pluralism no matter what. Only that, where there is room for doubt, we should allow multiple explanations to coexist, because we don't know in advance what the best answer will be. This only emerges over time. And a scientific theory can only benefit from responding to the challenges that other explanations pose.


Conclusions

Hasok Chang aims to demonstrate the value of pluralism through critiquing the history of the so-called "chemical revolution" identified with Lavoisier. And the case of phlogiston is both fascinating in its own right and a compelling study of how the lack of pluralism retarded the progress of science. 

While sources like Britannica follow science folklore in insisting on the "correctness" of the oxygen theory, historians of science tell us a different story. It may be true that Lavoisier's theory was widely adopted by 1800, but historians have shown that it was also largely falsified by 1815. By this time, the phlogiston theory had been "killed", as Chang puts it.

Chang attempts to show that phlogiston was not such a bad theory and that the oxygen theory was not such a good theory. Contrary to the usual Whiggish accounts, the triumph of Lavoisier's oxygen theory was not really an example of "scientific progress". Indeed, Chang supposes that adopting the oxygen theory actually retarded the progress of science, since it pointed away from the role of electricity in chemistry. This important insight took another century to emerge.

The phlogiston theory is arguably the better of the two theories that existed in the late 1700s. Chang argues that had phlogiston persisted just a little longer, at least until Lavoisier was disproved, we might have made the leap to seeing chemical reactions in terms of the flow of electricity between elements much earlier than we eventually did. And who knows what else this might have changed?

The point is not to inaugurate some kind of neo-phlogistonist movement or to speculate about counterfactual (alternative) histories. The point is that when we have competing theories, in the present, we should allow them to coexist rather than rushing to settle on one of them. 

Pluralism is a pragmatic approach to uncertainty. When different explanations are possible, we can compare and contrast the differences. Allowing such challenges is more likely to result in scientific progress than the rush to judgment or the overwhelming desire to have one right answer.

As noted at the outset, in this essay, I have largely overlooked the contributions of Priestley's and Lavoisier's contemporaries. I have emphasised the two main players, even more than Chang does, purely for narrative simplicity (and keeping this essay to a reasonable length). This might make it seem that it was something like a personal competition, when that doesn't seem to be the case. Think of this essay as a taster. My aim is to whet your appetite to go and discover Chang for yourself, or better, to go and read the original papers being published at the time. See for yourself.  


Coda

The pluralism that Chang praises in the case of chemistry is not the same kind of pluralism that exists in so-called "interpretations of quantum mechanics". Chang is in favour of having multiple explanations of a phenomenon until such time as the best explanation unequivocally emerges. But he also considers that the best explanations change over time as new data comes in. Chang is a pragmatist, and this seems to be the only viable approach to science. We do not and cannot acquire metaphysical certainty because there is no epistemic privilege with respect to reality. We are all inferring facts about reality based on experience, a procedure known to be fraught with difficulties that often go unnoticed.

Generally, in science, we see competing explanations that attempt to fit a new phenomenon into our pre-existing metaphysics. In crude terms, scientific theories are made to fit into existing views about reality, and new data changes our view of reality only rarely and often incrementally. Paradigms do change, but only with great reluctance. This conservatism is generally a good thing as long as it doesn't become dogmatic.

In stark contrast to the rest of science, in quantum physics, the mathematical approximations are considered infallible and inviolable, and scientists propose different realities in which the mathematics makes sense. They have become dogmatic about their theory and refuse to consider other models. It has not gone well.

As Sabine Hossenfelder said, "Theoretical physicists used to explain what was observed. Now they try to explain why they can’t explain what was not observed."

~~Φ~~


Bibliography

Chang, Hasok. (2009) "We Have Never Been Whiggish (About Phlogiston)". Centaurus 51(4): 239-264. https://doi.org/10.1111/j.1600-0498.2009.00150.x

Chang, Hasok. (2010). "The Hidden History of Phlogiston: How Philosophical Failure Can Generate Historiographical Refinement." HYLE – International Journal for Philosophy of Chemistry 16 (2): 47-79. Online.

Chang, Hasok. (2012a). Is Water H20? Evidence, Realism and Pluralism. Springer.

Chang, Hasok. (2012b). "Scientific Pluralism and the Mission of History and Philosophy of Science." Inaugural Lecture by Professor Hasok Chang, Hans Rausing Professor of History and Philosophy of Science, 11 October 2012. https://www.youtube.com/watch?v=zGUsIf9qYw8

Stahl, Georg Ernst. (1697). Zymotechnia fundamentalis.

16 May 2025

Observations and Superpositions

The role of observation in events has been a staple of quantum physics for decades and is closely associated with "the Copenhagen interpretation". On closer inspection, it turns out that everyone connected with Bohr's lab in Copenhagen had a slightly different view on how to interpret the Schrödinger equation. Worse, those who go back and look at Bohr's publications nowadays tend to confess that they cannot tell what Bohr's view was. For example, Adam Becker speaking to Sean Carroll (time index 21:21; emphasis added):

I don't think that there is any single Copenhagen interpretation. And while Niels Bohr and Max Born and Pauli, and Heisenberg and the others may have each had their own individual positions. I don't think that you can combine all of those to make something coherent...

...Speaking of people being mad at me, this is something that some people are mad at me for, they say, "But you said the Niels Bohr had this position?" I'm like, "No, I didn't, I didn't say that Niels Bohr had any position. I don't know what position he had and neither does anybody else."

So we should be cautious about claims made for "the Copenhagen interpretation", which seem to imply a consensus that never existed at Bohr's lab in Copenhagen.

That said, the idea that observation causes the wavefunction to collapse is still a staple of quantum physics. Despite playing a central role in quantum physics, "observation" is seldom precisely defined in scientific terms, or when it is defined, it doesn't involve any actual observation (I'll come back to this). The situation was made considerably worse when (Nobel laureate) Eugene Wigner speculated that it is "consciousness" that collapses the wave function. "Consciousness" is even less well-defined than "observation". While most academic physicists instantly rejected the role of consciousness in events, outside of physics it became a popular element of science folklore and New Ageism.

The idea that "observation" or "consciousness" are involved in "collapsing the wave function" is also an attachment point for Buddhists who wish to bolster their shaky faith by aligning it with science. The result of such legitimisation strategies is rather pathetic hand waving. Many Buddhists want reality to be reductive and idealist: they want "mind" to be the fundamental substance of the universe. This would align with some modern interpretations of traditional Buddhist beliefs about mind. But the idea is also to find some rational justification for Buddhist superstitions like karma and rebirth. As I showed at length in my book Karma and Rebirth Reconsidered, it simply does not work.

In this essay, I will show that it is trivially impossible for observation to play any role in causation at any level. I'm going to start by defining observation with respect to a person and exploring the implications of this, particularly with respect to Schrödinger's cat. I will also consider the post hoc rationalisation of observation qua "interaction" (sans any actual observation).


What is "An Observation"?

We may say that an observer, Alice, observes a process P giving rise to an event E, with an outcome O, when they become aware of P, E and O. It is possible to be aware of each part individually, but in order to understand and explain what has happened, we really need to have some idea of what processes were involved, what kinds of events it engendered, and the specific outcomes of those events. 

It's instructive to ask, "How does Alice become aware of external events?" Information from the process, event, and/or outcome of interest first has to reach her in some form. The fastest way that this can happen is for light from the process, event, and/or outcome to reach Alice's eyes. It always takes a finite amount of time for the light to reach her eye.

But light reaching Alice's eye alone does not create awareness. Rather, cells in the eye convert the energy of light into electrochemical energy (a nerve impulse). That pulse of energy travels along the optic nerve to the brain and is incorporated into our virtual world model and then, finally, presented to the first person perspective. Only then we become aware of it. And this part also takes a finite amount of time. Indeed, this part takes a lot more time than the light travelling.

Therefore, the time at which Alice becomes aware of P, E, and O, is some appreciable amount of time after E happens and O is already fixed. There is no alternative definition of "observation" that avoids this limitation, since information cannot travel faster than the speed of light and the brain is always involved. The only other possibilities are, if anything, slower. Therefore:

Alice can only observe processes, events, and outcomes after the fact.

If observation is always after the fact, then observation can never play any causal role in the sequence of events because causes must precede effects, in all frames of reference. Therefore:

Observation can play no causal role
in processes, events, or outcomes.

This means that there is no way that "observation" (or "consciousness") can cause the collapse the wavefunction. Rather, the collapse of the wavefunction has to occur first, then the light from that event has to travel to Alice's eye. There is no way around this physical limitation in our universe. And given the nature of wavefunctions—the outputs if which are vectors in a complex plane—this can hardly be surprising. 

Observation is never instantaneous let alone precognitive. And this means that all talk of observation causing "wavefunctions to collapse" is trivially false.

We could simply leave it at that, but it will be instructive to re-examine the best known application of "observation".


Schrödinger's cat

Schrödinger's cat is only ever alive or dead. It is never both alive and dead. This was the point that Schrödinger attempted to make. Aristotle's law of noncontradiction applies: an object cannot both exist and not exist at the same time. We cannot prove this axiom from first principles, but if we don't accept it as an axiom, it renders all communication pointless. No matter what true statement I may state, anyone can assert that the opposite is also true.

Schrödinger proposed his thought experiment as a reductio ad absurdum argument against Bohr and the others in Copenhagen. He was trying to show that belief in quantum superpositions leads to absurd, illogical consequences. He was right, in my opinion, but he did not win the argument (and nor will I).

This argument is broadly misunderstood outside of academic physics. This is because Schrödinger's criticism was taken up by physicists as an exemplification of the very effect it was intended to debunk. "Yes," cried the fans of Copenhagen type explanations, "this idea of both-alive-and-dead at the same time is exactly what we mean. Thanks." And so we got stuck with the idea that the cat is both alive and dead at the same time (which is nonsense). Poor old Schrödinger, he hated this idea (and didn't like cats) and now it is indelibly associated with him.

The general set up of the Schrödinger's cat thought experiment is that a cat is placed in a box. Inside the box, a random event may occur. If it occurs, the event triggers the death of the cat via a nefarious contraption. Once the cat is in the box, Alice doesn't know whether the cat is alive or dead. The cat is a metaphor for subatomic particles. We are supposed to believe that they adopt a physical superposition of states: say, "spin up" and "spin down", or "position x" and "position y" at the same time before we measure them, then at the point of measurement, they randomly adopt one or the other of the superposed states.

Here's the thing. The cat goes into the box alive. If the event happens, the cat dies. If it doesn't happen the cat lives. And Alice doesn't know which until she opens the box. The uncertainty here is not metaphysical, it's epistemic. It's not that a cat can even be in a state of both-alive-and-dead, it cannot; it's only that we don't know whether it is alive or dead. So this is a bad analogy.

Moreover, even when Alice opens the box, the light from the cat still takes some time to reach her eyes. Observation always trails behind events, it cannot anticipate or participate in events. Apart from reflected light, nothing is coming out from Alice that could participate in the sequence of events happening outside her body, let alone change the outcome.

Also, the cat has eyes and a brain. It is itself an "observer". 

Epistemic uncertainty cannot be mapped back to metaphysical uncertainty without doing violence to reason. A statement, "I don't know whether the cat is alive or dead," cannot be taken to imply that the cat is both alive and dead. This is definitely a category error for cats. Schrödinger's view was that it is also a category error for electrons and photons. And again, I agree with Schrödinger (and Einstein).

In that case, why do physics textbooks still insist on the nonsensical both-alive-and-dead scenario? It seems to be related to a built-in feature of the mathematics of spherical standing waves, which are at the heart of Schrödinger's equation (and many other features of modern science). The mathematics of standing waves was developed in the 18th century (i.e. it is thoroughly classical). Below, I quote from the Mathworld article on Laplace's equation (for a spherical standing wave) by Eric Weisstein (2025. Emphasis added)

A function psi which satisfies Laplace's equation is said to be harmonic. A solution to Laplace's equation has the property that the average value over a spherical surface is equal to the value at the center of the sphere (Gauss's harmonic function theorem). Solutions have no local maxima or minima. Because Laplace's equation is linear, the superposition of any two solutions is also a solution.

The last sentence of this passage is similar to a frequently encountered claim in quantum physics. That is to say, the fact that solutions for individual quantum states can be added together and produce another valid solution for the wave equation. This is made out to be a special feature of quantum mechanics that defines the superposition of "particles".

Superposition of waves is nothing remarkable or "weird". Any time two water waves meet, for example, they superpose.


In this image, two wave fronts travel towards the viewer obliquely from the left and right at the same time (the appear to meet almost at right angles). The two waves create an interference pattern (the cross in the foreground) where the two waves are superposed. Waves routinely superpose. And this is known as the superposition principle.

The superposition principle, also known as superposition property, states that, for all linear systems, the net response caused by two or more stimuli is the sum of the responses that would have been caused by each stimulus individually."
The Penguin Dictionary of Physics.

For this type of linear function, we can define superposition precisely: f(x) + f(y) = f(x+y)

In mathematical terms, each actual wave can be thought of as a solution to a wave equation. The sum of the waves must also be a solution because of the situation we see in the image, i.e. two waves physically adding together where they overlap, while at the same time retaining their identity.

I've now identified three universal properties of spherical standing waves that are frequently presented as special features of quantum physics:

  • quantisation of energy
  • harmonics = higher energy states (aka orbitals)
  • superposition (of waves)

These structural properties of standing waves are not "secret", but they are almost always left out of narrative accounts of quantum physics. And yet, these are important intuitions to bring to bear when applying wave mechanics to describing real systems.

Something else to keep in mind is that "quantisation" is an ad hoc assumption in quantum physics. It's postulated to be a fundamental feature of all quantum fields. The only problem is that all of the physical fields we know of—which is to say the fields we can actually measure—are smooth and continuous across spacetime: including gravitational fields and electromagnetic fields. Scientists have imagined discontinuous or quantized fields, but they have never actually seen one.

Moreover, as far as I know, the only physical mechanism in our universe that is known to quantize energy, create harmonics, and allow for superposition is the standing wave. The logical deduction from these facts is that it is the standing wave structure of the atom that quantizes the energy of electrons and photons and creates electron orbitals. 

Quantization is a structural property of atoms, not a substantial property of fields. (Or more conventionally and less precisely, quantization is an emergent property, not a fundamental property). 

Also, as I have already explained, the coexistence of probabilities always occurs before any event, and those probabilities always collapse at the point when an event has a definite outcome. There is nothing "weird" about this; it's not a "problem". What is weird, is the idea that hypostatizing and reifying probabilities leads to some meaningful metaphysics. It has not, and it will not.

While the superposition of waves or probabilities is an everyday occurrence. The superposition of physical objects is another story. Physical objects occupy space in an exclusive way: if one object is in that location, no other physical object can also be in that location. Physical objects cannot superpose and they are never observed to be superposed. And yet, the superposition of point particles is how physicists continue to explain the electron in an atom.

The electric field has been measured and it is found to be smooth and continuous in spacetime. Just as predicted by Maxwell. Given this, simple logic and basic geometry dictates that if—

  1. the electrostatic field of the proton has spherical symmetry, and
  2. a hydrogen atom is electrostatically neutral, and
  3. the neutrality is assumed to be the result of the electron's electrostatic field,

—then the electron can only be in one configuration: it must be a sphere (or a close approximation of a sphere) completely surrounding the proton. This is the only way to ensure that all the field lines emerging from the proton terminate at the electron. Otherwise there are unbalanced forces - a net charge rather than neutrality. And a changing electric field dissipates energy, which electrons do not. 

Unbalanced forces

Now, if the electron is both a wave and a sphere, then the electron can only be a spherical standing wave. The Bohr model of the atom was incorrect and it surprises me greatly that this problem was not identified at the time. 

And if the electron is a spherical standing wave then, because these are universal features of standing waves, we expect:

  1. The energy of the electron in the H atom will be quantised.
  2. The electron will form harmonics corresponding to higher energy states and it will jump between them when it absorbs or emits photons.
  3. When two electron waves intersect, the sum of their amplitudes is also a solution to the wave equation.

Moreover, we can now take pictures of atoms using electron microscopes. Atoms are physical objects. In every single picture, atoms appear to be approximately spherical.


And yet mainstream quantum models do not quite treat atoms as real. Quantum physics is nowadays all about probabilities. The problem is that, as I established in an earlier essay, a probability cannot possibly balance an electrostatic field to create a neutral atom. Only a real electric field can do this. Schrödinger was right to be unconvinced by the probability interpretation, even if it works. But he was wrong about modelling a particle as a wave. 

Waves are observed to superpose all the time. Solid objects are never observed to do so. The only reason we even consider superposition for "particles" is the wave-particle duality postulate, which we now know to be inaccurate. "Particles" are waves.

As I understand it, the idea that our universe consists of 17 fields in which particles are "excitations" is a widely accepted postulate. And as such, one might have expected scientists to go back over the physics predicated on wave-particle duality and recast it in terms of only waves. Having the wave equation describe a wave would be a start.

I digress. Clearly the idea that observers influence outcomes is trivially false. So now we must turn to the common fudge of removing the observer from the observation.


Interaction as Observation

One way around the problems with observation, is to redefine "observation" so that it excludes actual observations and observers. The move is to redefine "observation" to mean "some physical interaction". I'm sure I've mentioned this before because I used to think this was a good idea.

While we teach quantum physics in terms of isolated "particles" in empty, flat space, the fact is that the universe is crammed with matter and energy, especially in our part of the universe. Everything is interacting with everything that it can interact with, simultaneously in all the ways that it can interact, at every moment that it is possible to interact. Nothing in reality is ever simple.

In classical physics, we are used to being able to isolate experiments and exclude variables. This cannot ever happen at the nanoscale and below. An electron, for example, is surrounded by an electrostatic field which interacts with the fields around all other wavicles, near and far.

Electrons, for example, are all constantly pushing against each other via the electromagnetic force. If your apparatus contains electrons, their fields invariably interact with the electron you wish to study. This includes mirrors, beam-splitters, prisms, diffraction gratings, and double slits. The apparatus is not "classical", it's part of the quantum system you study. At the nanoscale and below, there are no neutral apparatus. 

Therefore, the idea that interaction causes the wavefunction to "collapse" is also untenable because in the real world wavicles are always interacting. In an H atom, for example, the electron and the proton are constantly and intensely interacting via the electromagnetic force. So the electron in an H atom could never be in a superposition.


Conclusions

Observation can only occur after the fact and is limited by the speed of light (or speed of causality).

Neither "observation" nor "consciousness" can play any role in the sequence of events, let alone a causal role.

Schrödinger's cat is never both alive and dead. And observation makes no difference to this (because observation can only ever be post hoc and acausal).

It is always the case, no matter what kind of system we are talking about, that probabilities for all possibilities coexist prior to an event and collapse as the event produces a specific outcome. But this is in no way analogous to waves superposing and should not be called "superposition".

All (linear) waves can superpose. All standing waves are quantised. All standing waves have harmonics.

Defining observation so as to eliminate the observer doesn't help as much as physicists might wish.

"Observation" is irrelevant to how we formulate physics.

The wave-particle duality postulate is still built into quantum mechanics, despite being known to be false.

For the last century, quantum physicists have been trying to change reality to fit their theory. Many different kinds of reality have been proposed to account for quantum theory: Copenhagen, Many Worlds, Qbism, etc. I submit that proposing a wholly different reality to account for your theory is tantamount to insanity. The success in predicting probabilities seems to have causes physicists to abandon science. I don't get it, and I don't like it. 

~~Φ~~


Bibliography

Weisstein, Eric W. (2025) "Laplace's Equation." MathWorld. https://mathworld.wolfram.com/LaplacesEquation.html

Related Posts with Thumbnails