Showing posts with label Physics. Show all posts
Showing posts with label Physics. Show all posts

16 January 2026

How can a particle be in two places at once? (Superposition, Again)

Image of atom

A common question for lay people confronted with counterintuitive popular narrative accounts of quantum physics is:
 
How can a particle be in two places at once?

The idea that a particle can be in two places at once is a common enough interpretation of the idea of quantum superposition, but this is not the only possible interpretation. Some physicists suggest that superposition means that we simply don't know the position, and some say that it means that the "position" is in fact smeared out into a kind of "cloud" (not an objective cloud). However, being in two places at once is an interpretation that lay people routinely encounter, and it has become firmly established in the popular imagination.

Note that while the idea is profoundly counterintuitive, physicists often scoff at intuition. Richard Feynman once said, "The universe is under no obligation to make sense to you." I suppose this is true enough, but it lets scientists off the hook to easily. The universe might be obligation-free, but science is not. I would argue precisely that science is obligated to make sense. For the first 350 years or so, science was all about making sense of empirical data. This approach was consciously rejected by people like Werner Heisenberg, Max Born, and Niels Bohr before arriving at their anti-realist conclusions.

But here's the thing. Atoms are unambiguously and unequivocally objective (their existence and properties are independent of the observer). We even have images of individual atoms now (above right). Electrons, protons, neutrons, and neutrinos are all objective entities. They exist, they persist, they take part in causal relations, and we can measure their physical properties such as mass, spin, and charge. The spectral absorption/emission lines associated with each atom are also objective.

It was the existence of emission lines, along with the photoelectric effect, that led Planck and Einstein to propose the first quantum theory of the atom. And if these lines are objective, then we expect them to have an objective cause. And since they obviously form a harmonic series we ought to associate the lines with objective standing waves. The mathematics used to describe and predict the lines does describe a standing wave, but for reasons that are still not clear to me, physicists deny that an objective standing wave is involved. The standing wave is merely a mathematical calculation tool. Quantum mechanics is an antirealist scientific theory, which is an oxymoron. 

However, we may say that if an entity like the atom in the image above has mass, then that mass has to be somewhere at all times It may be relatively concentrated or distributed with respect to the centre of mass, but it is always somewhere. Mass is not abstract. Mass is physical and objective. Mass can definitely not be in two places at once. Similarly, electrical charge is a fundamental physical property. It also has to be somewhere. If we deny these objective facts then all of physics goes down the toilet. 

Moreover, if that entity with mass and charge is not at absolute zero, then it has kinetic energy: it is moving. If it is moving, that movement has a speed and a direction (i.e. velocity). At the nanoscale, there is built-in uncertainty regarding knowing both position and velocity at the same time, but we can, for example, know precisely where an electron is when it hits a detector (at the cost of not knowing its speed and direction at that moment).

Quantum theory treats such objective physical entities as abstractions. Bohr convinced his colleagues that we cannot have a realist theory of the subatomic. It's not something anyone can describe because it's beyond our ability to sense. This was long before images of atoms were available. 

The story of how we came to have an anti-realist theory of these objective entities and their objective behaviour would take me too far from my purpose in this essay, but it's something to contemplate. Mara Beller's book Quantum Dialogue goes into this issue in detail. Specifically, she points to the covert influence of logical positivism on the entire Copenhagen group.

The proposition that a particle can be in two places at once is not only wildly counterintuitive, but it breaks one of Aristotle's principles of reasoning: the principle of noncontradiction. Which leaves logic in tatters and reduces knowledge to trivia. Lay people can only be confused by this, but I think that, secretly, many physicists are also confused.

To be clear:

  • No particle has ever been observed to be in different locations at the same time. When we observe particles, they are always in one place and (for example, in a cloud chamber) appear to follow a trajectory. Neither the location nor the trajectory is described by quantum physics.
  • No particle has ever been predicted to be in different locations at the same time. The Schrödinger equation simply cannot give us information about where a particle is.

So the question is, why do scientists like to say that quantum physics means that a particle can be in two places, or in two "states"*, at one time? To answer this, we need to look at the procedures that are employed in quantum mechanics and note a rather strange conclusion.

* One has to be cautious of the word "state" in this context, since it refers only to the mathematical description, not to the physical state of a system. And the distinction is seldom, if ever, noted in popular accounts.

What follows will involve some high school-level maths and physics.


The Schrödinger Equation

Heisenberg and Schrödinger developed their mathematical models to try to explain why the photons emitted by atoms have a specific quantum of energy (the spectral emission lines) rather than an arbitrary energy. Heisenberg used matrices and Schrödinger used differential equations, but the two approaches amount to the same thing. Even when discussing Schrödinger's differential equation, physicists still use matrix jargon like "eigenfunctions" indiscriminately.

The Schrödinger equation can take many forms, which does not help the layperson. However, the exact form doesn't matter for my purposes. What does matter is that they all include a Greek letter psi 𝜓. Here, 𝜓 is not a variable of the type we encounter in classical physics; it is a mathematical function. Physicists call 𝜓 the wavefunction. Let's dig into what this means.


Functions

A function, often denoted by f, is a mathematical rule. In high school mathematics, we all learn about simple algebraic functions of the type:

f(x) = x + 1

This rule says: whatever the current value of x is, take that value and add 1 to it.

So if x = 1 and we apply the rule, then f(x) = 2. If x = 2.5, then f(x) = 3.5. And so on.

A function can involve any valid mathematical operation or combinations of them. And there is no theoretical limit on how complex a function can be. I've seen functions that take up whole pages of books.

We often meet this formalism in the context of a Cartesian graph. For example, if the height of a line on a graph is proportional to its length along the x-axis, then we can express this mathematically by saying that y is a function of x. In maths notation.

y = f(x); where f (x) = x + 1.

Or simply: y = x + 1.

This particular function describes a line at +45° that crosses the y-axis at y = 1. Note also that if the height (y) and length (x) are treated as the two orthogonal sides of a right-triangle, then we can begin to use trigonometry to describe how they change in relation to each other. Additionally, we can treat (x,y) as a matrix or as the description of a vector.

In physics, we would physically interpret an expression like y = x + 1 as showing how the value of y is proportional to the value of x. We also use calculus to show how one variable changes over time with respect to another, but I needn't to go into this.


Wavefunctions and Hilbert Spaces

The wavefunction 𝜓 is a mathematical rule (where 𝜓 is the Greek letter psi, pronounced like "sigh"). If we specify it in terms of location on the x-axis, 𝜓(x) gives us one complex number (ai + b; where i = √-1) for every possible value of x. And unless otherwise specified, x can be any real number, which we write as x ∈ ℝ (which we read as "x is a member of the set of real numbers"). In practice, we usually specify a limited range of values for x.

All the values of 𝜓(x), taken together, can be considered to define a vector in an abstract notional "space" we call a Hilbert space, after the mathematician David Hilbert. The quantum Hilbert space has as many dimensions as there are values of x, and since x ∈ ℝ, this means it has infinitely many dimensions. While this seems insane at first glance, since a "space" with infinitely many dimensions would be totally unwieldy, in fact, it allows physicists to treat 𝜓(x) as a single mathematical object and do maths with it. It is this property that allows us to talk about operations like adding two wavefunctions (which becomes important below).

We have to be careful here. In quantum mechanics, 𝜓 does not describe a objective, physical wave in space. Hilbert space is not an objective space. This is all just abstract mathematics. Moreover, there isn’t an a priori universal Hilbert space containing every possible 𝜓. Every system produces a distinct abstract space. 

That said, Sean Carroll and other proponents of the so-called "Many Worlds" interpretation first take the step of defining the system of interest as "the entire universe" and notionally assign this system a wavefunction 𝜓universe. However, there is no way to write down an actual mathematical function for such an entity since it would have infinitely many variables. Even if we could write it down, there is no way to compute any results from such a function: it has no practical value. In gaining a realist ontology, we lose all ability to get information without introducing massive simplifications. Formally, you can define a universal 𝜓. But in practice, to get predictions, you always reduce to a local system, which is nothing other than ordinary quantum mechanics without the Many Worlds metaphysical overlay. So in practice, Many Worlds offers no advantage over "shut up and calculate". And since the Many Worlds ontology is extremely bizarre, I fail to see the attraction.

It is axiomatic for the standard textbook approach to quantum mechanics—deriving from the so-called "Copenhagen interpretation"—that there is no objective interpretation of 𝜓. Neutrally, we may say that the maths needn't correspond to anything in the world, it just happens to give the right answers. The maths itself is agnostic; it doesn't require any physical interpretation. Bohr and co positivistically insisted that it's not possible to have a physical interpretation because we cannot know the world on that scale.

As readers likely know, the physics community is deeply divided over (a) the possibility of realist interpretations, i.e. the issue of 𝜓-ontology and (b) which, if any, realist interpretation of 𝜓 is the right one. There is a vast amount of confusion and disagreement amongst physicists themselves over what the maths represents, which does not help the layperson at all. But again, we can skip over this and stay focussed on the goal.


The Schrödinger equation in Practice

To make use of the Schrödinger equation, a physicist must carefully consider what kind of system they are interested in and define 𝜓 so that it describes that system. Obviously, this selection is crucial for getting accurate results. And this is a point we have to come back to.

When we set out to model an electron in a hydrogen atom, for example, we have to choose an expression for 𝜓 whose outputs correspond to the abstract mathematical "state" of that electron. There's no point in choosing some other expression, because it won't give accurate results. Ideally, there is one and only one expression that perfectly describes the system, but in practice, there may be many others that approximate it.

For the sake of this essay, I will discuss the case in which 𝜓 is a function of location. In one dimension, we can state this as: 𝜓(x). When working in three spatial and one time dimensions, for technical reasons, we use spherical spatial coordinates, which are two angles and a length, as well as time: 𝜓(φ,θ,x,t). The three-dimensional maths is challenging, and physicists are not generally required to be able prove the theorem. They only need to know how to apply the end results.

Schrödinger himself began by describing an electron trapped in a one-dimensional box, as perhaps the simplest example of a quantum system (this is an example of a spherical cow approximation). This is very often the first actual calculation that students of quantum mechanics perform. How do we choose the correct expression for this system? In practice, this (somewhat ironically) can involve using approximations derived from classical physics, as well as some trial and error.

We know the the electron is a wave and so we expect it to oscillate with something like harmonic motion. In simple harmonic motion, the height of the wave on the y-axis changes as the sine of the position of the particle on the x-axis.

One of the simplest equations that satisfies our requirements, therefore, would be 𝜓(x) = sin x, though we must specify lower and upper limits for x reflecting the scale of the box.

However, it is not enough to specify the wavefunction and solve it as we might do in wave mechanics. Rather, we first need to do another procedure. We apply an operator to the wavefunction.

Just as a function is a rule applied to a number to produce another number, an operator is a rule applied to a function that produces another function. In this method, we identify operators by giving them a "hat".

So, if p is momentum (for historical reasons), then the operator that we apply to the wavefunction so that it gives us information about momentum is p̂. And we can express this application as 𝜓. For my purposes, further details on operators (including Dirac notation) don't matter. However, we may say that this is a powerful mathematical approach that allows us to extract information about any measurable property for which an operator can be defined, from just one underlying function. It's actually pretty cool.

There is one more step, which is applying the Born rule. Again, for the purposes of this essay, we don't need to say more about this, except that when we solve p̂ψ, the result is a vector (a quantity + a direction). The length of this vector is proportional to the probability that, when we make a measurement at x, we will find momentum p. And applying the Born rule gives us the actual probability.

So the procedure for using the Schrödinger equation has several steps. Using the example of 𝜓(x), and finding the momentum p at some location x, we get something like this:

  • Identify an appropriate mathematical expression for the wavefunction 𝜓(x).
  • Apply the momentum operator 𝜓(x).
  • Solve the resulting function (which gives us a vector).
  • Apply the Born Rule to obtain a probability.

So far so good (I hope).

To address the question—How can a particle be in two places at once?—we need to go back to step one.


Superposition is Neither Super nor Related to Position.

It is de rigueur to portray superposition as a description of a physical situation, but this is not what was intended. For example, Dirac's famous quantum mechanics textbook presents superposition as an a priori requirement of the theory, not a consequence of it. Any wavefunction 𝜓 must, by definition, be capable of being written as a combination of two or more other wavefunctions: 𝜓 = 𝜓₁ + 𝜓₂. Dirac simply stated this as an axiom. He offers no proof, no evidence, no argument, and no rationale.

We might do this with a problem where using one 𝜓 results in overly complicated maths. For example it's common to treat the double-slit experiment as two distinct systems involving slit 1 and slit 2. For example, we might say that 𝜓₁ describes a particle going only through slit 1, and 𝜓₂ describes a particle going through slit 2. The standard defence in this context looks like this:

  • The interference pattern is real.
  • The calculation that predicts it more or less requires 𝜓 = 𝜓₁ + 𝜓₂.
  • Therefore, the physical state of the system before measurement must somehow correspond to 𝜓₁ + 𝜓₂.

But the last step is exactly the kind of logic that quantum mechanics itself has forbidden. We cannot say what the state of the system is prior to measuring it. Ergo, we cannot say where the particle is before we measure it and we definitely cannot say its in two places at once.

To be clear, 𝜓 = 𝜓₁ + 𝜓₂ is a purely mathematical exercise that has no physical objective counterpart. According to the formalism, 𝜓 is not an objective wave. So how can 𝜓₁ + 𝜓₂ have any objective meaning? It cannot. Anything said about a particle "being in multiple states at once", or "taking both/many paths", or "being in two places at once" is all just interpretive speculation. We don't know. And the historically dominant paradigm tells us that we cannot know and we should not even ask.

To be clear, the Schrödinger does not and cannot tell us what happens during the double slit experiment. It can only tell us the probable outcome. The fact that the objective effect appears to be caused by interference and the mathematical formalism involves 𝜓₁ + 𝜓₂ is entirely coincidental (according to the dominant paradigm).

Dirac has fully embraced the idea that quantum mechanics is purely about calculating probabilities and that it is not any kind of physical description. A physical description of matter on the sub-atomic scale is not possible in this view. And his goal does not involve providing any such thing. His goal is only to perfect and canonise the mathematics which Heisenberg and Born had presented as a fait accompli in 1927:

“We regard quantum mechanics as a complete theory for which the fundamental physical and mathematical hypotheses are no longer susceptible of modification.”—Report delivered at the 1927 Solvay Conference.

I noted above that we have to specify some expression for 𝜓 that makes sense for the system of interest. If the expression is for some kind of harmonic motion, then we must specify things like the amplitude, frequency, direction of travel, and phase. Our choices here are not, and cannot be, derived from first principles. Rather, they must be arbitrarily specified by the physicist.

Now, there are an almost infinite number of expressions of the type 𝜓(x) = sin (x). We can specify amplitude, etc., to any arbitrary level of detail.

  • The function 𝜓(x) = 2 sin (x) will have twice the amplitude.
  • The function 𝜓(x) = sin (2x) will have twice the frequency.
  • The function 𝜓(x) = sin (-x) will travel in the opposite direction.

And so on.

A physicist may use general knowledge and a variety of rules of thumb to decide which exact function suits their purposes. As noted, this may involve using approximations derived from classical physics. We need to be clear that nothing in the quantum mechanical formalism can tell us where a particle is at a given time or when it will arrive at a given location. Whoever is doing the calculation has to supply this information.

Obviously, there are very many expressions that could be used. But in the final analysis, we need to decide which expression is ideal, or most nearly so. 

For a function like 𝜓(x) = sin (x), for example, we can add some variables: 𝜓(x) = A sin (kx). Where A can be understood as a scaling factor for amplitude, and k as a scaling factor for frequency. Both A and k can be any real number (A ∈ ℝ and k ∈ ℝ).

Even this very simple example clearly has an infinite number of possible variations since ℝ is an infinite set. There are infinitely many possible functions 𝜓₁, 𝜓₂, 𝜓₃, ... 𝜓. Moreover, because of the nature of the mathematics involved, if 𝜓₁ and 𝜓₂ are both valid functions, then 𝜓₁ + 𝜓₂ is also a valid function. It was this property of linear differential equations that Dirac sought to canonise as superposition.

To my mind, there is an epistemic problem in that we have to identify the ideal expression from amongst the infinite possibilities. And having chosen one expression, we then perform a calculation, and it outputs probabilities for measurable quantities.

The 𝜓-ontologists try to turn this into a metaphysical problem. Sean Carroll likes to say "the wavefunction is real". 𝜓-ontologists then make the move that causes all the problems, i.e. they speculatively assert that the system is in all of these states until we specify (or measure) one. And thus "superposition" goes from being a mathematical abstraction to being an objective phenomena, and its only one more step to saying things like "a particle can be in two places at once". 

I hope I've shown that such statements are incoherent at face value. But I hope I've also made clear that such claims are incoherent in terms of quantum theory itself, since the Schrödinger equation can never under any circumstances tell us where a particle is, only the probability of finding it in some volume of space that we have to specify in advance. 


Conclusion

The idea that a particle can be in two places at once is clearly nonsense even by the criteria of the quantum mechanics formalism itself. The whole point of denying the relevance of realism was to avoid making definite statements about what is physically happening on a scale that we can neither see nor imagine (according to the logical positivists).

So coming up with a definite, objective interpretation—like particles that are in two places at once—flies in the face of the whole enterprise of quantum mechanics. The fact that the conclusion is bizarre is incidental since it is incoherent to begin with.

The problem is that while particles are objective; our theory is entirely abstract. Particles have mass. Mass is not an abstraction; mass has to be somewhere. So we need an objective theory to describe this. Quantum mechanics is simply not that theory. And nor is quantum field theory. 

I'm told that mathematically, Dirac's canonisation of superposition was a necessary move. And to be fair, the calculations do work as advertised. One can accurately and precisely calculate probabilities with this method. But no one has any idea what this means in physical terms, no one knows why it works or what causes the phenomena it is supposed to describe. When Richard Feynman said "No one understands quantum mechanics", this is what he mean. And nothing has changed since he said it.

It would help if scientists themselves could stop saying stupid things like "particles can be in two places at once". No, particles cannot be in two places at once, and nothing about quantum mechanics makes this true. There is simply no way for quantum mathematics, as we currently understand it, to tell us anything at all about where a particle is. The location of interest is something that the physicist doing the calculation has to supply for the Schrödinger equation, not something the equation can tell us (unlike in classical mechanics).

And if the equation cannot tell us the location of the particle, under any circumstances, then it certainly cannot tell us that it is in two places or many places. Simple logic alone tells us this much.

The Schrödinger equation can only provide us with probabilities. While there are a number of possible mathematical "states" the particle can be in, we do not know which one it is in until we measure it.

If we take Dirac and co at face value, then stating any pre-measurement physical fact is simply a contradiction in terms. Pretending that this is not problematic is itself a major problem. Had we been making steady progress towards some kind of resolution, it might be less ridiculous. But the fact is that a century has passed since quantum mechanics was proposed and physicists still have no idea how or why it works but still accept that "the fundamental physical and mathematical hypotheses are no longer susceptible of modification."

Feynman might have been right when he said that the universe is not obligated to make sense. But the fact is that, science is obligated to make sense. That used to be the whole point of science, and still is in every other branch of science other than quantum mechanics. No one says of evolutionary theory, for example, that it is all a mysterious blackbox that we cannot possibly understand. And no one would accept this as an answer. Indeed, a famous cartoon by Sydney Harris gently mocks this attitude...


The many metaphysical speculations that are termed "interpretations of quantum mechanics" all take the mathematical formalism that explicitly divorces quantum mechanics from realism as canonical and inviolable. And then they all fail miserably to say anything at all about reality. And this is where we are.

It is disappointing, to say the least.

~~Φ~~

30 May 2025

Theory is Approximation

A farmer wants to increase milk production. They ask a physicist for advice. The physicist visits the farm, takes a lot of notes, draws some diagrams, then says, "OK, I need to do some calculations."

A week later, the physicist comes back and says, "I've solved the problem and I can tell you how to increase milk production".

"Great", says the farmer, "How?".

"First", says the physicist, "assume a spherical cow in a vacuum..."

What is Science?

Science is many things to many people. At times, scientists (or, at least, science enthusiasts) seem to claim that they alone know the truth of reality. Some seem to assume that "laws of science" are equivalent to laws of nature. Some go as far as stating that nature is governed by such "laws". 

Some believe that only scientific facts are true and that no metaphysics are possible. While this view is less common now, it was of major importance in the formulation of quantum theory, which still has problems admitting that reality exists. As Mara Beller (1996) notes:

Strong realistic and positivistic strands are present in the writings of the founders of the quantum revolution-Bohr, Heisenberg, Pauli and Born. Militant positivistic declarations are frequently followed by fervent denial of adherence to positivism (183). 

On the other hand, some see science as theory-laden and sociologically determined. Science is just one knowledge system amongst many of equal value. 

However, most of us understand that scientific theories are descriptive and idealised. And this is the starting point for me. 

In practising science, I had ample opportunity to witness hundreds or even thousands of objective (or observer-independent) facts about the world. The great virtue of the scientific experiment is that you get the same result, within an inherent margin of error associated with measurement, no matter who does the experiment or how many times they do it. The simplest explanation of this phenomenon is that the objective world exists and that such facts are consistent with reality. Thus, I take knowledge of such facts to constitute knowledge about reality. The usual label for this view is metaphysical realism.

However, I don't take this to be the end of the story. Realism has a major problem, identified by David Hume in the 1700s. The problem is that we cannot know reality directly; we can only know it through experience. Immanuel Kant's solution to this has been enormously influential. He argues that while reality exists, we cannot know it. In Kant's view, those qualities and quantities we take to be metaphysical—e.g. space, time, causality, etc.—actually come from our own minds. They are ideas that we impose on experience to make sense of it. This view is known as transcendental idealism. One can see how denying the possibility of metaphysics (positivism) might be seen as (one possible) extension of this view. 

It's important not to confuse this view with the idea that only mind is real. This is the basic idea of metaphysical idealism. Kant believed that there is a real world, but we can never know it. In my terms, there is no epistemic privilege.

Where Kant falls down is that he lacks any obvious mechanism to account for shared experiences and intersubjectivity (the common understanding that emerges from shared experiences). We do have shared experiences. Any scenario in which large numbers of people do coordinated movements can illustrate what I mean. For example, 10,000 spectators at a tennis match turning their heads in unison to watch a ball be batted back and forth. If the ball is not objective, or observer-independent, how do the observers manage to coordinate their movements? While Kant himself argues against solipsism, his philosophy doesn't seem to consider the possibility of comparing notes on experience, which places severe limits on his idea. I've written about this in Buddhism & The Limits of Transcendental Idealism (1 April 2016).

In a pragmatic view, then, science is not about finding absolute truths or transcendental laws. Science is about idealising problems in such a way as to make a useful approximation of reality. And constantly improving such approximations. Scientists use these approximations to suggest causal explanations for phenomena. And finally, we apply the understanding gained to our lives in the form of beliefs, practices, and technologies. 


What is an explanation?

In the 18th and 19th centuries, scientist confidently referred to their approximations as "laws". At the time, a mechanistic universe and transcendental laws seemed plausible. They were also gathering the low-hanging fruit, those processes which are most obviously consistent and amenable to mathematical treatment. By the 20th century, as mechanistic thinking waned, new approximations were referred to as "theories" (though legacy use of "law" continued). And more recently, under the influence of computers, the term "model" has become more prevalent. 

A scientific theory provides an explanation for some aspect of reality, which allows us to understand (and thus predict) how what we observe will change over time. However, even the notion of explanation requires some unpacking.

In my essay, Does Buddhism Provide Good Explanations? (3 Feb 23), I noted Faye's (2007) typology of explanation:

  • Formal-Logical Mode of Explanation: A explains B if B can be inferred from A using deduction.
  • Ontological Mode of Explanation: A explains B if A is the cause of B.
  • Pragmatic Mode of Explanation: a good explanation is an utterance that addresses a particular question, asked by a particular person whose rational needs (especially for understanding) must be satisfied by the answer.
In this essay, I'm striving towards the pragmatic mode and trying to answer my own questions. 

Much earlier (18 Feb 2011), I outlined an argument by Thomas Lawson and Robert McCauley (1990) which distinguished explanation from interpretation.

  • Explanationist: Knowledge is the discovery of causal laws, and interpretive efforts simply get in the way.
  • Interpretationist: Inquiry about human life and thought occurs in irreducible frameworks of values and subjectivity. 
"When people seek better interpretations they attempt to employ the categories they have in better ways. By contrast, when people seek better explanations they go beyond the rearrangement of categories; they generate new theories which will, if successful, replace or even eliminate the conceptual scheme with which they presently operate." (Lawson & McCauley 1990: 29)

The two camps are often hostile to each other, though some intermediate positions exist between them. As I noted, Lawson and McCauley see this as somewhat performative:

Interpretation presupposes a body of explanation (of facts and laws), and seeks to (re)organise empirical knowledge. Explanation always contains an element of interpretation, but successful explanations winnow and increase knowledge. The two processes are not mutually exclusive, but interrelated, and both are necessary.

This is especially true for physics where explanations often take the form of mathematical equations that don't make sense without commentary/interpretation.  


Scientific explanation.

Science mainly operates, or aims to operate, in the ontological/causal mode of explanation: A explains B if (and only if) A is the cause of B. However, it still has to satisfy the conditions for being a good pragmatic explanation:  "a good explanation is an utterance that addresses a particular question, asked by a particular person whose rational needs (especially for understanding) must be satisfied by the answer."

As noted in my opening anecdote, scientific models are based on idealisation, in which an intractably complex problem is idealised until it becomes tractable. For example, in kinematic problems, we often assume that the centre of mass of an object is where all the mass is. It turns out that when we treat objects as point masses in kinematics problems, the computations are much simpler and the results are sufficiently accurate and precise for most purposes. 

Another commonly used idealisation is the assumption that the universe is homogeneous or isotropic at large scales. In other words, as we peer out into the farthest depths of space, we assume that matter and energy are evenly distributed. As I will show in the forthcoming essay, this assumption seems to be both theoretically and empirically false. And it seems that so-called "dark energy" is merely an artefact of this simplifying assumption. 

Many theories have fallen because of employing a simplifying assumption that distorts answers to make them unsatisfying. 

A "spherical cow in a vacuum" sounds funny, but a good approximation can simplify a problem just enough to make it tractable and still provide sufficient accuracy and precision for our purposes. It's not that we should never idealise a scenario or make simplifying assumptions. The fact is that we always do this. All physical theories involve starting assumptions. Rather, the argument is pragmatic. The extent to which we idealise problems is determined by the ability of the model to explain phenomena to the level of accuracy and precision that our questions require. 

For example, if our question is, "How do we get a satellite into orbit around the moon?" we have a classic "three-body" problem (with four bodies: Earth, moon, sun, and satellite). Such problems are mathematically very difficult to solve. So we have to idealise and simplify the problem. For example, we can decide to ignore the gravitational attraction caused by the satellite, which is real but tiny. We can assume that space is relatively flat throughout. We can note that relativistic effects are also real but tiny. We don't have to slavishly use the most complex explanation for everything. Given our starting assumptions, we can just use Newton's law of gravitation to calculate orbits. 

We got to relativity precisely because someone asked a question that Newtonian approaches could not explain, i.e. why does the orbit of Mercury precess and at what rate? In the Newton approximation, the orbit doesn't precess. But in Einstein's reformulation of gravity as the geometry of spacetime, a precession is expected and can be calculated. 


Models

I was in a physical chemistry class in 1986 when I realised that what I had been learning through school and university was a series of increasingly sophisticated models, and the latest model (quantum physics) was still a model. At no point did we get to reality. There did seem to me to be a reality beyond the models, but it seemed to be forever out of reach. I had next to no knowledge of philosophy at that point, so I struggled to articulate this thought, and I found it dispiriting. In writing this essay, I am completing a circle that I began as a naive 20-year-old student.

This intuition about science crystallised into the idea that no one has epistemic privilege. By this I mean that no one—gurus and scientists included—has privileged access to reality. Reality is inaccessible to everyone. No one knows the nature of reality or the extent of it. 

We all accumulate data via the same array of physical senses. That data feeds virtual models of world and self created by the brain. Those models both feed information to our first-person perspective, using the sensory apparatus of the brain to present images to our mind's eye. This means that what we "see" is at least two steps removed from reality. This limit applies to everyone, all the time.

However, when we compare notes on our experience, it's clear that some aspects of experience are independent of any individual observer (objective) and some of them are particular to individual observers (subjective). By focusing on and comparing notes about the objective aspects of experience, we can make reliable inferences about how the world works. This is what rescues metaphysics from positivism on one hand and superstition on the other. 

We can all make inferences from sense data. And we are able to make inferences that prove to be reliable guides to navigating the world and allow us to make satisfying causal explanations of phenomena. Science is an extension of this capacity, with added concern for accuracy, precision, and measurement error. 

Since reality is the same for everyone, valid models of reality should point in the same direction. Perhaps different approaches will highlight different aspects of reality, but we will be able to see how those aspects are related. This is generally the case for science. A theory about one aspect of reality has to be consistent, even compliant, with all the other aspects. Or if one theory is stubbornly out of sync, then that theory has to change, or all of science has to change. Famously, Einstein discovered several ways in which science had to change. For example, Einstein proved that time is particular rather than universal.  Every point in space has its own time. And this led to a general reconsideration of the role of time in our models and explanations. 


Sources of Error

A scientific measurement is always accompanied by an estimate of the error inherent in the measurement apparatus and procedure. Which gives us a nice heuristic: If a measurement you are looking at is not accompanied by an indication of the errors, then the measurement is either not scientific, or it has been decontextualised and, with the loss of this information, has been rendered effectively unscientific.

Part of every good scientific experiment is identifying sources of error and trying to eliminate or minimise them. For example, if I measure my height with three different rulers, will they all give the same answer? Perhaps I slumped a little on the second measurement? Perhaps the factory glitched, and one of the rulers is faulty? 

In practice, a measurement is accurate to some degree, precise to some degree, and contains inherent measurement error to some degree. And each degree should be specified to the extent that it is known.

Accuracy is itself a measurement, and as a quantity reflects how close to reality the measurement is. 

Precision represents how finely we are making distinctions in quantity.

Measurement error reflects uncertainty introduced into the measurement process by the apparatus and the procedure.

Now, precision is relatively easy to know and control. We often use the heuristic that a ruler is accurate to half the smallest measure. So a ruler marked with millimetres is considered precise to 0.5 mm. 

Let's I want to measure my tea cup. I have three different rulers. But I also note that the cup has rounded edges, so knowing where to measure from is a judgment call. I estimate that this will add a further 1 mm of error. Here are my results: 

  • 83.5 ± 1.5 mm.
  • 86.0 ± 1.5 mm.
  • 84.5 ± 1.5 mm

The average is 84.6 ± 1.5 mm. So we would say that we think the true answer lies between 86.1 and 83.1 mm. And note that even though I have an outlier (86.0 mm), this is in fact within the margin of error. 

As I was measuring, I noted another potential source of error. I was guesstimating where the widest point was. And I think this probably adds another 1-2 mm of measurement error. When considering sources of error in a measurement, one's measurement procedure is often a source. In science, clearly stating one's procedure allows others to notice problems the scientists might have overlooked. Here, I might have decided to mark the cup so that I measured at the same point each time. 

Now the trick is that there is no way to get behind the measurement and check with reality. So, accuracy has to be defined pragmatically as well. One way is to rely on statistics. For example, one makes many measurements and presents the mean value and the standard deviation (which requires more than three measurements). 

The point is that error is always possible. It always has to be accounted for, preferably in advance. We can take steps to eliminate error. An approximation always relies on starting assumptions, and these are also a source of error. Keep in mind that this critique comes from scientists themselves. They haven't been blindly ignoring error all these years. 


Mathematical Models

I'm not going to dwell on this too much. But in science, our explanations and models usually take the form of an abstract symbolic mathematical equation. A simple, one-dimensional wave equation takes the general form:

y = f(x,t)

That is to say that the displacement of the wave (y) is a function of position (x) and time (t). Which is to say that changes in the displacement are proportional to changes in position in space and time. This describes a wave that, over time, moves in the x direction (left-right) and displaces in the y direction (up-down). 

More specifically, we model simple harmonic oscillations using the sine function. In this case, we know that spatial changes are a function of position and temporal changes are a function of time. 

y(x) = sin(x)
y(t) = sin(t)

It turns out that the relationship between the two functions can be expressed as 

y(x,t) = sin(x ± t).

If the wave is moving right, we subtract time, and if the wave is moving to the left, we add it. 

The sine function smoothly changes between +1 and -1, but a real wave has an amplitude, and we can scale the function by multiplying it by the amplitude.

y(x,t) = A sin(x ± t).

And so on. We keep refining the model until we get to the general formula:

y(x,t) = A sin(kx ± ωt ± ϕ).

Where A is the maximum amplitude, k is the stiffness of the waving medium, ω is the angular velocity, and ϕ is the phase.

The displacement is periodic in both space and time. Since k = 2π/λ (where λ is the wavelength), the function returns to the same spatial configuration when x/n = λ (where n is a whole number). Similarly, since ω = 2π/T (where T is the period or wavetime), the function returns to the same temporal configuration when t/n = T.

What distinguishes physics from pure maths is that, in physics, each term in an equation has a physical significance or interpretation. The maths aims to represent changes in our system over time and space. 

Of course, this is idealised. It's one-dimensional. Each oscillation is identical to the last. The model has no friction. If I add a term for friction, it will only be an approximation of what friction does. But no matter how many terms I add, the model is still a model. It's still an idealisation of the problem. And the answers it gives are still approximations.


Conclusion

No one has epistemic privilege. This means that all metaphysical views are speculative. However, we need not capitulate to solipsism (we can only rely on our own judgements), relativism (all knowledge has equal value) or positivism (no metaphysics is possible). 

Because, in some cases, we are speculating based on comparing notes about empirical data. This allows us to pragmatically define metaphysical terms like reality, space, time, and causality in such a way that our explanations provide us with reliable knowledge. That is to say, knowledge we can apply and get expected results. Every day I wake up and the physical parameters of the universe are the same, even if everything I see is different. 

Reality is the world of observer-independent phenomena. No matter who is looking, when we compare notes, we broadly agree on what we saw. There is no reason to infer that reality is perfect, absolute, or magical. It's not the case that somewhere out in the unknown, all of our problems will be solved. As a historian of religion, I recognise the urge to utopian thinking and I reject it. 

Rather, reality is seen to be consistent across observations and over time. Note that I say "consistent", not "the same". Reality is clearly changing all the time. But the changes we perceive follow patterns. And the patterns are consistent enough to be comprehensible. 

The motions of stars and planets are comprehensible: we can form explanations for these that satisfactorily answer the questions people ask. The patterns of weather are comprehensible even when unpredictable. People, on the other hand, remain incomprehensible to me.

That said, all answers to scientific questions are approximations, based on idealisations and assumptions. Which is fine if we make clear how we have idealised a situation and what assumptions we have made. This allows other people to critique our ideas and practices. As Mercier and Sperber point out, it's only in critique that humans actually use reasoning (An Argumentative Theory of Reason,10 May 2013). 

We can approximate reality, but we should not attempt to appropriate it by insisting that our approximations are reality. Our theories and mathematics are always the map, never the territory. The phenomenon may be real, but the maths never is.  

This means that if our theory doesn't fit reality (or the data), we should not change reality (or the data); we should change the theory. No mathematical approximation is so good that it demands that we redefine reality. Hence, all of the quantum Ψ-ontologies are bogus. The quantum wavefunction is a highly abstract concept; it is not real. For a deeper dive into this topic, see Chang (1997), which requires a working knowledge of how the quantum formalism works, but makes some extremely cogent points about idealised measurements.

In agreeing that the scientific method and scientific explanations have limits, I do not mean to dismiss them. Science is by far the most successful knowledge seeking enterprise in history. Science provides satisfactory answers many questions. For better or worse, science has transformed our lives (and the lives of every living thing on the planet). 

No, we don't get the kinds of answers that religion has long promised humanity. There is no certainty, we will never know the nature of reality, we still die, and so on. But then religion never had any good answers to these questions either. 

~~Φ~~


Beller, Mara. (1996). "The Rhetoric of Antirealism and the Copenhagen Spirit". Philosophy of Science 63(2): 183-204.

Chang, Hasok. (1997). "On the Applicability of the Quantum Measurement Formalism." Erkenntnis 46(2): 143-163. https://www.jstor.org/stable/20012757

Faye, Jan.(2007). "The Pragmatic-Rhetorical Theory of Explanation." In Rethinking Explanation. Boston Studies in the Philosophy of Science, 43-68. Edited by J. Persson and P. Yikoski. Dordrecht: Springer.

Lawson, E. T. and McCauley, R. N. (1990). Rethinking Religion: Connecting Cognition and Culture. Cambridge: Cambridge University Press.


Note: 14/6/25. The maths is deterministic, but does this mean that reality is deterministic? 

23 May 2025

The Curious Case of Phlogiston

I'm fascinated by revisionist histories. I grew up in a British colony where we were systematically lied to about our own history. Events in the 1970s and 1980s forced us to begin to confront what really happened when we colonised New Zealand. At around the same time, modern histories began to appear to give us a more accurate account. James Belich's Making Peoples had a major impact on me. Michael King's Being Pakeha also struck a chord, as did Maurice Shadbolt's historical novel Monday's Warriors.

Most people who know a little bit about the history of science will have heard of phlogiston. The phlogiston theory is usually portrayed as exactly the kind of speculative metaphysics that was laid to rest by artful empiricists. Phlogiston became a symbol of the triumph of empiricism over superstition. As a student of chemistry, I imbibed this history and internalised it. 

The popular history (aka science folklore) has a Whiggish feel in the sense that Lavoisier is represented as making a rational leap towards the telos of the modern view. Such, we are led to believe, is the nature of scientific progress. My favourite encyclopedia repeats the standard folklore:

The phlogiston theory was discredited by Antoine Lavoisier between 1770 and 1790. He studied the gain or loss of weight when tin, lead, phosphorus, and sulfur underwent reactions of oxidation or reduction (deoxidation); and he showed that the newly discovered element oxygen was always involved. Although a number of chemists—notably Joseph Priestley, one of the discoverers of oxygen—tried to retain some form of the phlogiston theory, by 1800 practically every chemist recognized the correctness of Lavoisier’s oxygen theory.—Encyclopedia Britannica.

Compare this remark by Hasok Chang (2012b: time 19:00) in his inaugural lecture as Hans Rausing Professor of History and Philosophy of Science, at Cambridge University:

I became a pluralist about science because I could not honestly convince myself that the phlogiston theory was simply wrong or even genuinely inferior to Lavoisier's oxygen-based chemical theory.

When I was reading about the systematic misrepresentation of the work of J. J. Thomson and Ernest Rutherford in physics folklore, Chang's lecture came to mind. I discovered Chang 4 or 5 years ago and have long wanted to review his account of phlogiston, but was caught up in other projects. In this essay, I will finally explore the basis for Chang's scepticism about the accepted history of phlogiston. While I largely rely on his book, Chang pursued this theme in two earlier articles (2009, 2010).


Setting the Scene

The story largely takes place in the mid-late eighteenth century. The two principal figures are Joseph Priestley (1733 – 1804) and Antoine-Laurent de Lavoisier (1743 – 1794). 

A caveat is that while I focus on these two figures, the historical events involved dozens, if not hundreds, of scientists. Even in the 1700s, science was a communal and cooperative affair; a slow conversation amongst experts. My theme here is not "great men of history". My aim is to explore the historiography of science and reset my own beliefs. Chang's revisionist history of phlogiston is fascinating by itself, but I am intrigued by how Chang uses it as leverage in his promotion of pluralism in science. Priestley and Lavoisier are just two pegs to hang a story on. And both were, ultimately, wrong about chemistry. 

Chang (2012: 2-5) introduces Priestley at some length. He refers to him as "a paragon of eighteenth-century amateur science" who "never went near a university", while noting that he was also a preacher and a "political consultant" (from what I read, Priestley was really more of a commentator and pamphleteer). As a member of a Protestant dissenting church, Priestley was barred from holding any public office or working in fields such as law or medicine. In the 1700s, British universities were still primarily concerned with training priests for the Church of England. That said, Priestley was elected a fellow of the Royal Society in 1766, which at least gained him the ears of fellow scientists. Priestley is known for his work identifying different gases in atmospheric air. He first discovered "fixed air" (i.e. carbon-dioxide) and became a minor celebrity with his invention of carbonated water. He also discovered oxygen, more on this below. 

However, Chang provides no similar introduction to Lavoisier. Rather, Lavoisier appears in a piecemeal way as a foil to his main character, Priestley. The disparity seems to be rhetorical. Part of Chang's argument for plurality in science is that Priestley was on the right track and has been treated poorly by historians of science. By focusing primarily on Priestley and treating Lavoisier as secondary, Chang might be seen as rebalancing a biased story.

I'm not sure that this succeeds, because as a reviewer, I now want to introduce Lavoisier to my readers, and I have to rely on third-party sources to do that. Chang doesn't just leave the reader hanging; he misses an opportunity to put Lavoisier in context and to draw some obvious comparisons. That Priestley and Lavoisier inhabited very different worlds is apposite to any history of phlogiston.

Lavoisier was an aristoi who inherited a large fortune at the age of five (when his mother died). He attended the finest schools where he became fascinated by the sciences (such as they were at the time). This was followed by university studies, where Lavoisier qualified as a lawyer, though he never practised law (he did not need to). As an aristo, Lavoisier had access to the ruling elite, which gave him leverage in his dispute with Priestley. He was also something of a humanitarian and philanthropist, spending some of his fortune on such projects as clean drinking water, prison reform, and public education. Despite this, he was guillotined during the French Revolution after being accused of corruption in his role as a tax collector. He was later exonerated of corruption.

The contrasting social circumstances help to explain why Lavoisier was able to persuade scientists to abandon phlogiston for his oxygen theory. Lavoisier had money and class on his side in a world almost completely dominated by money and class. 

Having introduced the main players, we now need to backtrack a little to put their work in its historical context. In the 1700s, the Aristotelian idea that the world is made of earth, water, fire, and air was still widely believed. To be clear, both water and air were considered to be elemental substances. 18th-century medicine was still entirely rooted in this worldview.

Alchemy still fascinated the intelligentsia of the day. On one level, alchemists pursued mundane goals, such as turning lead into gold, and on another, they sought physical immortality (i.e. immortality in this life rather than in the afterlife).

The telescope and microscope were invented in the early 1600s. With the former, Galileo observed the Moon and Jupiter's satellites, becoming the first empirical scientist to upset the existing worldview by discovering new facts about the world. 

That worldview was still largely the synthesis of Christian doctrine with Aristotelian philosophy created by Thomas Aquinas (1225–1274). The microscope had also begun to reveal a level of structure to the world, and to life, that no one had previously suspected existed. The practice of alchemy began to give way to natural philosophy, i.e. the systematic investigation of properties of matter. Priestley and Lavoisier were not the only people doing this, by any means, but they were amongst the leading exponents of natural philosophy. 

One of the key phenomena that captured the attention of natural philosophers, for obvious reasons, was combustion. The ancient Greeks believed that fire was elemental and that combustion released the fire element latent in the fuel. This is the precursor to the idea of phlogiston as a substance.


Phlogiston Theory

The first attempt at a systematic account of phlogiston is generally credited to Georg Ernst Stahl (1659 – 1734) in Zymotechnia fundamentalis "Fundamentals of the Art of Fermentation" (1697). The term phlogiston derives from the Greek φλόξ phlóx "flame", and was already in use when it was applied to chemistry.

The basic idea was that anything which burns contains a mixture of ash and phlogiston. Combustion is the process by which phlogiston is expelled from matter, leaving behind ash. And we see this process happening in the form of flames. And thus, a combustible substance was one that contained phlogiston. Phlogiston was "the principle of inflammability". 

However, experimentation had begun to show interesting relationships between metals and metal-oxides (known at the time as calx). One could be turned into the other, and back again. For example, metallic iron gradually transforms into a reddish calx, which is a mixture of a couple of different oxides of iron. To turn iron-oxide back into iron, we mix it with charcoal or coke and heat it strongly. And this reversible reaction is common to all metals. 

Chemists used phlogiston to explain this phenomenon. Metals, they conjectured, were rich in phlogiston. This is why metals have such qualities as lustre, malleability, ductility, and electrical conductivity. In becoming a calx, the metal must be losing phlogiston, and by analogy, a calx is a kind of ash. On the other hand, charcoal and coke burn readily, so they must also be rich in phlogiston. When heated together, the phlogiston must move from the charcoal back into the calx, reconstituting the metal. 

This reversible reaction was striking enough for Immanuel Kant to use it, in The Critique of Pure Reason (1781), as an example of how science "began to grapple with nature in a principled way" (Chang 2012: 4).

Priestley is famous for having discovered oxygen, but as Chang emphasises, it was Lavoisier who called it that. Priestley called it dephlogisticated air, i.e. air from which phlogiston has been removed. "Air" in this context is the same as "gas" in modern parlance.

Priestley produced dephlogisticated air by heating mercury-oxide, releasing oxygen and leaving the pure metal. According to the phlogiston theory, such dephlogisticated air should readily support combustion because, being dephlogisticated, it would readily accept phlogiston from combustion. And so it proved. Combustion in dephlogisticated air was much more vigorous. Breathing the new air also made one feel invigorated. Priestley was the first human to breathe pure oxygen, though he tested it on a mouse first.

Formerly considered elemental, atmospherical air could now be divided into "fixed air" and "dephlogisticated air". A missing piece was inflammable air (hydrogen), which was discovered by Henry Cavendish in 1766, when he was observing the effects of acids on metals. Cavendish had also combusted dephlogisticated air and inflammable air to make water. And Priestley had replicated this in his own lab.

Priestley and Cavendish initially suspected that inflammable air was in fact phlogiston itself, driven from the metal by the action of acids. A calx in acid produced no inflammable air, because it was already dephlogisticated. However, the fact that dephlogisticated air and phlogiston combined to make water was suggestive and led to an important refinement of the phlogiston theory.

They settled on the idea that inflammable air (hydrogen) was phlogisticated water, and that dephlogisticated air (oxygen) was actually dephlogisticated water. And thus, the two airs combined to form water. In this view, water is still elemental. 

It was Lavoisier who correctly interpreted this reaction to mean that water was not an element but a compound of hydrogen and oxygen (and it was Lavoisier who named inflammable air hydrogen, i.e. "water maker"). However, it is precisely here, Chang argues that phlogiston proves itself to be the superior theory.

Chang notes that, without the benefit of hindsight, it's difficult to say what is so wrong with the phlogiston theory. It gave us a working explanation of certain chemical phenomena, and it made testable predictions that were accurate enough to be taken seriously. For its time, phlogiston was a perfectly good scientific theory. So the question then becomes, "Why do we see it as a characteristic example of a bad scientific theory disproved by empiricism?" Was it really such a bad theory?


A Scientific Blunder?

On one hand, Chang argues that, given the times, phlogiston theory was a step in the right direction, away from alchemical views and towards seeing electricity as the flow of a fluid, which then leads towards the modern view of chemical reactions involving the flow or exchange of electrons. And on the other hand, Lavoisier's theory is far from being "correct".

If the argument is that phlogiston was an ad hoc concept that could not be observed, then why is the same criticism not levelled against Lavoisier for the role of elemental luminaire or caloric in his theory? Caloric is what we would now call "heat", and it is clearly not elemental.

The terms "oxidation" and "reduction" (and the portmanteau "redox") are generalisations from Lavoisier's explanation of metals and metal-oxides. A metal-oxide can be "reduced" to the pure metal, and a metal oxidised to form the oxide. And one can make them go back and forth by altering the conditions.

While oxidation and reduction apply to metals and their oxides, such reactions are not typical. Most redox reactions don't involve metals or oxygen. When fluorine reacts with hydrogen, for example, we say that hydrogen is "oxidised" (gives up an electron) and that fluorine is "reduced" (gains an electron). And this terminology doesn't make much sense. Even with a BSc in chemistry, I always have to stop and think carefully about which label applies because it's not intuitive.

A commonly cited reason for the collapse of the phlogiston theory is that a metal gains weight in becoming a calx. The implication is that phlogiston theory was at a loss to explain this. Superficially, the early versions of phlogiston theory argue that in becoming a calx, the metal loses phlogiston, so we would expect it to lose weight, rather than gain it. The idea that the metal combines with oxygen is correct in hindsight, and is how we see the formation of metal-oxides in the present.

However, Priestley and another phlogistonist, Richard Kirwan, did have an explanation for weight gain. I've already noted that Priestley's ideas matured and that, latterly, he had concluded that inflammable air (hydrogen) was phlogisticated water, and dephlogisticated air (oxygen) was dephlogisticated water. In Priestley's mature view, the metal formed a calx by combination with water and the loss of phlogiston. The added weight was due to the dephlogisticated water. When the calx was reduced, the metal absorbed phlogiston and gave up water. 

Like Chang, when I review this explanation, keeping in mind the state of knowledge at the time, I can't see how Lavoisier's explanation is any better. Seen in the context of the times (late 18th century), there was nothing illogical about the phlogiston theory. It explained observations and made testable predictions. As Chang (2010: 50) says:

We really need to lose the habit of treating ‘phlogiston theory got X wrong’ as the end of the story; we also need to ask whether Lavoisier’s theory got X right, and whether it did not get Y and Z wrong.

Chang cites several historians of science commenting on this. For example, John McEvoy (1997) notes that...

by the end of the eighteenth century, almost every major theoretical claim that Lavoisier made about the nature and function of oxygen had been found wanting.

And Robert Siegfried (1988):

The central assumptions that had guided [Lavoisier's] work so fruitfully were proved empirically false by about 1815.

These comments are in striking contrast to the claim made by Britannica: "by 1800, practically every chemist recognized the correctness of Lavoisier’s oxygen theory". The story in Britannica is the widely accepted version of history. At the same time, Chang makes clear, the story in Britannica is simply false.

Lavoisier's theory of acids, his theory of combustion, and his theory of caloric were all clearly wrong from the viewpoint of modern chemistry. For example, Lavoisier claimed that all acids contain oxygen (the name oxygen means "acid maker"). However, hydrochloric acid (which we have in our stomachs) does not contain oxygen. Indeed, the action of acids is now thought to be because of their ability to produce hydrogen ions (aka naked protons, aka phlogisticated water), which are extremely reactive.

Moreover, as Chang (2012: 9) shows, the problems with Lavoisier's theory were well known to his contemporaries. Many scientists voiced their concerns at the time. The point is well taken. If we are judging by modern standards, then Lavoisier and Priestley were both wrong, Lavoisier no less than Priestley. Nonetheless, Lavoisier, with his fortune and his access to the French aristoi, had more leverage than dissenting Priestley.

That said, Lavoisier clearly won the argument. And the brief account of his triumph in Britannica is a classic example of the adage that the victors write history.


What We Lost

What Chang tries to do next is declared by the subtitle of section 2: "Why Phlogiston Should Have Lived" (2012: 14). The first section of the book is deliberately written relatively informally with the idea that a general reader could appreciate the argument. In this second section, however, he develops a much more philosophically rigorous approach and introduces a great deal more jargon, some of which is specific to his project.

My aim in this essay is to continue the discussion at the same level. This inevitably means losing exactly the nuances that Chang introduces and probably diverging from his intentions to some extent. I do recommend reading the rest of his argument. What follows is my, all too brief, interpretation of Chang's argument. 

While his history is revisionist, Chang's point is not to promote a speculative counterfactual history (which is to say, a fictitious alternative history). Rather, he seeks to make an argument for pluralism. Where pluralism means the coexistence of different explanations for any given phenomenon, until such time as the best explanation emerges. 

Chang argues that Lavoisier's view that oxygen was being exchanged in chemical reactions was clearly inferior and only applicable to metal/calx reactions. By the time this became clear, phlogiston was discredited and could not be revived. And Lavoisier's counterintuitive oxidation-reduction model became the norm in chemistry, and still is, despite its obvious disadvantages. 

The idea that phlogiston was being exchanged in chemical reactions was not a bad theory (for the time). Moreover, phlogiston was already conceptually linked to electricity. Getting from redox to the exchange of electrons took another century. Chang argues that the conceptual leap from phlogiston to the exchange of electrons could have been considerably easier than it was, starting from Lavoisier's theory.

Chang's argument for pluralism is not simply based on the two theories being equally false. Indeed, he goes to some pains to explain what they both got right. The point is that the phlogiston theory had untapped potential. In prematurely killing off phlogiston and adopting Lavoisier's oxygen theory (which as we have seen was disproved a few decades later), we actually retarded the progress of science. And when Lavoisier was proven wrong, we had no alternative theory and simply retained his awkward and misleading terminology. 

Had we allowed the two theories to co-exist a little longer, so that Lavoisier's explanation could be thoroughly tested and proven false before it was adopted, there is a possibility that we might have lighted on the electron exchange theory of chemical reactions a century earlier than we did. Indeed, as hinted above, phlogiston was already linked to electricity. Seen with hindsight, the rush to judgment about chemical reactions meant that scientists of the late 17th and early 18th centuries missed a huge opportunity. 

Chang is a pragmatist. He knows we cannot go back. His argument is that we should be alert to this situation in the present and the future and be less eager to settle on a theory where ambiguity remains. Arguably, the temporary triumph of the various Copenhagen interpretations of Schrödinger's equation was a similar example. We settled too early, for reasons unconnected to science, only to have the chosen theory be disproved some decades later. 

I don't read Chang as saying that we should hold on to pluralism no matter what. Only that, where there is room for doubt, we should allow multiple explanations to coexist, because we don't know in advance what the best answer will be. This only emerges over time. And a scientific theory can only benefit from responding to the challenges that other explanations pose.


Conclusions

Hasok Chang aims to demonstrate the value of pluralism through critiquing the history of the so-called "chemical revolution" identified with Lavoisier. And the case of phlogiston is both fascinating in its own right and a compelling study of how the lack of pluralism retarded the progress of science. 

While sources like Britannica follow science folklore in insisting on the "correctness" of the oxygen theory, historians of science tell us a different story. It may be true that Lavoisier's theory was widely adopted by 1800, but historians have shown that it was also largely falsified by 1815. By this time, the phlogiston theory had been "killed", as Chang puts it.

Chang attempts to show that phlogiston was not such a bad theory and that the oxygen theory was not such a good theory. Contrary to the usual Whiggish accounts, the triumph of Lavoisier's oxygen theory was not really an example of "scientific progress". Indeed, Chang supposes that adopting the oxygen theory actually retarded the progress of science, since it pointed away from the role of electricity in chemistry. This important insight took another century to emerge.

The phlogiston theory is arguably the better of the two theories that existed in the late 1700s. Chang argues that had phlogiston persisted just a little longer, at least until Lavoisier was disproved, we might have made the leap to seeing chemical reactions in terms of the flow of electricity between elements much earlier than we eventually did. And who knows what else this might have changed?

The point is not to inaugurate some kind of neo-phlogistonist movement or to speculate about counterfactual (alternative) histories. The point is that when we have competing theories, in the present, we should allow them to coexist rather than rushing to settle on one of them. 

Pluralism is a pragmatic approach to uncertainty. When different explanations are possible, we can compare and contrast the differences. Allowing such challenges is more likely to result in scientific progress than the rush to judgment or the overwhelming desire to have one right answer.

As noted at the outset, in this essay, I have largely overlooked the contributions of Priestley's and Lavoisier's contemporaries. I have emphasised the two main players, even more than Chang does, purely for narrative simplicity (and keeping this essay to a reasonable length). This might make it seem that it was something like a personal competition, when that doesn't seem to be the case. Think of this essay as a taster. My aim is to whet your appetite to go and discover Chang for yourself, or better, to go and read the original papers being published at the time. See for yourself.  


Coda

The pluralism that Chang praises in the case of chemistry is not the same kind of pluralism that exists in so-called "interpretations of quantum mechanics". Chang is in favour of having multiple explanations of a phenomenon until such time as the best explanation unequivocally emerges. But he also considers that the best explanations change over time as new data comes in. Chang is a pragmatist, and this seems to be the only viable approach to science. We do not and cannot acquire metaphysical certainty because there is no epistemic privilege with respect to reality. We are all inferring facts about reality based on experience, a procedure known to be fraught with difficulties that often go unnoticed.

Generally, in science, we see competing explanations that attempt to fit a new phenomenon into our pre-existing metaphysics. In crude terms, scientific theories are made to fit into existing views about reality, and new data changes our view of reality only rarely and often incrementally. Paradigms do change, but only with great reluctance. This conservatism is generally a good thing as long as it doesn't become dogmatic.

In stark contrast to the rest of science, in quantum physics, the mathematical approximations are considered infallible and inviolable, and scientists propose different realities in which the mathematics makes sense. They have become dogmatic about their theory and refuse to consider other models. It has not gone well.

As Sabine Hossenfelder said, "Theoretical physicists used to explain what was observed. Now they try to explain why they can’t explain what was not observed."

~~Φ~~


Bibliography

Chang, Hasok. (2009) "We Have Never Been Whiggish (About Phlogiston)". Centaurus 51(4): 239-264. https://doi.org/10.1111/j.1600-0498.2009.00150.x

Chang, Hasok. (2010). "The Hidden History of Phlogiston: How Philosophical Failure Can Generate Historiographical Refinement." HYLE – International Journal for Philosophy of Chemistry 16 (2): 47-79. Online.

Chang, Hasok. (2012a). Is Water H20? Evidence, Realism and Pluralism. Springer.

Chang, Hasok. (2012b). "Scientific Pluralism and the Mission of History and Philosophy of Science." Inaugural Lecture by Professor Hasok Chang, Hans Rausing Professor of History and Philosophy of Science, 11 October 2012. https://www.youtube.com/watch?v=zGUsIf9qYw8

Stahl, Georg Ernst. (1697). Zymotechnia fundamentalis.

16 May 2025

Observations and Superpositions

The role of observation in events has been a staple of quantum physics for decades and is closely associated with "the Copenhagen interpretation". On closer inspection, it turns out that everyone connected with Bohr's lab in Copenhagen had a slightly different view on how to interpret the Schrödinger equation. Worse, those who go back and look at Bohr's publications nowadays tend to confess that they cannot tell what Bohr's view was. For example, Adam Becker speaking to Sean Carroll (time index 21:21; emphasis added):

I don't think that there is any single Copenhagen interpretation. And while Niels Bohr and Max Born and Pauli, and Heisenberg and the others may have each had their own individual positions. I don't think that you can combine all of those to make something coherent...

...Speaking of people being mad at me, this is something that some people are mad at me for, they say, "But you said the Niels Bohr had this position?" I'm like, "No, I didn't, I didn't say that Niels Bohr had any position. I don't know what position he had and neither does anybody else."

So we should be cautious about claims made for "the Copenhagen interpretation", which seem to imply a consensus that never existed at Bohr's lab in Copenhagen.

That said, the idea that observation causes the wavefunction to collapse is still a staple of quantum physics. Despite playing a central role in quantum physics, "observation" is seldom precisely defined in scientific terms, or when it is defined, it doesn't involve any actual observation (I'll come back to this). The situation was made considerably worse when (Nobel laureate) Eugene Wigner speculated that it is "consciousness" that collapses the wave function. "Consciousness" is even less well-defined than "observation". While most academic physicists instantly rejected the role of consciousness in events, outside of physics it became a popular element of science folklore and New Ageism.

The idea that "observation" or "consciousness" are involved in "collapsing the wave function" is also an attachment point for Buddhists who wish to bolster their shaky faith by aligning it with science. The result of such legitimisation strategies is rather pathetic hand waving. Many Buddhists want reality to be reductive and idealist: they want "mind" to be the fundamental substance of the universe. This would align with some modern interpretations of traditional Buddhist beliefs about mind. But the idea is also to find some rational justification for Buddhist superstitions like karma and rebirth. As I showed at length in my book Karma and Rebirth Reconsidered, it simply does not work.

In this essay, I will show that it is trivially impossible for observation to play any role in causation at any level. I'm going to start by defining observation with respect to a person and exploring the implications of this, particularly with respect to Schrödinger's cat. I will also consider the post hoc rationalisation of observation qua "interaction" (sans any actual observation).


What is "An Observation"?

We may say that an observer, Alice, observes a process P giving rise to an event E, with an outcome O, when they become aware of P, E and O. It is possible to be aware of each part individually, but in order to understand and explain what has happened, we really need to have some idea of what processes were involved, what kinds of events it engendered, and the specific outcomes of those events. 

It's instructive to ask, "How does Alice become aware of external events?" Information from the process, event, and/or outcome of interest first has to reach her in some form. The fastest way that this can happen is for light from the process, event, and/or outcome to reach Alice's eyes. It always takes a finite amount of time for the light to reach her eye.

But light reaching Alice's eye alone does not create awareness. Rather, cells in the eye convert the energy of light into electrochemical energy (a nerve impulse). That pulse of energy travels along the optic nerve to the brain and is incorporated into our virtual world model and then, finally, presented to the first person perspective. Only then we become aware of it. And this part also takes a finite amount of time. Indeed, this part takes a lot more time than the light travelling.

Therefore, the time at which Alice becomes aware of P, E, and O, is some appreciable amount of time after E happens and O is already fixed. There is no alternative definition of "observation" that avoids this limitation, since information cannot travel faster than the speed of light and the brain is always involved. The only other possibilities are, if anything, slower. Therefore:

Alice can only observe processes, events, and outcomes after the fact.

If observation is always after the fact, then observation can never play any causal role in the sequence of events because causes must precede effects, in all frames of reference. Therefore:

Observation can play no causal role
in processes, events, or outcomes.

This means that there is no way that "observation" (or "consciousness") can cause the collapse the wavefunction. Rather, the collapse of the wavefunction has to occur first, then the light from that event has to travel to Alice's eye. There is no way around this physical limitation in our universe. And given the nature of wavefunctions—the outputs if which are vectors in a complex plane—this can hardly be surprising. 

Observation is never instantaneous let alone precognitive. And this means that all talk of observation causing "wavefunctions to collapse" is trivially false.

We could simply leave it at that, but it will be instructive to re-examine the best known application of "observation".


Schrödinger's cat

Schrödinger's cat is only ever alive or dead. It is never both alive and dead. This was the point that Schrödinger attempted to make. Aristotle's law of noncontradiction applies: an object cannot both exist and not exist at the same time. We cannot prove this axiom from first principles, but if we don't accept it as an axiom, it renders all communication pointless. No matter what true statement I may state, anyone can assert that the opposite is also true.

Schrödinger proposed his thought experiment as a reductio ad absurdum argument against Bohr and the others in Copenhagen. He was trying to show that belief in quantum superpositions leads to absurd, illogical consequences. He was right, in my opinion, but he did not win the argument (and nor will I).

This argument is broadly misunderstood outside of academic physics. This is because Schrödinger's criticism was taken up by physicists as an exemplification of the very effect it was intended to debunk. "Yes," cried the fans of Copenhagen type explanations, "this idea of both-alive-and-dead at the same time is exactly what we mean. Thanks." And so we got stuck with the idea that the cat is both alive and dead at the same time (which is nonsense). Poor old Schrödinger, he hated this idea (and didn't like cats) and now it is indelibly associated with him.

The general set up of the Schrödinger's cat thought experiment is that a cat is placed in a box. Inside the box, a random event may occur. If it occurs, the event triggers the death of the cat via a nefarious contraption. Once the cat is in the box, Alice doesn't know whether the cat is alive or dead. The cat is a metaphor for subatomic particles. We are supposed to believe that they adopt a physical superposition of states: say, "spin up" and "spin down", or "position x" and "position y" at the same time before we measure them, then at the point of measurement, they randomly adopt one or the other of the superposed states.

Here's the thing. The cat goes into the box alive. If the event happens, the cat dies. If it doesn't happen the cat lives. And Alice doesn't know which until she opens the box. The uncertainty here is not metaphysical, it's epistemic. It's not that a cat can even be in a state of both-alive-and-dead, it cannot; it's only that we don't know whether it is alive or dead. So this is a bad analogy.

Moreover, even when Alice opens the box, the light from the cat still takes some time to reach her eyes. Observation always trails behind events, it cannot anticipate or participate in events. Apart from reflected light, nothing is coming out from Alice that could participate in the sequence of events happening outside her body, let alone change the outcome.

Also, the cat has eyes and a brain. It is itself an "observer". 

Epistemic uncertainty cannot be mapped back to metaphysical uncertainty without doing violence to reason. A statement, "I don't know whether the cat is alive or dead," cannot be taken to imply that the cat is both alive and dead. This is definitely a category error for cats. Schrödinger's view was that it is also a category error for electrons and photons. And again, I agree with Schrödinger (and Einstein).

In that case, why do physics textbooks still insist on the nonsensical both-alive-and-dead scenario? It seems to be related to a built-in feature of the mathematics of spherical standing waves, which are at the heart of Schrödinger's equation (and many other features of modern science). The mathematics of standing waves was developed in the 18th century (i.e. it is thoroughly classical). Below, I quote from the Mathworld article on Laplace's equation (for a spherical standing wave) by Eric Weisstein (2025. Emphasis added)

A function psi which satisfies Laplace's equation is said to be harmonic. A solution to Laplace's equation has the property that the average value over a spherical surface is equal to the value at the center of the sphere (Gauss's harmonic function theorem). Solutions have no local maxima or minima. Because Laplace's equation is linear, the superposition of any two solutions is also a solution.

The last sentence of this passage is similar to a frequently encountered claim in quantum physics. That is to say, the fact that solutions for individual quantum states can be added together and produce another valid solution for the wave equation. This is made out to be a special feature of quantum mechanics that defines the superposition of "particles".

Superposition of waves is nothing remarkable or "weird". Any time two water waves meet, for example, they superpose.


In this image, two wave fronts travel towards the viewer obliquely from the left and right at the same time (the appear to meet almost at right angles). The two waves create an interference pattern (the cross in the foreground) where the two waves are superposed. Waves routinely superpose. And this is known as the superposition principle.

The superposition principle, also known as superposition property, states that, for all linear systems, the net response caused by two or more stimuli is the sum of the responses that would have been caused by each stimulus individually."
The Penguin Dictionary of Physics.

For this type of linear function, we can define superposition precisely: f(x) + f(y) = f(x+y)

In mathematical terms, each actual wave can be thought of as a solution to a wave equation. The sum of the waves must also be a solution because of the situation we see in the image, i.e. two waves physically adding together where they overlap, while at the same time retaining their identity.

I've now identified three universal properties of spherical standing waves that are frequently presented as special features of quantum physics:

  • quantisation of energy
  • harmonics = higher energy states (aka orbitals)
  • superposition (of waves)

These structural properties of standing waves are not "secret", but they are almost always left out of narrative accounts of quantum physics. And yet, these are important intuitions to bring to bear when applying wave mechanics to describing real systems.

Something else to keep in mind is that "quantisation" is an ad hoc assumption in quantum physics. It's postulated to be a fundamental feature of all quantum fields. The only problem is that all of the physical fields we know of—which is to say the fields we can actually measure—are smooth and continuous across spacetime: including gravitational fields and electromagnetic fields. Scientists have imagined discontinuous or quantized fields, but they have never actually seen one.

Moreover, as far as I know, the only physical mechanism in our universe that is known to quantize energy, create harmonics, and allow for superposition is the standing wave. The logical deduction from these facts is that it is the standing wave structure of the atom that quantizes the energy of electrons and photons and creates electron orbitals. 

Quantization is a structural property of atoms, not a substantial property of fields. (Or more conventionally and less precisely, quantization is an emergent property, not a fundamental property). 

Also, as I have already explained, the coexistence of probabilities always occurs before any event, and those probabilities always collapse at the point when an event has a definite outcome. There is nothing "weird" about this; it's not a "problem". What is weird, is the idea that hypostatizing and reifying probabilities leads to some meaningful metaphysics. It has not, and it will not.

While the superposition of waves or probabilities is an everyday occurrence. The superposition of physical objects is another story. Physical objects occupy space in an exclusive way: if one object is in that location, no other physical object can also be in that location. Physical objects cannot superpose and they are never observed to be superposed. And yet, the superposition of point particles is how physicists continue to explain the electron in an atom.

The electric field has been measured and it is found to be smooth and continuous in spacetime. Just as predicted by Maxwell. Given this, simple logic and basic geometry dictates that if—

  1. the electrostatic field of the proton has spherical symmetry, and
  2. a hydrogen atom is electrostatically neutral, and
  3. the neutrality is assumed to be the result of the electron's electrostatic field,

—then the electron can only be in one configuration: it must be a sphere (or a close approximation of a sphere) completely surrounding the proton. This is the only way to ensure that all the field lines emerging from the proton terminate at the electron. Otherwise there are unbalanced forces - a net charge rather than neutrality. And a changing electric field dissipates energy, which electrons do not. 

Unbalanced forces

Now, if the electron is both a wave and a sphere, then the electron can only be a spherical standing wave. The Bohr model of the atom was incorrect and it surprises me greatly that this problem was not identified at the time. 

And if the electron is a spherical standing wave then, because these are universal features of standing waves, we expect:

  1. The energy of the electron in the H atom will be quantised.
  2. The electron will form harmonics corresponding to higher energy states and it will jump between them when it absorbs or emits photons.
  3. When two electron waves intersect, the sum of their amplitudes is also a solution to the wave equation.

Moreover, we can now take pictures of atoms using electron microscopes. Atoms are physical objects. In every single picture, atoms appear to be approximately spherical.


And yet mainstream quantum models do not quite treat atoms as real. Quantum physics is nowadays all about probabilities. The problem is that, as I established in an earlier essay, a probability cannot possibly balance an electrostatic field to create a neutral atom. Only a real electric field can do this. Schrödinger was right to be unconvinced by the probability interpretation, even if it works. But he was wrong about modelling a particle as a wave. 

Waves are observed to superpose all the time. Solid objects are never observed to do so. The only reason we even consider superposition for "particles" is the wave-particle duality postulate, which we now know to be inaccurate. "Particles" are waves.

As I understand it, the idea that our universe consists of 17 fields in which particles are "excitations" is a widely accepted postulate. And as such, one might have expected scientists to go back over the physics predicated on wave-particle duality and recast it in terms of only waves. Having the wave equation describe a wave would be a start.

I digress. Clearly the idea that observers influence outcomes is trivially false. So now we must turn to the common fudge of removing the observer from the observation.


Interaction as Observation

One way around the problems with observation, is to redefine "observation" so that it excludes actual observations and observers. The move is to redefine "observation" to mean "some physical interaction". I'm sure I've mentioned this before because I used to think this was a good idea.

While we teach quantum physics in terms of isolated "particles" in empty, flat space, the fact is that the universe is crammed with matter and energy, especially in our part of the universe. Everything is interacting with everything that it can interact with, simultaneously in all the ways that it can interact, at every moment that it is possible to interact. Nothing in reality is ever simple.

In classical physics, we are used to being able to isolate experiments and exclude variables. This cannot ever happen at the nanoscale and below. An electron, for example, is surrounded by an electrostatic field which interacts with the fields around all other wavicles, near and far.

Electrons, for example, are all constantly pushing against each other via the electromagnetic force. If your apparatus contains electrons, their fields invariably interact with the electron you wish to study. This includes mirrors, beam-splitters, prisms, diffraction gratings, and double slits. The apparatus is not "classical", it's part of the quantum system you study. At the nanoscale and below, there are no neutral apparatus. 

Therefore, the idea that interaction causes the wavefunction to "collapse" is also untenable because in the real world wavicles are always interacting. In an H atom, for example, the electron and the proton are constantly and intensely interacting via the electromagnetic force. So the electron in an H atom could never be in a superposition.


Conclusions

Observation can only occur after the fact and is limited by the speed of light (or speed of causality).

Neither "observation" nor "consciousness" can play any role in the sequence of events, let alone a causal role.

Schrödinger's cat is never both alive and dead. And observation makes no difference to this (because observation can only ever be post hoc and acausal).

It is always the case, no matter what kind of system we are talking about, that probabilities for all possibilities coexist prior to an event and collapse as the event produces a specific outcome. But this is in no way analogous to waves superposing and should not be called "superposition".

All (linear) waves can superpose. All standing waves are quantised. All standing waves have harmonics.

Defining observation so as to eliminate the observer doesn't help as much as physicists might wish.

"Observation" is irrelevant to how we formulate physics.

The wave-particle duality postulate is still built into quantum mechanics, despite being known to be false.

For the last century, quantum physicists have been trying to change reality to fit their theory. Many different kinds of reality have been proposed to account for quantum theory: Copenhagen, Many Worlds, Qbism, etc. I submit that proposing a wholly different reality to account for your theory is tantamount to insanity. The success in predicting probabilities seems to have causes physicists to abandon science. I don't get it, and I don't like it. 

~~Φ~~


Bibliography

Weisstein, Eric W. (2025) "Laplace's Equation." MathWorld. https://mathworld.wolfram.com/LaplacesEquation.html

Related Posts with Thumbnails