Wavefunctions are functions (think f(x) = y
) that take a system state as input and produce something like the square root of a probability as output. The “square root of a probability” takes the form of a set (vector) of complex numbers. These functions are called wavefunctions because, in practice, they appear as solutions to wave equations. There are at least three quirks that seem off-putting about wavefunctions when first encountered, which I’ll phrase as three questions:
Why are they waves?
Why are they complex-valued?
Why are there multiple outputs?
I’m going to ignore the first question for now, partly because I don’t feel like I know enough of the history, and partly because that’s an empirical question, not a mathematical one. It being empirical means the relevant experiments and theories become important for explanations, and it’s hard for me to say something definitive about those when I don’t feel like I know enough of the history.
So we’re focusing on why wavefunctions are complex-valued vectors.
Note that this post assumes familiarity with linear algebra. If you’re not familiar with linear algebra, please watch the 3blue1brown’s lecture series on the topic first. He’s a smart cookie, and he explains things well.
Quaternions
The arithmetic of spatial quantities (rotation angles, lengths, areas, and so on) is given by what’s called a Clifford Algebra. As far as I can tell, Clifford Algebras were discovered as a generalization of quaternions, which are like imaginary/complex numbers designed to support rotations in 3D space. Quaternions were constructed as solutions to the following equations.
i² = -1
j² = -1
k² = -1
ijk = -1
From the first three equations, you can probably imagine why complex numbers work as a starting point for understanding quaternions. Intuitively, i
, j
, and k
each represents a different 90-degree rotation, so the first three equations state that applying two 90-degree rotations (a 180-degree rotation) to a value is the same thing as negating that value. Think of a number line, add a second dimension, and it should make sense. The fact that there are three ways to do this encodes the fact that, in 3D space, there are 3 axes around which you can rotate something.
You can work out the last equation visually to confirm that it is correct. (“If you rotate 90 degrees along one axis, then 90 degrees along another axis, the result is the same as rotating 90 degrees along the third axis.”) That equation is interesting though because it implicitly encodes the fact that you can apply at most two of i
, j
, k
before the total rotation “collapses” back to an equivalent rotation consisting of only two-or-fewer of i
, j
, k
. The hidden consequence is that the “state” of a rotation can be represented by two numbers. Intuitively, if you pick an arbitrary point on the surface of a sphere to call the origin, you can use a 2D coordinate to represent any other point on the sphere. In fact, under this scheme, there would be two distinct ways to describe any given point— except the origin— without involving shenanigans like doing loops around the sphere. Though I won’t discuss it, the fact that there are two representations for each non-zero rotation is related to quantum spin.
One thing to quickly note here. When you represent a rotation as a point on a sphere relative to some origin, addition becomes a natural concept, albeit in a projective space rather than a normal Euclidean one. With the original i
, j
, k
representation, addition didn’t quite make intuitive sense— we had to multiply elements.
Anyway, let’s solve that last equation for one of the rotations.
ijk = -1
ijk(k) = -1(k)
ij(kk) = -k
ij(-1) = -k
ij = k
The order of multiplications matters. ij = k
, but ji = -k
. You can try it out yourself.
Anyway, with this, we can represent our algebra of rotations with just real-valued scalars plus two new elements (i
and j
) and their corresponding equations (i² = j² = -1
). It turns out, miraculously, that the last equation (ijk = -1
) isn’t needed because it follows from just the first two, plus our definition of k
.
ijk = ij(ij)
= -ij(ji)
= -i(jj)i
= -i(-1)i
= ii
= -1
So in the end, the original definition of quaternions was more verbose than it needed to be. We only really needed to know two things if we wanted to derive quaternions:
From any point on a surface of a sphere, we have two “dimensions” of rotations, arbitrarily labeled
i
andj
.Squaring any rotation
i
orj
yields-1
.
As you’ll see soon, these rules can be generalized to deal with many other things, including areas, volumes, and even infinitesimals.
Clifford Algebras
As I mentioned earlier, the arithmetic of spatial quantities is given by Clifford Algebras. A Clifford Algebra is a set of numbers attached to spatial quantities, similar to how quaternions have “numbers” i
and j
attached to 90-degree rotations. A Clifford Algebra can be constructed from the following inputs:
Pick a number of dimensions and the kind of value each dimension deals with.
Define a function for squaring a unit vector in each of your dimensions.
Here are some examples.
For complex numbers, start with 1 dimension of real values, and let the squaring function be
{i² = -1}
.For quaternions, start with 2 dimensions of real values, and let the squaring functions be
{i² = -1, j² = -1}
. Two equations, one for each dimension.For infinitesimals in Euclidean space, start with 3 dimensions of real values, and let the squaring functions be
{dx² = dy² = dz² = 0}
. If you work it out, you’ll see that any 1D infinitesimal quantity can be represented as a linear combination ofdx
,dy
, anddz
, multiplying any two of those gives you a 2D infinitesimal area, and multiplying all three of them gives you a 3D infinitesimal volume. For example, if you try to extrudedxdy
alongdx
, you’ll getdxdxdy = (dx²)dy = 0dy = 0
for the volume, as expected since you would get a flat box.
To be clear, it’s very strange that such a simple set of rules can describe all of these spatial quantities. That’s math for you.
Anyway. This algebra is clearly very general, which is great, but any given Clifford Algebra has a lot of hidden relationships that would be nice to make more explicit. To take quaternions as an example, it’s not obvious from the Clifford Algebra construction that there should be a third rotation axis k = ij
that spontaneously forms when we only specified two. To reveal things like this, we would need some way to represent the elements of a Clifford Algebra using objects whose properties we understand better.
There are other reasons to find a better representation. Two, in particular, are relevant for wavefunctions:
The pragmatic one: you need a numerical representation of Clifford Algebra elements if you want to use them in equations without making the equations incomprehensible.
As suggested by quaternions, the operations associated with Clifford Algebra elements seem to act on some underlying “state,” which is a vector. Whereas the Clifford Algebra elements can be intuitively combined through multiplications, it’s less intuitive what addition refers to, though it’s certainly possible within the algebra. (
dydx + dz
is a valid element, but what’s an area plus a length?) When we converted quaternion operations to vectors-from-a-starting-point, addition became a sensible operation. Maybe the vector view can illuminate other operations in the same way.
To find a good representation, we need to be mindful of which operations need to remain valid. In the case of Clifford Algebras, the new representation needs to preserve addition and multiplication between elements. As a bonus, it would be nice if it preserved the distinctions between two Clifford Algebra elements. We wouldn’t want two rotations in a Clifford Algebra to get collapsed into just one operation in the new representation, since that might make us miss some interesting truths. As a double bonus, it would be nice if the new representation only supported the elements that exist in the Clifford Algebra. That way, we don’t need to worry about checking whether an operation in the new representation is valid in the old representation.
Matrix rings
Those requirements, preserving addition, multiplication, and maintaining a 1-to-1 map between elements, are collectively referred to as an “algebra isomorphism.” It turns out that all Clifford Algebras we care about in physics are algebra-isomorphic to some set of matrices.
The i
, j
, and ij = k
operations of quaternions are represented by, effectively, the Pauli matrices. (You can recover these matrices from the actual Pauli matrices by transposing and multiplying by i
.) You end up with 2x2 complex-valued matrices that look like this.
i = [ [i,0], [0,-i] ]
The
i
on the left-hand side corresponds to the quaternion rotationi
. Thei
on the right-hand side corresponds to the complex numberi
.
j = [ [0,1], [-1,0] ]
ij = k = [ [0,i], [i,0] ]
Ditto here. On the left-hand side,
i
refers to the quaternion rotation. On the right-hand side, it’s the complex number.
You can check that matrix multiplication of i
and j
does actually yield k
, and that any of these matrices squared yields -1
.
Combine this with the usual representation of the scalar 1
(multiplicative identity)…
1 = [ [1,0], [0,1] ]
… And you end up with a basis for all 2x2 complex matrices. You can add and multiply them like normal matrices, and no matter what the result is, you can always convert it back to a quaternion.
With this matrix representation, it becomes more clear what vectors the quaternions manipulate. Since 2x2 complex matrices act on 2D vectors, and since the algebra of quaternions is algebra-isomorphic to these 2x2 complex matrices, we can say that quaternions also act on 2D vectors.
Wavefunctions in spacetime
Wavefunctions assign a “compatible” representation of probability to each spatial quantity in spacetime. Because they deal with spatial quantities, we’ll need an appropriate Clifford Algebra if we want to manipulate them. Wavefunction outputs are additive, not multiplicative, so for calculations, we’ll need a vector representation based on those Clifford Algebra elements.
The Clifford Algebra for spacetime is given by:
4 total dimensions (1 for time, 3 for space) of real (floating point) values.
Squaring functions
{t² = 1, x² = y² = z² = -1}
. These are given by the Minkowski Metric, which describes how to calculate invariant intervals between events in spacetime. In special relativity, perceived distances and time intervals change depending on your velocity. The intervals given by the Minkowski Metric are special because they don’t change with velocity, and so they’re globally defined.
Note that this is for spacetime under the theory of special relativity! If you want to work with different models of spacetime or work with different invariants, you would need a different algebra!
The matrix representation for this particular Clifford Algebra is given by 2x2 quaternion-valued matrices, which can be represented by a 4x4 complex-valued matrix. As a result, the “compatible” representation for wavefunction values is a length-4 vector of complex values. In total, wavefunctions represent probabilities as a length-4 vector of complex numbers. Note that this is the most general case, and different particles might involve only a subset of this algebra acting on only a subspace of these vectors.
In any case, I guess that answers our original two questions. Wavefunctions are complex-valued vectors because that’s how the underlying states are represented for relativistic spacetime.
Caveat: I’m not a physicist. (Why did you read this?!) If I made any mistakes, please let me know.
Check https://paperclip.substack.com/p/comments-on-understanding-wavefunction for a changelist.
I haven't read through all of this, but I'm sure you took a wrong turn, because at the end you're saying quantum wavefunctions are complex-valued because quaternions are how you should represent quantities in space-time, and that all this has something to do with spinors. And that's wrong in multiple ways.
A quantity is a spinor if it transforms in certain ways under rotations. A complex-valued object in space-time doesn't have to be a spinor; it can also be a scalar, vector, or tensor. Also, on the physics side, a large part of the significance of spinors is that they necessarily become fermionic when quantized.
I suggest reading some resources that don't try to reduce everything to Clifford algebras...