Dirac equation

The Dirac equation is a relativistic quantum mechanical wave equation invented by Paul Dirac in 1928. It provides a description of elementary spin-1/2 particles, such as electrons, that is fully consistent with the principles of quantum mechanics and largely consistent with the theory of special relativity. It also accounts in a natural way for the nature of particle spin and the existence of antiparticles.

Table of contents

1 Introduction
2 Derivation of the Dirac equation

2.1 Nature of the wavefunction
2.2 Energy spectrum
2.3 Hole theory

3 Electromagnetic interaction

3.4 Interaction Hamiltonian

4 Relativistically covariant notation
5 References

5.5 Selected Papers
5.6 Textbooks

Introduction

Since the Dirac equation was originally invented to describe the electron, we will generally speak of "electrons" in this article. Actually, the equation applies to other types of elementary spin-1/2 particles, such as neutrinos. A modified Dirac equation can be used to approximately describe protons and neutrons, which are made of smaller particles called quarks and are therefore not elementary particles.

The Dirac equation is

where m is the rest mass of the electron, c is the speed of light, p is the momentum operator, is Planck's constant, x and t are the space and time coordinates respectively, and ψ(x, t) is a four-component wavefunction. (The wavefunction has to be formulated as a four-component spinor, rather than a simple scalar, due to the demands of special relativity. The physical meanings of the components are discussed below.) The α's are linear operators that act on the wavefunction, written as a column matrix, as 4×4 matrices known as Dirac matrices. There is more than one way to choose a set of Dirac matrices, a convenient choice being

The Dirac equation describes the probability amplitudes for a single electron. This single-particle theory gives a fairly good prediction of the spin and magnetic moment of the electron and explains much of the fine structure observed in atomic spectral lines. It also makes the peculiar prediction that there exists an infinite set of quantum states in which the electron possesses negative energy. This strange result led Dirac to predict, via a remarkable hypothesis known as "hole theory", the existence of particles behaving like positively-charged electrons. This prediction was verified by the discovery of the positron in 1932.

Despite these successes, the theory is flawed by its neglect of the possibility of creating and destroying particles, one of the basic consequences of relativity. This difficulty is resolved by reformulating it as a quantum field theory. Adding a quantized electromagnetic field to this theory leads to the modern theory of quantum electrodynamics (QED). For a more detailed discussion of the field formulation, refer to the article on Dirac field theory.

Derivation of the Dirac equation

The Dirac equation is a special case of the Schr�dinger equation, which describes the time-evolution of a quantum mechanical system:

For convenience, we will work in the position basis, in which the state of the system is represented by a wavefunction, ψ(x,t). In this basis, the Schr�dinger equation becomes

where the Hamiltonian H now denotes an operator acting on wavefunctions rather than state vectors.

We have to specify the Hamiltonian so that it appropriately describes the total energy of the system in question. Let us consider a "free" electron isolated from all external force fields. For a non-relativistic model, we adopt a Hamiltonian analogous to the kinetic energy of classical mechanics (ignoring spin for the moment):

where the p's are the momentum operators in each of the three spatial directions j=1,2,3. Each momentum operator acts on the wavefunction as a spatial derivative:

To describe a relativistic system, we have to find a different Hamiltonian. Assume that the momentum operators retain the above definition. According to Albert Einstein's famous mass-momentum-energy relationship, the total energy of a system is given by

This prescribes something like

This is not a satisfactory equation, for it does not treat time and space on an equal footing, one of the basic tenets of special relativity. Dirac reasoned that, since the right side of the equation contains a first-order derivative in time, the left side should contain equally simple first-order derivatives in space (i.e., in the momentum operators). One way for this to happen is if the quantity in the square root is a perfect square. Suppose that

where the α's are constants to be determined. Expanding the square and comparing coefficients on each side, we obtain the following conditions for the α's:

Here, I stands for the identity element. These conditions may be written more concisely as

where {...} is the anticommutator, defined as {A,B}≡AB+BA, and δ is the Kronecker delta, which has the value 1 if its two subscripts are equal and 0 otherwise.

These conditions cannot be satisfied if the α's are ordinary numbers, but they can be satisfied if the α's are matrices. The matrices must be Hermitian, so that the Hamiltonian is Hermitian. The smallest matrices that work are 4×4 matrices, but there is more than one possible choice, or representation, of matrices. Although the choice of representation does not affect the properties of the Dirac equation, it does affect the physical meaning of the individual components of the wavefunction.

In the introduction, we presented the representation used by Dirac. This representation can be more compactly written as

where 0 and I are the 2×2 zero and identity matrices, respectively, and the σ_j's (j=1,2,3) are the Pauli matrices.

It is now straightforward to carry out the square root, which gives the Dirac equation. The Hamiltonian in this equation,

is called the Dirac Hamiltonian.

Nature of the wavefunction

Since the wavefunction ψ is acted on by the 4×4 Dirac matrices, it must be a four-component object. We will see, in the next section, that the wavefunction contains two sets of degrees of freedom, one associated with positive energies and the other with negative energies, with each set containing two degrees of freedom that describe the probability amplitudes for the spin to be pointing "up" or "down" along a specified direction.

We may explicitly write the wavefunction as a column matrix:

The dual wavefunction can be written as a row matrix:

where the * superscript denotes complex conjugation. By comparison, the dual of a scalar (one-component) wavefunction is just its complex conjugate.

As in ordinary single-particle quantum mechanics, the "absolute square" of the wavefunction gives the probability density of the particle at each position x and time t. In this case, the "absolute square" is obtained by matrix multiplication:

The conservation of probability gives the normalization condition

By applying Dirac's equation, we can examine the local flow of probability:

The probability current J is given by

Multiplying J by the electron charge e yields the electric current density j carried by the electron.

The values of the wavefunction components depend on the coordinate system. Dirac showed how ψ transforms under general changes of coordinate system, including rotations in three-dimensional space as well as Lorentz transformations between relativistic frames of reference. It turns out that ψ does not transform like a vector under rotations and is in fact a type of object known as a spinor.

Energy spectrum

It is instructive to find the energy eigenstates of the Dirac Hamiltonian. To do this, we must solve the time-independent Schr�dinger equation,

where ψ₀ is the time-independent part of the energy eigenfunction:

Let us look for a plane-wave solution. For convenience, we align the z axis with the direction in which the particle is moving, so that

where w is a constant four-component spinor and p is the momentum of the particle, as we can verify by applying the momentum operator to this wavefunction. In the Dirac representation, the equation for ψ₀ reduces to the eigenvalue equation:

For each value of p, there are two eigenspaces, both two-dimensional. One eigenspace contains positive eigenvalues, and the other negative eigenvalues, of the form:

The positive eigenspace is spanned by the eigenstates:

and the negative eigenspace by the eigenstates:

where

The first spanning eigenstate in each eigenspace has spin pointing in the +z direction ("spin up"), and the second eigenstate has spin pointing in the -z direction ("spin down").

In the non-relativistic limit, the ε spinor component reduces to the kinetic energy of the particle, which is negligible compared to pc:

In this limit, therefore, we can interpret the four wavefunction components as the respective amplitudes of (i) spin-up with positive energy, (ii) spin-down with positive energy, (iii) spin-up with negative energy, and (iv) spin-down with negative energy. This description is not accurate in the relativistic regime, where the non-zero spinor components have similar sizes.

Hole theory

The negative E solutions found in the preceding section are problematic, for relativistic mechanics tells us that the energy of a particle at rest (p = 0) should be E = mc² rather than E = -mc². Mathematically speaking, however, there seems to be no reason for us to reject the negative-energy solutions. Since they exist, we cannot simply ignore them, for once we include the interaction between the electron and the electromagnetic field, any electron placed in a positive-energy eigenstate would decay into negative-energy eigenstates of successively lower energy by emitting excess energy in the form of photons. Real electrons obviously do not behave in this way.

To cope with this problem, Dirac introduced the hypothesis, known as hole theory, that the vacuum is the many-body quantum state in which all the negative-energy electron eigenstates are occupied. This description of the vacuum as a "sea" of electrons is called the Dirac sea. Since the Pauli exclusion principle forbids electrons from occupying the same state, any additional electron would be forced to occupy a positive-energy eigenstate, and positive-energy electrons would be forbidden from decaying into negative-energy eigenstates.

Dirac further reasoned that if the negative-energy eigenstates are incompletely filled, each unoccupied eigenstate – called a hole – would behave like a positively charged particle. The hole possesses a positive energy, since energy is required to create a particle–hole pair from the vacuum. Dirac initially thought that the hole was a proton, but Hermann Weyl pointed out that the hole should behave as if it had the same mass as an electron, whereas the proton is over a thousand times heavier. The hole was eventually identified as the positron, experimentally discovered by Carl Anderson in 1932.

By necessity, hole theory assumes that the negative-energy electrons in the Dirac sea interact neither with each other nor with the positive-energy electrons. Without this assumption, the Dirac sea would produce a huge (in fact infinite) amount of negative electric charge, which must somehow be balanced by a sea of positive charge if the vacuum is to remain electrically neutral. However, it is quite unsatisfactory to postulate that positive-energy electrons should be affected by the electromagnetic field while negative-energy electrons are not. For this reason, physicists have abandoned hole theory in favour of Dirac field theory, which bypasses the problem of negative energy states by treating positrons as true particles. (Caveat: in certain applications of condensed matter physics, the underlying concepts of "hole theory" are certainly valid. The sea of conduction electrons in an electrical conductor, called a Fermi sea, contains electrons with energies up to the chemical potential of the system. An unfilled state in the Fermi sea behaves like a positively-charged electron, though it is referred to as a "hole" rather than a "positron". The negative charge of the Fermi sea is balanced by the positively-charged ionic lattice of the material.)

Electromagnetic interaction

So far, we have considered an electron that is not in contact with any external fields. Proceeding by analogy with the Hamiltonian of a charged particle in classical electrodynamics, we can modify the Dirac Hamiltonian to include the effect of an electromagnetic field. The revised Hamiltonian is (in SI units):

where e is the electric charge of the electron, and A and φ are the electromagnetic vector and scalar potentials, respectively. Here, the potentials are written as exactly-specified functions of the time t and the position operator x. This is a semiclassical approximation that is valid when the quantum fluctuations of the field (i.e., the emission and absorption of photons) are not important.

By setting φ = 0 and working in the non-relativistic limit, Dirac solved for the top two components in the positive-energy wavefunctions (which, as discussed earlier, are the dominant components in the non-relativistic limit), obtaining

where B = ∇ ×A is the magnetic field acting on the particle. This is precisely the Pauli equation for a non-relativistic spin-1/2 particle, with magnetic moment (i.e., a spin g-factor of 2). The actual magnetic moment of the electron is larger than this, though only by about 0.12%. The shortfall is due to quantum fluctuations in the electromagnetic field, which have been neglected.

For several years after the discovery of the Dirac equation, most physicists believed that it also described the proton and the neutron, which are both spin-1/2 particles. However, beginning with the experiments of Stern and Frisch in 1933, the magnetic moments of these particles were found to disagree significantly with the predictions of the Dirac equation. The proton has a magnetic moment 2.79 times larger than predicted (with the proton mass inserted for m in the above formulas), i.e., a g-factor of 5.58. The neutron, which is electrically neutral, has a g-factor of -3.83. These "anomalous magnetic moments" were the first experimental indication that the proton and neutron are not elementary particles. They are in fact composed of smaller particles called quarks.

Interaction Hamiltonian

It is noteworthy that the Hamiltonian can be written as the sum of two terms:

where H_el is the Dirac Hamiltonian for a free electron and H_int is the Hamiltonian of the electromagnetic interaction. The latter may be written as

It has the expected value

where ρ is the electric charge density and j is the electric current density. The integrand in the final expression is the interaction energy density. It is a relativistically covariant scalar quantity, as we can see by writing it in terms of the current-charge four-vector j = (ρc,j) and the potential four-vector A = (φ/c,A):

where η is the metric of flat spacetime:

Relativistically covariant notation

Let us return to the Dirac equation for the free electron. It is often useful to write the equation in a relativistically covariant form, in which the derivatives with time and space are treated on the same footing.

To do this, first recall that the momentum operator p acts like a spatial derivative:

Multiplying each side of the Dirac equation by α₀ (recalling that α₀²=I) and plugging in the above definition of p, we obtain

Now, define four gamma matrices:

These matrices possess the property that

where η once again stands for the metric of flat spacetime. These relations define a Clifford algebra called the Dirac algebra.

The Dirac equation may now be written, using the position-time four-vector x = (ct,x), as