Derivation of the partition function

The partition function provides a link between the microscopic properties of atoms and molecules (eg. size, shape and characteristic energy levels) and the bulk thermodynamic properties of matter. In order to understand the partition function, how it can be derived, and why it works, it is important to recognize that these bulk thermodynamic properties reflect the average behavior of the atoms and molecules. For example, the pressure of a gas is really just the average force per unit area exerted by its particles as they collide with the container walls. It doesn't matter which particular particles strike the wall at any given time or even the force with which a given particle strikes the wall. In addition it is not necessary to consider the fluctuations in pressure as different numbers of particles hit the walls, since the magnitude of these fluctuations is likely to be extremely small. Only the average force produced by all the particles over time is important in determining the pressure. Similarly for other properties, it is the average behavior that is important. The partition function provides a way to determine the most likely average behavior of atoms and molecules given information about the microscopic properties of the material.

In order to derive the partition function, consider a system composed of N molecules. Although the system has a constant total energy of E, the energy may be distributed among the molecules in any number of ways. As molecules interact, the energy is continually redistributed. Not only is energy exchanged between molecules, but between the various modes of motion (eg. rotation, vibration, etc...). Instead of attempting to determine the energy of each individual molecule at every instant in time, we instead focus on the population of each energetic state. In other words, we would like to determine on average how many molecules, n_i, are in a particular energetic state, E_i. Over time the population of each state remains almost constant, although the individual molecules in each state may change at every collision.

In order to proceed we assume the principle of equal a priori probabilities. This means that we assume that all states corresponding to a given energy are equally probable. For example, vibrational states of a given energy are just as likely to be populated as rotational or electronic states of the same energy. We also assume that the molecules are independent in the sense that the total energy of the system is equal to the sum of the energies of each individual particle.

At any instant there will be n₀ molecules in the state with energy E₀, n₁ with E₁, and so on. The complete specification of populations n₀, n₁,... for each energy state gives the instantaneous configuration of the system. For convenience we may write a particular configuration as {n₀, n₁,...}. We'll also take E₀ to correspond to the lowest energy level or the ground state.

A large number of configurations are possible. For instance one possible configuration is {N,0,0,...} with all of the molecules in the ground state, E₀. Another possible configuration could be {N-1,1,0,...}, where one of the molecules is in the excited state, E₁. Of these two configurations, the second is much more likely, since any of the N molecules could be in the excited state resulting in a total of N possible arrangements of molecules. On the other hand there is only one possible way to get the first configuration, since all of the molecules must be in the ground state. If the system were free to fluctuate between these two states, we would expect to find it most frequently in the second state, especially for large values of N. Since the system would most often be found in the second state, we would also expect the characteristics of the system to be dominated by the characteristics of that state.

The number of arrangements, W, corresponding to a given configuration {n₀, n₁,...} is given by:

This expression comes from combinatorics (and is applied in probability theory) and corresponds to the number of distinguishable ways N objects can be sorted into bins with n_i objects in bin i.

When working with large numbers it is often convenient to work with ln(W) instead of W itself. For this case:

Applying Stirling's approximation,

and the fact that

gives

We showed previously that the configuration {N-1,1,0...} dominates {N,0,0,...} because there are more ways to obtain it. We would expect there to be other configuations that dominate both of these. In fact we would expect the configuration with the largest value of W to dominate all other configurations. We can find this dominant configuration by finding the maximum of the function W with respect to n_i. We know that when W is a maximum then ln(W) is also a maximum, so for convenience we will instead try to find the maximum of ln(W).

One way to find the maximum of ln(W) is to solve the equation:

However, Equation (4) applies to the situation in which any arbitrary configuration {n₀, n₁,...} is possible. In reality there are a few constraints on the system that must be accounted for. First, since the total number of molecules is fixed at N, not all values of n_i can be arbitrary. Instead only configurations in which:

are possible. Also, the total energy of the system is fixed at E. Therefore, since the total energy is the sum of the energies of all the individual molecules:

We can find the maximum of ln(W) subject to the constraints on N and E expressed in equations (5) and (6) using the method of Lagrange multipliers as follows. First, we must rearrange the constraint equations as:

Next, we create a new function by multiplying the constraints by the arbitrary constants -α' and β, and adding them to the original function, ln(W), to get:

Taking the derivative of Equation (8) and setting the result to zero gives:

We define a new parameter α = α' - 1, giving:

Solving this for n_i gives the most probable population of state E_i:

Finally, we must evaluate the constants α and β. Substituting Equation (10) back into Equation (5) and solving for exp(α) gives:

Changing the subscript to j and substituting this result back into Equation (10) gives the Maxwell-Boltzmann distribution:

The Boltzmann distribution gives the most probable energy distribution of molecules in a system. It can further be shown that β = 1/kT, where k is Boltzmann's constant and T is the absolute temperature (given in kelvins). The term in the denominator is called the partition function and is defined as follows:

The partition function provides a measure of the total number of energetic states that are accessible at a particular temperature and can be related to many different thermodynamic properties (see Statistical Mechanics). kel