According to this convention, when an index variable appears twice in a single term, it implies that we are summing over all of its possible values. In typical applications, these are 1,2,3 (for calculations in Euclidean space), or 0,1,2,3 or 1,2,3,4 (for calculations in Minkowski space), but they can have any range, even (in some applications) an infinite set. Furthermore, abstract index notation uses Einstein notation without requiring any range of values.
Sometimes, the index is required to appear once as a superscript and once as a subscript; in other applications, all indices are subscripts. See Dual vector space and Tensor product.
Table of contents |
2 Elementary vector algebra and matrix algebra 3 With no implicit inner product 4 Extension to complex vector spaces and Hilbert spaces 5 Extension to spinors |
In the traditional usage, one has in mind a vector space V with finite dimension n, and a specific basis of V. We can write the basis vectors as e_{1},e_{2},...,e_{n}. Then if v is a vector in V, it has coordinates v_{1},...,v_{n} relative to this basis.
The basic rule is:
The i is known as a dummy index since the result is not dependent on it; thus we could also write, for example:
In contexts where the index must appear once as a subscript and once as a superscript, the basis vectors e_{i} retain subscripts but the coordinates become v^{i} with superscripts. Then the basic rule is:
We have also used a superscript for the dual basis, which fits in with a convention requiring summed indices to appear once as a subscript and once as a superscript. In this case, if L is an element in V*, then:
The real purpose of the Einstein notation is for formulas and equations that make no mention of the chosen basis. For example, if L and v are as above, then
If V be Euclidean n-space R^{n}, then there is a standard basis for V, in which e_{i} is (0,...,0,1,0,...,0), with the 1 in the ith position. Then n-by-n matrices can be thought of as elements of V* V. We can also think of vectors in V as column vectors, or n-by-1 matrices; elements of V* are row vectors, or 1-by-n matrices.
In these examples, all indices will appear as superscripts. (Ultimately, this is because V has an inner product and the chosen basis is orthonormal, as explained in the next section.)
If H is a matrix and v is a column vector, then Hv is another column vector. To define w := Hv, we can write:
The distributive law, that H(u + v) = Hu + Hv, can be written:
The transpose of a column vector is a row vector with the same components, and the transpose of a matrix is another matrix whose components are given by swapping the indices. Suppose that we're interested in the product of v^{T} and H^{T}. If w (a row vector) is this product, then:
The dot product of two vectorss u and v can be written:
You may have noticed in these examples that we often introduced a vector w that would normally not have to be given a specific name using coordinate-free notation. This vector wouldn't need to be given a specific name using only index notation either, but the translation between the notations is easier to describe by giving it a name.
If you review the above examples, you'll find that all of them through the distributive law make sense if a summed index must appear once as a subscript and once as a superscript. But the examples from the transpose on don't make sense in that case. This is because they implicitly use the standard inner product on Euclidean space, while the earlier examples do not.
In some applications, there is no inner product on V. In these cases, requiring a summed index to appear once as a subscript and once as a superscript can help one avoid errors in calculation, in much the same way as dimensional analysis does. Perhaps more significantly, the inner product may be a primary object of study that shouldn't be suppressed in the notation; this is the case, for example, in general relativity. Then the difference between a subscript and a superscript can be quite significant.
When an inner product is explicitly referred to, its components are often referred to as g_{ij}. Note that g_{ij} = g_{ji}. Then the formula for the dot product becomes:
Similarly, we can raise an index using the corresponding inner product on V*. The coordinates of this inner product are g^{ij}, which is (as a matrix) the inverse of g_{ij}. If you raise an index and then lower it (or the other way around), then you get back where you started. If you raise the i in g_{ij}, then you get d^{i}_{j}, and if you raise the j in d^{i}_{j}, then you get g^{ij}.
If the chosen basis of V is orthonormal, then g_{ij} = d_{ij} and u_{i} = u^{i}. In this case, the formula for the dot product from the previous section may be recovered. But if the basis is not orthonormal, then this will not be true; thus, if you're studying the inner product and can't know ahead of time whether a given basis is orthonormal, you'll need to refer to g_{ij} explicitly. Furthermore, if the inner product is not positive-definite (as is the case, for example, in special relativity), then g_{ij} = d_{ij} will not be true even if the basis is chosen to be orthnormal, since you will sometimes have -1 instead of 1 when i = j. Thus, raising and lowering indices are important operations in these applications.