Tensors

Let's review the definitions of trigonometric functions. Let's say you have a right triangle, with the small point at the origin. The angle at that point is θ, the horizontal line is x, the vertical line is y, and the diagonal line is r.

cos [theta] = x/r

sin [theta] = y/r

tan [theta] = y/x

Often it is convenient to change the axes of the coordinate system you are using. When you change the axes, the coordinates are different.

In three dimensions, the new coordinates are given by

u cos [alpha] + v cos [beta] + w cos [gamma]

If you use l, m, and n to mean the cosines of the angles between the old and new axes, you have

ul + vm + wn

You can simplify it further by calling the coefficients u₁, u₂, and u₃, and the cosines of the angles l₁, l₂, l₃

[summation of] u_i l_i where i = 1, 2, 3

Let's say you have two sets of axes. The first is x_i. The second is x_j' and has the same origin as x_i, just rotated.

Here I've drawn it as two dimensions but let's say you have the same thing in three dimensions.

How do you change the coordinates from one system to another? Just multiply by the cosine of the angle between them.

cos [theta] = l_ij

x_j' = (cos [theta]) (x_i) = l_ijx_i

which is the same going back, so

x_i = l_ijx_j'

This formula works not just for position but for displacement, velocity and acceleration. It also works for force as long as the equations of motion are true in all reference frames. You can break this apart into each of the coordinates of the point. In three dimensions, there are three equations, each of which have the following form:

A_j' = l_ijA_i or

A_i = l_ijA_j'

Together, these three equations comprise the components of a vector. This is what defines a vector. The rotation and translation of axes can be drawn like this.

This is why a vector can also be drawn as an arrow.

This is a vector.

x_j' = l_ij x_i

x_i = l_ij x_j'

These are the components of a vector.

A_j' = l_ij A_i

A_i = l_ij A_j'

Let's say you were to have two of each of the variables.

A_j' B_l' = l_ijl_kl A_i B_k

A_iB_k = l_ijl_kl A_j' B_l'

These are the individual components, and if you combine them you have this.

k_jl' = l_ijl_kl k_ik

k_ik = l_ijl_kl k_jl'

This is called a second order tensor. You cannot draw it geometrically like you can a vector but you can deal with it algebraically as easily as a vector. A scalar is a zero order tensor. A vector is a first order tensor. This is a second order tensor, which is also called a dyad. In physics, you frequently use second order tensors. Rarely, you use third and fourth order tensors. You almost never use higher orders, although mathematically you could have any order by just having higher multiples of the number of variables.

If you look at the tensor k_ik, you see there are two subscripts, each of which could have values 1, 2, or 3. Therefore, there are 3² = 3 x 3 = 9 combinations. For this reason, the tensor can be written as a matrix.


            k11  k12  k13
K =         k21  k22  k23
            k31  k32  k33

Even though it's a 3 x 3 matrix, it's a second order tensor, since k has two subscripts.

This is a common notation for tensors. The components k₁₁, k₂₂, and k₃₃ are called the diagonal. Often you have zeroes off the diagonal. The sum of the components of the diagonal is called the trace or spur.

(K + L)_ik = K_ik + L_ik which is a tensor

K_jl' = l_ikl_jl K_ik

l_ikl_jl = [delta]_ik which is the same for any rotation

δ_ik is an isotropic tensor which means it is unaltered by rotation. There are no isotropic tensors of the first order. The only isotropic second order tensors are multiples of δ_ik. The only isotropic third order tensors are multiples of ε_ikm. There are three independent ones of the fourth order which are

[delta]_ik[delta]_mp

[delta]_im[delta]_kp + [delta]_ip[delta]_km

[delta]_im[delta]_kp - [delta]_ip[delta]_km

It's possible to multiply a tensor by a vector, and then write it in dyadic notation.

A.K = A_iK_ik or K.A = K_ikA_k

If you have a tensor K_ik and you exchange rows and columns, you get another tensor K_ki.

If K_ik = K_ki, the tensor is symmetrical. If K_ik = - K_ki, the tensor is antisymmetrical. If K_ik is a tensor, two others are K_ik + K_ki and K_ik - K_ki. The first of these is unaltered if i and k are exchanged and is a symmetrical tensor. The second has all the components reversed, and is an antisymmetrical tensor. Any tensor K_ik can be written as the sum of a symmetrical and antisymmetrical tensor.

K_ik = 1/2(K_ik + K_ki) + 1/2(K_ik - K_ki)

The diagonal components of an antisymmetrical tensor K_ik must vanish, and since for the others K_ik = -K_ki, only three independant quantities are needed to specify an antisymmetric tensor, which then takes the form


              0        K12    -K31
Kik =        -K12      0       K23
              K31     -K23     0

If K_ik = -K_ki, then K₁₁ = - K₁₁, K₂₂ = - K₂₂, and K₃₃ = - K₃₃, which is only true if K₁₁ = K₂₂ = K₃₃ = 0.

You now have three values, K₁₂, K₂₃, and K₃₁. However, the vector K_i, where i has a value for each of the three axes, x, y, and z, also has three values. Therefore these three values can be used to define a vector. Therefore, an antisymmetrical tensor can be written as a vector.

δ_ik is the Kronecker delta, and in Euclidean space, is equal to the metric tensor. ε_ikm is the permutation tensor.

Sometimes vectors can be most simply written in tensor notation. For instance, the dot product between two vectors can be simply written as

u . v = u_i v_i

where repeated indices are summed so that

u_i v_i = u₁ v₁ + u₂ v₂ + . . . u_m v_m

The cross product of two vectors can be written as

u x v = [epsilon]_ikm u^j v^k

A vector is usually portrayed as a 1 x n matrix. If it's portrayed as a m x 1 matrix, it's called a column vector. A two-component complex column vector is called a spinor. It's actually much more complicated than that, but this is sufficient for our present purposes. Here's a typical spinor describing a fermion of arbitrary helicity.

In cosmology, the structure of spacetime is described by a metric. The simplest is flat spacetime which is described by the Minkowski metric, which is as follows.

Minkowski metric, g^uv

t -  +1  0  0  0

x-    0 -1  0  0

y-    0   0 -1 0

z-    0   0  0 -1

This is a second order tensor where the upper left hand value is +1, the other values in the diagonal are -1, and you have zeroes off the diagonal. The Minkowski metric is represented by g_uv or g^uv = Diag[+1, -1, -1, -1]

Notice that the time coordinate is +1, and the three spatial coordinates are -1. Some people use an opposite sign convention with the time coordinate -1, and the space coordinates +1. When a tensor used in physics has the form of the first coordinate being the time coordinate, and the remaining three being spatial coordinates, with zeroes off the diagonal, it's called a 4-vector. Technically, it's a second order tensor but you could pretend it's a vector with four components, t, x, y, and z. A 4-vector has the following general form.

A^u = (a^t, a^x, a^y, a^z) = (a⁰, a¹, a², a³) = (a⁰, a)

Since the three spatial components are often exactly the same, they are often represented by a single boldface variable a¹, a², a³ = a. It's also sometimes represented by a variable with an arrow over it.

If you multiply a 4-vector times the Minkowski metric g^uv, this has the effect of putting a minus sign on the space part. The result is called a 4-covector, and you switch from superscripts to subscripts.

g^uvA^u = A_u = (a_t, -a_x, -a_y, -a_z) = (a₀, -a₁, -a₂, -a₃) = (a₀, -a)

In flat space, the Minkowski metric g^uv is the same as the metric tensor, which is actually a function that computes the distance between two points in general space. It's derived from a generalization of the Pythagorean theorem. In curved space, meaning in the presence of a gravitational field, the metric tensor has to be a coordinate dependent field transforming in the right way to keep the proper time invariant.

In Newtonian mechanics, you can change from one inertial frame to another by translations and rotations. This is only within the three spatial coordinates since time is not included. In special relativity, you can change from one inertial frame to another within not just space but spacetime, so all four coordinates are included. You have the translations. You also have a group of transformations called Lorentz transformations, which include rotations and boosts, which are changing from a motionless inertial frame to one moving at a constant velocity. The translations and Lorentz transforms together form the Poincare transforms. If something is unchanged under Lorentz transforms, it's Lorentz invariant. If it's unchanged under Poincare transforms, it's Poincare invariant.

4-vectors are Poincare invariant. That means they are valid in all inertial frames regardless of coordinate system. The components might change from one reference frame to another but the 4-vector is not changed. It's valid for all inertial observers. L^u_v is the Lorentz transform. It is a transformation tensor that gives relations between different inertial reference frames.

A'^u = L^u_v A^v

Lorentz transform, L^u_v


[gamma]      -[gamma]v/c        0            0

[gamma]v/c    [gamma]           0            0

0               0               1            0

0               0               0            1

where γ is the Lorentz factor, γ = 1/[squareroot of (1 - (v/c)²)] Also β = v/c

From this, you get the relation between a given quantity at rest, and then its measured relativistic value.

length L = L₀/[gamma]

time t = [gamma]t₀

volume V = V₀/[gamma]

temperature T = T₀/[gamma]

heat Q = Q₀/[gamma]

entropy density S = [gamma]S₀

Some quantities are unchanged, such as pressure and entropy.

Usually 4-vectors contain only real components. However, it is possible for them to contain imaginary components. An example is the polarization 4-vector.

i = [squareroot of -1] All complex numbers have the form a + bi. For non-imaginary real numbers, b = 0. For imaginary numbers, b is not 0. For pure imaginary numbers a = 0. If you have a number a + bi, the complex conjugate of the number is a - bi. Here is the general form of a 4-vector.

A = (a⁰, a)

If you take into account imaginary numbers, it becomes

A = (a⁰_r + ia⁰_i, a_r + ia_i)

We use the subscripts r and i to keep track which are real versus imaginary. The complex conjugate of the 4-vector A is as follows.

A = (a⁰_r - ia⁰_i, a_r - ia_i)

The 4-vector is frequently used in physics. Here is list of some 4-vectors. [h bar] = h/2π where h is Planck's constant. h = 6.626 x 10^-34 Js. 1 joule = 1 kgm²/s². c is the speed of light, which is the velocity of a massless particle from the reference frame of a particle with mass. c = 2.99792458 x 10⁸ m/s. m₀ is the rest mass of a particle, and q₀ is the charge of a particle. I give the 4-vector, how it's defined, and then its units. These are all Lorentz vectors.

4-vector - definition A = (a₀, a) - (SI units)

4-position - R = (ct, r) - (m)

4-velocity - U = γ(c, u) - (m/s)

4-acceleration - A = γ(dγ/dt c, dγ/dt u + γa) - (m/s²)

4-momentum - P = (E/c, p) = γm₀(c, u) = m₀U - (kgm/s)

4-force - F = γ(dE/cdt, f) - (kgm/s²)

4-displacement - dR = (cdt, dr) - (m)

4-wave vector - K = (w/c, k) = (w/c, wv_phasen) = 1/[h bar] (E/c, p) - (rad/m)

4-current density - J = (cp, j) = p₀γ(c, u) = p₀U - (C/m²s)

4-number flux - N = γn₀(c, u) = n₀U - (m/s)

4-polarization - E = (e⁰, e) - (none since it contains imaginary components)

4-gradient - d = d/dx_u = (d/cdt, [nabla]) - (1/m)

4-vector potential_EM - A_EM = (V_EM/c, a_EM) - (kgm/Cs)

4-momentum_EM - P_EM = (E/c + qV_EM, p + qa_EM) - (kgm/s, including EM potentials)

4-gradient_EM - D = (d/dct + iq/[h bar](V_EMc), -[nabla] + iq/[h bar]a_EM) - (1/m, including EM potentials)

If you multiply the 4-gradient by itself you get

dd = (d/cdt, [nabla]).(d/cdt, [nabla]) = (d²/c²dt²)-[nabla].[nabla]

This is known as the D'Alambertian operator. It is usually symbolized by a square. It was invented by D'Alambert. Who invented the Fourier series? This might sound like a self-evident question but remember that Monroe was not responsible for the Monroe Doctrine. D'Alambert did work on the vibrating string, and in the process of that he and Euler came up with the Fourier series in 1747. D. Bernoulli got the solution as a sine series in 1753, and what we call the Fourier sine theorem followed from that. Fourier tried to prove it in his work on heat conduction titled, "Analytical Theory of Heat" published in 1822, but the attempt at a proof was inaccurate and almost incoherent. It was actually proven by Dirichlet in 1829.

You use tensors to provide generally valid relations between 4-vectors. Since the components of 4-vectors are altered by change of axes, the components of tensors have to be able to change also, so they can still provide relations between the same 4-vectors. Here is the tensor transformation law.

g'_{[alpha][beta]} = [partial derivative of x^u with respect to x'_[alpha]] [partial derivative of x^v with respect to x'_[beta]]g_uv

For Cartesian coordinates, it would make no difference if we turn the fractions upside-down.

[partial derivative of x^u with respect to x'_[alpha]] = [partial derivative of x'^[alpha] with respect to x_u] = cos [theta]

where θ is the angle of rotation between the two axes. However, this is not in general true.

If you have the 4-vector A_u, the components are x_u = (ct, -x, -y, -z). These are the covariant components of A_u. They transform the same way as a vector. If you multiply by the metric tensor, you get x^u = (ct, x, y, z) which are the contravariant components of the 4-vector A^u. They transform oppositely of a vector.

A'^u = [partial derivative of x'^u with respect to x^v]A^v

A'^u = [partial derivative of x^v with respect to x'^u]A_v

It is also possible to contract tensors. If you multiply A^u by A_u, you get a scalar. The process is called contracting, and the result is called the size or norm of the tensor A^u. A^u A_u is invariant. A^u A^u would not be a constant. The effects of arbitrary coordinate changes do not cancel unless you have upstairs and downstairs indices.

Let's say you have A^u B_u = 1, and A^u is a 4-vector. Therefore, B_u must also be a 4-vector in order for the right hand side to be a scalar. This method of deducing the nature of the quantities in an equation is called manifest covariance.

A coordinate derivative is when a derivative acts on a single coordinate.

The index must be downstairs since the derivative acts on only one coordinate.

[coordinate derivative of u on x^v] = [delta] = Diag[1, 1, 1, 1]

Notice we once again see δ, which is an isotropic tensor, meaning that components are the same in all frames. It must exist in order to define the inverse matrix of a tensor. If you have both a partial derivative with upstairs indices and a partial derivative with downstairs indices, you have the D'Alambertian operator.

The scalar product of two 4-vectors is

a_u b^u = a⁰ b⁰ - a¹ b¹ - a² b² - a³ b³

The metric tensor is a tool used to raise and lower indices.

A_u = g_uv A^v

For example, in special relativity, the 4-derivatives are

[partial derivative_u] = ([partial derivative with respect to ct], [nabla]) = ([partial derivative with respect to ct], [partial derivative with respect to x], [partial derivative with respect to y], [partial derivative with respect to z])

[partial derivative^u] = ([partial derivative with respect to ct], -[nabla]) = ([partial derivative with respect to ct], -[partial derivative with respect to x], -[partial derivative with respect to y], -[partial derivative with respect to z])

so therefore [partial derivative_u]a^u = [partial derivative of a⁰ with respect to ct] - = [partial derivative of a¹ with respect to x] - = [partial derivative of a² with respect to y] - = [partial derivative of a³ with respect to z] = [partial derivative of a⁰ with respect to ct] + [nabla]a

Therefore conservation laws, such as for the 4-current, can be written as ∂^uJ_u = 0.

Tensor equations with indices in the same relative positions on either side must be generally valid. The equation is called generally covariant. This has nothing to do with the word "covariant" in "covariant components".

The determinant of a tensor is defined as follows.

det(g_{[alpha][beta]}) =

|g11   g12|
|         | =
|g21   g22|

g₁₁ g₂₂ - g₁₂ g₂₁

since in matrices


|a           b|
|             | =  ad-bc
|c           d|

g = -detg_uv

Then you have

[integral of] [squareroot of -g]pd⁴x^u = constant

[squareroot of -g]is called the Jacobian. For Minkowski spacetime, g is -1, so the Jacobian is +1 or -1. [squareroot of -g]p is the scalar density. An object formed from a tensor and n powers of the squareroot of -g is called the tensor density of weight n.

If the Jacobian is positive, that is called a proper transformation. If the Jacobian is negative, that is called an improper transformation. In special relativity, a tensor density will transform like a tensor if you restrict yourself to proper transformations. However, with improper transformations, or spatial inversion, tensor densities of odd weights will change sign. These quantities are called pseudotensors. Pseudotensors include pseudovectors and pseudoscalars. The most famous example is the antisymmetric Levi-Civita tensor, ε^αβγδ.

[epsilon]_{[alpha][beta][gamma][delta]} = g[epsilon]^{[alpha][beta][gamma][delta]}

It is equal to +1 if the indices are an even permutation of 0123, and -1 if they are an odd permutation. It's 0 for any other value of indices. Here, a "permutation of 0123" is an ordering of the numbers 0, 1, 2, 3 which can be obtained by starting with 0123 and exchanging two of the digits. An even permutation is obtained by an even number of such exchanges, and an odd permutation is obtained by an odd number.

The Christoffel symbol of the first kind is represented in various ways including

These are called components of the affine connection. They are defined by

[ij, k] =

| ij| = 
|  k|

g_uv[Christoffel symbol, capital gamma]^u_ij = g_uve^u [partial derivative of e_i with respect to qⁱ] = e_v [partial derivative of e_i with respect to qⁱ]

where g_uv is the metric tensor and e_i = [partial derivative of r with respect to qⁱ]

Here is the Christoffel symbol expressed in terms of the metric tensor.

The following define the covariant derivative.

DA^u = dA^u + [Christoffel symbol]^u_{[alpha][beta]}A^[alpha]dx^[beta]

DA_u = dA_u - [Christoffel symbol]^[alpha]_u[beta]A_[alpha]dx^[beta]

You can generalize equations by replacing ordinary derivatives by covariant derivatives. The equation of motion for a particle would become

F^u = m(DU^u/d[tau])

In order to simplify the notation, partial derivatives are often represented by a comma. Covariant derivatives are often represented by a semicolon.

V^u,_v = [partial derivative of V^u with respect to x^v] = [partial derivative_v V^u]

V^u;_v = DV^u/[partial derivative]x^v = V^u,_v + [Christoffel symbol]^u_[alpha]vV^[alpha],

For instance, for Maxwell's equations, you have F^uv;_v = -u₀J^u

In physics, you frequently use the energy-momentum tensor T^uv. Its value depends on the system. For instance, for a cold fluid with density p₀ in its rest frame, the only non-zero component is the upper left T⁰⁰ = c²p₀ In a general frame, you have

T^UV = c²p₀

[gamma]^2               -[gamma]^2[beta]           0            0

-[gamma]^2[beta]        -[gamma]^2[beta]^2         0            0

  0                       0                        0            0

  0                       0                        0            0

where β = v/c and γ = 1/[squareroot of 1 - (v/c)²]. This is similar to the Lorentz transform. You have two powers of γ, one for the change in number density, and one for the relativistic mass increase. For a perfect fluid, the rest-frame T^uv is

T^uv = Diag[c²[rho], p, p, p]

In a general frame, the energy momentum tensor of a perfect fluid is

T^uv = ([rho] + p/c²)U^uU^v - pg^uv

In order to explain the fact the Universe is homogeneous and isotropic, we invented inflation which predicted that the universe is flat. Fortunately for us, we observed that the universe is flat since the cosmic microwave background is isotropic. However, then we determined that the amount of matter in the Universe was insufficient to cause the Universe to be flat. We then had to invent some sort of dark energy to flatten the universe. However, this dark energy would also predict that the universe is accelerating. Then Hubble measured the redshift of distant supernovae and determined that the expansion of the Universe really is accelerating, so it's consistent. Primary candidates for the dark energy are the cosmological constant, a rolling scalar field called quintessence, or topological defects.

In 1998, a balloon called Boomerang floated around Antarctica for ten days, and measured the anisotropy of the cosmic microwave background to high precision. The conclusion is that the Universe is very close to flat.

However, on small scales, the Universe is curved. General relativity uses this curvature to explain gravitational effects. The Minkowski metric describes flat spacetime. How do you describe curved spacetime?

In 1884, Erwin A. Abbott wrote a book titled "Flatland" about two dimensional creatures. If a group of such creatures were living on the surface of a large sphere, how would they know it wasn't a plane? Gauss was the first to recognize that you could do this by measuring the angles of a triangle. The angles of a triangle on a plane, or in Euclidean space, always add up to 180^o or π radians. The angles of a triangle on a positively curved surface, such as a sphere, add up to more than that. The angles of a triangle on a negatively curved surface, such as a saddle surface, add up to less than that.

This is actually a specific example of a more general case called parallel transport. Imagine that you have a square on a plane. If you move a vector around this square, it's always either parallel or perpendicular to the line it's on. The vector is always facing in the same direction. Let's say you do the same thing with a triangle on a sphere where each of its angles are 90^o. You can get the vector pointing in different directions depending on which direction it takes around the triangle.

Take the path integral around the loop. The extent of curvature is the change in the vector which is proportional to both the vector itself, due to rotation, and the distance along the loop, to first order, so that the total change in going around a small loop is given by

[delta]V^u = '[R^u_{[alpha][gamma][beta]} [path integral ([capital delta]x^[beta]dx^[alpha] - [capital delta]x^[alpha]dx^[beta])]]

Since Γ is a function of position in the loop, it can't be taken out of the loop. Make first order expansions of V^u and Γ as functions of the total displacement from the starting point Δx^u. For Γ, the expansion is a first-order Taylor expansion. For V^u, what matters is the first order change in V^u due to parallel transport.

There's no first order term because

around the loop. Writing this twice and permutating α and β gives you

where R^u_αγβ is the Riemann tensor. The Riemann tensor R^u_αγβ is defined as

which can be written as

The Riemann tensor is a fourth order tensor. However, if you multiply A^u by A_u, you get a scalar. A^u = g_uvA_u. In this way, you can reduce a fourth order tensor to a second order tensor.

R_uv = R^[gamma]_u R^u_{[alpha][gamma][beta]}

R^uv is the Ricci tensor. You can do the same thing again and contract it to the curvature scalar R.

R = R^uvR_uv

The following is the Einstein tensor G^uv.

G^uv = R^uv - (1/2)g^uvR

Notice that the Einstein tensor has zero covariant divergence.