Beyond three dimensions

n-space

Let R be the set of real numbers (on which are defined the usual operations of multiplication, addition and additive inversion). Let Rⁿ be the set of vectors (u₁ u₂ ... u_n)^T. In the last section, we treated the geometry of R³. Properly understood, the same notions make sense in Rⁿ for any n.

Dot-product and norm

Suppose u and v are vectors in Rⁿ. Their dot-product is given by the formula

u·v = u₁v₁ + u₂v₂ + ... + u_nv_n.

In particular, u·u is never negative, so the norm of u can be defined to be the nonnegative scalar |u| such that

|u|² = u·u.

Note in particular that u·u and |u| are positive if (and only if) u is not 0.

Call two vectors parallel if one is a scalar multiple of the other. (Note that 0 is a multiple of every vector.) Assume now that u is not 0. If u and v are parallel, then ku - v = 0 for some scalar k; otherwise, the equation is true for no scalar k. Considering the two cases separately yields the Cauchy–Schwarz Inequality:

|u·v| ≤ |u||v|,

with equality if and only if u and v are parallel.

Because of the Cauchy–Schwarz Inequality, there is a real number θ between 0 and 2π such that

u·v = |u||v| cos θ;

so you can think of θ as the angle between u and v. In particular, u and v are orthogonal when u·v = 0.

By doing the algebra, one finds

|u + v|² = |u|² + 2u·v + |v|².

Applying Cauchy–Schwarz yields the triangle inequality:

|u + v| ≤ |u| + |v|.

One also has the Pythagorean Theorem: |u + v|² = |u|² + |v|² when u and v are orthogonal.

Everything here makes sense geometrically, but it all follows from algebraic facts.

An alternative definition

Instead of starting with the dot-product, we can define the norm so that

|u|² = u₁² + u₂² + ... + u_n².

Then we can define the dot-product by the equation

4u·v = |u + v|² - |u - v|².

Linear transformations

Theoretical definition

A linear transformation from Rⁿ to R^m is a function T, such that T(x) is in R^m when x is in Rⁿ, and satisfying the rules:

T(x + y) = T(x) + T(y),

T(kx) = kT(x).

Let e_i be the vector in Rⁿ which has 1 in row i and 0 everywhere else. If u is in Rⁿ, then

u = u₁e₁ + u₂e₂ + ... + u_ne_n ,

and therefore

T(u) = u₁T(e₁) + u₂T(e₂) + ... + u_nT(e_n).

This will justify the following:

Practical definition

A linear transformation from Rⁿ to R^m is the function of multiplication by an m×n matrix; if A is such a matrix, then the corresponding linear transformation is denoted T_A , and satisfies the rule:

T_A(x) = Ax.

In particular, column i of A is just T_A(e_i). So if T is an arbitrary linear transformation, it is multiplication by the matrix

(T_A(e₁) T_A(e₂) ... T_A(e_n));

we can denote this matrix by [T]. The equations

[T_A] = A and T_[T] = T

symbolize the equivalence of the two definitions of linear transformations.

Linear operators

A linear transformation from Rⁿ to itself is a linear operator on Rⁿ. Suppose T is such. Its eigenvalues and eigenvectors are defined as for [T]. Suppose λ is an eigenvalue of T, with corresponding eigenvector u ; then u is not 0, and

T(u) = λu ;

so, T does not change the direction of u or its scalar multiples (although T collapses u to 0, if λ = 0). If [T] happens to be a diagonal matrix, then its diagonal entries are just the eigenvalues of T, and the vectors e_i are eigenvectors. In this case, T effects a dilation or contraction or reflection or `collapse' in the direction of e_i, depending on whether the corresponding eigenvalue is at least 1, between 1 and 0, negative, or 0.

A linear operator always has at least one eigenvalue, but the eigenvalues might not be real numbers. Such is the case in R² if T effects rotation through some angle which is neither zero nor two right angles. If the angle is θ, then T(e₁) = (cos θ sin θ)^T and T(e₂) = (-sin θ cos θ)^T.

Linear transformations as functions

Composition of linear transformations corresponds to multiplication of matrices:

T_AT_B = T_AB.

A linear transformation T is one-to-one if T(x) and T(y) are distinct whenever x and y are distinct. If T is one-to-one, and T is in fact a linear operator, then the matrix [T] is invertible, and T itself has an inverse, namely the linear operator T^-1 such that

[T^-1] = [T]^-1.

Next section: Abstract vector spaces