Contents | Previous section

Inner-product spaces

The vector-space Rn, together with the dot-product, is an example of an inner-product space. In particular, R itself (as a vector space), with multiplication, is an inner-product space.

Definitions

Suppose V is a a vector-space. A function f that converts ordered pairs of vectors in V into scalars is a bilinear form if it satisfies the rules: If f is a bilinear form, it is symmetric if it satisfies: If f is a symmetric bilinear form, it is positive-definite if it satisfies: (Note that f(0,0) = 0 for any bilinear form f.)

An inner product is a positive-definite symmetric bilinear form.

An inner-product space is a vector space with an inner product; usually the inner product is denoted by angle-brackets, so that <u, v> is the scalar that results from applying the inner product to the pair (u, v) of vectors. By positive-definiteness, the norm |v| of a vector v in an inner-product space can defined to be the nonnegative square root of <v, v>.

Examples on Rn

There are bilinear forms that are not symmetric, and symmetric bilinear forms that are not positive-definite.

Indeed, suppose f is a bilinear form on Rn. Then f(u, v) is the sum of the terms

aijujvi ,

where aij = f(ej, ei), and where i and j range from 1 to n. Therefore

f(u, v) = vTAu ,

where A is the matrix (aij)ij. The bilinear form f is hence symmetric if and only if the matrix A is symmetric.

Suppose that the bilinear form f is symmetric, and that the matrix A just named is diagonalizable; say

A = PDP-1 ,

where D is a diagonal matrix with diagonal entries di; these entries are the eigenvalues of A. Suppose further that P-1 = PT (that is, P is an orthogonal matrix). Then

f(u, v) = vTPDPTu = (PTv)TD(PTu) .

Hence f(Pu, Pv) = vTDu, and therefore

f(Pu, Pv) = d1u1v1 +d2u2v2 + ... + dnunvn.

In particular, f is positive-definite if and only if each eigenvalue of A is positive.

It is in fact the case (although it is a challenge to prove) that every symmetric matrix is diagonalizable by an orthogonal matrix in the way just described. So, every n×n symmetric matrix A with positive eigenvalues determines on Rn the inner product given by the formula,

<u, v> = vTAu .

A special case is when A is diagonal; then the inner product is a so-called weighted-Euclidean inner product. If A is the identity matrix, then the inner product is just the Euclidean inner product, which is the dot-product.

Examples not on Rn

Orthogonality

In an inner-product space, two vectors are orthogonal if their product is zero.

If V is an inner-product space, and W is a subspace of V, then the orthogonal complement of W in V is the subspace of V comprising every vector that is orthogonal to every vector of W. (The orthogonal complement of W is often denoted W^, that is, W-with-superscript-upside-down-T.)

Note that the orthogonal complement of W in V is a subset of V, by definition; it is a subspace, by proof.

We can restate a fact noted earlier, thus:

Theorem. With respect to the dot-product, the orthogonal complement of the row space of a matrix is precisely the nullspace of the matrix.

You can show that if the rank of a matrix A is equal to the number of columns of A, then the matrix ATA is invertible. From this, you can show:

Theorem. If A is an arbitrary m×n matrix, and b is a vector in Rm, then the linear system

ATAx = ATb

is consistent, and there is a unique vector v in the column space of A such that v = Ax for some solution x of the given system.

The unique vector v in the Theorem is such that v - b is orthogonal to the column space of A. It can therefore be called the (orthogonal) projection of b into the column space of A; it is the vector in the column space whose distance from b is minimal.

The system ATAx = ATb is called the normal system associated with Ax = b; a solution to the normal system is a least-squares solution to the system Ax = b itself (even though this system may be inconsistent).

If the matrix A has a single column, namely the vector a, then the vector v of the Theorem is just the vector

((b·a)/(a·a))a,

where ( · ) is the dot-product.

If u and v are vectors in an arbitrary inner-product space V, the vector

(<u, v>/<v, v>)v

is the (orthogonal) projection of u onto v, and is denoted by projv(u). In terms of the operation of projection, we can describe the Gram-Schmidt Process as follows:

Suppose {u1u2, ..., ur} is a finite, linearly independent set, spanning a subspace W of an inner-product space V. We can find a set {v1v2, ..., vr} such that:

Then in fact: The latter property means that {v1v2, ..., vr} is an orthogonal basis of its span; the former property means that this span is W. If u is a vector in V, then the vector

projv1(u) + projv2(u) + ... + projvr(u)

is the (orthogonal) projection of u into W; denoted by projW(u), it is the unique vector v in W such that u - v is in the orthogonal complement of W.

Applications

Fitting polynomial functions to points

Suppose that (x0y0), (x1y1), …, (xnyn) are n + 1 points in the Cartesian plane with distinct x-coordinates. Suppose m is a number no greater than n. We can form an (n+1)×(m+1) matrix A whose row i is

(1 xi xi2 ... xim) .

Let y be the vector (y0 y1 y2 ... yn)T in Rn+1. There is a vector a in Rn+1 (in fact a unique vector) such that Aa is the projection of y into the column space of A. (That is, a satisfies ATAa = ATy.) Let qm be the polynomial

a0 + a1x + a2x2 + ... + amxm ;

then qm is the polynomial in Pm best fitted to the given points, in the sense that the sum of the squares (qm(xi) - yi)2 is as small as possible. In particular, if m = n, the polynomial meets the points exactly: we have

qn(xi) = yi

for each i from 0 to n. In general, qm is the projection of qn into Pm with respect to the inner product on Pn described above.

Approximating continuous functions by trigonometric functions

On the space of continuous functions on the interval [0, 2p] from zero to twice pi, we have an inner product as described above. With respect to this inner product, every function on the following list is orthogonal to every other function:

1; cos x, cos 2x, cos 3x, ...; sin x, sin 2x, sin 3x, ....

Except for the constant function 1, they all have the same norm, namely Öp, the square root of pi; but the norm of 1 is Ö(2p), the square root of twice pi. If f is a continuous function defined on the interval [0, 2p], then for any positive number n, we can project f into the span of the set {1, cos x, cos 2x, ..., cos nx, sin x, sin 2x, ..., sin nx}; the projection of f is the function

a0/2 + (a1 cos x + b1 sin x) + (a2 cos 2x + b2 sin 2x) + ... + (an cos nx + bn sin nx) ,

where:

Contents | next section

Valid CSS! Valid HTML 4.01!