Determinants

A definition

We wish to assign to every square matrix A a determinant, which will be a scalar denoted by det A. The required property is that the determinant of a square matrix be 0 precisely when the matrix is not invertible.

One possibility would be to let the determinant of every invertible matrix be 1. In fact this is the only possibility when the scalars that we use are not arbitrary real or complex numbers, but just 0 and 1, considered as constituting the “two-element field”: this means 0 and 1 add and multiply as usual, except that 1 + 1 = 0.

In the special case just described, the determinant function is multiplicative: for all square matrices A and B of the same size,

det(AB) = det A det B.

In the general case we shall want to retain this property. Since also det A ≠ 0 when A is invertible, we thus require

det I = 1.

Since every invertible matrix is a product of elementary matrices, it will now be enough to define their determinants (if determinants do exist as desired). Thus we make the following definitions (which are compatible with multiplicativity).

If E results from multiplying a row of I by the scalar a, then det E = a.
If E results from interchanging two rows of I, then det E = −1.
If E results from adding a multiple of one row of I to another, then det E = 1.

If A is a product E₁E₂⋅⋅⋅E_r of elementary matrices, then we must have

det A = det E₁ det E₂ ⋅⋅⋅ det E_r.

However, we shall have to check that this is a valid definition, since the same matrix may have different factorizations as a product of elementary matrices.

Meanwhile, if we denote by diag(d₁, ..., d_n) the diagonal matrix with diagonal entries d₁, …, d_n, then we must have

det diag(d₁, ..., d_n) = d₁⋅⋅⋅d_n.

Moreover, if T is merely a triangular matrix (upper or lower) whose diagonal entries are d₁, …, d_n, then again det T = d₁⋅⋅⋅d_n.

Suppose now that, by rearranging the rows of diag(d₁, ..., d_n), we obtain a matrix F. This is just a matrix whose every column or row has at most one nonzero entry. Working from left to right, interchanging rows, we return F to diagonal form. If the number of interchanges required is k, then we define

det F = (−1)^kd₁⋅⋅⋅d_n.

We use this to define the determinants of arbitrary square matrices. Suppose A is an n×n matrix (a_ij). For an arbitrary permutation σ (“sigma”) of the set {1,...,n}, let A_σ be the n×n matrix whose entry in row i, column j, is a_ij, provided σ(j) = i; otherwise the entry is 0. Then det A_σ is defined as for F. We now define det A to be the sum of the det A_σ for all possible values of σ (the number of such values being n!). In short,

det A = ∑_σ det A_σ.

For a cleaner presentation of this definition, we note that, for some permutation σ,

F = F_σ.

If each of the scalars d_i is equal to 1, we define

sgn(σ) = det F;

this is the signum or sign of σ, and it is ±1. Now

det A = ∑_σ sgn(σ) a_σ(1)1⋅⋅⋅a_σ(n)n,

or equivalently

det A = ∑_σ sgn(σ) ∏_1≤j≤n a_σ(j)j.

One can check that the determinant function as just defined is multiplicative, provided that the signum function is multiplicative. We can obtain the latter multiplicativity by using the definition

sgn(σ) = ∏_{1≤i≤j≤n} ([σ(i)−σ(j)]/[i−j]).

Properties

By the way we have obtained their definition, determinants of square matrices have the following properties.

A is invertible if and only if det A is not zero.
det I = 1.
If E results from multiplying a row of I by the scalar a, then det A = a.
If E results from interchanging two rows of I, then det A = −1.
If E results from adding a multiple of one row of I to another, then det A = 1.
det (E₁E₂...E_r) = det E₁ det E₂ ... det E_r when the E_k are elementary matrices.
More generally, det (AB) = det A det B.
det A^T = det A (since taking the transpose of an elementary matrix does not change its determinant).

A technique

A straightforward way to calculate the determinant of a square matrix A is this: using the elementary row-operations except the scaling of rows, reduce A to an upper-triangular matrix. Each row-interchange causes a change of sign of the determinant of the matrix; adding a multiple of one row to another causes no change. The determinant of the upper-triangular matrix is the product of its diagonal entries.

More properties

Suppose two rows of a square matrix are identical. Then interchanging those identical rows must both change the sign of the determinant, and keep the determinant the same. Therefore the determinant must be zero.

Likewise, if one row is a multiple of another – in particular, if one row is zero – then the determinant is zero.

The same goes if we speak of columns instead of rows, since transposing a matrix does not change its determinant.

Another technique

An alternative definition of the determinant can be developed thus. Let A be the n×n matrix (a_ij). If n = 1, then det A is just the value of the single entry a₁₁. Suppose n > 1. The minor of the entry a_ij is then the determinant of the (n−1)×(n−1) matrix resulting from deleting row i and column j of A. The cofactor of a_ij is:

the minor, if i+j is even;
−1 times the minor, if i+j is odd.

For each entry of the left column of A, take the product of that entry with its cofactor; the sum of these n products is det A. In fact, you can start with any column or row; take the products of its entries with their respective cofactors, add up the products – the result is det A.

We can check that this definition of the determinant gives the same results as the first by noting that each definition gives the same value for det I, namely 1; also, that under each definition, the determinant of a matrix changes in the same way when elementary row-operations are applied.

Theory

Suppose A is the n×n matrix

A = (a₁ a₂ ... a_n).

We know that interchanging two columns a_i and a_j changes the sign of the determinant of A.
Suppose we replace the column a_r with a vector x of variables. (So, x is (x₁ x₂ ... x_n)^T.) Then the determinant of this new matrix is a linear polynomial in the variables x_i. We can write this determinant as a function, say L(x). It is a linear function; that is,
L(ax + by) = a L(x) + b L(y).

A way to summarize these two properties is to say that the determinant of a matrix is an alternating, multilinear function of its columns. (Don't worry about the terminology; it is not in the book.) In fact the determinant can be defined by these properties, together with the property that the determinant of the identity-matrix is 1.

A consequence

Suppose A, B and C are square matrices that are identical, except that their columns j are vectors a, b and a + b respectively. Then

det A + det B = det C .

Likewise for rows instead of columns.

Application to eigenvalues

Let A be a square, n×n matrix, and suppose there is a nonzero vector v and a scalar λ (the Greek letter lambda) such that

Av = λv .

Then λ is an eigenvalue or characteristic value of A, and v is a corresponding eigenvector or characteristic vector. We shall develop the notions more later; for now we can note that the equation can be rewritten thus:

(A−λI)v = 0 ,

and also (λI−A)v = 0. Since v is nonzero, we can conclude that the matrix λI−A is not invertible, hence that its determinant is zero. In other words, the eigenvalues λ of A are just the solutions of the equation

det (λI−A) = 0 .

The left member of the equation is a polynomial of degree n in the variable λ; the equation is the characteristic equation of A.

Next section: Geometry