Linear systems

The Linear In Linear Algebra

Ordinary algebra deals with polynomials, such as ax2 + bx + c or x3 + 27y3 or x3 - 2x2y + 4xy2 - 8y3. These are constructed out of variables such as x and y by three operations:

  1. multiplication of variables, to form x2 or xy2 for example;
  2. scaling, that is, multiplication by constants, whether literal (as in ax2) or numeral (as in 27y2);
  3. addition.

A linear polynomial is one formed by the last two operations only; in other words, it is a linear combination of variables. [A technical point: this definition of linear polynomials excludes polynomials such as ax + b, which not all writers would do.] Linear polynomials are the province of linear algebra. So, linear algebra is simpler than algebra in general! Yet it is complicated by the allowance of any number of variables, and by consideration of systems of any number of equations.

The general linear polynomial in n variables is

a1x1 + a2x2 + ... + anxn.

We shall understand the constants ai to be real numbers, for now; later, they will sometimes by complex numbers. We shall generally call the constants scalars.

Linear Systems

One can see the goal of linear algebra as the understanding of linear systems. The general linear system of m equations in n unknowns is

a11x1 + a12x2 + ... + a1nxn = b1
a21x1 + a22x2 + ... + a2nxn = b2
am1x1 + am2x2 + ... + amnxn = bm

The terms bi are the constant terms of the corresponding equations. A solution to the system is an n-tuple (x1, x2, ..., xn) that satisfies each equation. (Note what is, strictly speaking, an ambiguity: the symbols xj are variables in the system, but also constants making up a solution to the system.) The names of the variables xj are not important; their order is what is important. The information of the system is contained in its augmented matrix

a11 a12 ... a1n b1
a21 a22 ... a2n b2
am1 am2 ... amn bm

[You should see brackets the the left and right; but some browsers will just show stacks of accented letters.] This matrix is formed by adding the final column of constants bi to the coefficient matrix

a11 a12 ... a1n
a21 a22 ... a2n
am1 am2 ... amn

Having m rows and n columns, this is an m×n matrix. We may denote it by (aij); here, aij is understood to be the entry in row i and column j of the matrix (counting from the top and the left, respectively), and i and j are understood to range from 1 to m and from 1 to n respectively. (If there is uncertainty about which letter represents the row-number, and which the column-number, we can denote the matrix by (aij)ij.)

We can solve a linear system by manipulating the equations in certain ways; the corresponding manipulations of the augmented matrix are the elementary row-operations:

  1. multiplication of (the entries in) a row by a nonzero scalar;
  2. interchange of two rows;
  3. addition of a (scalar) multiple of one row to another.

By applying these operations in the technique called Gaussian elimination (or reduction), we arrive at a matrix in row-echelon form. To characterize this form, let us call the first non-zero entry of a row the pivot of the row. (The book does not use this term. If every entry of a row is zero, then the row has no pivot.) A matrix is in row-echelon form if it meets the following conditions:

  1. If a row besides the first has a pivot, then this pivot is to the right of the pivot of the preceding row. (In particular, the preceding row has a pivot; so the rows with no pivots are at the bottom of the matrix.)
  2. Every pivot is 1. (The book then calls it a leading 1.)

With an augmented matrix so reduced, we can read off the solution(s) to the corresponding linear system. In particular:

Once you have an augmented matrix in row-echelon form, you have two possible routes towards an explicit solution to the corresponding system: back-substitution and Gauss-Jordan elimination. It doesn't matter which one you use. Gauss-Jordan elimination yields a matrix in reduced row-echelon form, where every pivot is the unique non-zero entry in its column.

For every linear system, there is a corresponding homogeneous system, which results from replacing each constant term with zero. A homogeneous system is automatically consistent, and zero (that is, (0,...,0)) is a solution. If (r1, r2, ..., rn) is a solution to a linear system, and (x1, x2, ..., xn) is a solution to the corresponding homogeneous system, then

(x1 + r1, x2 + r2, ..., xn + rn)

is also a solution of the original system. (More on this later.)

Next section

CSS Validator HTML 4.01 Validator
Intended to be viewable with any browser