Linear systems

The Linear In Linear Algebra

Ordinary algebra deals with polynomials, such as ax² + bx + c or x³ + 27y³ or x³ - 2x²y + 4xy² - 8y³. These are constructed out of variables such as x and y by three operations:

multiplication of variables, to form x² or xy² for example;
scaling, that is, multiplication by constants, whether literal (as in ax²) or numeral (as in 27y²);
addition.

A linear polynomial is one formed by the last two operations only; in other words, it is a linear combination of variables. [A technical point: this definition of linear polynomials excludes polynomials such as ax + b, which not all writers would do.] Linear polynomials are the province of linear algebra. So, linear algebra is simpler than algebra in general! Yet it is complicated by the allowance of any number of variables, and by consideration of systems of any number of equations.

The general linear polynomial in n variables is

a₁x₁ + a₂x₂ + ... + a_nx_n.

We shall understand the constants a_i to be real numbers, for now; later, they will sometimes by complex numbers. We shall generally call the constants scalars.

Linear Systems

One can see the goal of linear algebra as the understanding of linear systems. The general linear system of m equations in n unknowns is

a₁₁x₁ + a₁₂x₂ + ... + a_1nx_n = b₁ a₂₁x₁ + a₂₂x₂ + ... + a_2nx_n = b₂ ................................ a_m1x₁ + a_m2x₂ + ... + a_mnx_n = b_m

The terms b_i are the constant terms of the corresponding equations. A solution to the system is an n-tuple (x₁, x₂, ..., x_n) that satisfies each equation. (Note what is, strictly speaking, an ambiguity: the symbols x_j are variables in the system, but also constants making up a solution to the system.) The names of the variables x_j are not important; their order is what is important. The information of the system is contained in its augmented matrix

é ê ê ë a₁₁ a₁₂ ... a_1n b₁ a₂₁ a₂₂ ... a_2n b₂ .................. a_m1 a_m2 ... a_mn b_m ù ú ú û

[You should see brackets the the left and right; but some browsers will just show stacks of accented letters.] This matrix is formed by adding the final column of constants b_i to the coefficient matrix

é ê ê ë a₁₁ a₁₂ ... a_1n a₂₁ a₂₂ ... a_2n ............... a_m1 a_m2 ... a_mn ù ú ú û

Having m rows and n columns, this is an m×n matrix. We may denote it by (a_ij); here, a_ij is understood to be the entry in row i and column j of the matrix (counting from the top and the left, respectively), and i and j are understood to range from 1 to m and from 1 to n respectively. (If there is uncertainty about which letter represents the row-number, and which the column-number, we can denote the matrix by (a_ij)_ij.)

We can solve a linear system by manipulating the equations in certain ways; the corresponding manipulations of the augmented matrix are the elementary row-operations:

multiplication of (the entries in) a row by a nonzero scalar;
interchange of two rows;
addition of a (scalar) multiple of one row to another.

By applying these operations in the technique called Gaussian elimination (or reduction), we arrive at a matrix in row-echelon form. To characterize this form, let us call the first non-zero entry of a row the pivot of the row. (The book does not use this term. If every entry of a row is zero, then the row has no pivot.) A matrix is in row-echelon form if it meets the following conditions:

If a row besides the first has a pivot, then this pivot is to the right of the pivot of the preceding row. (In particular, the preceding row has a pivot; so the rows with no pivots are at the bottom of the matrix.)
Every pivot is 1. (The book then calls it a leading 1.)

With an augmented matrix so reduced, we can read off the solution(s) to the corresponding linear system. In particular:

If the final column has a pivot, then the system is inconsistent (has no solution).
Otherwise, the system is consistent (has a solution); moreover:
- If each column except the last has a pivot, then the system has a unique solution.
- If column j has no pivot, and j < n + 1, then x_j is a free variable. The variables that are not free variables are leading variables; their values are determined by the values of the free variables. In particular, when there are free variables, then the system has infinitely many solutions.

Once you have an augmented matrix in row-echelon form, you have two possible routes towards an explicit solution to the corresponding system: back-substitution and Gauss-Jordan elimination. It doesn't matter which one you use. Gauss-Jordan elimination yields a matrix in reduced row-echelon form, where every pivot is the unique non-zero entry in its column.

For every linear system, there is a corresponding homogeneous system, which results from replacing each constant term with zero. A homogeneous system is automatically consistent, and zero (that is, (0,...,0)) is a solution. If (r₁, r₂, ..., r_n) is a solution to a linear system, and (x₁, x₂, ..., x_n) is a solution to the corresponding homogeneous system, then

(x₁ + r₁, x₂ + r₂, ..., x_n + r_n)

is also a solution of the original system. (More on this later.)

Next section