Professional Documents
Culture Documents
1.5
Systems of Equations
How to solve systems of simultaneous linear equations using Gaussian elimination and LU decomposition.
In this section, we’ll discuss how to solve systems of linear equations: perhaps 2, 3, 100, or
even 1000 simultaneous equations, and correspondingly many unknowns.
Simple circuit problems can have tens to hundreds of equations and unknowns. It is not
uncommon for complex circuits to be in the thousands of equations or beyond.
We’re not going to discuss nonlinear equations because their complexity is hard to reason
about. For example, consider the equation:
sin(x) = 0
This nonlinear equation has just 1 equation and 1 unknown, but has infinitely many solutions:
x = nπ
for all integers n
. Additionally, as discussed in Linear & Nonlinear, we can find a
solution for a nonlinear system by making a series of locally linear approximations using the
Newton-Raphson Method.
A Linear Equation
To start, we need to define what we mean by a linear equation. A linear equation is one that
can be written as:
r1 x 1 + r2 x 2 + ⋯ + rn x n = c
, a number which may be zero, in which case we don’t usually write the x
term at all. And i
y − 3x = 0
10k = 30
They are unrelated because they don’t fit into the same space of variables. They do not form
a system.
y = 5(x − 2)
y = 5x − 10
y − 5x = −10
If an equation can’t be manipulated to fit this form, it is nonlinear. (It may be possible to
construct a linear approximation, however: see the Algebraic Approximations and Linear &
Nonlinear sections.)
2
x = 1 has polynomial term of order 2
These sorts of equations will not be addressed here, but are still solvable with multiple
numerical iterations using the same techniques shown here as a foundation.
A System of Equations
A system of equations simply means that we have multiple equations, all of which must be
satisfied at the same time, and multiple unknowns, which are shared between the
equations.
An equation with unknowns is a search problem: we are searching for the value of the
unknowns that will make the equation be true. The equation is true when the left side equals
the right side.
When we jump to having multiple equations and multiple unknowns, we have to think about
not just whether our one equation is a true statement, but whether all of the equations in
our system are true at the same time and for the same values of the unknowns.
x − y = 1
Let’s search for a solution by trying different values for the unknowns. We have to choose
specific values for all of the unknowns in order to evaluate the left-hand sides of each
equation, so we are searching over all possible values of all unknowns.
If we choose x = 0, y = 0
, this will make both equations false. Therefore, x = 0, y = 0
is not a
solution to this system.
If we choose x = 0, y = 2
, this will make the first equation true. But it will make the second
equation false. All of the equations must be true for the solution to be valid. Therefore,
x = 0, y = 2
is not a solution to this system.
If we choose x = 2, y = 1
, this will make both equations true. Therefore, x = 2, y = 1
is a
solution to this system.
In fact, it is a unique solution point: it is the only solution to this system. Why? Because if,
for example, we were to increase x
by a little bit, then one equation would require y
to
increase a little bit to remain valid, and the other equation would require it to decrease. The
variable y
can’t both increase and decrease at the same time.
A Geometric Interpretation
If we only have two unknowns, it’s easy to map these to a two-dimensional x-y plane.
Continuing with the same two-equation example above, we could convert both equations to
slope-intercept form:
1
y = − x + 2
2
y = x − 1
And now we can use the CircuitLab software to plot these two lines:
System of Two Linear Equations Edit
-
Simulate
circuitlab.com/c63c25cy62k6t
Exercise Click the “circuit,” click “Simulate,” then “Run DC Sweep.” You’ll see the two lines
plotted, with an intersection at x = 2, y = 1
. This shows how we can quickly use the DC
Sweep mode of a circuit simulator as a simple but flexible and powerful graphing tool.
There are only three possible outcomes for any system of equations. Either:
No solution exists.
One unique solution exists.
Infinitely many solutions exist.
Using our two-equation, two-unknown example from earlier, we can consider these three
cases. (If you haven’t plotted it, do so now!)
If our two equations produce two perfectly parallel lines (but not the same line), there is no
intersection. This means that there is no solution. No point in the x-y plane lies on both
lines, so no x-y pair satisfies both equations at once.
If our two equations produce the exact same line overlapping with itself, then all points are
intersecting. There are infinitely many solutions. All points on that line are possible solutions
of the system of equations.
If the lines are not parallel, then they are guaranteed to have a single intersection point.
This point is our one unique solution to the system. This is the normal case, and usually the
one we’re interested in, but it’s important to see what can go wrong and what’s necessary for
our solution to exist at all.
You can quickly modify the plotted equations above to illustrate the two other cases for
yourself. For example, see this one for two equations with no solution:
Exercise Click to plot the no solutions case.
That’s why we can’t just have an intersection between two planes (two equations) to uniquely
solve a 3D linear system problem.
If we have a third equation (i.e. a third plane), we can take the intersection of that plane with
the line that resulted from the intersection of the first two planes. In 3D, the intersection of
a plane and a line can either be:
When we successfully solve a system of 3 equations and 3 unknowns, we’re in the case where
the first two equations intersected to produce a line, and that line then intersected with the
third equation’s plane to produce a single solution point.
Every equation in our system is like a hyperplane in N-1 dimensions where N is the number
of unknowns in the system. Why N-1? Because, in general, if you specify N-1 of the
unknowns, then there’s only 1 specific numerical value the remaining unknown can take for
that equation to be satisfied:
While most people do not have good intuition for geometry above 3D (or good drawing
skills above 2D!), the geometric interpretation allows us to quickly see the possible cases of
solving any linear system.
2x = 10
0x = 10
There is no value of x
for which this equation is true. Therefore, this system (of 1 equation
and 1 unknown) has no solution.
0x = 0
In this case, x
can take on any value, whether x = 5
or x = −π
or x = 6 × 10
, and still the
23
It may appear that we’ve oversimplified, but in fact, there’s nothing special about this
example. These three outcomes are the only three possibilities with systems of any size, but
they are illustrated here with the system of minimum size N = 1
equations and unknowns.
−x + y = 3
This is the equation of a line, with 1 equation and 2 unknowns. It has infinitely many solution
points: all points on that line are solutions to the equation. That is the definition of a line.
With 2 unknowns and 1 equation constraining them, this infinite solution space happens to
have 2 − 1 = 1
degrees of freedom. For example, we could write the line as a vector
equation:
x 1 0
[ ] = [ ]m + [ ]
y −1 3
How can a 1 equation, 2 unknown system have zero solutions? Well, a simple way is:
0x + 0y = 3
We don’t normally think about it, but when we evaluate a line such as −x + y = 3
at a single
point, such as x = 10
, we’re really adding a second equation to the system:
−x + y = 3
x = 10
To evaluate at x = 10
, we are actually first adding that second equation to our system, and
then solving a system with 2 equations and 2 unknowns. This happens so fast that we don’t
consider it consciously or write it down, but that simple second equation is vital to producing
a single solution point. Geometrically, we are intersecting the original line with a new vertical
line defined as x = 10
, and then finding the unique intersection point.
2x = 4
4x = 6
There is no value of x
that satisfies both of these equations, so no solution exists.
2x = 4
4x = 8
Linear Independence
We’re not just interested in having N equations for N unknowns. We need all of those N
equations to provide new information – new constraints – or else we would have the same
solution space with just N-1 equations.
2x = 4
4x = 8
But linear independence is not a pairwise property. That means we can’t figure it out by
only looking at one pair of equations at a time. We have to look at the whole collection of
equations.
x + 2y + 3z = 1
x = −5
2y + 3z = 6
Any two of these equations taken alone are linearly independent, but all three together are
not. We can add the second and third equations together and get the first equation.
If a set of equations are not linearly independent, we can remove one (or possibly more) of
them and still have the same space of solutions.
There is no simple rule for inspecting a set of equations and quickly determining if they are
or are not linearly independent. If you see quickly see any obvious issues, such as where one
equation is a just a nonzero multiple of another, or where one equation is a linear
combination of two other equations, you can quickly determine that the set is not linearly
independent. However, in general, you may have to combine all previous N-1 equations to
determine that the Nth equation is or is not linearly independent. In practice, determining
linear independence is done by solving the system using Gaussian eliminiation or LU
decomposition, as will be described below.
In the case where the system is not linearly independent, then the right-hand-side constants
may help determine whether there are zero solutions (if the duplicated equations are
inconsistent with each other) or infinitely many solutions (if the duplicated equations are
consistent with each other).
x + 2y + 3z = 1
x = −5
2y + 3z = 6
x + 2y + 3z = 1
x = −5
2y + 3z = 0
Now, while the left-hand-sides of the second and third equations sum to match the first, the
right-hand-side constants do not. This guarantees that this system has no solutions.
While the outcome of no solutions versus infinitely many solutions may seem quite different,
this is analogous to our 0x = 10
versus 0x = 0
in the 1 equation and 1 unknown base case
described earlier. In either case, it is the structure of the left-hand-side alone that guarantees
that we’ve lost our ability to produce a single solution point. That’s why, at least from the
perspective of solving linear systems of equations, linear independence is a property of
the left-hand-side coefficients only.
If there are more unknowns than linearly independent equations, our system is
underdetermined: we have unconstrained degrees of freedom, and will have infinitely
many solutions rather than a single point.
If there are more linearly independent equations than unknowns, we have a problem: it’s
actually impossible for this case to happen, because those equations won’t actually be
linearly independent.
Only N linearly independent equations in N unknowns can produce a single solution point,
which is what we’re usually looking for.
Matrix Representation
Let’s consider a system with 3 equations and 3 unknowns:
f + 2g + 3h = 1
f = −5
g + h = 1
This system has 3 linearly independent equations, and has a single valid solution:
f = −5, g = −3, h = 4
. You should quickly substitute these values in to each equation and
confirm that it is valid.
1 2 3 f 1
⎡ ⎤⎡ ⎤ ⎡ ⎤
⎢1 0 0 ⎥ ⎢ g ⎥ = ⎢ −5 ⎥
⎣ ⎦⎣ ⎦ ⎣ ⎦
0 1 1 h 1
By performing the matrix multiplication on the left, we recover one equation from each row
of A
.
You should become comfortable going back and forth between the matrix representation
and the equation representation. They are truly identical. However, we often work in the
matrix representation because it’s more compact and it’s also the way we solve these
systems with computational tools. Learning to approach and solve problems both in the
equation representation and the matrix representation is valuable, and we will flip
back and forth in the rest of this book.
This particular problem is a simple matrix equation, and can easily be solved by any matrix
solver. For example, click here to solve this matrix equation on Wolfram Alpha.
Solving a linear system like this is so common that in many mathematical software packages
(including MATLAB, Octave, Sage) one simply uses the backslash (“\”) operator to designate
solving an equation: x = A∖B
.
Augmented Matrices
To make it even more compact to represent systems of equations, it’s common to omit the
vector of unknowns, and to simply join the B
vector on to the right-hand-side of A
, with a
vertical bar separating them:
⎡ 1 2 3 1 ⎤
⎢
1 0 0 −5 ⎥
for (f , g, h)
⎢ ⎥
⎣ 0 1 1 1 ⎦
This is called an augmented matrix, and it’s just another identical way of writing the system
of equations from above. There’s nothing special about it except as a convenient way of
writing these equations, and it happens to keep the left-hand-side coefficients and right-
hand-side constants of each row together nicely.
The ordering of the columns maps directly to the ordering of the unknowns (f , g, h)
in the
example above. However, we could have chosen a different column ordering, so long as
we’re careful to keep track as we work.
The ordering of the rows maps directly to the ordering of the equations. But a system of
equations has the same meaning regardless of what order we write the equations. If we
reorder the rows of A
, we must also reorder the corresponding rows of B
. However, the
augmented matrix form [A|B]
does this for us “automatically” by keeping the entire row
together. We can write this augmented matrix in any row order and the system will have the
same solution.
Are there any especially convenient orderings that make a system of equations easier to
solve? Yes. Let’s look at a few.
Diagonal Matrices
Here’s a system of 5 equations in 5 unknowns which takes practically zero effort to solve:
−v = 3
9w = 6
5x = 10
8y = 24
z = 99
It’s obvious that the equations don’t interact with each other at all, and we can solve each by
simply dividing the right-hand-side constant by the left-hand-side coefficient:
3
v =
−1
6
w =
9
10
x =
5
24
y =
8
99
z =
1
⎡ −1 0 0 0 0 3 ⎤
⎢
0 9 0 0 0 6 ⎥
⎢
⎥
⎢
⎥
⎢
0 0 5 0 0 10 ⎥
for (v, w, x, y, z)
⎢
⎥
⎢
⎥
⎢ 0 0 0 8 0 24 ⎥
⎣ 0 0 0 0 1 99 ⎦
The 5x5 coefficient matrix has a special shape: it’s a diagonal matrix, meaning it only has
nonzero entries on the main diagonal, and is strictly zero everywhere else.
Remember that for any system of equations we are free to rearrange the ordering of the
rows freely, and additionally, we are free to rerrange the ordering of the columns so long as
we keep track of the corresponding ordering of the unknown variables. That’s why the easy-
to-solve diagonal structure has more to do with the structure of the equations themselves,
rather than the order in which we may first see them written.
Triangular Matrices
Here’s a system of 4 equations in 4 unknowns which was selected to be easily solvable by
hand:
2w = 16
w − 3x = 14
−w + x + 10y = −10
w + x + 6y + 2z = 0
Make sure you can solve this system quickly by hand before you continue. You should find
w = 8, x = −2, y = 0, z = 3
.
This system is easy to solve because we can work one row at a time. Looking only at the first
equation, we have one equation and one unknown, so we’ll immediately find a value for
.
16
w = = 8
2
Next, looking at the second row, we can substitute in the first value we found w = 8
, and
then the second row then becomes 8 − 3x = 14
, or −3x = 6
, so x = −2
.
This process continues on and on for each subsequent equation because each new
equation contains only variables we’ve already solved, plus one new unknown which is
now easily solved by doing a bit of subtraction and division.
⎡ 2 0 0 0 16 ⎤
⎢
1 −3 0 0 14 ⎥
⎢
⎥
for (w, x, y, z)
⎢
⎥
⎢ −1 1 10 0 −10 ⎥
⎣ 1 1 6 2 0 ⎦
This shape of coefficient matrix is called a lower triangular matrix, meaning that nonzero
values are present only on the main diagonal and any cells below the main diagonal. The
area above the main diagonal is filled with strictly zero values.
1. Start at the first row, which contains a single nonzero variable term, and solve that
equation directly with division.
2. For all subsequent rows, substitute in the already-known values of the N-1 variables
corresponding to the previous N-1 rows, again leaving a simple algebra problem to find
the value of the Nth variable.
Similarly, we can mirror the situation. For example, consider this 3 equation and 3 unknown
system:
x − y + z = 2
3y − z = 3
2z = 6
2
, then substitute that into the equation above
that to find y
, and so on.
This system written as an augmented matrix:
⎡ 1 −1 1 2 ⎤
⎢
0 3 −1 3 ⎥
for (x, y, z)
⎢ ⎥
⎣ 0 0 2 6 ⎦
The shape of this coefficient matrix is upper triangular, meaning that there are nonzero
values present only on the main diagonal and any cells above the main diagonal. All the cells
below the main diagonal are zero.
In general, most systems of equations we’re interested in solving will not happen to be one
of the three easy to solve cases (diagonal, lower triangular, upper triangular). However, we’ll
now show that it’s possible to solve any matrix problem by doing a series of simple
operations that convert it into a triangular matrix problem without changing the solution of
the system of equations.
Row Operations
Equations have some nice properties that we take for granted:
a = b
a + m = b + m
ka = kb
These basic rules say that if we have an equation, we can always add a constant to both sides
and it’s still true, or we can multiply by a constant and it’s still true.
a + c = b + d
a11 x + a12 y = b1
a21 x + a22 y = b2
We can, for example, add the two equations together to create a new equation:
This last line is a new equation that is a linear combination of the original two equations. It
is not linearly independent of the original two. However, importantly, if we pick either one
(and only one) of the original equations, alongside our new equation, those two form a new
system of two equations that is a linearly independent (assuming the original equations were
linearly independent).
That is to say that we can modify our original system of equations in this way and get a new
system of equations which is equivalent to (i.e. has the same solution to) the original system.
Because this is usually done with the matrix representation as discussed above, and each
equation is represented by a row in our matrix, these modifications are called “row
operations.” For example, to illustrate the same addition and overwrite to form a new system:
a11 a12 b1
[ ] for (x, y)
a21 a22 b2
−−−−−−−−−−−−−−−−−−−−−−−→
a11 a12 b1
[ ] for (x, y)
(a11 + a21 ) (a12 + a22 ) (b1 + b2 )
Both the original & modified augmented matrices have the same solution. We can apply
further row operations to continue to manipulate the system. The goal of these row
operations will be to turn our original system into a triangular system which we know is easy
to solve.
An equation can always be scaled by multiplying both sides by a constant factor (as long as
that factor is not zero). This does not change the solution space of that equation.
We have to store the new, scaled equation in place of the original. We might write:
2R3 → R3
to indicate that we’re going to double row #3 and then store that new equation back in the
same row.
R2 ↔ R3
to
indicate that we’re going to swap row #2 with row #3.
(All reorderings of rows are possible through a series of pairwise row swaps.)
We can always add any nonzero multiple of one row to another row. This produces a linear
combination of the two parent rows.
In order to maintain all the information that is previously present in the original system, we
have to replace one of the two parent rows with our new linear combination.
R4 − 3R1 → R4
to indicate that we’re going to take the fourth row, subtract three times the first row, and
then store the result again in the fourth row.
All three of these row operations produce a new system of equations which has the same
solution as the original system. Now, we’ll see how to use these operations to solve a system
of equations.
Gaussian Elimination
Solving a linear system using Gaussian elimination is a two-step process:
1. First, we’ll use row operations to change a matrix into an upper-triangular one.
2. Second, we’ll solve the upper-triangular system (which is easy to do, as we’ve shown
earlier).
2w − y = 0
8x + y + z = 12
w + y = 9
⎡ 1 1 1 0 11 ⎤
⎢
2 0 −1 0 0 ⎥
⎢
⎥
for (w, x, y, z)
⎢
⎥
⎢ 0 8 1 1 12 ⎥
⎣ 1 0 1 0 9 ⎦
For each row of the matrix, we’ll look at a pivot value – the value that happens to be on the
main diagonal – and use it to eliminate any nonzero matrix coefficients in the equations
below it. For the first row, the pivot is marked here in red, and the two nonzero values to be
eliminated are marked in green:
⎡ 1 1 1 0 11 ⎤
⎢
2 0 −1 0 0 ⎥
⎢
⎥
for (w, x, y, z)
⎢
⎥
⎢ 0 8 1 1 12 ⎥
⎣ 1 0 1 0 9 ⎦
To remove the green values we’ll subtract the correct multiple (the multiple which cancels
out the corresponding term) of the pivot row from each of the rows below it:
2
R2 − R1 →R2
1
1
R4 − R1 →R4
1
−−−−−−−−−→
⎡ 1 1 1 0 11 ⎤
⎢
0 −2 −3 0 −22 ⎥
⎢
⎥
for (w, x, y, z)
⎢
⎥
⎢ 0 8 1 1 12 ⎥
⎣ 0 −1 0 0 −2 ⎦
(No operation was necessary on the 3rd row, because its corresponding cell in the pivot
column was already 0.)
Note that after these two row operations, we’ve made it so that the entire column below the
pivot cell is all zeros.
Now, we move to the 2nd row. The pivot is the red cell on the main diagonal, and the green
cells below it are ones we’ll work on eliminating:
⎡ 1 1 1 0 11 ⎤
⎢
0 −2 −3 0 −22 ⎥
⎢
⎥
for (w, x, y, z)
⎢
⎥
⎢ 0 8 1 1 12 ⎥
⎣ 0 −1 0 0 −2 ⎦
Again, we need two row operations to eliminate the 8 and the -1 values:
8
8
R − R →R
3 2 3
−2
−1
R − R →R
4 2 4
−2
−−−−−−−−−−→
⎡ 1 1 1 0 11 ⎤
⎢
0 −2 −3 0 −22 ⎥
⎢
⎥
⎢
⎥
for (w, x, y, z)
⎢
0 0 −11 1 −76 ⎥
⎢ ⎥
3
⎣ 0 0 0 9 ⎦
2
⎡ 1 1 1 0 11 ⎤
⎢
0 −2 −3 0 −22 ⎥
⎢
⎥
⎢
⎥
for (w, x, y, z)
⎢
0 0 −11 1 −76 ⎥
⎢ ⎥
3
⎣ 0 0 0 9 ⎦
2
With just one remaining row, only one row operation is necessary to clear that cell.
3/2
R − R →R
4 3 4
−11
−−−−−−−−−−→
⎡ 1 1 1 0 11 ⎤
⎢
0 −2 −3 0 −22 ⎥
⎢
⎥
⎢
⎥
for (w, x, y, z)
⎢
0 0 −11 1 −76 ⎥
⎢ ⎥
3 30
⎣ 0 0 0 − ⎦
22 22
There are some unpleasant fractions flying around, but the result of this operation is that the
coefficient matrix is now upper triangular!
⎡ 1 1 1 0 11 ⎤
⎢
0 −2 −3 0 −22 ⎥
⎢
⎥
⎢
⎥
for (w, x, y, z)
⎢
0 0 −11 1 −76 ⎥
⎢ ⎥
3 30
⎣ 0 0 0 − ⎦
22 22
Now that it’s upper triangular, we just solve with back-substitution: start from the last row,
and solve to find
.
3 30
z = − z = −10
22 22
Next, we move up a row and look at the 3rd row, while substituting in the value of the one
unknown z
we’ve already solved:
−11y + z = −76
−11y − 10 = −76
−11y = −66
y = 6
−2x − 18 = −22
−2x = −4
x = 2
And finally the 1st row, substituting in all the other terms:
w + x + y = 11
w + 2 + 6 = 11
w = 3
And that’s it! We’ve solved for all our unknowns. Most of the work was involved in using
each pivot to eliminate all the nonzero values in the column below it. The resulting
upper triangular matrix was easy to solve bottom-up.
Gaussian elimination, as shown here, is a very mechanical way to use row operations to
transform an arbitrary matrix problem into a triangular one by eliminating nonzero terms
below the main diagonal, one at a time.
, and then used it to cancel out any nonzero terms A np for n > p
. The cancellation worked
because we multiplied row p
by a ratio of the terms,
so that:
Anp
App
Anp
Anp − ( )App = 0
App
in order to zero out that cell. However, this process breaks if we have to divide by zero, i.e.
if App = 0
.
As an example, let’s consider the same four equations, but let’s swap row 1 and row 3 before
we begin:
8x + y + z = 12
2w − y = 0
w + x + y = 11
w + y = 9
⎡ 0 8 1 1 12 ⎤
⎢
2 0 −1 0 0 ⎥
⎢
⎥
for (w, x, y, z)
⎢
⎥
⎢ 1 1 1 0 11 ⎥
⎣ 1 0 1 0 9 ⎦
There is no way we can use the red 0 pivot cell to eliminate the green nonzero values in the
column below it. There is no multiple of 0 that will additively cancel out nonzero values.
If you encounter a case where the pivot value is zero, you must swap rows so that the pivot
is not zero before you proceed. If this happens at an intermediate step, that’s OK: you can
pause and swap row p
with some later row n > p
, and then resume. (Don’t swap with an
earlier row n < p
because that will have nonzero values to the left of the pivot cell!)
LU Decomposition
Circuit simulators like CircuitLab (and other numerical software) usually approach Gaussian
elimination with a small twist. Instead of only working on an NxN original coefficient matrix
A
to create the NxN upper triangular matrix (which we’ll call U
for upper), we’ll
simultaneously create a NxN lower triangular matrix (L
for lower), such that A = LU
, the
matrix product of the lower and upper triangular matrices.
Notice that the Gaussian elimination process to make the matrix be upper triangular did not
depend on the right-hand-side constant values; while they were modified by the row
operations, they did not determine the modifications of the coefficient rows, and were
referenced only in the backsubstitution step. If we wanted to use the same left-hand-side A
but re-solve our equations for a different set of constants B
on the right, we would have to
redo much of our work.
1. Solve Ly = B
to find a vector y
using forward substitution since L
is lower triangular.
2. Solve Ux = y
to find x
using back substitution since U
is upper triangular.
Ax = B
LUx = B
L(Ux) = B
Ly = B
These two solves are very, very fast because they’re each working on a triangular matrix.
A = LU
⎡
1 1 1 0
⎤ ⎡1 0 0 0⎤⎡1 1 1 0 ⎤
⎢
2 0 −1 0 ⎥
⎢
2 1 0 0 ⎥
⎢
0 −2 −3 0 ⎥
⎢
⎥
= ⎢
⎥
⎢
⎥
⎢0 ⎢
⎥
⎢
⎥
8 1 1⎥ ⎢0 −4 1 0⎥⎢0 0 −11 1 ⎥
⎣ ⎦ ⎣1
1 3 3
1 0 1 0 − − 1⎦⎣0 0 0 ⎦
2 22 22
Where do these L, U
matrices come from?
Solving Ly = B
finds the same transformed right-hand side constants that Gaussian
elimination produces on the augmented matrix. Then, solving Ux = y
just solves the same
upper triangular system resulting from Gaussian elimination. The first step may look
unfamiliar, but it’s really just doing the same work that Gaussian elimination would do on the
right-hand side.
In practice, many problems (including many circuit problems) involve re-solving the same
coefficient matrix A
with different constant values B
repeatedly, so performing the relatively
slow LU decomposition just once and then performing the fast triangular solves as many
times as needed can be a tremendous speedup.
Ax = B
PAx = PB
LUx = PB
1. First solve Ly = PB
to find y
.
2. Then solve Ux = y
.
For systems of N linear equations with N unknowns, write the equations with all unknown
terms on the left-hand side and all constants on the right-hand side, and then follow these
three rules:
You may work on the rows in whatever order will make the least work.
Look for overlap.
Look for places you can cancel terms without adding any new terms.
You may work with the equations or with the augmented matrix form; both are
equivalent, but one is more verbose and the other might make it too easy to make
mistakes when working by hand.
This strategy is essentially Gaussian elimination, but we’ve intermixed the reduction and the
substitution steps to do whichever is more convenient first, because if we’re showing our
work by writing each step on paper it’s often convenient to reduce the dimensionality of our
problem as early as possible.
Here’s an example of how we might solve a system by hand. Let’s start with the same system
we solved using Gaussian elimination earlier:
w + x + y = 11
2w − y = 0
8x + y + z = 12
w + y = 9
⎡ 1 1 1 0 11 ⎤
⎢
2 0 −1 0 0 ⎥
⎢
⎥
for (w, x, y, z)
⎢
⎥
⎢ 0 8 1 1 12 ⎥
⎣ 1 0 1 0 9 ⎦
First, we notice that we can use the 4th row to cancel out 3rd-column y
terms in both the 1st
and 2nd equations without introducing any new nonzero terms:
R −R →R
1 4 1
R +R →R
2 4 2
−−−−−−−−→
⎡ 0 1 0 0 2 ⎤
⎢
3 0 0 0 9 ⎥
⎢
⎥
for (w, x, y, z)
⎢
⎥
⎢ 0 8 1 1 12 ⎥
⎣ 1 0 1 0 9 ⎦
The first and second row now have just one nonzero term each, so we immediately have two
of our results:
x = 2 and w = 3
We can substitute these values into the remaining two equations to produce our remaining
system:
1 1 −4
[ ] for (y, z)
1 0 6
Let’s try another approach. Here’s the original system of equations again:
⎡ 1 1 1 0 11 ⎤
⎢
2 0 −1 0 0 ⎥
⎢
⎥
for (w, x, y, z)
⎢
⎥
⎢ 0 8 1 1 12 ⎥
⎣ 1 0 1 0 9 ⎦
Now we’ll notice that the 2nd row has two terms and zero constant value: it says 2w − y = 0
,
or y = 2w
. We’re going to choose to keep w
and remove y
from our set of equations by
substituting 2w
every time we see a y
, and then we can remove the 2nd equation entirely
from our system:
substitute 2w for y
remove R
2
−−−−−−−−−−−→
⎡ 3 1 0 11 ⎤
⎢
2 8 1 12 ⎥
for (w, x, z)
⎢ ⎥
⎣ 3 0 0 9 ⎦
Notice that the right-hand side values are unchanged. While we haven’t actually determined
the numerical value for y
yet, we’ve now quickly reduced the dimensionality of our problem
from 4 to 3. And in doing so, the new third equation is trivially solvable for w = 3
.
We substitute w = 3
into the remaining two equations and remove the third equation to find:
1 0 2
[ ] for (x, z)
8 1 6
This is the same example system of equations we solved mechanically in the “Gaussian
Elimination” section above, but by applying these solve-by-hand strategies, we have to do a
lot less work (significantly fewer total operations), avoid ugly fractions, and get the answer
faster!
When solving by hand, we have lots of opportunities to make arithmetic mistakes (for
example, adding rather than subtracting when computing the new right-hand side after
substitution), as well as record-keeping mistakes (for example, not keeping track of which
column maps to which variable as we go). Fortunately, it’s relatively easy to check your
solution after you compute it: just plug it into each of the equations and make sure the left
side equals the right. Get in the habit of quickly checking your work after you solve systems
of equations.
What’s Next
In the next section, Steady State & Transient, we’ll consider how to separate the analysis of a
system’s normal operating conditions from events that may change those conditions
temporarily or permanently.
Robbins, Michael F. Ultimate Electronics: Practical Circuit Design and Analysis. CircuitLab, Inc., 2021,
ultimateelectronicsbook.com. Accessed 06 Sep 2022. (Copyright © 2021 CircuitLab, Inc.)