You are on page 1of 26

ULTIMATE ELECTRONICS: PRACTICAL CIRCUIT DESIGN AND ANALYSIS

1.5

Systems of Equations
How to solve systems of simultaneous linear equations using Gaussian elimination and LU decomposition.

In this section, we’ll discuss how to solve systems of linear equations: perhaps 2, 3, 100, or
even 1000 simultaneous equations, and correspondingly many unknowns.

Simple circuit problems can have tens to hundreds of equations and unknowns. It is not
uncommon for complex circuits to be in the thousands of equations or beyond.

We’re not going to discuss nonlinear equations because their complexity is hard to reason
about. For example, consider the equation:

sin(x) = 0

This nonlinear equation has just 1 equation and 1 unknown, but has infinitely many solutions:
x = nπ
for all integers n
. Additionally, as discussed in Linear & Nonlinear, we can find a
solution for a nonlinear system by making a series of locally linear approximations using the
 
Newton-Raphson Method.

A Linear Equation 
To start, we need to define what we mean by a linear equation. A linear equation is one that
can be written as:

r1 x 1 + r2 x 2 + ⋯ + rn x n = c

This equation has n


unknown variables x
. Each term has some known constant coefficient r
i i

, a number which may be zero, in which case we don’t usually write the x
term at all. And i

this equation has a single known constant term c


which the equation sums up to, which
might be 0, or some other number.

These are a few unrelated linear equations:

5x1 − 3x2 + 6x4 = 10

y − 3x = 0

10k = 30
They are unrelated because they don’t fit into the same space of variables. They do not form
a system.

Sometimes, an equation needs to be algebraically manipulated to fit our desired standard


form, but is still a linear equation. For example, from the slope-intercept form of a line:

y = 5(x − 2)

y = 5x − 10

y − 5x = −10

If an equation can’t be manipulated to fit this form, it is nonlinear. (It may be possible to
construct a linear approximation, however: see the Algebraic Approximations and Linear &
Nonlinear sections.)

Here are some examples of nonlinear equations:

y = sin(x) has non-linear function inside

2
x = 1 has polynomial term of order 2

x(y + 1) = 3 has product of unknowns

These sorts of equations will not be addressed here, but are still solvable with multiple
 
numerical iterations using the same techniques shown here as a foundation.

A System of Equations 
A system of equations simply means that we have multiple equations, all of which must be
satisfied at the same time, and multiple unknowns, which are shared between the
equations.

An equation with unknowns is a search problem: we are searching for the value of the
unknowns that will make the equation be true. The equation is true when the left side equals
the right side.

For most values of the unknowns, the equation will be false: y + 1 = 3


is a false statement for
infinitely many possible choices of y
. It is only a true statement at one particular choice. In
this case, this equation is a true statement only when y = 2
.

When we jump to having multiple equations and multiple unknowns, we have to think about
not just whether our one equation is a true statement, but whether all of the equations in
our system are true at the same time and for the same values of the unknowns.

Consider this system of two linear equations in two unknowns:


x + 2y = 4

x − y = 1

Let’s search for a solution by trying different values for the unknowns. We have to choose
specific values for all of the unknowns in order to evaluate the left-hand sides of each
equation, so we are searching over all possible values of all unknowns.

If we choose x = 0, y = 0
, this will make both equations false. Therefore, x = 0, y = 0
is not a
solution to this system.

If we choose x = 0, y = 2
, this will make the first equation true. But it will make the second
equation false. All of the equations must be true for the solution to be valid. Therefore,
x = 0, y = 2
is not a solution to this system.

If we choose x = 2, y = 1
, this will make both equations true. Therefore, x = 2, y = 1
is a
solution to this system.

In fact, it is a unique solution point: it is the only solution to this system. Why? Because if,
for example, we were to increase x
by a little bit, then one equation would require y
to
increase a little bit to remain valid, and the other equation would require it to decrease. The
 
variable y
can’t both increase and decrease at the same time.

A Geometric Interpretation 
If we only have two unknowns, it’s easy to map these to a two-dimensional x-y plane.

Continuing with the same two-equation example above, we could convert both equations to
slope-intercept form:

1
y = − x + 2
2

y = x − 1

And now we can use the CircuitLab software to plot these two lines:
System of Two Linear Equations Edit
-
Simulate
circuitlab.com/c63c25cy62k6t

Exercise Click the “circuit,” click “Simulate,” then “Run DC Sweep.” You’ll see the two lines
plotted, with an intersection at x = 2, y = 1
. This shows how we can quickly use the DC
 
Sweep mode of a circuit simulator as a simple but flexible and powerful graphing tool.

Zero, One, or Infinitely Many Solutions 


A solution is a specified configuration of values for all of the unknowns such that all of the
equations are simultaneously satisfied.

There are only three possible outcomes for any system of equations. Either:

No solution exists.
One unique solution exists.
Infinitely many solutions exist.

Using our two-equation, two-unknown example from earlier, we can consider these three
cases. (If you haven’t plotted it, do so now!)

If our two equations produce two perfectly parallel lines (but not the same line), there is no
intersection. This means that there is no solution. No point in the x-y plane lies on both
lines, so no x-y pair satisfies both equations at once.
If our two equations produce the exact same line overlapping with itself, then all points are
intersecting. There are infinitely many solutions. All points on that line are possible solutions
of the system of equations.

If the lines are not parallel, then they are guaranteed to have a single intersection point.
This point is our one unique solution to the system. This is the normal case, and usually the
one we’re interested in, but it’s important to see what can go wrong and what’s necessary for
our solution to exist at all.

You can quickly modify the plotted equations above to illustrate the two other cases for
yourself. For example, see this one for two equations with no solution:

System of Two Linear Equations - No Solution Edit


-
Simulate
circuitlab.com/c9jf9emv678x7

 
Exercise Click to plot the no solutions case.

Geometric Interpretations in Higher Dimensions 


In two dimensions, solving a system of linear equations is equivalent to finding an
intersection of the lines described by each equation. Each equation Ax + By = C
describes
some line in 2D space.
In three dimensions, solving a linear system is equivalent to finding an intersection of the
planes described by each equation. Each equation Ax + By + Cz = D
describes some plane
in 3D space.

In 3D, note that the intersection of two planes could be either:

A line (the normal case),


A plane (if both planes are identical), or
No points (if both planes are parallel but not identical).

That’s why we can’t just have an intersection between two planes (two equations) to uniquely
solve a 3D linear system problem.

If we have a third equation (i.e. a third plane), we can take the intersection of that plane with
the line that resulted from the intersection of the first two planes. In 3D, the intersection of
a plane and a line can either be:

A point (the normal case),


The line (if the line happens to be part of the plane), or
No points (if the line is parallel to the plane but not touching).

When we successfully solve a system of 3 equations and 3 unknowns, we’re in the case where
the first two equations intersected to produce a line, and that line then intersected with the
third equation’s plane to produce a single solution point.

In higher dimensions, solving a linear system is equivalent to finding an intersection of the


hyperplanes described by each equation.

Every equation in our system is like a hyperplane in N-1 dimensions where N is the number
of unknowns in the system. Why N-1? Because, in general, if you specify N-1 of the
unknowns, then there’s only 1 specific numerical value the remaining unknown can take for
that equation to be satisfied:

In one dimension (Ax = D


), nothing needs to be specified, and there’s only one x
coordinate that satisfies the equation.
In two dimensions (Ax + By = D
), if you specify an x
coordinate, there’s only one y
coordinate value that falls on any given line.
In three dimensions (Ax + By + Cz = D
), if you specify an x and y
coordinates, then
there’s only one z
coordinate value that falls on any given plane.
And so on.
Every time we include another equation, we’re applying a new constraint, and, so long as a
unique solution exists, we’re reducing the dimensionality of the solution space by 1.
Ideally, after we’ve applied all our equations one at a time, we’re left with just a single unique
solution point.

While most people do not have good intuition for geometry above 3D (or good drawing
skills above 2D!), the geometric interpretation allows us to quickly see the possible cases of
 
solving any linear system.

One Equation & One Unknown 


Consider an equation with just one unknown and one equation:

2x = 10

This has one solution: x = 5


. But can we modify this equation to have no solution? Actually,
yes:

0x = 10

There is no value of x
for which this equation is true. Therefore, this system (of 1 equation
and 1 unknown) has no solution.

But what if we change the right hand side, too?

0x = 0

In this case, x
can take on any value, whether x = 5
or x = −π
or x = 6 × 10
, and still the
23

equation remains true. There are infinitely many possible solutions.

It may appear that we’ve oversimplified, but in fact, there’s nothing special about this
example. These three outcomes are the only three possibilities with systems of any size, but
 
they are illustrated here with the system of minimum size N = 1
equations and unknowns.

One Equation & Two Unknowns 


When we have fewer equations than unknowns, a system definitely does not have a single
solution, but it might have either zero or infinitely many solutions.

We’re familiar with an equation like:

−x + y = 3

This is the equation of a line, with 1 equation and 2 unknowns. It has infinitely many solution
points: all points on that line are solutions to the equation. That is the definition of a line.
With 2 unknowns and 1 equation constraining them, this infinite solution space happens to
have 2 − 1 = 1
degrees of freedom. For example, we could write the line as a vector
equation:

x 1 0
[ ] = [ ]m + [ ]
y −1 3

where the real number m


projects the slope vector along the one degree of freedom allowed
by the equation. (In a problem with 3 unknowns and 1 equation, we would have a vector
equation with two different vectors and two independent degrees of freedom.)

How can a 1 equation, 2 unknown system have zero solutions? Well, a simple way is:

0x + 0y = 3

There are no values of (x, y)


that would satisfy this equation.

We don’t normally think about it, but when we evaluate a line such as −x + y = 3
at a single
point, such as x = 10
, we’re really adding a second equation to the system:

−x + y = 3

x = 10

To evaluate at x = 10
, we are actually first adding that second equation to our system, and
then solving a system with 2 equations and 2 unknowns. This happens so fast that we don’t
consider it consciously or write it down, but that simple second equation is vital to producing
a single solution point. Geometrically, we are intersecting the original line with a new vertical
 
line defined as x = 10
, and then finding the unique intersection point.

Two Equations & One Unknown 


When a system has more equations than unknowns, the system is usually overdetermined
and inconsistent, meaning no solution exists. For example:

2x = 4

4x = 6

There is no value of x
that satisfies both of these equations, so no solution exists.

However, what if our two equations were:

2x = 4

4x = 8

In this special case, the value x = 2


happens to satisfy both equations. What’s happened
here? We’ve found two equations that are not linearly independent. The second equation
doesn’t actually give us any new information compared to the first. Any value of x
that
 
satisfies the first equation will also satisfy the second.

Linear Independence 
We’re not just interested in having N equations for N unknowns. We need all of those N
equations to provide new information – new constraints – or else we would have the same
solution space with just N-1 equations.

Our simple example of not-independent equations was very simple:

2x = 4

4x = 8

It’s clear that one equation is just double the other.

But linear independence is not a pairwise property. That means we can’t figure it out by
only looking at one pair of equations at a time. We have to look at the whole collection of
equations.

Here’s an example of three equations:

x + 2y + 3z = 1

x = −5

2y + 3z = 6

Any two of these equations taken alone are linearly independent, but all three together are
not. We can add the second and third equations together and get the first equation.

If a set of equations are not linearly independent, we can remove one (or possibly more) of
them and still have the same space of solutions.

There is no simple rule for inspecting a set of equations and quickly determining if they are
or are not linearly independent. If you see quickly see any obvious issues, such as where one
equation is a just a nonzero multiple of another, or where one equation is a linear
combination of two other equations, you can quickly determine that the set is not linearly
independent. However, in general, you may have to combine all previous N-1 equations to
determine that the Nth equation is or is not linearly independent. In practice, determining
linear independence is done by solving the system using Gaussian eliminiation or LU
 
decomposition, as will be described below.

Linear Independence and the Right Hand Side 


Note that the linear independence of a set of equations only has to do with the coefficients
of the unknowns on the left-hand-side of the equation. It does not depend on the values of
the constants on the right-hand-side.

In the case where the system is not linearly independent, then the right-hand-side constants
may help determine whether there are zero solutions (if the duplicated equations are
inconsistent with each other) or infinitely many solutions (if the duplicated equations are
consistent with each other).

For example, the non-independent system:

x + 2y + 3z = 1

x = −5

2y + 3z = 6

happens to be consistent with respect to the right-hand-side constants, so it could be


reduced to a system of 2 equations in 3 unknowns which represent a line (in 3D space)
passing through the (x, y, z)
points (−5, 3, 0) and (−5, 0, 2)
, and infinitely many other points
on the same line.

However, if we change just a single right-hand-side constant from 6 to 0:

x + 2y + 3z = 1

x = −5

2y + 3z = 0

Now, while the left-hand-sides of the second and third equations sum to match the first, the
right-hand-side constants do not. This guarantees that this system has no solutions.

While the outcome of no solutions versus infinitely many solutions may seem quite different,
this is analogous to our 0x = 10
versus 0x = 0
in the 1 equation and 1 unknown base case
described earlier. In either case, it is the structure of the left-hand-side alone that guarantees
that we’ve lost our ability to produce a single solution point. That’s why, at least from the
perspective of solving linear systems of equations, linear independence is a property of
 
the left-hand-side coefficients only.

N Linearly-Independent Equations & N Unknowns 


A system having N linearly independent equations and N unknowns, where N is the same for
both quantities, is a prerequisite for having a solvable linear system of equations. Why?

If there are more unknowns than linearly independent equations, our system is
underdetermined: we have unconstrained degrees of freedom, and will have infinitely
many solutions rather than a single point.
If there are more linearly independent equations than unknowns, we have a problem: it’s
actually impossible for this case to happen, because those equations won’t actually be
linearly independent.

Only N linearly independent equations in N unknowns can produce a single solution point,
 
which is what we’re usually looking for.

Matrix Representation 
Let’s consider a system with 3 equations and 3 unknowns:

f + 2g + 3h = 1

f = −5

g + h = 1

This system has 3 linearly independent equations, and has a single valid solution:
f = −5, g = −3, h = 4
. You should quickly substitute these values in to each equation and
confirm that it is valid.

We could also write this directly as a matrix equation Ax = B


, where A
is a 3x3 matrix of
left-hand-side coefficients, x
is a 3x1 vector of unknowns, and B
is a 3x1 vector of right-
hand-side constants:

1 2 3 f 1
⎡ ⎤⎡ ⎤ ⎡ ⎤
⎢1 0 0 ⎥ ⎢ g ⎥ = ⎢ −5 ⎥
⎣ ⎦⎣ ⎦ ⎣ ⎦
0 1 1 h 1

By performing the matrix multiplication on the left, we recover one equation from each row
of A
.

You should become comfortable going back and forth between the matrix representation
and the equation representation. They are truly identical. However, we often work in the
matrix representation because it’s more compact and it’s also the way we solve these
systems with computational tools. Learning to approach and solve problems both in the
equation representation and the matrix representation is valuable, and we will flip
back and forth in the rest of this book.

This particular problem is a simple matrix equation, and can easily be solved by any matrix
solver. For example, click here to solve this matrix equation on Wolfram Alpha.

Solving a linear system like this is so common that in many mathematical software packages
(including MATLAB, Octave, Sage) one simply uses the backslash (“\”) operator to designate
 solving an equation: x = A∖B
.

Augmented Matrices 
To make it even more compact to represent systems of equations, it’s common to omit the
vector of unknowns, and to simply join the B
vector on to the right-hand-side of A
, with a
vertical bar separating them:

⎡ 1 2 3 1 ⎤


1 0 0 −5 ⎥
 for (f , g, h)
⎢ ⎥

⎣ 0 1 1 1 ⎦

This is called an augmented matrix, and it’s just another identical way of writing the system
of equations from above. There’s nothing special about it except as a convenient way of
writing these equations, and it happens to keep the left-hand-side coefficients and right-
 
hand-side constants of each row together nicely.

Ordering of Rows and Columns 


When we consider writing a system of equations as a matrix, we have two considerations:

1. In what order should we place the columns?


2. In what order should we place the rows?

The ordering of the columns maps directly to the ordering of the unknowns (f , g, h)
in the
example above. However, we could have chosen a different column ordering, so long as
we’re careful to keep track as we work.

The ordering of the rows maps directly to the ordering of the equations. But a system of
equations has the same meaning regardless of what order we write the equations. If we
reorder the rows of A
, we must also reorder the corresponding rows of B
. However, the
augmented matrix form [A|B]
does this for us “automatically” by keeping the entire row
together. We can write this augmented matrix in any row order and the system will have the
same solution.

Are there any especially convenient orderings that make a system of equations easier to
 
solve? Yes. Let’s look at a few.

Diagonal Matrices 
Here’s a system of 5 equations in 5 unknowns which takes practically zero effort to solve:
−v = 3

9w = 6

5x = 10

8y = 24

z = 99

It’s obvious that the equations don’t interact with each other at all, and we can solve each by
simply dividing the right-hand-side constant by the left-hand-side coefficient:

3
v =
−1

6
w =
9

10
x =
5

24
y =
8
99
z =
1

As an augmented matrix, this system of equations looks like:

⎡ −1 0 0 0 0 3 ⎤


0 9 0 0 0 6 ⎥




0 0 5 0 0 10 ⎥
 for (v, w, x, y, z)


⎢ 0 0 0 8 0 24 ⎥

⎣ 0 0 0 0 1 99 ⎦

The 5x5 coefficient matrix has a special shape: it’s a diagonal matrix, meaning it only has
nonzero entries on the main diagonal, and is strictly zero everywhere else.

Remember that for any system of equations we are free to rearrange the ordering of the
rows freely, and additionally, we are free to rerrange the ordering of the columns so long as
we keep track of the corresponding ordering of the unknown variables. That’s why the easy-
to-solve diagonal structure has more to do with the structure of the equations themselves,
 
rather than the order in which we may first see them written.

Triangular Matrices 
Here’s a system of 4 equations in 4 unknowns which was selected to be easily solvable by
hand:

2w = 16

w − 3x = 14

−w + x + 10y = −10

w + x + 6y + 2z = 0
Make sure you can solve this system quickly by hand before you continue. You should find
w = 8, x = −2, y = 0, z = 3
.

This system is easy to solve because we can work one row at a time. Looking only at the first
equation, we have one equation and one unknown, so we’ll immediately find a value for

.
16
w = = 8
2

Next, looking at the second row, we can substitute in the first value we found w = 8
, and
then the second row then becomes 8 − 3x = 14
, or −3x = 6
, so x = −2
.

This process continues on and on for each subsequent equation because each new
equation contains only variables we’ve already solved, plus one new unknown which is
now easily solved by doing a bit of subtraction and division.

Here’s the same system written as an augmented matrix:

⎡ 2 0 0 0 16 ⎤


1 −3 0 0 14 ⎥



 for (w, x, y, z)

⎢ −1 1 10 0 −10 ⎥

⎣ 1 1 6 2 0 ⎦

This shape of coefficient matrix is called a lower triangular matrix, meaning that nonzero
values are present only on the main diagonal and any cells below the main diagonal. The
area above the main diagonal is filled with strictly zero values.

A lower triangular matrix problem is easy to solve through forward substitution:

1. Start at the first row, which contains a single nonzero variable term, and solve that
equation directly with division.
2. For all subsequent rows, substitute in the already-known values of the N-1 variables
corresponding to the previous N-1 rows, again leaving a simple algebra problem to find
the value of the Nth variable.

Similarly, we can mirror the situation. For example, consider this 3 equation and 3 unknown
system:

x − y + z = 2

3y − z = 3

2z = 6

We can again easily solve by hand to find x = 1, y = 2, z = 3


. But in this case, our intuition is
to solve from the bottom up: first find z = 6

2

, then substitute that into the equation above
that to find y
, and so on.
This system written as an augmented matrix:

⎡ 1 −1 1 2 ⎤


0 3 −1 3 ⎥
 for (x, y, z)
⎢ ⎥

⎣ 0 0 2 6 ⎦

The shape of this coefficient matrix is upper triangular, meaning that there are nonzero
values present only on the main diagonal and any cells above the main diagonal. All the cells
below the main diagonal are zero.

An upper triangular matrix problem is easy to solve through back substitution:

1. Start at the last row, solve directly.


2. Move up to the proceeding equation and substitute in the already-known values. Repeat
until finished.

In general, most systems of equations we’re interested in solving will not happen to be one
of the three easy to solve cases (diagonal, lower triangular, upper triangular). However, we’ll
now show that it’s possible to solve any matrix problem by doing a series of simple
operations that convert it into a triangular matrix problem without changing the solution of
 
the system of equations.

Row Operations 
Equations have some nice properties that we take for granted:

a = b

a + m = b + m

ka = kb

These basic rules say that if we have an equation, we can always add a constant to both sides
and it’s still true, or we can multiply by a constant and it’s still true.

If we have two equations a = b and c = d


then we can combine them:

a + c = b + d

Suppose we have a linearly independent system of 2 equations with 2 unknowns x, y


:

a11 x + a12 y = b1

a21 x + a22 y = b2

We can, for example, add the two equations together to create a new equation:

(a11 x + a12 y) + (a21 x + a22 y) = b1 + b2


Or, after factoring the terms:

(a11 + a21 )x + (a12 + a22 )y = b1 + b2

This last line is a new equation that is a linear combination of the original two equations. It
is not linearly independent of the original two. However, importantly, if we pick either one
(and only one) of the original equations, alongside our new equation, those two form a new
system of two equations that is a linearly independent (assuming the original equations were
linearly independent).

That is to say that we can modify our original system of equations in this way and get a new
system of equations which is equivalent to (i.e. has the same solution to) the original system.

Because this is usually done with the matrix representation as discussed above, and each
equation is represented by a row in our matrix, these modifications are called “row
operations.” For example, to illustrate the same addition and overwrite to form a new system:

a11 a12 b1
[ ]  for (x, y)
a21 a22 b2

Add row 1 + row 2; store result in row 2

−−−−−−−−−−−−−−−−−−−−−−−→

a11 a12 b1
[ ]  for (x, y)
(a11 + a21 ) (a12 + a22 ) (b1 + b2 )

Both the original & modified augmented matrices have the same solution. We can apply
further row operations to continue to manipulate the system. The goal of these row
operations will be to turn our original system into a triangular system which we know is easy

 to solve.

Here are the three types of allowed row operations:

Row Operation #1: Multiply a Row by a Constant 

An equation can always be scaled by multiplying both sides by a constant factor (as long as
that factor is not zero). This does not change the solution space of that equation.

We have to store the new, scaled equation in place of the original. We might write:

2R3 → R3

 
to indicate that we’re going to double row #3 and then store that new equation back in the
same row.

Row Operation #2: Swap Rows 


A system of equations has the same meaning regardless of what order we list the equations
in. Therefore, we can always swap two rows of our augmented matrix and have a new system
with the same meaning. We might write:

R2 ↔ R3

to
  indicate that we’re going to swap row #2 with row #3.

(All reorderings of rows are possible through a series of pairwise row swaps.)

Row Operation #3: Add a Row to Another Row 

We can always add any nonzero multiple of one row to another row. This produces a linear
combination of the two parent rows.

In order to maintain all the information that is previously present in the original system, we
have to replace one of the two parent rows with our new linear combination.

For example, we might write:

R4 − 3R1 → R4

to indicate that we’re going to take the fourth row, subtract three times the first row, and
then store the result again in the fourth row.

All three of these row operations produce a new system of equations which has the same
solution as the original system. Now, we’ll see how to use these operations to solve a system
 
of equations.

Gaussian Elimination 
Solving a linear system using Gaussian elimination is a two-step process:

1. First, we’ll use row operations to change a matrix into an upper-triangular one.
2. Second, we’ll solve the upper-triangular system (which is easy to do, as we’ve shown
earlier).

Let’s consider the following system of equations:


w + x + y = 11

2w − y = 0

8x + y + z = 12

w + y = 9

or, as an augmented matrix:

⎡ 1 1 1 0 11 ⎤


2 0 −1 0 0 ⎥



 for (w, x, y, z)

⎢ 0 8 1 1 12 ⎥

⎣ 1 0 1 0 9 ⎦

For each row of the matrix, we’ll look at a pivot value – the value that happens to be on the
main diagonal – and use it to eliminate any nonzero matrix coefficients in the equations
below it. For the first row, the pivot is marked here in red, and the two nonzero values to be
eliminated are marked in green:

⎡ 1 1 1 0 11 ⎤


2 0 −1 0 0 ⎥



 for (w, x, y, z)

⎢ 0 8 1 1 12 ⎥

⎣ 1 0 1 0 9 ⎦

To remove the green values we’ll subtract the correct multiple (the multiple which cancels
out the corresponding term) of the pivot row from each of the rows below it:
2
R2 − R1 →R2
1

1
R4 − R1 →R4
1

−−−−−−−−−→

⎡ 1 1 1 0 11 ⎤


0 −2 −3 0 −22 ⎥



 for (w, x, y, z)

⎢ 0 8 1 1 12 ⎥

⎣ 0 −1 0 0 −2 ⎦

(No operation was necessary on the 3rd row, because its corresponding cell in the pivot
column was already 0.)

Note that after these two row operations, we’ve made it so that the entire column below the
pivot cell is all zeros.

Now, we move to the 2nd row. The pivot is the red cell on the main diagonal, and the green
cells below it are ones we’ll work on eliminating:

⎡ 1 1 1 0 11 ⎤


0 −2 −3 0 −22 ⎥



 for (w, x, y, z)

⎢ 0 8 1 1 12 ⎥

⎣ 0 −1 0 0 −2 ⎦

Again, we need two row operations to eliminate the 8 and the -1 values:
8
8
R − R →R
3 2 3
−2

−1
R − R →R
4 2 4
−2

−−−−−−−−−−→

⎡ 1 1 1 0 11 ⎤


0 −2 −3 0 −22 ⎥




 for (w, x, y, z)

0 0 −11 1 −76 ⎥

⎢ ⎥
3
⎣ 0 0 0 9 ⎦
2

There are lots of negative signs flying around, so be careful!

Finally, we’ll operate on the 3rd row:

⎡ 1 1 1 0 11 ⎤


0 −2 −3 0 −22 ⎥




 for (w, x, y, z)

0 0 −11 1 −76 ⎥

⎢ ⎥
3
⎣ 0 0 0 9 ⎦
2

With just one remaining row, only one row operation is necessary to clear that cell.
3/2

R − R →R
4 3 4
−11

−−−−−−−−−−→

⎡ 1 1 1 0 11 ⎤


0 −2 −3 0 −22 ⎥




 for (w, x, y, z)

0 0 −11 1 −76 ⎥

⎢ ⎥
3 30
⎣ 0 0 0 − ⎦
22 22

There are some unpleasant fractions flying around, but the result of this operation is that the
coefficient matrix is now upper triangular!

⎡ 1 1 1 0 11 ⎤


0 −2 −3 0 −22 ⎥




 for (w, x, y, z)

0 0 −11 1 −76 ⎥

⎢ ⎥
3 30
⎣ 0 0 0 − ⎦
22 22

Now that it’s upper triangular, we just solve with back-substitution: start from the last row,

and solve to find
.
3 30
z = − z = −10
22 22

Next, we move up a row and look at the 3rd row, while substituting in the value of the one
unknown z
we’ve already solved:

−11y + z = −76

−11y − 10 = −76

−11y = −66

y = 6

And now the 2nd row, substituting in the two unknowns y, z


we now know:
−2x − 3y = −22

−2x − 18 = −22

−2x = −4

x = 2

And finally the 1st row, substituting in all the other terms:

w + x + y = 11

w + 2 + 6 = 11

w = 3

And that’s it! We’ve solved for all our unknowns. Most of the work was involved in using
each pivot to eliminate all the nonzero values in the column below it. The resulting
upper triangular matrix was easy to solve bottom-up.

Gaussian elimination, as shown here, is a very mechanical way to use row operations to
transform an arbitrary matrix problem into a triangular one by eliminating nonzero terms
 
below the main diagonal, one at a time.

Gaussian Elimination with Pivoting 


At each step in the Gaussian elimination process, we looked at row p
and our pivot value A pp

, and then used it to cancel out any nonzero terms A np  for n > p
. The cancellation worked
because we multiplied row p
by a ratio of the terms,
so that:
Anp

App

Anp
Anp − ( )App = 0
App

in order to zero out that cell. However, this process breaks if we have to divide by zero, i.e.
if App = 0
.

As an example, let’s consider the same four equations, but let’s swap row 1 and row 3 before
we begin:

8x + y + z = 12

2w − y = 0

w + x + y = 11

w + y = 9

or, as an augmented matrix:

⎡ 0 8 1 1 12 ⎤


2 0 −1 0 0 ⎥



 for (w, x, y, z)

⎢ 1 1 1 0 11 ⎥

⎣ 1 0 1 0 9 ⎦
There is no way we can use the red 0 pivot cell to eliminate the green nonzero values in the
column below it. There is no multiple of 0 that will additively cancel out nonzero values.

A pivot value must be nonzero.

If you encounter a case where the pivot value is zero, you must swap rows so that the pivot
is not zero before you proceed. If this happens at an intermediate step, that’s OK: you can
pause and swap row p
with some later row n > p
, and then resume. (Don’t swap with an
 
earlier row n < p
because that will have nonzero values to the left of the pivot cell!)

LU Decomposition 
Circuit simulators like CircuitLab (and other numerical software) usually approach Gaussian
elimination with a small twist. Instead of only working on an NxN original coefficient matrix
A
to create the NxN upper triangular matrix (which we’ll call U
for upper), we’ll
simultaneously create a NxN lower triangular matrix (L
for lower), such that A = LU
, the
matrix product of the lower and upper triangular matrices.

The process of solving A = LU


is called LU Decomposition or LU Factorization.

Notice that the Gaussian elimination process to make the matrix be upper triangular did not
depend on the right-hand-side constant values; while they were modified by the row
operations, they did not determine the modifications of the coefficient rows, and were
referenced only in the backsubstitution step. If we wanted to use the same left-hand-side A
but re-solve our equations for a different set of constants B
on the right, we would have to
redo much of our work.

However, if we already have a decomposition A = LU


, then we can solve the system
Ax = B
for a new right-hand-side B
using this two-step process:

1. Solve Ly = B
to find a vector y
using forward substitution since L
is lower triangular.
2. Solve Ux = y
to find x
using back substitution since U
is upper triangular.

Ax = B

LUx = B

L(Ux) = B

Ly = B

These two solves are very, very fast because they’re each working on a triangular matrix.

The process of creating the L, U


matrices is very similar to Gaussian elimination, but uses L
to keep track of the multiplications used for each pivot elimination. We won’t get into more
detail here, but the LU decomposition for our 4x4 example matrix above is:

A = LU


1 1 1 0
⎤ ⎡1 0 0 0⎤⎡1 1 1 0 ⎤


2 0 −1 0 ⎥

2 1 0 0 ⎥

0 −2 −3 0 ⎥



= ⎢


⎢0 ⎢


8 1 1⎥ ⎢0 −4 1 0⎥⎢0 0 −11 1 ⎥

⎣ ⎦ ⎣1
1 3 3
1 0 1 0 − − 1⎦⎣0 0 0 ⎦
2 22 22

Where do these L, U
matrices come from?

1. Notice that the U


shown here is identical to the result of our Gaussian elimination
process above.
2. If you look closely, you may notice that the cell values in the L
matrix come from the
coefficients on each of the row operations we performed to turn A
into U
.

Solving Ly = B
finds the same transformed right-hand side constants that Gaussian
elimination produces on the augmented matrix. Then, solving Ux = y
just solves the same
upper triangular system resulting from Gaussian elimination. The first step may look
unfamiliar, but it’s really just doing the same work that Gaussian elimination would do on the
right-hand side.

In practice, many problems (including many circuit problems) involve re-solving the same
coefficient matrix A
with different constant values B
repeatedly, so performing the relatively
slow LU decomposition just once and then performing the fast triangular solves as many
 
times as needed can be a tremendous speedup.

LU Factorization with Pivoting 


Just as Gaussian elimination requires swapping rows to ensure nonzero pivot values, LU
decomposition actually may require reordering the rows too. In practical implementations,
LU decomposition actually transforms A
by solving PA = LU
, where P
is a permutation
matrix (exactly one 1 in every row and every column; zero otherwise) which simply reorders
the rows of A so that a decomposition exists.

From the LU decomposition with row permutation, we can solve:

Ax = B

PAx = PB

LUx = PB

Notice that we apply the row permutation to both the coefficients PA


and to the right-
hand-side constants PB
; this preserves their integrity as equations, just like the augmented
matrix form does.
As PB
is just a reordering of the vector B
, we can then continue with the forward and back
substitution solves as before:

1. First solve Ly = PB
to find y
.
2. Then solve Ux = y
.

The algorithms for actually finding P, L, U


from a given A
are somewhat advanced and we
won’t elaborate in this book, but the implementation details are important to producing a
good result, especially on matrices of large size, high sparsity, and wide dynamic range, as
are often present in electronics problems. High quality implementations of LU decomposition
like the one built into the CircuitLab software provide high performance and excellent
 
numerical stability over a wide space of possible input matrices.

Strategies for Solving Systems of Equations by Hand 


When we’re asked to solve a system of equations by hand, we’ll still use the same tools, but
we can make our task slightly easier by taking a few shortcuts as we see them.

For systems of N linear equations with N unknowns, write the equations with all unknown
terms on the left-hand side and all constants on the right-hand side, and then follow these
three rules:

1. If there are any equations with just one unknown (such as 5x = 3


), first solve for that
unknown and then substitute its value into all remaining equations, and collect with the
constant term on the right-hand side. You can now remove that equation from the
system. This reduces the remaining number of equations by 1 and reduces the remaining
number of unknowns by 1.
2. If there are any equations with two unknowns and zero constant (such as x − 2y = 0
), then these two variables are simply a multiple of each other (x = 2y
). Choose one
variable to keep, and replace all instances of the removed variable with the correct
multiple of the retained variable (such as replacing x
with 2y
) in all remaining equations.
You can now set aside the original equation (such as x − 2y = 0
) and deal with it at the
very end of your solution process (for example, after you know the value of y
). This
reduces the remaining number of equations by 1 and reduces the remaining number of
unknowns by 1.
3. Use the row operation of adding a multiple of one row to another row in order to cancel
out a term of one equation, reducing the number of unknowns in that particular
equation by one. Aim toward modifying an equation to look more like rule #1 or #2.
(Once there is only one unknown left in that equation, then rule #1 applies.)
Repeat, doing one operation from this list until the system is completely solved. Some tips:

You may work on the rows in whatever order will make the least work.
Look for overlap.
Look for places you can cancel terms without adding any new terms.
You may work with the equations or with the augmented matrix form; both are
equivalent, but one is more verbose and the other might make it too easy to make
mistakes when working by hand.

This strategy is essentially Gaussian elimination, but we’ve intermixed the reduction and the
substitution steps to do whichever is more convenient first, because if we’re showing our
work by writing each step on paper it’s often convenient to reduce the dimensionality of our
problem as early as possible.

Here’s an example of how we might solve a system by hand. Let’s start with the same system
we solved using Gaussian elimination earlier:

w + x + y = 11

2w − y = 0

8x + y + z = 12

w + y = 9

or, as an augmented matrix:

⎡ 1 1 1 0 11 ⎤


2 0 −1 0 0 ⎥



 for (w, x, y, z)

⎢ 0 8 1 1 12 ⎥

⎣ 1 0 1 0 9 ⎦

First, we notice that we can use the 4th row to cancel out 3rd-column y
terms in both the 1st
and 2nd equations without introducing any new nonzero terms:

R −R →R
1 4 1

R +R →R
2 4 2

−−−−−−−−→

⎡ 0 1 0 0 2 ⎤


3 0 0 0 9 ⎥



 for (w, x, y, z)

⎢ 0 8 1 1 12 ⎥

⎣ 1 0 1 0 9 ⎦

The first and second row now have just one nonzero term each, so we immediately have two
of our results:

x = 2  and  w = 3

We can substitute these values into the remaining two equations to produce our remaining
system:
1 1 −4
[ ]  for (y, z)
1 0 6

We now have a simple second equation with y = 6


, which we can then plug into the first
equation and find z = −10
. We now have solved all four unknowns. You can and should
quickly check that these values satisfy the original four equations, and they match the
Gaussian elimination solution we found previously.

Let’s try another approach. Here’s the original system of equations again:

⎡ 1 1 1 0 11 ⎤


2 0 −1 0 0 ⎥



 for (w, x, y, z)

⎢ 0 8 1 1 12 ⎥

⎣ 1 0 1 0 9 ⎦

Now we’ll notice that the 2nd row has two terms and zero constant value: it says 2w − y = 0
,
or y = 2w
. We’re going to choose to keep w
and remove y
from our set of equations by
substituting 2w
every time we see a y
, and then we can remove the 2nd equation entirely
from our system:
substitute 2w for y

remove R
2

−−−−−−−−−−−→

⎡ 3 1 0 11 ⎤


2 8 1 12 ⎥
 for (w, x, z)
⎢ ⎥

⎣ 3 0 0 9 ⎦

Notice that the right-hand side values are unchanged. While we haven’t actually determined
the numerical value for y
yet, we’ve now quickly reduced the dimensionality of our problem
from 4 to 3. And in doing so, the new third equation is trivially solvable for w = 3
.

Now that we know w


, we also can remember to solve our simpliciation, y = 2w for  y = 6
.

We substitute w = 3
into the remaining two equations and remove the third equation to find:

1 0 2
[ ]  for (x, z)
8 1 6

From here, it’s clear from the 1st row that x = 2


, and then when substituting that value into
the second row, we find z = 10
.

This is the same example system of equations we solved mechanically in the “Gaussian
Elimination” section above, but by applying these solve-by-hand strategies, we have to do a
lot less work (significantly fewer total operations), avoid ugly fractions, and get the answer
faster!
When solving by hand, we have lots of opportunities to make arithmetic mistakes (for
example, adding rather than subtracting when computing the new right-hand side after
substitution), as well as record-keeping mistakes (for example, not keeping track of which
column maps to which variable as we go). Fortunately, it’s relatively easy to check your
solution after you compute it: just plug it into each of the equations and make sure the left
side equals the right. Get in the habit of quickly checking your work after you solve systems
 
of equations.

What’s Next 
In the next section, Steady State & Transient, we’ll consider how to separate the analysis of a
system’s normal operating conditions from events that may change those conditions
temporarily or permanently.

Robbins, Michael F. Ultimate Electronics: Practical Circuit Design and Analysis. CircuitLab, Inc., 2021,
ultimateelectronicsbook.com. Accessed 06 Sep 2022. (Copyright © 2021 CircuitLab, Inc.)

You might also like