Sudoku Solver

Sudoku Solver
This page describes an algorithm for solving Sudoku puzzles, which have suddenly become
popular in the UK, and provides a Python script to automate the process.
The "Easy" Example

Introduction
Applying the Given Numbers
Looking for Singletons
Pairs
Eliminating Bad Guesses
If All Else Fails
Download the Script
Updates:
Oct 9 2005 - Replaced the Python
script with an easier to understand
version that doesn't use recursion.
Altered this page to reflect the new
algorithm..
The "Easy" Example

1
4
4
6
7 5
3
5
3
7 3
6
1
We will start off with an easy example. This one

appeared in the Evening Standard on 31st May
2005, where it was described as "Hard". It has 27
squares filled in, the examples they describe as
"Easy" have 32. This doesn't make much difference
to a computer algorithm, though if you were working
through this by hand the more initial numbers you
are given the quicker you should be able to solve it.
I call it "easy" as it can be solved after applying only
the simpler parts of the algorithm.
8 4
2 9
Introduction
The algorithm works by writing out all the possible answers and then eliminating those that
are incompatible with the numbers published in the puzzle. The starting position is that every
cell can contain any of the numbers 1-9. We represent this by writing all the possible values
into each cell, which gives us a grid that looks like this.
123456 1234567 1234567 1234567 1234567 1234567 1234567 1234567 1234567
789
89
89
89
89
89
89
89
89
123456 1234567 1234567 1234567 1234567 1234567 1234567 1234567 1234567
789
89
89
89
89
89
89
89
89
123456 1234567 1234567 1234567 1234567 1234567 1234567 1234567 1234567
789
89
89
89
89
89
89
89
89
123456 1234567 1234567 1234567 1234567 1234567 1234567 1234567 1234567

789
89
89
89
89
89
89
89
89
123456 1234567 1234567 1234567 1234567 1234567 1234567 1234567 1234567
789
89
89
89
89
89
89
89
89
123456 1234567 1234567 1234567 1234567 1234567 1234567 1234567 1234567
789
89
89
89
89
89
89
89
89
123456 1234567 1234567 1234567 1234567 1234567 1234567 1234567 1234567
789
89
89
89
89
89
89
89
89
123456 1234567 1234567 1234567 1234567 1234567 1234567 1234567 1234567
789
89
89
89
89
89
89
89
89
123456 1234567 1234567 1234567 1234567 1234567 1234567 1234567 1234567
789
89
89
89
89
89
89
89
89
The overall approach is to eliminate the impossible so that what remains must be the answer.
Applying the Given Numbers

In the example above we have been given the values for 27 of the cells. The first thing to do is
to apply these to our grid and eliminate all the values that are inconsistent with these.
Working across the top row the first value we have is in the 5th cell. This is a 1. This means
that there can be no other '1's on the top row or in the 5th column, or in the top-middle block
of 9. Crossing out all these '1's leaves
234567 2345678 2345678 2345678 1
89
9
9
9
2345678 2345678 2345678 2345678

9
9
9
9
123456 1234567 1234567 2345678 234567 2345678 1234567 1234567 1234567

789
89
89
9
89
9
89
89
89
123456 1234567 1234567 2345678 234567 2345678 1234567 1234567 1234567
789
89
89
9
89
9
89
89
89
123456 1234567 1234567 1234567 234567 1234567 1234567 1234567 1234567
789
89
89
89
89
89
89
89
89
123456 1234567 1234567 1234567 234567 1234567 1234567 1234567 1234567
789
89
89
89
89
89
89
89
89
123456 1234567 1234567 1234567 234567 1234567 1234567 1234567 1234567
789
89
89
89
89
89
89
89
89
123456 1234567 1234567 1234567 234567 1234567 1234567 1234567 1234567
789
89
89
89
89
89
89
89
89
123456 1234567 1234567 1234567 234567 1234567 1234567 1234567 1234567
789
89
89
89
89
89
89
89
89
123456 1234567 1234567 1234567 234567 1234567 1234567 1234567 1234567

789
89
89
89
89
89
89
89
89
Continuing this process; after entering the 9 given values on the top three rows we have
2368
2368
2368
23567
3578
3578
1239
1239
2357
2357
13579
13579
13689
346
346
346
1389
1389
123456 1234678 1235678 1234567 2345678 1234567 1234578 1235678 1345678

89
9
9
9
9
8
9
9
9
123456 1234678 1235678 1234567 2345678 1234567 1234578 1235678 1345678
89
9
9
9
9
8
9
9
9
123456 1234678 1235678 1234567 2345678 1234567 1234578 1235678 1345678
89
9
9
9
9
8
9
9
9
123456 1234678 1235678 1234567 2345678 1234567 1234578 1235678 1345678
89
9
9
9
9
8
9
9
9
123456 1234678 1235678 1234567 2345678 1234567 1234578 1235678 1345678
89
9
9
9
9
8
9
9
9
123456 1234678 1235678 1234567 2345678 1234567 1234578 1235678 1345678
89
9
9
9
9
8
9
9
9
This hasn't made much impact on the bottom two thirds of the grid because so far we've only
had one value in each column. But we've got 18 more values to place. When we get to '6' in
the 6th column of the 6th row the grid is becoming a lot sparser and we notice that the only
possible value left for the sixth cell on the third row is '4'. So we have found our first missing
number (shown in green).
2368
2368
2368
23567
3578
3578
1239
123
2357
57
13579
13579
13689
36
346
1389
1389
368
3678
578
3578
35678
12468
124678 12678
579
5789
125789
1256789 156789
12378
789
123789
123789
12378
13789
123468 1234678 12356789 235679 23456789 14578 12345789 12356789 1356789

9
123468 1234678 12356789 235679 23456789 14578 12345789 12356789 1356789
9
123468 1234678 12356789 235679 23456789 14578 12345789 12356789 1356789
9
After we've used up all the given numbers (and the one we found) we we are left with the
grid.
236
2368
238
3567 1
1239 23
2357
1389
36
36
378
3578
578
57 6
159
1579
36
1389
189
578
578
568
1246 24678 1278
579
578
125789
125689 156789
278
1278
78
12789
1789
24
259
56
456
15 1259
239
237
357
357
1259
159
34
357
57 58
568
568
So we move onto phase 2, "Looking for Singletons".
Looking for Singletons

Phase two involves counting how often the undecided numbers appear in a set of cells (row,
column or 3x3 super-cell). If we start with the first column, the undecided cells are the 1st,
2nd, 4th, 5th, 8th and 9th. The possibilities are '236', '1239', '36', '1246', '239' and '34'. If we
count how often each undecided number appears (the known ones are excluded from this
process) we get
Number
2 3
4 5
6 7
8 9
Frequency 2
4 5
2 -
3 -
Every undecided number appears at least twice, so this doesn't help us. Repeating this
exercise for columns 2 and 3 doesn't produce any useful information either. When we try
column 4 though, we get
Number
Frequency -
2 3
4 5
6 7
8 9
3 3
Now the '9' only appears once, in the 5th row, so that is where it must be. We have found our
second number. We can now set the '9' and repeat the process described in phase 1 giving.
236
2368
238
3567 1
3578
578
1239 23
2357
57 6
159
1579
1389
36
36
1389
189
36
378
578
578
568
1246 24678 1278
578
12578
12568 15678
278
1278
78
12789
1789
24
259
56
456
15 1259
239
237
357
357
1259
159
34
357
57 58
568
568
That didn't help very much, so we restart this phase at the beginning. The first four columns
offer no additional insights, but now the frequencies in column 5 are
Number
2 3
4 5
6 7
8 9
Frequency -
1 3
1 5
2 5
3 -
Which gives us two more discoveries, '2' on the second row and '4' on the 7th. Filling these in
using the phase 1 algorithm sets off a chain reaction...
Action
Consequences
(2,5) = 2 (2,2) = 3
(2,2) = 3 None
(7,5) = 4 (7,2) = 2
(7,2) = 2 (8,2) = 7
(8,2) = 7 (6,2) = 8
(6,2) = 8 (1,2) = 6, (6,5) = 7
(1,2) = 6 (1,1) = 2, (5,2) = 4
(6,5) = 7 None
(1,1) = 2 (1,3) = 8
(5,2) = 4 None
(1,3) = 8 None
Co-ordinates are (row, column) with the top left being (1,1). The grid now looks like this
(with the new discoveries highlighted).
2
6 8
357 1
357
57
19 3 4
57 6
159
1579
36
36
1389
189
36 9 37
58
578
568
16 4 127
58
12578
12568 15678
8 12
129
19
2 59
56
15 159
39 7 6
35
35
1259
159
34 1 35
57 58
568
568
5 19
Restartind g this phase counting down the columns shows us that the bottom cell in the first
column must be '4'. This doesn't lead to any further discoveries. Trying again reveals that the
top cell in the 4th column must be '7'. This second discovery forces both (1,9) and (2,6) to be
'5', which in turn forces three more discoveries, leaving.
2
6 8
9 3
19 3 4
5 6
19
179
36 36
4 189
189
36 9 37
58
2 578
568
16 4 127
58
3 12578
12568 1678
8 12
6 129
19
2 59
56 4
1 59
39 7 6
35 35
8 4
1259
19
7 58
568
68
5 19
1 35
Searching the columns again we see that the third cell in the fifth column must be a 6. Setting
this triggers another cascade of discoveries which culminates with the solution.
2 6
8 7
1 9
3 4
1 3
4 8
2 5
6 9
7 5
9 3
6 4
1 8
3 9
7 1
8 2
5 6
6 4
2 9
5 3
7 1
5 8
1 4
7 6
2 3
8 2
5 6
4 1
9 7
9 7
6 5
3 8
4 2
4 1
3 2
9 7
8 5
For this example that is as far as we need to go, which is why I've described it as "Easy". I
don't have any examples, but any patterns that can be solved only by using the first phase
could be described as "Trivial". The next sections discuss some more difficult examples of
the problem.
Pairs
The first two phases are sufficient to solve most examples published in popular newspapers,
but here is a more difficult example someone sent me. It only has 23 given numbers, 4 less
than the previous example.
2
5
6
3
1
9
8
7
3 9
7 6
5
7
8
9
8 1
9
After applying phases 1 and 2 we are still left with.
134
379
134
1578
14589
14
679
13469
1234
127
1249
3479
79
1349
279
124
1249
124
479
238
23
29
289
128
128
12
28
28
12345
125
1245
349
259
2349
234
23
2345
2456
2346
1248
28
125
12456
47
2567
246
Which appears to have still a long way to go. If we examine the second column we can see
things of interest.
1. The numbers 7 and 9 only appear twice, and when the do appear
they appear together.
2. The are two cells that can only contain 2 or 8.
The first of these rules will allow us to simplify cells (1,2) and (3,2) which currently show the
options 3,7,9 and 2,7,9. Because 7 and 9 only appear in these two cells, one must contain 7
and the other 9, though we don't know which is which. This does though allow us to remove
the possibilities that these cells contain a 3 or a 2.
The second of these rules applies to cells (6,2) and (9,2), both of which can only contain a 2
or an 8. Again this means that one of the cells must contain 2 and the other 8. Although we
don't know which is which we do no that none of the other cells in the column can be a 2 or
an 8. This allows us to remove the twos from (3,2) and (8,2), though in this case (3,2) has
already had its two removed using the previous rule.
The table now looks like
134
79
134
1578
14589
14
679
13469
1234
127
1249
3479
79
1349
79
124
1249
124
479
238
23
29
289
128
128
12
28
28
12345
125
1245
349
259
2349
234
2345
2456
2346
1248
28
125
12456
47
2567
246
With the changed cells in green. The fact that 3 is now in a cell on it's own means that we can
eliminate all the other 3's in the same block or row.
134
79
134
1578
14589
14
679
13469
1234
127
1249
3479
79
1349
79
124
1249
124
479
238
23
29
289
128
128
12
28
28
1245
125
1245
349
259
2349
24
245
2456
246
1248
28
125
12456
47
2567
246
In this case there are no more pairs that can be used to simplify the grid, so we move onto
phase 4, Eliminating Bad Guesses.
Eliminating Bad Guesses

This phase is simple to described but harder to implement. In the previous example, after
exhausting phase 3 we are still left with
134
79
134
1578
14589
14
679
13469
1234
127
1249
3479
79
1349
79
124
1249
124
479
238
23
29
289
128
128
12
28
28
1245
125
1245
349
259
2349
24
245
2456
246
1248
28
125
12456
47
2567
246
The next phase involves making guesses and seeing the consequences. There are three
possible outcomes,
1. the search runs to completion with a solution
2. the search leads to a contradiction
3. the search is inconclusive
This first outcome at first glance appears to be ideal, but at this moment we don't know there
is only one solution so we ignore it. Instead what we look for is outcome 2, the contradiction
becase we now know that that guess must be wrong. Looking at the table above there are 141
possible guesses, starting with the top left cell being 1.
This part of the algorithm involves making that guess and then applying the three previous
phases to see where it leads us. In this case setting cell (1,1) to 1 leads to a contradiction so
we can eliminate that guess leaving only 3 or 4 in the top left cell. Trying 3 causes no
problems but 4 leads to a contradiction, so we can eliminate that too.
Now we only have 3 left so that must be the value in the top left cell. Fixing this value and
applying the previous phases leads to the solution
3 9
4 5
6 7
8 1
9 4
5 2
7 6
2 3
1 8
If All Else Fails

If we take the example from the start of phase 3 (Pairs) and remove the '6' on the fifth row we
are left with this puzzle.
2
5
6
3
5
7
3 9
7 6
5
7
8
9
8 1
9
After applying the first 4 phases we are left with
134
79 134
1234 5
78
1589
45 2
67
127
129
379 134
3479
1346
126
79 26
129
14 479
25 6
239 39
34 346
168
68
12 5
23
56
56
1234
12
1245
349
359 2349
2456
35
2345 34 2345
2346
145
1256 12456
47
567 246
Phase 5 works by working through the possible values recursively until it finds a solution or a
contradiction. It steps through the possible guesses sequentially. The first cell (1,1) can be 1, 3
or 4. Choosing 1 gives
1
79 34
59
45 2
67
346
34
39
26
79 26
19
14 479
25 6
239 39
34 346
16
12 5
23
56
56
45
349
359 349
2345 34 2345
456
45
156 1456
47
567 2
8
35
349
The next guess is to try 7 in cell (1,2). This leads to

1 7 3
8 9
5 2
4 5 8
7 2
6 39
39 1
2 9 6
3 1
4 7
8 1 5
4 7
2 6
39 39
9 3 4
6 8
1 5
6 2 7
5 3
9 1
7 6 1
2 4
8 39
39
3 4 2
9 5
7 8
5 8 9
1 6
3 4
The third guess, setting (2,7) to 3 leads to a solution

1 7
3 8
9 5
2 6
4 5
8 7
2 6
3 9
2 9
6 3
1 4
7 8
8 1
5 4
7 2
6 3
9 3
4 6
8 1
5 2
346
6 2
7 5
3 9
1 4
7 6
1 2
4 8
9 5
3 4
2 9
5 7
8 1
5 8
9 1
6 3
4 7
Stepping back one guess and trying (2,7) = 9 also leads to a solution. All in all there are 32
possibilities, see here.
Download the Script

There are two scripts to choose from. Click on the link to download the corresponding one.
SuSolver.py
The new script. This implements the 5 phases described above

except the first part of phase 3 (looking for numbers that only appear
twice and in pairs). This solves all the examples I have tried. When it
can't solve a puzzle using the first 4 phases it enters phase 5 which
is intended to enumerate the possible solutions (on the assumption
that there are more than one). If there is only 1 solution which isn't
found after the first 4 phases then phase 5 will find it.
OldSuSolver.p
y
This is the previous version I published. It is very similar but in place

of the current phase 4 it went straight into a recursive algorithm. The
problem with this version is that it will always find a solution even
when there aren't any, picking the first one it finds from the set of
possibilities. I'm pretty confident though, that if there is a unique
solution it will find it.
More information about running Python scripts can be found on the Python Patterns page.
Visits since June 2005:

Sudoku Solver

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Sudoku Solver

Uploaded by

Copyright:

Available Formats

Sudoku Solver

The "Easy" Example

Download the Script

The "Easy" Example

We will start off with an easy example. This one

123456 1234567 1234567 1234567 1234567 1234567 1234567 1234567 1234567

Applying the Given Numbers

2345678 2345678 2345678 2345678

123456 1234567 1234567 2345678 234567 2345678 1234567 1234567 1234567

123456 1234567 1234567 1234567 234567 1234567 1234567 1234567 1234567

123456 1234678 1235678 1234567 2345678 1234567 1234578 1235678 1345678

123468 1234678 12356789 235679 23456789 14578 12345789 12356789 1356789

1246 24678 1278

So we move onto phase 2, "Looking for Singletons".

Looking for Singletons

1246 24678 1278

Eliminating Bad Guesses

If All Else Fails

The next guess is to try 7 in cell (1,2). This leads to

The third guess, setting (2,7) to 3 leads to a solution

Download the Script

The new script. This implements the 5 phases described above

This is the previous version I published. It is very similar but in place

You might also like