Professional Documents
Culture Documents
This page describes an algorithm for solving Sudoku puzzles, which have suddenly become
popular in the UK, and provides a Python script to automate the process.
Updates:
Oct 9 2005 - Replaced the Python
script with an easier to understand
version that doesn't use recursion.
Altered this page to reflect the new
algorithm..
4
6
7 5
3
5
3
7 3
6
1
8 4
2 9
Introduction
The algorithm works by writing out all the possible answers and then eliminating those that
are incompatible with the numbers published in the puzzle. The starting position is that every
cell can contain any of the numbers 1-9. We represent this by writing all the possible values
into each cell, which gives us a grid that looks like this.
123456 1234567 1234567 1234567 1234567 1234567 1234567 1234567 1234567
789
89
89
89
89
89
89
89
89
123456 1234567 1234567 1234567 1234567 1234567 1234567 1234567 1234567
789
89
89
89
89
89
89
89
89
123456 1234567 1234567 1234567 1234567 1234567 1234567 1234567 1234567
789
89
89
89
89
89
89
89
89
2368
2368
23567
3578
3578
1239
1239
2357
2357
13579
13579
13689
346
346
346
1389
1389
2368
2368
23567
3578
3578
1239
123
2357
57
13579
13579
13689
36
346
1389
1389
368
3678
578
3578
35678
12468
124678 12678
579
5789
125789
1256789 156789
12378
789
123789
123789
12378
13789
9
After we've used up all the given numbers (and the one we found) we we are left with the
grid.
236
2368
238
3567 1
1239 23
2357
1389
36
36
378
3578
578
57 6
159
1579
36
1389
189
578
578
568
579
578
125789
125689 156789
278
1278
78
12789
1789
24
259
56
456
15 1259
239
237
357
357
1259
159
34
357
57 58
568
568
2 3
4 5
6 7
8 9
Frequency 2
4 5
2 -
3 -
Every undecided number appears at least twice, so this doesn't help us. Repeating this
exercise for columns 2 and 3 doesn't produce any useful information either. When we try
column 4 though, we get
Number
Frequency -
2 3
4 5
6 7
8 9
3 3
Now the '9' only appears once, in the 5th row, so that is where it must be. We have found our
second number. We can now set the '9' and repeat the process described in phase 1 giving.
236
2368
238
3567 1
3578
578
1239 23
2357
57 6
159
1579
1389
36
36
1389
189
36
378
578
578
568
578
12578
12568 15678
278
1278
78
12789
1789
24
259
56
456
15 1259
239
237
357
357
1259
159
34
357
57 58
568
568
That didn't help very much, so we restart this phase at the beginning. The first four columns
offer no additional insights, but now the frequencies in column 5 are
Number
2 3
4 5
6 7
8 9
Frequency -
1 3
1 5
2 5
3 -
Which gives us two more discoveries, '2' on the second row and '4' on the 7th. Filling these in
using the phase 1 algorithm sets off a chain reaction...
Action
Consequences
(2,5) = 2 (2,2) = 3
(2,2) = 3 None
(7,5) = 4 (7,2) = 2
(7,2) = 2 (8,2) = 7
(8,2) = 7 (6,2) = 8
(6,2) = 8 (1,2) = 6, (6,5) = 7
(1,2) = 6 (1,1) = 2, (5,2) = 4
(6,5) = 7 None
(1,1) = 2 (1,3) = 8
(5,2) = 4 None
(1,3) = 8 None
Co-ordinates are (row, column) with the top left being (1,1). The grid now looks like this
(with the new discoveries highlighted).
2
6 8
357 1
357
57
19 3 4
57 6
159
1579
36
36
1389
189
36 9 37
58
578
568
16 4 127
58
12578
12568 15678
8 12
129
19
2 59
56
15 159
39 7 6
35
35
1259
159
34 1 35
57 58
568
568
5 19
Restartind g this phase counting down the columns shows us that the bottom cell in the first
column must be '4'. This doesn't lead to any further discoveries. Trying again reveals that the
top cell in the 4th column must be '7'. This second discovery forces both (1,9) and (2,6) to be
'5', which in turn forces three more discoveries, leaving.
2
6 8
9 3
19 3 4
5 6
19
179
36 36
4 189
189
36 9 37
58
2 578
568
16 4 127
58
3 12578
12568 1678
8 12
6 129
19
2 59
56 4
1 59
39 7 6
35 35
8 4
1259
19
7 58
568
68
5 19
1 35
Searching the columns again we see that the third cell in the fifth column must be a 6. Setting
this triggers another cascade of discoveries which culminates with the solution.
2 6
8 7
1 9
3 4
1 3
4 8
2 5
6 9
7 5
9 3
6 4
1 8
3 9
7 1
8 2
5 6
6 4
2 9
5 3
7 1
5 8
1 4
7 6
2 3
8 2
5 6
4 1
9 7
9 7
6 5
3 8
4 2
4 1
3 2
9 7
8 5
For this example that is as far as we need to go, which is why I've described it as "Easy". I
don't have any examples, but any patterns that can be solved only by using the first phase
could be described as "Trivial". The next sections discuss some more difficult examples of
the problem.
Pairs
The first two phases are sufficient to solve most examples published in popular newspapers,
but here is a more difficult example someone sent me. It only has 23 given numbers, 4 less
than the previous example.
2
5
6
3
1
9
8
7
3 9
7 6
5
7
8
9
8 1
9
After applying phases 1 and 2 we are still left with.
134
379
134
1578
14589
14
679
13469
1234
127
1249
3479
79
1349
279
124
1249
124
479
238
23
29
289
128
128
12
28
28
12345
125
1245
349
259
2349
234
23
2345
2456
2346
1248
28
125
12456
47
2567
246
Which appears to have still a long way to go. If we examine the second column we can see
things of interest.
1. The numbers 7 and 9 only appear twice, and when the do appear
they appear together.
2. The are two cells that can only contain 2 or 8.
The first of these rules will allow us to simplify cells (1,2) and (3,2) which currently show the
options 3,7,9 and 2,7,9. Because 7 and 9 only appear in these two cells, one must contain 7
and the other 9, though we don't know which is which. This does though allow us to remove
the possibilities that these cells contain a 3 or a 2.
The second of these rules applies to cells (6,2) and (9,2), both of which can only contain a 2
or an 8. Again this means that one of the cells must contain 2 and the other 8. Although we
don't know which is which we do no that none of the other cells in the column can be a 2 or
an 8. This allows us to remove the twos from (3,2) and (8,2), though in this case (3,2) has
already had its two removed using the previous rule.
The table now looks like
134
79
134
1578
14589
14
679
13469
1234
127
1249
3479
79
1349
79
124
1249
124
479
238
23
29
289
128
128
12
28
28
12345
125
1245
349
259
2349
234
2345
2456
2346
1248
28
125
12456
47
2567
246
With the changed cells in green. The fact that 3 is now in a cell on it's own means that we can
eliminate all the other 3's in the same block or row.
134
79
134
1578
14589
14
679
13469
1234
127
1249
3479
79
1349
79
124
1249
124
479
238
23
29
289
128
128
12
28
28
1245
125
1245
349
259
2349
24
245
2456
246
1248
28
125
12456
47
2567
246
In this case there are no more pairs that can be used to simplify the grid, so we move onto
phase 4, Eliminating Bad Guesses.
79
134
1578
14589
14
679
13469
1234
127
1249
3479
79
1349
79
124
1249
124
479
238
23
29
289
128
128
12
28
28
1245
125
1245
349
259
2349
24
245
2456
246
1248
28
125
12456
47
2567
246
The next phase involves making guesses and seeing the consequences. There are three
possible outcomes,
1. the search runs to completion with a solution
2. the search leads to a contradiction
3. the search is inconclusive
This first outcome at first glance appears to be ideal, but at this moment we don't know there
is only one solution so we ignore it. Instead what we look for is outcome 2, the contradiction
becase we now know that that guess must be wrong. Looking at the table above there are 141
possible guesses, starting with the top left cell being 1.
This part of the algorithm involves making that guess and then applying the three previous
phases to see where it leads us. In this case setting cell (1,1) to 1 leads to a contradiction so
we can eliminate that guess leaving only 3 or 4 in the top left cell. Trying 3 causes no
problems but 4 leads to a contradiction, so we can eliminate that too.
Now we only have 3 left so that must be the value in the top left cell. Fixing this value and
applying the previous phases leads to the solution
3 9
4 5
6 7
8 1
9 4
5 2
7 6
2 3
1 8
6
3
5
7
3 9
7 6
5
7
8
9
8 1
9
After applying the first 4 phases we are left with
134
79 134
1234 5
78
1589
45 2
67
127
129
379 134
3479
1346
126
79 26
129
14 479
25 6
239 39
34 346
168
68
12 5
23
56
56
1234
12
1245
349
359 2349
2456
35
2345 34 2345
2346
145
1256 12456
47
567 246
Phase 5 works by working through the possible values recursively until it finds a solution or a
contradiction. It steps through the possible guesses sequentially. The first cell (1,1) can be 1, 3
or 4. Choosing 1 gives
1
79 34
59
45 2
67
346
34
39
26
79 26
19
14 479
25 6
239 39
34 346
16
12 5
23
56
56
45
349
359 349
2345 34 2345
456
45
156 1456
47
567 2
8
35
349
8 9
5 2
4 5 8
7 2
6 39
39 1
2 9 6
3 1
4 7
8 1 5
4 7
2 6
39 39
9 3 4
6 8
1 5
6 2 7
5 3
9 1
7 6 1
2 4
8 39
39
3 4 2
9 5
7 8
5 8 9
1 6
3 4
3 8
9 5
2 6
4 5
8 7
2 6
3 9
2 9
6 3
1 4
7 8
8 1
5 4
7 2
6 3
9 3
4 6
8 1
5 2
346
6 2
7 5
3 9
1 4
7 6
1 2
4 8
9 5
3 4
2 9
5 7
8 1
5 8
9 1
6 3
4 7
Stepping back one guess and trying (2,7) = 9 also leads to a solution. All in all there are 32
possibilities, see here.
OldSuSolver.p
y
More information about running Python scripts can be found on the Python Patterns page.
Visits since June 2005: