You are on page 1of 12

Sudoku Solver

This page describes an algorithm for solving Sudoku puzzles, which have suddenly become
popular in the UK, and provides a Python script to automate the process.

The "Easy" Example


Introduction
Applying the Given Numbers
Looking for Singletons
Pairs
Eliminating Bad Guesses
If All Else Fails

Download the Script

Updates:
Oct 9 2005 - Replaced the Python
script with an easier to understand
version that doesn't use recursion.
Altered this page to reflect the new
algorithm..

The "Easy" Example


1
4

4
6

7 5

3
5

3
7 3

6
1

We will start off with an easy example. This one


appeared in the Evening Standard on 31st May
2005, where it was described as "Hard". It has 27
squares filled in, the examples they describe as
"Easy" have 32. This doesn't make much difference
to a computer algorithm, though if you were working
through this by hand the more initial numbers you
are given the quicker you should be able to solve it.
I call it "easy" as it can be solved after applying only
the simpler parts of the algorithm.

8 4
2 9

Introduction
The algorithm works by writing out all the possible answers and then eliminating those that
are incompatible with the numbers published in the puzzle. The starting position is that every
cell can contain any of the numbers 1-9. We represent this by writing all the possible values
into each cell, which gives us a grid that looks like this.
123456 1234567 1234567 1234567 1234567 1234567 1234567 1234567 1234567
789
89
89
89
89
89
89
89
89
123456 1234567 1234567 1234567 1234567 1234567 1234567 1234567 1234567
789
89
89
89
89
89
89
89
89
123456 1234567 1234567 1234567 1234567 1234567 1234567 1234567 1234567

789

89

89

89

89

89

89

89

89

123456 1234567 1234567 1234567 1234567 1234567 1234567 1234567 1234567


789
89
89
89
89
89
89
89
89
123456 1234567 1234567 1234567 1234567 1234567 1234567 1234567 1234567
789
89
89
89
89
89
89
89
89
123456 1234567 1234567 1234567 1234567 1234567 1234567 1234567 1234567
789
89
89
89
89
89
89
89
89
123456 1234567 1234567 1234567 1234567 1234567 1234567 1234567 1234567
789
89
89
89
89
89
89
89
89
123456 1234567 1234567 1234567 1234567 1234567 1234567 1234567 1234567
789
89
89
89
89
89
89
89
89
123456 1234567 1234567 1234567 1234567 1234567 1234567 1234567 1234567
789
89
89
89
89
89
89
89
89
The overall approach is to eliminate the impossible so that what remains must be the answer.

Applying the Given Numbers


In the example above we have been given the values for 27 of the cells. The first thing to do is
to apply these to our grid and eliminate all the values that are inconsistent with these.
Working across the top row the first value we have is in the 5th cell. This is a 1. This means
that there can be no other '1's on the top row or in the 5th column, or in the top-middle block
of 9. Crossing out all these '1's leaves
234567 2345678 2345678 2345678 1
89
9
9
9

2345678 2345678 2345678 2345678


9
9
9
9

123456 1234567 1234567 2345678 234567 2345678 1234567 1234567 1234567


789
89
89
9
89
9
89
89
89
123456 1234567 1234567 2345678 234567 2345678 1234567 1234567 1234567
789
89
89
9
89
9
89
89
89
123456 1234567 1234567 1234567 234567 1234567 1234567 1234567 1234567
789
89
89
89
89
89
89
89
89
123456 1234567 1234567 1234567 234567 1234567 1234567 1234567 1234567
789
89
89
89
89
89
89
89
89
123456 1234567 1234567 1234567 234567 1234567 1234567 1234567 1234567
789
89
89
89
89
89
89
89
89
123456 1234567 1234567 1234567 234567 1234567 1234567 1234567 1234567
789
89
89
89
89
89
89
89
89
123456 1234567 1234567 1234567 234567 1234567 1234567 1234567 1234567
789
89
89
89
89
89
89
89
89

123456 1234567 1234567 1234567 234567 1234567 1234567 1234567 1234567


789
89
89
89
89
89
89
89
89
Continuing this process; after entering the 9 given values on the top three rows we have
2368

2368

2368

23567

3578

3578

1239

1239

2357

2357

13579

13579

13689

346

346

346

1389

1389

123456 1234678 1235678 1234567 2345678 1234567 1234578 1235678 1345678


89
9
9
9
9
8
9
9
9
123456 1234678 1235678 1234567 2345678 1234567 1234578 1235678 1345678
89
9
9
9
9
8
9
9
9
123456 1234678 1235678 1234567 2345678 1234567 1234578 1235678 1345678
89
9
9
9
9
8
9
9
9
123456 1234678 1235678 1234567 2345678 1234567 1234578 1235678 1345678
89
9
9
9
9
8
9
9
9
123456 1234678 1235678 1234567 2345678 1234567 1234578 1235678 1345678
89
9
9
9
9
8
9
9
9
123456 1234678 1235678 1234567 2345678 1234567 1234578 1235678 1345678
89
9
9
9
9
8
9
9
9
This hasn't made much impact on the bottom two thirds of the grid because so far we've only
had one value in each column. But we've got 18 more values to place. When we get to '6' in
the 6th column of the 6th row the grid is becoming a lot sparser and we notice that the only
possible value left for the sixth cell on the third row is '4'. So we have found our first missing
number (shown in green).
2368

2368

2368

23567

3578

3578

1239

123

2357

57

13579

13579

13689

36

346

1389

1389

368

3678

578

3578

35678

12468

124678 12678

579

5789

125789

1256789 156789

12378

789

123789

123789

12378

13789

123468 1234678 12356789 235679 23456789 14578 12345789 12356789 1356789


9
123468 1234678 12356789 235679 23456789 14578 12345789 12356789 1356789
9
123468 1234678 12356789 235679 23456789 14578 12345789 12356789 1356789

9
After we've used up all the given numbers (and the one we found) we we are left with the
grid.
236

2368

238

3567 1

1239 23

2357

1389

36

36

378

3578

578

57 6

159

1579

36

1389

189

578

578

568

1246 24678 1278

579

578

125789

125689 156789

278

1278

78

12789

1789

24

259

56

456

15 1259

239

237

357

357

1259

159

34

357

57 58

568

568

So we move onto phase 2, "Looking for Singletons".

Looking for Singletons


Phase two involves counting how often the undecided numbers appear in a set of cells (row,
column or 3x3 super-cell). If we start with the first column, the undecided cells are the 1st,
2nd, 4th, 5th, 8th and 9th. The possibilities are '236', '1239', '36', '1246', '239' and '34'. If we
count how often each undecided number appears (the known ones are excluded from this
process) we get
Number

2 3

4 5

6 7

8 9

Frequency 2

4 5

2 -

3 -

Every undecided number appears at least twice, so this doesn't help us. Repeating this
exercise for columns 2 and 3 doesn't produce any useful information either. When we try
column 4 though, we get
Number

Frequency -

2 3

4 5

6 7

8 9

3 3

Now the '9' only appears once, in the 5th row, so that is where it must be. We have found our
second number. We can now set the '9' and repeat the process described in phase 1 giving.
236

2368

238

3567 1

3578

578

1239 23

2357

57 6

159

1579

1389

36

36

1389

189

36

378

578

578

568

1246 24678 1278

578

12578

12568 15678

278

1278

78

12789

1789

24

259

56

456

15 1259

239

237

357

357

1259

159

34

357

57 58

568

568

That didn't help very much, so we restart this phase at the beginning. The first four columns
offer no additional insights, but now the frequencies in column 5 are
Number

2 3

4 5

6 7

8 9

Frequency -

1 3

1 5

2 5

3 -

Which gives us two more discoveries, '2' on the second row and '4' on the 7th. Filling these in
using the phase 1 algorithm sets off a chain reaction...
Action

Consequences

(2,5) = 2 (2,2) = 3
(2,2) = 3 None
(7,5) = 4 (7,2) = 2
(7,2) = 2 (8,2) = 7
(8,2) = 7 (6,2) = 8
(6,2) = 8 (1,2) = 6, (6,5) = 7
(1,2) = 6 (1,1) = 2, (5,2) = 4
(6,5) = 7 None
(1,1) = 2 (1,3) = 8
(5,2) = 4 None
(1,3) = 8 None
Co-ordinates are (row, column) with the top left being (1,1). The grid now looks like this
(with the new discoveries highlighted).
2

6 8

357 1

357

57

19 3 4

57 6

159

1579

36

36

1389

189

36 9 37

58

578

568

16 4 127

58

12578

12568 15678

8 12

129

19

2 59

56

15 159

39 7 6

35

35

1259

159

34 1 35

57 58

568

568

5 19

Restartind g this phase counting down the columns shows us that the bottom cell in the first
column must be '4'. This doesn't lead to any further discoveries. Trying again reveals that the
top cell in the 4th column must be '7'. This second discovery forces both (1,9) and (2,6) to be
'5', which in turn forces three more discoveries, leaving.
2

6 8

9 3

19 3 4

5 6

19

179

36 36

4 189

189

36 9 37

58

2 578

568

16 4 127

58

3 12578

12568 1678

8 12

6 129

19

2 59

56 4

1 59

39 7 6

35 35

8 4

1259

19

7 58

568

68

5 19

1 35

Searching the columns again we see that the third cell in the fifth column must be a 6. Setting
this triggers another cascade of discoveries which culminates with the solution.
2 6

8 7

1 9

3 4

1 3

4 8

2 5

6 9

7 5

9 3

6 4

1 8

3 9

7 1

8 2

5 6

6 4

2 9

5 3

7 1

5 8

1 4

7 6

2 3

8 2

5 6

4 1

9 7

9 7

6 5

3 8

4 2

4 1

3 2

9 7

8 5

For this example that is as far as we need to go, which is why I've described it as "Easy". I
don't have any examples, but any patterns that can be solved only by using the first phase
could be described as "Trivial". The next sections discuss some more difficult examples of
the problem.

Pairs
The first two phases are sufficient to solve most examples published in popular newspapers,
but here is a more difficult example someone sent me. It only has 23 given numbers, 4 less
than the previous example.
2
5

6
3

1
9

8
7

3 9

7 6

5
7

8
9

8 1

9
After applying phases 1 and 2 we are still left with.
134

379

134

1578

14589

14

679

13469

1234

127

1249

3479

79

1349

279

124

1249

124

479

238

23

29

289

128

128

12

28

28

12345

125

1245

349

259

2349

234

23

2345

2456

2346

1248

28

125

12456

47

2567

246

Which appears to have still a long way to go. If we examine the second column we can see
things of interest.
1. The numbers 7 and 9 only appear twice, and when the do appear
they appear together.
2. The are two cells that can only contain 2 or 8.
The first of these rules will allow us to simplify cells (1,2) and (3,2) which currently show the
options 3,7,9 and 2,7,9. Because 7 and 9 only appear in these two cells, one must contain 7
and the other 9, though we don't know which is which. This does though allow us to remove
the possibilities that these cells contain a 3 or a 2.
The second of these rules applies to cells (6,2) and (9,2), both of which can only contain a 2
or an 8. Again this means that one of the cells must contain 2 and the other 8. Although we
don't know which is which we do no that none of the other cells in the column can be a 2 or
an 8. This allows us to remove the twos from (3,2) and (8,2), though in this case (3,2) has
already had its two removed using the previous rule.
The table now looks like
134

79

134

1578

14589

14

679

13469

1234

127

1249

3479

79

1349

79

124

1249

124

479

238

23

29

289

128

128

12

28

28

12345

125

1245

349

259

2349

234

2345

2456

2346

1248

28

125

12456

47

2567

246

With the changed cells in green. The fact that 3 is now in a cell on it's own means that we can
eliminate all the other 3's in the same block or row.
134

79

134

1578

14589

14

679

13469

1234

127

1249

3479

79

1349

79

124

1249

124

479

238

23

29

289

128

128

12

28

28

1245

125

1245

349

259

2349

24

245

2456

246

1248

28

125

12456

47

2567

246

In this case there are no more pairs that can be used to simplify the grid, so we move onto
phase 4, Eliminating Bad Guesses.

Eliminating Bad Guesses


This phase is simple to described but harder to implement. In the previous example, after
exhausting phase 3 we are still left with
134

79

134

1578

14589

14

679

13469

1234

127

1249

3479

79

1349

79

124

1249

124

479

238

23

29

289

128

128

12

28

28

1245

125

1245

349

259

2349

24

245

2456

246

1248

28

125

12456

47

2567

246

The next phase involves making guesses and seeing the consequences. There are three
possible outcomes,
1. the search runs to completion with a solution
2. the search leads to a contradiction
3. the search is inconclusive
This first outcome at first glance appears to be ideal, but at this moment we don't know there
is only one solution so we ignore it. Instead what we look for is outcome 2, the contradiction
becase we now know that that guess must be wrong. Looking at the table above there are 141
possible guesses, starting with the top left cell being 1.
This part of the algorithm involves making that guess and then applying the three previous
phases to see where it leads us. In this case setting cell (1,1) to 1 leads to a contradiction so
we can eliminate that guess leaving only 3 or 4 in the top left cell. Trying 3 causes no
problems but 4 leads to a contradiction, so we can eliminate that too.
Now we only have 3 left so that must be the value in the top left cell. Fixing this value and
applying the previous phases leads to the solution
3 9

4 5

6 7

8 1

9 4

5 2

7 6

2 3

1 8

If All Else Fails


If we take the example from the start of phase 3 (Pairs) and remove the '6' on the fifth row we
are left with this puzzle.
2
5

6
3

5
7

3 9

7 6

5
7

8
9

8 1

9
After applying the first 4 phases we are left with
134

79 134

1234 5

78

1589

45 2

67

127

129

379 134

3479

1346

126

79 26

129

14 479

25 6

239 39

34 346

168

68

12 5

23

56

56

1234

12

1245

349

359 2349

2456

35

2345 34 2345

2346

145

1256 12456

47

567 246

Phase 5 works by working through the possible values recursively until it finds a solution or a
contradiction. It steps through the possible guesses sequentially. The first cell (1,1) can be 1, 3
or 4. Choosing 1 gives
1

79 34

59

45 2

67

346

34

39

26

79 26

19

14 479

25 6

239 39

34 346

16

12 5

23

56

56

45

349

359 349

2345 34 2345

456

45

156 1456

47

567 2

8
35

349

The next guess is to try 7 in cell (1,2). This leads to


1 7 3

8 9

5 2

4 5 8

7 2

6 39

39 1

2 9 6

3 1

4 7

8 1 5

4 7

2 6

39 39

9 3 4

6 8

1 5

6 2 7

5 3

9 1

7 6 1

2 4

8 39

39

3 4 2

9 5

7 8

5 8 9

1 6

3 4

The third guess, setting (2,7) to 3 leads to a solution


1 7

3 8

9 5

2 6

4 5

8 7

2 6

3 9

2 9

6 3

1 4

7 8

8 1

5 4

7 2

6 3

9 3

4 6

8 1

5 2

346

6 2

7 5

3 9

1 4

7 6

1 2

4 8

9 5

3 4

2 9

5 7

8 1

5 8

9 1

6 3

4 7

Stepping back one guess and trying (2,7) = 9 also leads to a solution. All in all there are 32
possibilities, see here.

Download the Script


There are two scripts to choose from. Click on the link to download the corresponding one.
SuSolver.py

The new script. This implements the 5 phases described above


except the first part of phase 3 (looking for numbers that only appear
twice and in pairs). This solves all the examples I have tried. When it
can't solve a puzzle using the first 4 phases it enters phase 5 which
is intended to enumerate the possible solutions (on the assumption
that there are more than one). If there is only 1 solution which isn't
found after the first 4 phases then phase 5 will find it.

OldSuSolver.p
y

This is the previous version I published. It is very similar but in place


of the current phase 4 it went straight into a recursive algorithm. The
problem with this version is that it will always find a solution even
when there aren't any, picking the first one it finds from the set of
possibilities. I'm pretty confident though, that if there is a unique
solution it will find it.

More information about running Python scripts can be found on the Python Patterns page.
Visits since June 2005:

You might also like