You are on page 1of 13

History of Sudoku

Sudoku has a fascinating history. "Su" means number in Japanese, and "Doku"
refers to the single place on the puzzle board that each number can fit into. It also
connotes someone who is single—indeed, one way to describe the game is "Solitaire with
numbers." Sometimes it is mis-spelled as "soduko" or "sudoko." Although its name is
Japanese, its origins are actually European and American, and the game represents the
best in cross-cultural fertilization. Unlike many games which spring from one culture
and are then absorbed by others, Sudoku's development reveals it to be a true hybrid

The 18th century Swiss mathematician Leonhard Euler apparently developed the
concept of "Latin Squares" where numbers in a grid appear only once, across and up
and down. In the late 1970's, Dell Magazines in the US began publishing what we now
call Sudoku puzzles using Euler's concept with a 9 by 9 square grid. They called it
Number Place, and it was developed by an independent puzzle maker, Howard Garnes.

In the mid-1980s, the president of the Japanese puzzle giant Nikoli, Inc., Mr.
Maki Kaji (pictured at left), urged the company to publish a version of the puzzle that
became a huge hit in that country. Nikoli gave the game its current name, and helped
refine it by restricting the number of revealed or given numbers to 30 and having them
appear symmetrically. Afterwards the game became increasingly popular in Japan and
started becoming a fixture in daily newspapers and magazines. Yet almost two decades
passed before the game was taken up by The Times newspaper in London as a daily
puzzle. This development was due to the efforts of Wayne Gould, a retired Hong Kong
judge originally from New Zealand. He first came across a Sudoku puzzle in a Japanese
bookshop in 1997, and later spent many years developing a computer program to
generate them. In the fall of 2004, he was able to convince The Times to start
publishing daily Sudoku puzzles developed using his software. The first game was
published on November 12, 2004. Within a few months, other British newspapers
began publishing their own Sudoku puzzles.

Once again, Sudoku's popularity crossed the oceans. By the summer of 2005,
major newspapers in the US were also offering Sudoku puzzles like they would daily
crossword puzzles. It is interesting to note that while software is critical to being able
to supply the growing demand for Sudoku puzzles—it can take hours of processing time
to generate one unique puzzle—it was old media in the form of newspapers that have
done so much to spread Sudoku around the world. In the US, the New York Post, the
San Francisco Chronicle and USA Today offered Sudoku puzzles to their readers by
September 2005.

Sudoku's future development is unknown. While the 9 by 9 grid is the most

common form of Sudoku, there are many variants of the game. Four by four (4 x 4)
Sudoku with 2 by 2 subsections are simpler, fun for younger audiences, and easy to
deliver to mobile devices like cellphones (this site offers a 4 by 4 variant). There are
5 by 5 games, 6 by 6 and 7 by 7 games. For the truly addicted, there are even 16 by
16 grids, not to mention a 25 by 25 grid apparently offered by Japanese game
developer Nikoli. Sudoku puzzles using letters and symbols, some even spelling words in
their final solutions are also becoming available. Other variants require computational

Where this rapidly developing fad leads to, no one can tell. What is clear
though is that Sudoku is a fun and challenging way for people of any age and culture to
hone their logical and deductive abilities. Who knows—played often enough, Sudoku may
help make the human race a tiny bit smarter.

Inventor of Sudoku

Howard Garnshe is the one who first introduces his first ever sodoku.

Variants of Sudoku

Minor Variations

Grid Size

It turns out there is nothing special about the regular 9x9 puzzle, the grid size
can be any old size. You can make it smaller and simpler or larger and harder to solve.
Commonly used grid sizes are 2x3 regions rather than 3x3 so there are six numbers to
place in the squares and 4x4 where there are sixteen numbers. Our theory page goes
into more detail; there is no theoretical limit on how large the grid can be. All the
same strategies apply; you can use them to solve different sized puzzles.

Variations of region size examples of 2 by 3; 2 by 7; and 4 by 5 grids.

Colors, Words, Pictures and Symbols

Sudoku is about placing things in the correct order; it has nothing to do with
arithmetic. The familiar numbers 1 to 9 can be replace these with anything at all as
long as each is different from each other. So the puzzle can use a set of nine
different colors, nine letters or pictures. If it uses fragments of a completed picture
then one of regions (usually the central one) will show the picture in full. However you
really need a program to help you play these picture puzzles as you can't all that
easily pencil in sketches for the missing squares!
Word Sudoku

If letters are used rather than numbers then

'hidden' words can be included in the grid, often this
is a nine letter word in a row or column. Here's an
example of puzzle with its solution alongside, the
hidden word games is in the rightmost column.

Jigsaw or Squiggle Sudoku

In regular Sudoku all the regions in the grid have an identical shape - a square
or a rectangle. If instead you allow the regions to be any old shape, as long as each
shape is contiguous and has the same number of squares as all the other regions. The
strategies have to change a bit. The two-out-of-three strategy, which is so useful for
regular puzzles, has to be overhauled and the shared subgroup exclusion rule needs to
account for the individual overlaps of the particular patterns.

Extra twists

The minor variations do not change or extend the rules one iota; they are still
all essentially the same puzzle. In this next category are puzzles that add some other
rule or hint on top of the normal Sudoku rules.

Neighbor Order

This is a neat idea of adding extra information that is easy to see and use.
The border between two squares is used to give a hint as to which neighbor is larger.
It includes a pointer to the smaller number. It is also known as 'Greater than Sudoku'.
A row of 9; 8 ;7; 2; 4; 1; 6; 5; 3 has the ordering 9 > 8 > 7 > 2 < 4 > 1 < 6 > 5 >
3. This extra hint introduces a new way to solve squares. If you have a 2 and a
neighbor that is smaller than this then it must be a 1, similarly an 8 with a higher
square must be a 9.

Here is an example puzzle with all the orderings shown in each square's border.
In this case the square Bb can be immediately be solved by ordering as it has 6 > x >
4 and so x must be 5.

To make solution a little harder only some of the neighbor borders can be
indicated. Often the initial grid is empty except for the neighbor ordering. This is a
puzzle with the addition of number ordering and so it is not strictly Sudoku any more,
a picture or color Sudoku puzzle doesn't have a concept of order of neighbors so to a
purist it's not quite the real deal.

We at Sudoku Dragon have added our own little extension to Sudoku. Optionally one
group in the grid will contain the numbers in order 1 to 9 or reverse order. The stripe
can be wrapped up into a region which makes it sometimes hard to spot. We have a
whole page devoted to describing the stripe.

X Sudoku

Going back to Euler and the origins of Sudoku the original Magic Numbers and
Latin Squares had constraints on the diagonals as well as the rows and columns. So
there is a variety of Sudoku that always has the numbers 1 to 9 occurring in both the
diagonals as well as the rows and columns. You can use this extra constraint to solve

Killer Sudoku

A popular variation of Sudoku is to add some arithmetic on top of the basic

rules. In Killer Sudoku the standard grid is divided up into collections, in the top left
square of the collection is the sum of the numbers that go in the squares making up
the group. This lets you solve squares in new ways, in the simplest case you could have
a collection of only two squares, if the sum is 3 then you know one square in 1 and the
other 2, or if the sum was 17 then they must be 8 and 9. So the sum is giving extra
information about the numbers in the squares quite apart from the standard 'number
unique to a group' coming from Sudoku. A key starting point is to look for collections
with a low average (e.g. a total of 6 for 3 squares, average 2, means can only be {1,
2, 3}) or a high average (e.g. a total of 30 for 4 squares, average 7.5 means can only
be {6, 7, 8, 9}) Usually the extra information from these additions is sufficient to
start off with an empty grid and yet still be able to solve it.

Other Variations

There are a host of other variations on the basic theme and new ones are being
invented all the time. If you have found one that you like let us know about it.
Algorithmic of Sudoku
The class of Sudoku puzzles consists of a partially completed row-column grid
of cells partitioned into N regions or zones each of size N cells, to be filled in using a
prescribed set of N distinct symbols (typically the numbers {1, ..., N}), so that each
row, column and region contains exactly one of each element of the set. The puzzle
can be solved using a variety of algorithms.

Solving Sudoku by backtracking

The basic backtracking algorithm can be adapted to solve sudokus. This is

straightforward. Say a zone is a subset of N boxes of an N x N grid, which must
contain the numbers from 1 to N. A standard sudoku contains 27 zones, namely 9
rows, 9 columns and 9 squares that are 3 x 3. In a jigsaw sudoku, the square zones
are replaced by zones having irregular boundaries, like a jigsaw piece.

One possible algorithm that uses backtracking to solve such sudokus constructs
a graph on N2 vertices, one vertex for each box of the grid. Two vertices are
connected by an edge if there exist a zone containing the two boxes. The problem is
then equivalent to coloring this graph with N colors, where adjacent vertices may not
have the same color. This is done by starting with an empty assignment of colors and
assigning colors to vertices one after another, using some fixed order of the vertices.
Whenever a color is being assigned, we check whether it is compatible with the
existing assignments, i.e. whether the new color occurs among the neighbors of that
vertex. If it doesn't, then we may assign it to the vertex and try to process another
vertex. We backtrack once all N colors have been tried for a given vertex. If all
vertices have been assigned a color, then we have found a solution. There are of
course much more sophisticated algorithms to solve graph coloring. If the sudoku
contains initial data, i.e. some boxes have already been filled, then these go into the
color assignment before backtracking begins and the vertex sequence includes only the
empty boxes.

The above algorithm was used to solve a 10x10 jigsaw sudoku that was
proposed on A link to the proposal may be found in the section
for external links. The first section of the program defines the 10 jigsaw pieces
(zones), the second the row and column zones. Thereafter the graph is constructed as
an adjacency list. The search procedure prints completed solutions (when all 100 boxes
have been assigned). Otherwise it computes the set of colors present among the
neighbors of the next vertex to be processed, and recursively tries those assignments
that do not conflict with this set. The search starts with an empty assignment.

A simple trick to narrow the range of the numbers of a box can be used in
improving backtracking solutions efficiently. For a sudoku with some filled boxes, the
numbers of the blank boxes are often restricted to be in a small subset of N because
they are required not to be conflict with other boxes in the correspondent zones. For
example, a zone like [1, 2, 3, {4,5},{4,6},{4,5,6},{4,5,6,7},{7,8},{8,9}] might exist
after the numbers in the filled boxes are used to clean up a sudoku. Note that there
are three boxes, {4,5},{4,6} and {4,5,6}, with three possible numbers, 4,5 and 6.
That means, 4, 5 and 6 would definitely occupy the three boxes and other boxes
should not have those numbers. Therefore, instead of backtrack the graph instantly,
we can at first narrow the range of other boxes by removing the numbers 4,5 and 6
and get the result zone as [1, 2, 3, {4,5}, {4,6}, {4,5,6}, 7, 8, 9]. Using this trick,
common Sudoku puzzles seldom needs many backtrackings to find the solution.

Exact Cover in solving sudokus

Sudoku can be described as an instance of the exact cover problem. This allows
both for a very elegant description of the problem and an efficient solution using a
backtracking algorithm.

In an exact cover problem, there is given a universe U of elements and a

collection of subsets of U. The task is to find a subcollection of such that
every element in U is an element of exactly one set in .

The challenge in applying the exact cover problem to sudoku is to find a

definition for the elements of U such that every valid sudoku solution must contain
every element of U exactly once, and to find a definition for the elements of
(subsets of U) such that if the union of a disjoint collection of these elements gives U,
then the elements specify a completely filled-in sudoku grid which satisfies every

Let S = {s11, s12, …, s19, s21, …, s99} be the set of squares of the sudoku grid.

Let ri = {si1, si2, …, si9} ⊆ S be the set of squares belonging to row i, and let R
= {r1, r2, …, r9} be the set of rows.

Let cj = {s1j, s2j, …, s9j} ⊆ S be the set of squares belonging to column j, and
let C = {c1, c2, …, c9} be the set of columns.

Clearly, for each row i and column j, {sij} = ri ∩ cj.

Let bk be the set of squares belonging to block k, and let B = {b1, b2, …, b9} be
the set of blocks.

Thus, b1 is the intersection of rows 1 to 3 with columns 1 to 3, b2 is the

intersection of rows 1 to 3 with columns 4 to 6, …, and b9 is the intersection of rows
7 to 9 with columns 7 to 9. For example,

b2 = {s14, s15, s16, s24, s25, s26, s34, s35, s36}.

Finally, we are ready to define a universe U and a collection of subsets of U

which exactly mirror the required constraints on the placement of values in the squares
of the sudoku grid. A solution to the exact cover problem is a collection of disjoint
subsets of U whose union is exactly U: each element of U appears in exactly one
element of . How can this be applied to sudoku? We need to find a set of objects
each of which appears exactly once in every finished sudoku puzzle. Every square of
the grid must appear exactly once. Every row, column, and square must contain each
value exactly once. Each of these constraints involves pairs: pairs of rows and columns,
pairs of rows and values, pairs of columns and values, and pairs of blocks and values.
Our universe is going to be made up of pairs.

Consider the Cartesian products R×C, R×V, C×V, and B×V. Each contains 81
pairs. For example R×C = {(r1, c1), ..., (r9, c9)}. The universe U is the 324 element
union of these four Cartesian products. As it happens, every valid sudoku solution
contains exactly these 324 pairs, no more, no less. But this set of pairs does not
represent a specific solution. It represents every valid solution to the blank sudoku

To represent a specific solution, we need to assign a specific value to each

square. Let Sijkl denote the subset of U containing (ri, cj), (ri, vl), (cj, vl), and (bk, vl).
This subset denotes the assignment of value l to square sij. The collection contains
exactly those subsets Sijkl of U in which square ij is an element of block k, i.e., sij ∈
bk. Since k is completely dependent on i and j, there are 9×9×9 = 729 such subsets in

An incomplete sudoku grid is represented by a disjoint collection of subsets

Sijkl of U which does not specify a value for every square. Since the collection is
disjoint (each element of U appears in at most one subset), we know it does not violate
any of the sudoku constraints.

Now, applying the exact cover problem, we find members of disjoint from
each other and from each member of , resulting in the collection of 81 disjoint
four-element subsets of U whose union is exactly the 324 element set U.

This representation of the sudoku problem eliminates all mention of rows,

columns, blocks, and values from the implementation. Instead, we always work with the
same collection of 729 four-element subsets of a 324 element universe. These can be
represented by the integers 0 to 728, together with a function f(i,j) which is true if
the subsets corresponding to i and j are disjoint. f might be implemented using a 7292
= 531441 element (66431 byte) constant bit vector. A specific puzzle is a set of
fewer than 81 integers with f(i,j) true for each pair i,j. Solving the puzzle involves
repeatedly finding a new integer k which passes f(i,k) for each i in the partially
completed puzzle, and backtracking when no such k can be found.

Obviously, once we find that a specific k fails f(i,k) for some i, we need not
consider it again so long as i remains in the proposed solution. So we keep a list of k
values which have not yet failed. The length of this list estimates the amount of work
required to discover that the current proposed solution fails. So at each level of
recursion, we can look ahead one level to estimate how much work each proposed k will
involve, and choose the k which returns failure in the shortest time.

Although the size of the complete search tree is fixed for a given puzzle, this
representation of the problem gives a fast test for failure at each level, and a way of
ordering the search so that the smallest subtrees are searched first.

Solving Sudokus by a Brute-Force Algorithm

Some hobbyists have developed computer

programs that will solve sudoku puzzles using a brute
force algorithm. Although it has been established that
approximately 6.67 x 1021 final grids exist, using a
brute force algorithm can be a practical method to solve
puzzles using a computer program if the code is well

An advantage of this method is that if the puzzle is valid, a solution is

guaranteed. There is not a strong relation between the solving time and the degree of
difficulty of the puzzle; generating a solution is just a matter of waiting until the
algorithm advances to the set of numbers that satisfies the puzzle. The disadvantage
of this method is that it may be comparatively slow when compared to computer
solution methods modeled after human-style deductive methods.

Briefly, a brute force program would solve a puzzle by placing the digit "1" in
the first cell and checking if it is allowed to be there. If there are no violations
(checking row, column, and box constraints) then the algorithm advances to the next
cell, and places a "1" in that cell. When checking for violations, it is discovered that
the "1" is not allowed, so the value is advanced to a "2". If a cell is discovered where
none of the 9 digits is allowed, then the algorithm leaves that cell blank and moves
back to the previous cell. The value in that cell is then incremented by one. The
algorithm is repeated until the allowed value in the 81st cell is discovered. The
construction of 81 numbers is parsed to form the 9 x 9 solution matrix.

Most Sudoku puzzles will be solved in just a few seconds with this method, but
there are exceptions. The following puzzle was designed to be a near worst case
situation for solution by brute force (although it is not regarded as a difficult puzzle
when solved by other methods).
Solving this puzzle by brute-force requires a large number of iterations because
it has a low number of clues (17), the top row has no clues at all, and the solution has
"987654321" as its first row. Thus a brute-force solver will spend an enormous amount
of time "counting" upward before it arrives at the final grid which satisfies the puzzle.
If one iteration is defined as one attempt to place one value in one cell, then this
puzzles requires 641,580,843 iterations to solve. These iterations do not include the
work involved at each step to learn if each digit entered is valid or not (required for
every iteration). Based on the specific construction of the computer code, programmers
have found the solution time for this puzzle to be between 30 and 45 minutes with a
computer processor running at 3 GHz. Many programmers have developed variations of
the brute force algorithm which will solve this puzzle in a minute or less with a 3 GHz
computer processor.

An extremely simplistic brute force algorithm written in the C language running

on a 2.2 GHz processor solved the above puzzle in just over a minute. Hence, the
choice of programming language is obviously important. Further, a trivial modification
to that simplistic algorithm is to rotate the puzzle prior to beginning the iterations
allowing for the easiest angle of attack. By looking at the puzzle from four different
angles, the program can quickly deduce which angle is likely to result in the most
iteration. With that one modification, the same 2.2 GHz processor was able to solve
the above puzzle in about a half a second.

One programmer has created charts of the progression of a pointer as it

advances through the 81 positions of a sudoku using a brute force algorithm. An
example is the chart for the solution to a sudoku "Star Burst Leo" shown here.

Solving Sudokus via Stochastic Search or

Optimization Methods

Some researchers have also shown how sudoku can be solved using stochastic—
i.e. random-based—search.[3]

Such a method could work as follows: first, start by randomly assigning

numbers to the blank cells in the grid, and calculate the number of errors. Now start
to "shuffle" these inserted numbers around the grid until the number of mistakes has
been reduced to zero. A solution to the puzzle will then have been found. Approaches
for shuffling the numbers include simulated annealing, and tabu search.

The advantage of this type of method is that the puzzle does not have to be
"logic-solvable" in order for the algorithm to be able to solve it. In other words, unlike
other methods, the puzzles that are given to this algorithm do not have to be specially
constructed so that they provide sufficient clues for filling the grid using forward
chaining logic only. In fact, the only prerequisite for the stochastic search algorithm
to work is that puzzle has at least one solution.
Stochastic-based optimisation algorithms are known to be quite fast, though
they are perhaps not as fast as some logic-based techniques with logic solvable
puzzles. Depending on the type of instance given to the algorithm, generally 9x9
puzzles will be solved in less than 1 second on a typical year-2000 laptop; 16x16
puzzles will take around 10–15 seconds. In the paper by Meir Perez and Tshilidzi
Marwala the Sudoku puzzle is successfully solved using stochastic search techniques and
these were: cultural algorithm, quantum annealing and the Hybrid method that
combines genetic algorithm with simulated annealing.[4].

Finally, note that it is also possible to express a Sudoku as an integer linear

programming problem. Such approaches seem to get close to a solution quite quickly,
and can then use branching towards the end. The Simplex algorithm seems able to
handle situations with no solutions or multiple solutions quite well.

Solving Sudokus by Constraint Programming (CP)

Sudoku is a constraint problem. An article by Helmut Simoni at Imperial College

London [5] describes in great detail many different reasoning algorithms available in the
form of constraints which can be applied to model and solve the problem. It requires
81 finite domain variables and 21 constraints. Any constraint solver will have an
example how to model and solve Sudoku problem (e.g.[6]). It will solve the problem in
milliseconds. Moreover, the constraint program modeling and solving Sudoku will in most
solvers have less than 100 lines of code. If the code employs a strong reasoning
algorithm (still polynomial complexity) then only some of the hardest instances (diabolic)
require further code which employs a search routine.
Submitted to:
Mrs. Lescano
Frederick Gay Manalo