You are on page 1of 47

CIPHERS

Practical Cryptography
http://practicalcryptography.com/ciphers/
CIPHERS

Ciphers are arguably the corner stone of cryptography. In general, a cipher is simply just a set
of steps (an algorithm) for performing both an encryption, and the corresponding decryption.

Despite might what seem to be a relatively simple concept, ciphers play a crucial role in modern
technology. Technologies involving communication (including the internet, mobile phones, digital
television or even ATMs) rely on ciphers in order to maintain both security and privacy.

Although most people claim they're not familar with cryptography, they are often familar with the
concept of ciphers, whether or not they are actually concious of it. Recent films such as The Da Vinci
Code and National Treature: Book of Secrets have plots centered around cryptography and ciphers,
bringing these concepts to the general public.

This section (quite appropriately) deals with individual ciphers and algorithms. They have been
divided based on their era and category (i.e. when were they used and how do they work). If you're
looking for a reference guide, refer to the alphabetical list to the right, otherwise continue reading.

In our effort to provide a practical approach to these, we have developed a javascript


implementation for each cipher that allows encryption and decryption of abitrary text (of your choosing)
using the cipher. Some history of each cipher is also included, and tips on cryptanalysis are also
provided.

Although most people claim they're not familar with cryptography, they are often familar with the
concept of ciphers, whether or not they are actually concious of it.

What are the eras of cryptography?


Crytography has been through numerous phases of evolution. Early ciphers in cryptography
were designed to allow encryption and decryption to take place by hand, while those which are
developed and used today are only possible due to the high computational performance of modern
machines (i.e the computer you are using right now). The major eras which have shaped cryptography
are listed below.

 Classical
 Mechanical
 Modern
Classical

The classical algorithms are those invented pre-computer up until around the 1950's. The list
below is roughly ordered by complexity, least complex at the top.

Classical ciphers are cryptographic algorithms that have been used in the past (pre WWII).
Some of them have only ever been used by amateurs (e.g. Bifid), while some of them have been used
by armies to secure their top level communications (e.g. ADFGVX).

None of these algorithms are very secure as far as protecting information goes (with today’s
computers to break them), so if real data security is needed you should probably look at modern
algorithms.
 Atbash Cipher
The Atbash cipher is a substitution cipher with a specific key where the letters of the
alphabet are reversed. I.e. all 'A's are replaced with 'Z's, all 'B's are replaced with 'Y's, and so on. It
was originally used for the Hebrew alphabet, but can be used for any alphabet.

The Atbash cipher offers almost no security, and can be broken very easily. Even if an
adversary doesn't know a piece of cipher text has been enciphered with the Atbash cipher, they
can still break it by assuming it is a substitution cipher and determining the key using hill-climbing.
The Atbash cipher is also an Affine cipher with a=25 and b = 25, so breaking it as an affine cipher
also works.

The Algorithm

The Atbash cipher is essentially a substitution cipher with a fixed key, if you know the cipher
is Atbash, and then no additional information is needed to decrypt the message. The substitution
key is:

ABCDEFGHIJKLMNOPQRSTUVWXYZ
ZYXWVUTSRQPONMLKJIHGFEDCBA

To encipher a message, find the letter you wish to encipher in the top row, and then replace
it with the letter in the bottom row. In the example below, we encipher the message 'ATTACK AT
DAWN'. The first letter we wish to encipher is 'A', which is above 'Z', so the first cipher text letter is
'Z'. The next letter is 'T', which is above 'G', so that comes next. The whole message is enciphered:

ATTACK AT DAWN
ZGGZXP ZG WZDM

To decipher a message, the exact same procedure is followed. Find 'Z' in the top row, which
is 'A' in the bottom row. Continue until the whole message is deciphered.

Other Implementations

To encipher your own messages in python, you can use the pycipher module. To install it,
use pip install pycipher. To encipher messages with the Atbash cipher (or another cipher, see here
for documentation):

>>>from pycipher import Atbash


>>>Atbash().encipher('defend the east wall of the castle')
'wvuvmwgsvvzhgdzoolugsvxzhgov'
>>>Atbash().decipher('wvuvmwgsvvzhgdzoolugsvxzhgov')
'defendtheeastwallofthecastle'
Cryptanalysis

The Atbash cipher is trivial to break since there is no key, as soon as you know it is an
Atbash cipher you can simply decrypt it. If you didn't know it was an Atbash cipher, you could break
it by assuming the cipher text is a substitution cipher, which can still be easily broken, see here.
Alternatively, it can be broken if it is assumed to be an Affine cipher.
 ROT13 Cipher
The ROT13 cipher is a substitution cipher with a specific key where the letters of the
alphabet are offset 13 places. I.e. all 'A's are replaced with 'N's, all 'B's are replaced with 'O's, and
so on. It can also be thought of as a Caesar cipher with a shift of 13.

The ROT13 cipher offers almost no security, and can be broken very easily. Even if an
adversary doesn't know a piece of cipher text has been enciphered with the ROT13 cipher, they
can still break it by assuming it is a substitution cipher and determining the key using hill-climbing.
The ROT13 cipher is also a Caesar cipher with a key of 13, so breaking it as a Caesar cipher also
works.

The Algorithm

The ROT13 cipher is essentially a substitution cipher with a fixed key, if you know the cipher
is ROT13, and then no additional information is needed to decrypt the message. The substitution
key is:

ABCDEFGHIJKLMNOPQRSTUVWXYZ
NOPQRSTUVWXYZABCDEFGHIJKLM

To encipher a message, find the letter you wish to encipher in the top row, and then replace
it with the letter in the bottom row. In the example below, we encipher the message 'ATTACK AT
DAWN'. The first letter we wish to encipher is 'A', which is above 'N', so the first cipher text letter is
'N'. The next letter is 'T', which is above 'G', so that comes next. The whole message is enciphered:

ATTACK AT DAWN
NGGNPX NG QNJA

To decipher a message, the exact same procedure is followed. Find 'N' in the top row, which
is 'A' in the bottom row. Continue until the whole message is deciphered.

Other Implementations

To encipher your own messages in python, you can use the pycipher module. To install it,
use pip install pycipher. To encipher messages with the Rot13 cipher (or another cipher, see here
for documentation):

>>>from pycipher import Rot13


>>>Rot13().encipher('defend the east wall of the castle')
'qrsraqgurrnfgjnyybsgurpnfgyr'
>>>Rot13().decipher('qrsraqgurrnfgjnyybsgurpnfgyr')
'DEFENDTHEEASTWALLOFTHECASTLE'
Cryptanalysis

The ROT13 cipher is trivial to break since there is no key, as soon as you know it is an
ROT13 cipher you can simply decrypt it. If you didn't know it was a ROT13 cipher, you could break
it by assuming the cipher text is a substitution cipher, which can still be easily broken, see here.
Alternatively, it can be broken if it is assumed to be a Caesar cipher, see here for a guide on
breaking them.
 Caesar Cipher
The Caesar cipher (a.k.a the shift cipher, Caesar's Code or Caesar Shift) is one of the
earliest known and simplest ciphers.

The Caesar cipher is one of the earliest known and simplest ciphers. It is a type of
substitution cipher in which each letter in the plaintext is 'shifted' a certain number of places down
the alphabet. For example, with a shift of 1, A would be replaced by B, B would become C, and so
on. The method is named after Julius Caesar, who apparently used it to communicate with his
generals.

More complex encryption schemes such as the Vigenère cipher employ the Caesar cipher
as one element of the encryption process. The widely known ROT13 'encryption' is simply a
Caesar cipher with an offset of 13. The Caesar cipher offers essentially no communication security,
and it will be shown that it can be easily broken even by hand.

Example

To pass an encrypted message from one person to another, it is first necessary that both
parties have the 'key' for the cipher, so that the sender may encrypt it and the receiver may decrypt
it. For the Caesar cipher, the key is the number of characters to shift the cipher alphabet.

Here is a quick example of the encryption and decryption steps involved with the Caesar
cipher. The text we will encrypt is 'defend the east wall of the castle', with a shift (key) of 1.

plaintext: defend the east wall of the castle


ciphertext: efgfoe uif fbtu xbmm pg uif dbtumf

It is easy to see how each character in the plaintext is shifted up the alphabet. Decryption is
just as easy, by using an offset of -1.

plain: abcdefghijklmnopqrstuvwxyz
cipher: bcdefghijklmnopqrstuvwxyza

Obviously, if a different key is used, the cipher alphabet will be shifted a different amount.

Mathematical Description

First we translate all of our characters to numbers, 'a'=0, 'b'=1, 'c'=2, ... , 'z'=25. We can now
represent the caesar cipher encryption function, e(x), where x is the character we are encrypting,
as:

Where k is the key (the shift) applied to each letter. After applying this function the result is a
number which must then be translated back into a letter. The decryption function is :
Other Implementations

For Caesar cipher code in various programming languages, see the Implementations page.

To encipher your own messages in python, you can use the pycipher module. To install it,
use pip install pycipher. To encipher messages with the Caesar cipher (or another cipher, see here
for documentation):

>>>from pycipher import Caesar


>>>Caesar(key=1).encipher('defend the east wall of the castle')
'EFGFOEUIFFBTUXBMMPGUIFDBTUMF'
>>>Caesar(key=1).decipher('EFGFOEUIFFBTUXBMMPGUIFDBTUMF')
'DEFENDTHEEASTWALLOFTHECASTLE'

Cryptanalysis

See Cryptanalysis of the Caesar Cipher for a way of automatically breaking this cipher.

Cryptanalysis is the art of breaking codes and ciphers. The Caesar cipher is probably the
easiest of all ciphers to break. Since the shift has to be a number between 1 and 25, (0 or 26 would
result in an unchanged plaintext) we can simply try each possibility and see which one results in a
piece of readable text. If you happen to know what a piece of the cipher text is, or you can guess a
piece, then this will allow you to immediately find the key.

If this is not possible, a more systematic approach is to calculate the frequency distribution
of the letters in the cipher text. This consists of counting how many times each letter appears.
Natural English text has a very distinct distribution that can be used help crack codes. This
distribution is as follows:
This means that the letter e is the most common, and appears almost 13% of the time,
whereas z appears far less than 1 percent of time. Application of the Caesar cipher does not
change these letter frequencies, it merely shifts them along a bit (for a shift of 1, the most frequent
cipher text letter becomes f). A cryptanalyst just has to find the shift that causes the cipher text
frequencies to match up closely with the natural English frequencies, and then decrypt the text
using that shift. This method can be used to easily break Caesar ciphers by hand.

If you are still having trouble, try the cryptanalysis section of the substitution cipher page. All
strategies that work with the substitution cipher will also work with the Caesar cipher (but methods
that work on the Caesar cipher do not necessarily work on the general substitution cipher).

For a method that works well on computers, we need a way of figuring out which of the 25
possible decryptions looks the most like English text. See Cryptanalysis of the Caesar Cipher for a
walkthrough of how to break it using quad gram statistics. The key (or shift) that results in a
decryption with the highest likelihood of being English text is most probably the correct key. Of
course, the more cipher text you have, the more likely this is to be true (this is the case for all
statistical measures, including the frequency approach above). So the method used is to take the
cipher text, try decrypting it with each key, then see which decryption looks the best. This simplistic
method of cryptanalysis only works on very simple ciphers such as the Caesar cipher and the rail
fence cipher; even slightly more complex ciphers can have far too many keys to check all of them.
 Affine Cipher
A type of simple substitution cipher, very easy to crack.

The Affine cipher is a special case of the more general monoalphabetic substitution cipher.

The cipher is less secure than a substitution cipher as it is vulnerable to all of the attacks
that work against substitution ciphers, in addition to other attacks. The cipher's primary weakness
comes from the fact that if the cryptanalyst can discover (by means of frequency analysis, brute
force, guessing or otherwise) the plaintext of two cipher text characters, then the key can be
obtained by solving a simultaneous equation.

The Algorithm

The 'key' for the Affine cipher consists of 2 numbers, we'll call them a and b. The following
discussion assumes the use of a 26 character alphabet (m = 26). a should be chosen to be
relatively prime to m (i.e. a should have no factors in common with m). For example 15 and 26
have no factors in common, so 15 is an acceptable value for a, however 12 and 26 have factors in
common (e.g. 2) so 12 cannot be used for a value of a. When encrypting, we first convert all the
letters to numbers ('a'=0, 'b'=1, ..., 'z'=25). The cipher text letter c, for any given letter p is
(remember p is the number representing a letter):

The decryption function is:

where a−1 is the multiplicative inverse of a in the group of integers modulo m.

To find a multiplicative inverse, we need to find a number x such that:

If we find the number x such that the equation is true, then x is the inverse of a, and we call
it a−1. The easiest way to solve this equation is to search each of the numbers 1 to 25, and see
which one satisfies the equation. If you want a more rigorous solution, you can use matlab to find x:

> [g,x,d] = gcd(a,m); % we can ignore g and d, we dont need them


> x = mod(x,m);

If you now multiply x and a and reduce the result (mod 26), you will get the answer 1.
Remember, this is just the definition of an inverse i.e. if a*x = 1 (mod 26), then x is an inverse of a
(and a is an inverse of x).

We now use the value of x we calculated as a-1. This allows us to perform the decryption
step.
Note: As stated above, m does not have to be 26, it is simply the number of characters in
the alphabet you choose to use. If upper case characters, lowercase characters and spaces are
used, then m will be 53. Digits and punctuation could also be incorporated (which again would
change the value of m).

Assume we discard all non alphabetical characters including spaces. Let the key be a=5
and b= 7. The encryption function is then (5*p + 7)(mod 26). To encode:

'defend the east wall of the castle',

we would take the first letter, 'd', convert it to a number, 3 ('a'=0, 'b'=1, ..., 'z'=25) and plug it
into the equation:

since 'w' = 22, 'd' is transformed into 'w' using the values a=5 and b= 7. If we continue with
all the other letters we would have:

'wbgbuwyqbbhtynhkkzgyqbrhtykb'

Now to decode, the inverse of 5 modulo 26 is 21, i.e. 5*21 = 1 (mod 26). The decoding
function is

so we have recovered d=3 as the first plaintext character.

'defendtheeastwallofthecastle'

Other Implementations

To encipher your own messages in python, you can use the pycipher module. To install it,
use pip install pycipher. To encipher messages with the Affine cipher (or another cipher, see here
for documentation):

>>>from pycipher import Affine


>>>Affine(a=5,b=9).encipher('defend the east wall of the castle')
'YDIDWYASDDJVAPJMMBIASDTJVAMD'
>>>Affine(a=5,b=9).decipher('YDIDWYASDDJVAPJMMBIASDTJVAMD')
'DEFENDTHEEASTWALLOFTHECASTLE'
Cryptanalysis

See Cryptanalysis of the Affine Cipher for a guide on how to break this cipher automatically.

The Affine cipher is a very insecure cipher, with the Caesar cipher possibly being the only
easier cipher to crack. The Affine cipher is a monoalphabetic substitution cipher, so all the methods
that are used to cryptanalysis substitution ciphers can be used for the affine cipher. Affine ciphers
can also be cracked if any 2 characters are known.

As an example, imagine we have a cipher text. If the 2 most common characters in the
cipher text are 'h' and 'q', then we can assume that these correspond to 'e' and 't' in the plaintext.
We can set up a simultaneous equation ('h' -> 'e' and 'q' -> 't'), the following 2 equations are simply
two instances of the affine cipher where we know (or assume we know) the values of the plaintext
character and the corresponding ciphertext character for 2 cases, but do not know a or b (In the
following equation we have converted letters to numbers, 'e'=4, 'h'=7, 'q'=16, 't'=19):

For the following discussion we will refer to the more general set of equations:

Solving systems of equations modulo 26 is slightly more difficult than solving them normally,
but it is still quite easy. We know the values p, q, r and s, and we wish to find a and b. We must first
find the number D = p - q, and D-1 (the inverse of D). D-1 is found by looping through the numbers
between 1 and 25 until you find a number, x, such that D*x = 1 (mod 26). We can now find the
value of a and b.

Using the example we started with, p=4, r=7, q=19, s=16. D = p-q = -15 = 11 (mod 26). This
means D-1 = 19. So:

From this we would conclude that the a, b pair used to encrypt the plaintext was 11 and 15
(this represents the key), respectively. If we decrypt the cipher text under this assumption, we can
see if these are correct. If they are, that is the end, otherwise we could try other combinations of
common cipher text letters instead of our guess of 'e' and 't'. This method is much easier to perform
if you have a program that performs these steps automatically.
 Rail-fence Cipher
The rail fence cipher is a very simple, easy to crack cipher. It is a transposition cipher that
follows a simple rule for mixing up the characters in the plaintext to form the cipher text. The rail
fence cipher offers essentially no communication security, and it will be shown that it can be easily
broken even by hand.

Although weak on its own, it can be combined with other ciphers, such as a substitution
cipher, the combination of which is more difficult to break than either cipher on its own.

Many websites claim that the rail-fence cipher is a simpler "write down the columns, read
along the rows" cipher. This is equivalent to using an un-keyed columnar transposition cipher.

Example

The key for the rail fence cipher is just the number of rails. To encrypt a piece of text, e.g.

defend the east wall of the castle

We write it out in a special way on a number of rails (the key here is 3)

d...n...e...t...l...h...s...
.e.e.d.h.e.s.w.l.o.t.e.a.t.e
..f...t...a...a...f...c...l.

The cipher text is read off along the rows:

dnetlhseedheswloteateftaafcl

With a key of 4:

d.....t.....t.....f.....s...
.e...d.h...s.w...o.t...a.t..
..f.n...e.a...a.l...h.c...l.
...e.....e.....l.....e.....e

The cipher text is again read off along the rows:

dttfsedhswotatfneaalhcleelee
Cryptanalysis

Cryptanalysis is the art of breaking codes and ciphers. The rail fence cipher is a very easy
cipher to break. A cryptanalyst (code breaker) simply has to try several keys until the correct one is
found. It is very easy to find a key if you know some of the plaintext, or can guess some of it.
Anagramming is another very powerful method that can be used with any transposition cipher that
consists of taking chunks of cipher text and guessing what the plaintext would be.

A peculiarity of transposition ciphers is that the frequency distribution of the characters will
be identical to that of natural text (since no substitutions have been performed, it is just the order
that has been mixed up). In other words it should look just like this:

English Letter Frequencies

For a method that works well on computers, we need a way of figuring out which of the keys
results in the most English like plaintext after decryption. For automated methods of determining
how 'English like' a piece of text is, check out the Classical Cryptanalysis section, in particular
Quad grams as a fitness measure. The key that results in a decryption with the highest likelihood of
being English text is most probably the correct key. Of course, the more cipher text you have, the
more likely this is to be true (this is the case for all statistical measures, including the frequency
approaches above). So the method used is to take the cipher text, try decrypting it with each key,
then see which decryption looks the best. This simplistic method of cryptanalysis (checking every
single possible key) only works on very simple ciphers such as this cipher; even slightly more
complex ciphers can have far too many keys to check all of them.
 Baconian Cipher
The Baconian cipher is named after its inventor, Sir Francis Bacon. The Baconian cipher is
a substitution cipher in which each letter is replaced by a sequence of 5 characters. In the original
cipher, these were sequences of 'A's and 'B's e.g. the letter 'D' was replaced by 'aaabb', the letter
'O' was replaced by 'abbab' etc.

This cipher offers very little communication security, as it is a substitution cipher. As such all
the methods used to cryptanalyse substitution ciphers can be used to break Baconian ciphers. The
main advantage of the cipher is that it allows hiding the fact that a secret message has been sent at
all.

The Algorithm

Each letter is assigned to a string of five binary digits. These could be the letters 'A' and 'B',
the numbers 0 and 1 or whatever else you may desire. An example Baconian Cipher Encoding
might be:

A = aaaaa I/J = abaaa R = baaaa


B = aaaab K = abaab S = baaab
C = aaaba L = ababa T = baaba
D = aaabb M = ababb U/V = baabb
E = aabaa N = abbaa W = babaa
F = aabab O = abbab X = babab
G = aabba P = abbba Y = babba
H = aabbb Q = abbbb Z = babbb

To encipher a message, e.g. 'STRIKE NOW', we replace each letter:

S T R I K E N O W
baaab baaba baaaa abaaa abaab aabaa abbaa abbab babaa

Hold O/Ff uNt/Il you/ hEar f/rOm mE/ agAin/. wE May/ cOMpR/OmIse.
Hold OFf uNtIl you hEar frOm mE agAin. wE May cOMpROmIse.

The message above has been written so that capital letters are used where the Baconian
cipher has a 'b' and lowercase where there is an 'a'. This scheme is a little bit transparent; however
there are many ways of encoding a Baconian cipher in text. This page discusses several examples.

Cryptanalysis

The Baconian cipher is a substitution cipher, which can be easily broken, see here for an
example of quickly breaking substitution ciphers.
 Polybius Square Cipher
The Polybius Square is essentially identical to the simple substitution cipher, except that
each plaintext character is enciphered as 2 cipher text characters. It can usually be detected if
there are only 5 or 6 different characters in the cipher text.

This algorithm offers very little communication security, and can be easily broken even by
hand, especially as the messages become longer (more than several hundred cipher text
characters).

Example

Here is a quick example of the encryption and decryption steps involved with the Polybius
Square. The text we will encrypt is 'defend the east wall of the castle'.

Keys for the Polybius Square usually consist of a 25 letter 'key square'. e.g. (the letters
along the top and side can be chosen arbitrarily):

An example encryption using the above key:

Plaintext: defend the east wall of the castle


Cipher text: CCBACBBABECC EDABBA BABBDDED EABBBDBD CACB EDABBA
DBBBDDEDBDBA

It is easy to see how each character in the plaintext is replaced with 2 characters in the
cipher alphabet. Decryption is just as easy, by using 2 cipher characters as the row and column
into the key square to get the original plaintext character back. When generating keys it is popular
to use a key word, e.g. 'zebra' to generate it, since it is much easier to remember a key word
compared to a random jumble of 25 characters. Using the keyword 'zebra', the key would become
(i/j are combined):

Cipher alphabet: zebracdfghiklmnopqstuvwxy

Here we have written out the key as a single string instead of a square. To create the
square, the first 5 characters make the first row; the second 5 characters make the second row etc.

If your keyword has repeated characters e.g. 'mammoth', be careful not to include the
repeated characters in the cipher alphabet.
It is interesting to note that the ADFGVX cipher uses a 6x6 version of the Polybius square
as the first step in its encryption.

Other Implementations

To encipher your own messages in python, you can use the pycipher module. To install it,
use pip install pycipher. To encipher messages with the Polybius square cipher (or another cipher,
see here for documentation):

>>>from pycipher import PolybiusSquare


>>>p = PolybiusSquare('phqgiumeaylnofdxkrcvstzwb',5,'ABCDE')
>>>p.encipher('defend the east wall of the castle')
'CEBCCDBCCBCEEBABBCBCBDEAEBEDBDCACACCCDEBABBCDDBDE
AEBCABC'
>>>p.decipher('CEBCCDBCCBCEEBABBCBCBDEAEBEDBDCACACCCDEB
ABBCDDBDEAEBCABC')
'DEFENDTHEEASTWALLOFTHECASTLE'

Cryptanalysis

The Polybius Square is quite easy to break, since it is just a substitution cipher in disguise.
This means that the whole section on cryptanalysing substitution ciphers is applicable, and will not
be repeated here. The only minor difference is that cryptanalysis must now be done on pairs of
characters instead of single characters.
 Simple Substitution Cipher
The simple substitution cipher is a cipher that has been in use for many hundreds of years
(an excellent history is given in Simon Singhs 'the Code Book'). It basically consists of substituting
every plaintext character for a different cipher text character. It differs from the Caesar cipher in that
the cipher alphabet is not simply the alphabet shifted, it is completely jumbled.

The simple substitution cipher offers very little communication security, and it will be shown
that it can be easily broken even by hand, especially as the messages become longer (more than
several hundred cipher text characters).

Example

Here is a quick example of the encryption and decryption steps involved with the simple
substitution cipher. The text we will encrypt is 'defend the east wall of the castle'.

Keys for the simple substitution cipher usually consist of 26 letters (compared to the Caesar
cipher's single number). An example key is:

Plain alphabet: abcdefghijklmnopqrstuvwxyz


Cipher alphabet: phqgiumeaylnofdxjkrcvstzwb

An example encryption using the above key:

Plaintext: defend the east wall of the castle


Cipher text: giuifg cei iprc tpnn du cei qprcni

It is easy to see how each character in the plaintext is replaced with the corresponding letter
in the cipher alphabet. Decryption is just as easy, by going from the cipher alphabet back to the
plain alphabet. When generating keys it is popular to use a key word, e.g. 'zebra' to generate it,
since it is much easier to remember a key word compared to a random jumble of 26 characters.
Using the keyword 'zebra', the key would become:

Cipher alphabet: zebracdfghijklmnopqstuvwxy

This key is then used identically to the example above. If your key word has repeated
characters e.g. 'mammoth', be careful not to include the repeated characters in the cipher alphabet.
Other Implementations

To encipher your own messages in python, you can use the pycipher module. To install it,
use pip install pycipher. To encipher messages with the substitution cipher (or another cipher, see
here for documentation):

>>>from pycipher import SimpleSubstitution


>>>ss = SimpleSubstitution('phqgiumeaylnofdxjkrcvstzwb')
>>>ss.encipher('defend the east wall of the castle')
'GIUIFGCEIIPRCTPNNDUCEIQPRCNI'
>>>ss.decipher('GIUIFGCEIIPRCTPNNDUCEIQPRCNI')
'DEFENDTHEEASTWALLOFTHECASTLE'

Cryptanalysis

See Cryptanalysis of the Substitution Cipher for a guide on how to automatically break this
cipher.

The simple substitution cipher is quite easy to break. Even though the number of keys is
around 288.4 (a really big number), there is a lot of redundancy and other statistical properties of
English text that make it quite easy to determine a reasonably good key. The first step is to
calculate the frequency distribution of the letters in the cipher text. This consists of counting how
many times each letter appears. Natural English text has a very distinct distribution that can be
used help crack codes. This distribution is as follows:
English Letter Frequencies

Letter frequencies ordered from most frequent to least frequent


This means that the letter 'e' is the most common, and appears almost 13% of the time,
whereas 'z' appears far less than 1 percent of time. Application of the simple substitution cipher
does not change these letter frequencies, it merely jumbles them up a bit (in the example above, 'e'
is enciphered as 'i', which means 'i' will be the most common character in the cipher text). A
cryptanalyst has to find the key that was used to encrypt the message, which means finding the
mapping for each character. For reasonably large pieces of text (several hundred characters), it is
possible to just replace the most common cipher text character with 'e', the second most common
cipher text character with 't' etc. for each character (replace according to the order in the image on
the right). This will result in a very good approximation of the original plaintext, but only for pieces of
text with statistical properties close to that for English, which is only guaranteed for long tracts of
text.

Short pieces of text often need more expertise to crack. If the original punctuation exists in
the message, e.g. 'giuifg cei iprc tpnn du cei qprcni', then it is possible to use the following rules to
guess some of the words, then, using this information, some of the letters in the cipher alphabet are
known.

* the information in the above table was borrowed from Simon Singhs website,
http://www.simonsingh.net/The_Black_Chamber/hintsandtips.htm
Usually, punctuation in cipher text is removed and the cipher text is put into blocks such as
'giuif gceii prctp nnduc eiqpr cnizz', which prevents the previous tricks from working. There are,
however, many other characteristics of English that can be utilized. The table below lists some
other facts that can be used to determine the correct key. Only the few most common examples are
given for each rule.

For information about other languages, see Letter frequencies for various languages.
* the information in the above table was borrowed from Simon Singhs website,
http://www.simonsingh.net/The_Black_Chamber/hintsandtips.htm
There are more tricks that can be used besides the ones listed here, maybe one day they
will be included here. In the meantime use your favorite search engine to find more information.

 Codes and Nomenclators Cipher


Nomenclators use elements of substitution ciphers and of codes. They generally combined
a small codebook with large homophonic substitution tables. Originally the code was restricted to
the names of important people, hence the name of the cipher (a nomenclator was a public official
who announced the titles of visiting dignitaries), however it gradually expanded to cover many
common words and place names as well. The symbols for whole words (called codewords) and
letters were not distinguished in the cipher text. The Rossignols' Great Cipher used by Louis XIV of
France was an example of a nomenclator. After it went out of use, messages in French archives
were unbreakable for several hundred years.

Nomenclators were used for diplomatic correspondence and espionage from the early
fifteenth century to the late eighteenth century; however they were being routinely broken by the
mid-sixteenth century.

Codes are an extension of nomenclators, in which all words (or entire phrases) are replaced
by symbols (instead of the letters in ciphers). The use of words removes any possibility of using
letter frequency analysis, but it introduces its own problems since it requires a huge number of
symbols (1 for each word). These symbols are generally written in a code book, which is generally
a thick book the size of a standard dictionary.
 Columnar Transposition Cipher
The columnar transposition cipher is a fairly simple, easy to implement cipher. It is a
transposition cipher that follows a simple rule for mixing up the characters in the plaintext to form
the cipher text.

Although weak on its own, it can be combined with other ciphers, such as a substitution
cipher, the combination of which can be more difficult to break than either cipher on its own. The
ADFGVX cipher uses a columnar transposition to greatly improve its security.

Example

The key for the columnar transposition cipher is a keyword e.g. GERMAN. The row length
that is used is the same as the length of the keyword. To encrypt a piece of text, e.g.

defend the east wall of the castle

we write it out in a special way in a number of rows (the keyword here is GERMAN):

G E R M A N
d e f e n d
t h e e a s
t w a l l o
f t h e c a
s t l e x x

In the above example, the plaintext has been padded so that it neatly fits in a rectangle. This
is known as a regular columnar transposition. An irregular columnar transposition leaves these
characters blank, though this makes decryption slightly more difficult. The columns are now
reordered such that the letters in the key word are ordered alphabetically.

A E G M N R
n e d e d f
a h t e s e
l w t l o a
c t f e a h
x t s e x l

The cipher text is read off along the columns:

nalcxehwttdttfseeleedsoaxfeahl

Other Implementations

To encipher your own messages in python, you can use the pycipher module. To install it,
use pip install pycipher. To encipher messages with the Columnar transposition cipher (or another
cipher, see here for documentation):

>>>from pycipher import ColTrans


>>>ColTrans("HELLO").encipher('defend the east wall of the castle')
'ETSLELDDALHTFHTOCEEEWFANEATS'
>>>ColTrans("HELLO").decipher('ETSLELDDALHTFHTOCEEEWFANEATS')
'DEFENDTHEEASTWALLOFTHECASTLE'

Cryptanalysis

For a guide on how to automatically break columnar transposition ciphers, see here.

Breaking columnar transposition ciphers by hand is covered in the book by Helen Fouche
Gains "Cryptanalysis - a study of ciphers and their solution" and the book by Sinkov "Elementary
Cryptanalysis". A comprehensive guide is also given in "Military Cryptanalysis - part IV" by
Friedman.

The columnar transposition cipher is not the easiest of transposition ciphers to break, but
there are statistical properties of language that can be exploited to recover the key. To greatly
increase the security, a substitution cipher could be employed as well as the transposition.

A peculiarity of transposition ciphers is that the frequency distribution of the characters will
be identical to that of natural text (since no substitutions have been performed, it is just the order
that has been mixed up). In other words it should look just like this:
English Letter Frequencies

Cracking by hand is usually performed by anagramming, or trying to reconstruct the route.


The more complex the route, the more difficult to crack.

For a method that works well on computers, we need a way of figuring out how "English like"
a piece of text is, check out the Text Characterization cryptanalysis section. The key that results in
a decryption with the highest likelihood of being English text is most probably the correct key. Of
course, the more cipher text you have, the more likely this is to be true (this is the case for all
statistical measures, including the frequency approaches above). So the method used is to take the
cipher text, try decrypting it with each key, then see which decryption looks the best. In the case of
this cipher, there are potentially a fair few keys. We can use an optimization technique such as
simulated annealing or a genetic algorithm to solve for the key.
 Autokey Cipher
The Autokey cipher is closely related to the Vigenere cipher, it differs in how the key
material is generated. The Autokey cipher uses a key word in addition to the plaintext as its key
material; this makes it more secure than Vigenere.

The Autokey Cipher is a polyalphabetic substitution cipher. It is closely related to the


Vigenere cipher, but uses a different method of generating the key. It was invented by Blaise de
Vigenère in 1586, and is in general more secure than the Vigenere cipher.

The Algorithm

The 'key' for the Autokey cipher is a key word. e.g. 'FORTIFICATION'

The Autokey cipher uses the following tableau (the 'tabula recta') to encipher the plaintext:
To encipher a message, place the keyword above the plaintext. Once all of the key
characters have been written, start writing the plaintext as the key:

FORTIFICATIONDEFENDTHEEASTWA
DEFENDTHEEASTWALLOFTHECASTLE

Now we take the letter we will be encoding, 'D', and find it on the first column on the tableau.
Then, we move along the 'D' row of the tableau until we come to the column with the 'F' at the top
(The 'F' is the keyword letter for the first 'D'), the intersection is our cipher text character, 'I'.

So, the cipher text for the above plaintext is:

FORTIFICATIONDEFENDTHEEASTWA
DEFENDTHEEASTWALLOFTHECASTLE
ISWXVIBJEXIGGZEQPBIMOIGAKMHE

Other Implementations

To encipher your own messages in python, you can use the pycipher module. To install it,
use pip install pycipher. To encipher messages with the Autokey cipher (or another cipher, see
here for documentation):

>>>from pycipher import Autokey


>>>Autokey('HELLO').encipher('defend the east wall of the castle')
'KIQPBGXMIRDLAAELDHBTSPQFLAPG'
>>>Autokey('HELLO').decipher('KIQPBGXMIRDLAAELDHBTSPQFLAPG')
'DEFENDTHEEASTWALLOFTHECASTLE'

Cryptanalysis

Despite being more secure than the Vigenere cipher, the Autokey cipher is still very easy to
break using automated methods. The reason Autokey is more difficult to break than Vigenere
ciphers is that the key does not repeat, which means the Kasiski test fails, and the Index of
Coincidence can't be used to determine the key length. It's main weakness is that partial keys can
be tested i.e. if you have the first key letter of a length 7 key, then the 1st, 8th, 15th, 22nd etc.
characters will be correctly decrypted. This weakness is exploited in the Autokey cracking guide.
 Beaufort Cipher
The Beaufort cipher, created by Sir Francis Beaufort, is a polyalphabetic substitution cipher
that is similar to the Vigenère cipher, except that it enciphers characters in a slightly different
manner.

The Algorithm

The 'key' for a beaufort cipher is a key word. e.g. 'FORTIFICATION'

The beaufort cipher uses the following tableau (the 'tabula recta') to encipher the plaintext:
To encipher a message, repeat the keyword above the plaintext:

FORTIFICATIONFORTIFICATIONFO
DEFENDTHEEASTWALLOFTHECASTLE

(The following assumes we are enciphering the plaintext letter D with the key letter F) Now
we take the letter we will be encoding, and find the column on the tableau, in this case the 'D'
column. Then, we move down the 'D' column of the tableau until we come to the key letter, in this
case 'F' (The 'F' is the keyword letter for the first 'D'). Our cipher text character is then read from the
far left of the row our key character was in, i.e. with 'D' plaintext and 'F' key, our cipher text
character is 'C'.

So, the cipher text for the above plaintext is:

FORTIFICATIONFORTIFICATIONFO
DEFENDTHEEASTWALLOFTHECASTLE
CKMPVCPVWPIWUJOGIUAPVWRIWUUK

Deciphering is performed in an identical fashion, i.e. encryption and decryption using the
beaufort cipher uses exactly the same algorithm.

This process can be compared to the Vigenère cipher, which uses a different algorithm, but
the same tableau, for finding the cipher text characters.

Other Implementations

To encipher your own messages in python, you can use the pycipher module. To install it,
use pip install pycipher. To encipher messages with the Beaufort cipher (or another cipher, see
here for documentation):

>>>from pycipher import Beaufort


>>>Beaufort('HELLO').encipher('defend the east wall of the castle')
'EAGHBELEHKHMSPOWTXGVAAJLWOTH'
>>>Beaufort('HELLO').decipher('EAGHBELEHKHMSPOWTXGVAAJLWOTH'
)
'DEFENDTHEEASTWALLOFTHECASTLE'
 Porta Cipher
The Porta Cipher is a polyalphabetic substitution cipher invented by Giovanni Battista della
Porta. Where the Vigenere cipher is a polyalphabetic cipher with 26 alphabets, the Porta is
basically the same except it only uses 13 alphabets. The 13 cipher alphabets it uses are reciprocal,
so enciphering is the same as deciphering.

The algorithm used here is the same as that used by the American Cryptogram Association.
Another source is Helen Fouche Gaines book "Cryptanalysis".

The Algorithm

The 'key' for a porta cipher is a key word. e.g. 'FORTIFICATION'

The Porta Cipher uses the following tableau to encipher the plaintext:

*There are a few slightly different tableaus floating around the net, I have gone with the one
used by the ACA, also referenced in Helen Fouche Gaines book "Cryptanalysis".

To encipher a message, repeat the keyword above the plaintext:

FORTIFICATIONFORTIFICATIONFO
DEFENDTHEEASTWALLOFTHECASTLE
Now we take the first key letter 'F', and find it on the first column (the key column containing
two letters) on the tableau. Then, we move along the 'F' row of the tableau until we come to the
column with the 'D' at the top (The 'D' is the first plaintext letter), the intersection is our cipher text
character, 'S'. The same process is repeated for all the characters.

So, the cipher text for the above plaintext is:

FORTIFICATIONFORTIFICATIONFO
DEFENDTHEEASTWALLOFTHECASTLE
Synnjscvrnrlahutukucvryrlany

You may notice that it is possible for two different keywords to produce exactly the same
enciphered message. The encryption and decryption process for this cipher is identical, so
encrypting a piece of text twice with the same key will return the original text.

Cryptanalysis
Because of the reciprocal alphabets used, it is impossible for any letter from one half of the
alphabet (A-M or N-Z) to be replaced with a letter from the same half. Let's say we have a
cryptogram sequence HEP, can this decrypt to THE? No, because E cannot be enciphered as H.
The same logic can rule out any of the common trigrams THA,AND,ENT,ION,TIO,FOR, and many
others. One possible decryption would be STH.

The porta cipher can be broken the same way as a Vigenere Cipher, for a guide on how to
break vigenere ciphers see here for how to do it automatically.

When trying to break the Porta cipher, the first step is to determine the key length. This page
describes how to use the Index of Coincidence to determine the key length for the Vigenere, the
same process can be used for the Porta. Once this is known, we can start trying to determine the
exact key.
 Running Key Cipher
The Running Key cipher has the same internal workings as the Vigenere cipher. The
difference lies in how the key is chosen; the Vigenere cipher uses a short key that repeats,
whereas the running key cipher uses a long key such as an excerpt from a book. This means the
key does not repeat, making cryptanalysis more difficult. The cipher can still be broken though, as
there are statistical patterns in both the key and the plaintext which can be exploited.

If the key for the running key cipher comes from a statistically random source, then it
becomes a 'one time pad' cipher. One time pads are theoretically unbreakable ciphers, because
every possible decryption is equally likely.

The Algorithm

The 'key' for a running key cipher is a long piece of text, e.g. an excerpt from a book. The
running key cipher uses the following tableau (the 'tabula recta') to encipher the plaintext:
To encipher a message, write the key stream above the plaintext, in this case our key is
from a Terry Pratchett book: 'How does the duck know that? Said Victor'. If we needed to encipher
a longer plaintext, we could just continue reading from the book.

HOWDOESTHEDUCKKNOWTHATSAIDVI
DEFENDTHEEASTWALLOFTHECASTLE

Now we take the letter we will be encoding, 'D', and find it on the first column on the tableau.
Then, we move along the 'D' row of the tableau until we come to the column with the 'H' at the top
(The 'H' is the keyword letter for the first 'D'), the intersection is our cipher text character, 'K'.

So, the cipher text for the above plaintext is:

HOWDOESTHEDUCKKNOWTHATSAIDVI
DEFENDTHEEASTWALLOFTHECASTLE
KSBHBHLALIDMVGKYZKYAHXUAAWGM

Cryptanalysis

Vigenere-like ciphers were regarded by many as practically unbreakable for 300 years. The
running key cipher is in general more difficult to break than the Vigenere or Autokey ciphers.
Because the key does not repeat, finding repeating blocks is less useful. The easiest way to crack
this cipher is to guess or obtain somehow a piece of the plaintext; this allows you to determine the
key. With some of the key known, you should try and identify the source of the key text.

When trying to crack this cipher automatically, high order language models are required.
The 4-grams used to break Vigenere ciphers are not good enough for breaking running key
ciphers. This page (coming soon) describes the use of a second order word level model used to
break running key ciphers. In essence, the key and plaintext are built simultaneously from
sequences of words such that the key sequence and the plaintext sequence create the cipher text.
The restrictions that English words place on allowable characters can be enough to determine the
key. If the key contains very rare words, though, the algorithm may find something that scores
higher but consists of more common words.

 Vigenère and Gronsfeld Cipher


A more complex polyalphabetic substitution cipher. Code is provided for encryption,
decryption and cryptanalysis.

The Vigenère Cipher is a polyalphabetic substitution cipher. The method was originally
described by Giovan Battista Bellaso in his 1553 book La cifra del. Sig. Giovan Battista Bellaso;
however, the scheme was later misattributed to Blaise de Vigenère in the 19th century, and is now
widely known as the Vigenère cipher.

Blaise de Vigenère actually invented the stronger Autokey cipher in 1586.

The Vigenère Cipher was considered le chiffre ind hiffrable (French for the unbreakable
cipher) for 300 years, until in 1863 Friedrich Kasiski published a successful attack on the Vigenère
cipher. Charles Babbage had, however, already developed the same test in 1854. Gilbert Vernam
worked on the vigenere cipher in the early 1900s, and his work eventually led to the one-time pad,
which is a provably unbreakable cipher.

The Algorithm

The 'key' for a vigenere cipher is a key word. e.g. 'FORTIFICATION'

The Vigenere Cipher uses the following tableau (the 'tabula recta') to encipher the plaintext:

To encipher a message, repeat the keyword above the plaintext:

FORTIFICATIONFORTIFICATIONFO
DEFENDTHEEASTWALLOFTHECASTLE
Now we take the letter we will be encoding, 'D', and find it on the first column on the tableau.
Then, we move along the 'D' row of the tableau until we come to the column with the 'F' at the top
(The 'F' is the keyword letter for the first 'D'), the intersection is our cipher text character, 'I'.

So, the cipher text for the above plaintext is:

FORTIFICATIONFORTIFICATIONFO
DEFENDTHEEASTWALLOFTHECASTLE
ISWXVIBJEXIGGBOCEWKBJEVIGGQS

Variants

There are several ciphers that are very similar to the vigenere cipher.

The Gronsfeld cipher is exactly the same as the vigenere cipher, except numbers are used
as the key instead of letters. There is no other difference. The numbers may be picked from a
sequence, e.g. the Fibonacci series, or some other pseudo-random sequence.

The gronsfeld cipher is cryptanalysed in the same way as the vigenere algorithm, however
the autokey cipher will not be broken using the kasiski method since the key does not repeat. The
best way to break the autokey cipher is to try and guess portions of the plaintext or key from the
cipher text, knowing they must both follow the frequency distribution of English text. Guessing how
the plaintext begins is the easiest way of cracking the cipher.

Other Implementations

To encipher your own messages in python, you can use the pycipher module. To install it,
use pip install pycipher. To encipher messages with the Vigenere cipher (or another cipher, see
here for documentation):

>>>from pycipher import Vigenere


>>>Vigenere('HELLO').encipher('defend the east wall of the castle')
'KIQPBKXSPSHWEHOSPZQHOINLGAPP'
>>>Vigenere('HELLO').decipher('KIQPBKXSPSHWEHOSPZQHOINLGAPP')
'DEFENDTHEEASTWALLOFTHECASTLE'

For the Gronsfeld cipher:

>>>from pycipher import Gronsfeld


>>>Gronsfeld([4,5,3,2,9]).encipher('defend the east wall of the castle')
'HJIGWHYKGNEXWYJPQRHCLJFCBXQH'
>>>Gronsfeld([4,5,3,2,9]).decipher('HJIGWHYKGNEXWYJPQRHCLJFCBXQH'
)
'DEFENDTHEEASTWALLOFTHECASTLE'

Cryptanalysis
See Cryptanalysis of the Vigenere Cipher for a guide on how to break this cipher by hand,
and here for how to do it automatically.

When trying to break the Vigenere cipher, the first step is to determine the key length. This
page describes how to use the Index of Coincidence to determine the key length. Once this is
known, we can start trying to determine the exact key. This is usually done using the Chi-squared
test to find the correct offset for each letter.

 Homophonic Substitution Cipher


The Homophonic Substitution cipher is a substitution cipher in which single plaintext letters
can be replaced by any of several different cipher text letters. They are generally much more
difficult to break than standard substitution ciphers.

The number of characters each letter is replaced by is part of the key, e.g. the letter 'E'
might be replaced by any of 5 different symbols, while the letter 'Q' may only be substituted by 1
symbol.

The easiest way to break standard substitution ciphers is to look at the letter frequencies,
the letter 'E' is usually the most common letter in English, so the most common cipher text letter will
probably be 'E' (or perhaps 'T'). If we allow the letter 'E' to be replaced by any of 3 different
characters, then we can no longer just take the most common letter, since the letter count of 'E' is
spread over several characters. As we allow more and more possible alternatives for each letter,
the resulting cipher can become very secure.

An Example

Our cipher alphabet is as follows:

To encipher the message DEFEND THE EAST WALL OF THE CASTLE, we find 'D' in the
top row, then replace it with the letter below it, 'F'. The second letter, 'E' provides us with several
choices; we could use any of 'Z', '7', '2' or '1'. We choose one of these at random, say '7'. After
continuing with this, we get the cipher text:

Plaintext: DEFEND THE EAST WALL OF THE CASTLE


Cipher text: F7EZ5F UC2 1DR6 M9PP 0E 6CZ SD4UP1

The number of cipher text letters assigned to each plaintext letter was chosen to flatten the
frequency distribution as much as possible. Since 'E' is normally the most common letter, it is
allowed more possibilities so that the frequency peak from the letter 'E' will not be present in the
cipher text.

Cryptanalysis

Breaking homophonic substitution ciphers can be very difficult if the number of homophones
is high. The usual method is some sort of hill climbing, similar to that used in breaking substitution
ciphers. In addition to finding which letters map to which others, we also need to determine how
many letters each plaintext letter can become. This is handled in this attempt by having 2 layers of
nested hill climbing: an outer layer to determine the number of symbols each letter maps to, then
an inner layer to determine the exact mapping.

 Four-Square Cipher
The Four-square cipher encrypts pairs of letters (like playfair), which makes it significantly
stronger than substitution ciphers etc. since frequency analysis becomes much more difficult.

Felix Delastelle (1840 - 1902) invented the four-square cipher, first published in a book in
1902. Delastelle was most famous for his invention of several systems of polygraphic substitution
ciphers including bifid, trifid, and the four-square cipher.

For a guide on how to break the foursquare cipher using Simulated Annealing, see
Cryptanalysis of the Foursquare Cipher.

The Algorithm

The four-square cipher uses four 5 by 5 matrices arranged in a square. Each of the 5 by 5
matrices contains 25 letters, usually the letter 'j' is merged with 'i' (wikipedia says 'q' is omitted, it is
not very important since both q and j are rather rare letters). In general, the upper-left and lower-
right matrices are the "plaintext squares" and each contain a standard alphabet. The upper-right
and lower-left squares are the "cipher text squares" and contain a mixed alphabetic sequence.

The cipher text squares can be generated using a keyword (dropping duplicate letters), then
fill the remaining spaces with the remaining letters of the alphabet in order. Alternatively the cipher
text squares can be generated completely randomly. The four-square algorithm allows for two
separate keys, one for each of the two cipher text matrices.

Steps

1. Break up the plaintext into bigrams i.e. ATTACK AT DAWN --> AT TA CK AT DA WN .An 'X'
(or some other character) may have to be appended to ensure the plaintext is an even length.

2. Using the four 'squares', two plain alphabet squares and two cipher alphabet squares, locate
the bigram to encrypt in the plain alphabet squares. The example below enciphers the bigram
'AT'. The first letter is located from the top left square, the second letter is located in the bottom
right square.
3. Locate the characters in the cipher text at the corners of the rectangle that the letters 'AT'
make:

4. Using the above keys, the bigram 'AT' is encrypted to 'TI'.

The text 'attack at dawn', with the keys 'zgptfoihmuwdrcnykeqaxvsbl' and


'mfnbdcrhsaxyogvituewlqzkp', becomes:

ATTACKATDAWN
TIYBFHTIZBSY

Other Implementations

To encipher your own messages in python, you can use the pycipher module. To install it,
use pip install pycipher. To encipher messages with the Foursquare cipher (or another cipher, see
here for documentation):

>>>from pycipher import Foursquare


>>>fs = Foursquare('zgptfoihmuwdrcnykeqaxvsbl','mfnbdcrhsaxyogvituewlqzkp')
>>>fs.encipher('defend the east wall of the castle')
'FBUMCNESFDPIKKZXCXMIUNZNQUNM'
>>>fs.decipher('FBUMCNESFDPIKKZXCXMIUNZNQUNM')
'DEFENDTHEEASTWALLOFTHECASTLE'

Cryptanalysis

For a guide on how to break the foursquare cipher using Simulated Annealing, see
Cryptanalysis of the Foursquare Cipher.

The four-square cipher can be easily cracked with enough cipher text. It is quite simple to
determine the key if both plaintext and cipher text are known, and for this reason guessing parts of
the plaintext is a very effective way of cracking this cipher. If a portion of the plaintext is known or
can be guessed this should be exploited first to determine as much of the key as possible, then
more guessing can be applied or other techniques described below.

Compared to the Playfair cipher, a four-square cipher will not show reversed cipher text
digraphs for reversed plaintext digraphs (e.g. the digraphs AB BA would encrypt to some pattern
XY YX in Playfair, but not in four-square). This, of course, is only true if the two keywords are
different. Another difference between four-square and Playfair which makes four-square a stronger
encryption is the fact that double letter digraphs will occur in four-square cipher text. [1]

The four-square cipher is a stronger cipher than Playfair, but it is more cumbersome
because of its use of two keys and preparing the encryption/decryption sheet can be time
consuming. Given that the increase in encryption strength afforded by four-square over Playfair is
marginal and that both schemes are easily defeated if sufficient cipher text is available, Playfair
was much more common.

A good tutorial on reconstructing the key for a four-square cipher can be found in chapter 7,
"Solution to Polygraphic Substitution Systems," of Field Manual 34-40-2, produced by the United
States Army. Another source is provided in lecture 17 of the LANIKI Crypto Course Lessons
Lecture 17.
 Hill Cipher
An algorithm based on matrix theory. Very good at diffusion.

Invented by Lester S. Hill in 1929, the Hill cipher is a polygraphic substitution cipher based
on linear algebra. Hill used matrices and matrix multiplication to mix up the plaintext.

To counter charges that his system was too complicated for day to day use, Hill constructed
a cipher machine for his system using a series of geared wheels and chains. However, the
machine never really sold.

Hill's major contribution was the use of mathematics to design and analyse cryptosystems. It
is important to note that the analysis of this algorithm requires a branch of mathematics known as
number theory. Many elementary number theory text books deal with the theory behind the Hill
cipher, with several talking about the cipher in detail (e.g. Elementary Number Theory and its
applications, Rosen, 2000). It is advisable to get access to a book such as this, and to try to learn a
bit if you want to understand this algorithm in depth.

For a guide on how to break Hill ciphers, see Cryptanalysis of the Hill Cipher.

Example

This example will rely on some linear algebra and some number theory. The key for a hill
cipher is a matrix e.g.

In the above case, we have taken the size to be 3×3, however it can be any size (as long as
it is square). Assume we want to encipher the message ATTACK AT DAWN. To encipher this, we
need to break the message into chunks of 3. We now take the first 3 characters from our plaintext,
ATT and create a vector that corresponds to the letters (replace A with 0, B with 1 ... Z with 25 etc.)
to get: [0 19 19] (this is ['A' 'T' 'T']).
To get our ciphertext we perform a matrix multiplication (you may need to revise matrix
multiplication if this doesn't make sense):

This process is performed for all 3 letter blocks in the plaintext. The plaintext may have to be
padded with some extra letters to make sure that there is a whole number of blocks.

Now for the tricky part, the decryption. We need to find an inverse matrix modulo 26 to use
as our 'decryption key'. i.e. we want something that will take 'PFO' back to 'ATT'. If our 3 by 3 key
matrix is called K, our decryption key will be the 3 by 3 matrix K-1, which is the inverse of K.

To find K-1 we have to use a bit of maths. It turns out that K-1 above can be calculated from
our key. A lengthy discussion will not be included here, but we will give a short example. The
important things to know are inverses (mod m), determinants of matrices, and matrix adjugates.

Let K be the key matrix. Let d be the determinant of K. We wish to find K-1 (the inverse of K),
such that K × K-1 = I (mod 26), where I is the identity matrix. The following formula tells us how to
find K-1 given K:

where d × d-1 = 1(mod 26), and adj(K) is the adjugate matrix of K.

d (the determinant) is calculated normally for K (for the example above, it is 489 = 21 (mod
26)). The inverse, d-1, is found by finding a number such that d × d-1 = 1 (mod 26) (this is 5 for the
example above since 5*21 = 105 = 1 (mod 26)). The simplest way of doing this is to loop through
the numbers 1..25 and find the one such that the equation is satisfied. There is no solution (i.e.
choose a different key) if gcd(d,26) ≠ 1 (this means d and 26 share factors, if this is the case K can
not be inverted, this means the key you have chosen will not work, so choose another one).

That is it. Once K-1 is found, decryption can be performed.

JavaScript Example of the Hill Cipher

This is a JavaScript implementation of the Hill Cipher. The case here is restricted to 2x2
case of the hill cipher for now, it may be expanded to 3x3 later.

The 'key' should be input as 4 numbers, e.g. 3 4 19 11. These numbers will form the key
(top row, bottom row).

Cryptanalysis
Cryptanalysis is the art of breaking codes and ciphers. When attempting to crack a Hill
cipher, frequency analysis will be practically useless, especially as the size of the key block
increases. For very long cipher texts, frequency analysis may be useful when applied to bigrams
(for a 2 by 2 hill cipher), but for short cipher texts this will not be practical.

For a guide on how to break Hill ciphers with a crib, see Cryptanalysis of the Hill Cipher.

The basic Hill cipher is vulnerable to a known-plaintext attack, however, (if you know the
plaintext and corresponding cipher text the key can be recovered) because it is completely linear.
An opponent who intercepts several plaintext/cipher text character pairs can set up a linear system
which can (usually) be easily solved; if it happens that this system is indeterminate, it is only
necessary to add a few more plaintext/cipher text pairs. The known cipher text attack is the best
one to try when trying to break the hill cipher, if no sections of the plaintext are known, guesses can
be made.

For the case of a 2 by 2 hill cipher, we could attack it by measuring the frequencies of all the
digraphs that occur in the cipher text. In Standard English, the most common digraph is 'th',
followed by 'he'. If we know the hill cipher has been employed and the most common digraph is 'kx',
followed by 'vz' (for example), we would guess that 'kx' and 'vz' correspond to 'th' and 'he',
respectively. This would mean [19, 7] and [7, 4] are sent to [10, 23] and [21, 25] respectively (after
substituting letters for numbers). If K was the encrypting matrix, we would have:

Since the inverse of P is

we have

which gives us a possible key. After attempting to decrypt the cipher text with

we would know whether our guess was correct. If it is not, we could try other combinations
of common cipher text digraphs until we get something that is correct.

In general, the hill cipher will not be used on its own, since it is not all that secure. It is,
however, still a useful step when combined with other non-linear operations, such as S-boxes (in
modern ciphers). It is generally used because matrix multiplication provides good diffusion (it mixes
things up nicely). Some modern ciphers use a matrix multiplication step to provide diffusion e.g.
AES and Twofish use matrix multiplication as a part of their algorithms.

 Playfair Cipher
 ADFGVX Cipher
 ADFGX Cipher
 Bifid Cipher
 Straddle Checkerboard Cipher
 Trifid Cipher
 Base64 Cipher
 Fractionated Morse Cipher

You might also like