Professional Documents
Culture Documents
Maths IA
Antonio Jardim
Candidate Number
The British School of Rio de Janeiro
2015
Word Count:
Antonio Jardim – 000461-0024
Contents
What is Cryptography?.................................................................................................................3
Ancient Cryptography...................................................................................................................5
The Ceaser Cipher....................................................................................................................5
Modular Arithmetic..............................................................................................................5
Brute Force..........................................................................................................................6
Frequency Analysis...............................................................................................................6
.................................................................................................................................................6
The Polyalphabetical Cipher - The Vigenere Cipher..................................................................7
Kasiski Test............................................................................................................................7
.............................................................................................................................................8
Index of Coincidence............................................................................................................8
The One Time Pad....................................................................................................................9
Modern Cryptography..................................................................................................................9
Diffie Hellman key exchange.....................................................................................................9
RSA Encryption.......................................................................................................................10
Conclusion..................................................................................................................................10
Bibilography...............................................................................................................................10
3|Page
Antonio Jardim – 000461-0024
What is Cryptography?
The development and success of the internet allowed all kinds of information
to be shared between different users creating a huge demand for network security. The
user that is sharing information might not want it to be public information, and
therefore only the sender and receiver of the information should be able to have
access to it. That is where cryptography comes in, allowing private and safe delivery of
information between different users. In very basic terms cryptography can be
described as being the idea of transforming information into unreadable information so
that it can only be accessed by the sender and the receiver. In order to transform the
information cryptography uses cyphers (algorithms). The process of using a cypher in a
piece of information so that it can become unreadable is called encrypting and the
reverse process is called decrypting. There are several different encryption methods
and cyphers, each with its own characteristics. We will therefore explore these
different methods with the goal of not only acquiring a deeper understanding of
cryptography but possibly identify the safest cryptographic method through
mathematical analysis. We will analyze how to mathematically decrypt these cyphers in
order to see which cypher is the safest.
4|Page
Antonio Jardim – 000461-0024
Ancient Cryptography
The Caesar Cipher (Monoalphabetic)
Ancient ciphers are al based on the symmetric key cryptosystem. With this in
mind, the first cipher that will be analyzed in this exploration will be the famous Caesar
cipher. This cipher is well known due to its affiliation with the famous Julius Caesar who
used it for military purposes, it is the a monoalphabetic substitution cipher. A
substitution cipher substitutes one
letter or character for another
following an alphabet of choice and a
monoalphabetic cipher uses fixed
substitution over the entire message.
This particular cipher uses a private
key which needs to be shared
between the sender and receiver of
the message before the message can Figure 1
be delivered. In this case the key will
be the amount of shifts that the
message contains. The cipher applies the same shift for every letter on the message. A
shift would be the amount of characters or letters skipped, starting from the original
letter, that would eventually select the new letter in which the original letter would be
replaced by. For example if the selected key is 3 and we encrypt the letter A the letter
on the encrypted message would be D. The following example presents a key of 19:
ABCDEFGHIJKLMNOPQRSTUVWXYZ
TUVWXYZABCDEFGHIJKLMNOPQRS
Using the reference above, you would locate your letter and then substitute it with the
letter directly below it. The phrase: ATTACK AT DAWN AT THE NORTHERN BRIDGE
would become SLLSUC SL VSOF SL LZW FGJLZWJF TJAVYW
Modular Arithmetic
In order to fully understand cryptography one must be able to perform and
understand modular arithmetic. When dividing two integers we will have an equation
A
that looks like the following: =C remainder D where A is the dividend, B is
B
the divisor, C is the quotient and D is the remainder. When we are only
interested on the remainder of the previous equation we use the modulo operator
(mod). The previous equation written using the modulo operator would be
A modB=D .
5|Page
Antonio Jardim – 000461-0024
extremely hard on the other, it works as a one way function. The following example
illustrates the need for modular arithmetic:
Figure 2
M A T H S plaintext
13 (M) 1 (A) 20 (T) 8 (H) 19 (S) plaintext
+ 19 (S) 19 (S) 19 (S) 19 (S) 19 (S) key
= 32 20 39 27 38 plaintext + key
= 6 (F) 20 (T) 13 (M) 1 (A) 12 (L) plaintext + key (mod 26)
F T M A L → ciphertext
Brute Force
In cryptography, a brute-force attack, or exhaustive key search can in theory, be
used against any type of encrypted data. It works on the premise of checking all
possible keys until the correct one is found. In this case this is extremely easy since
there are only 26 possible keys however this is certainly not the case in more modern
ciphers.
Frequency Analysis
The method of decryption used to decrypt this cipher is based on a single
weakness: every language contains an amount of letter frequency. In the English
language for example the letter E is the most used letter. A frequency table of the
English language can consequently be built and used to decrypt the message since the
frequency of letters of the original and of the encrypted message will remain the same.
The letter that appears the most in the encrypted message is most likely the letter E.
The name of this process is frequency analysis. If we analyze figure 3 we can clearly see
that the encrypted message has a shift of 3.
6|Page
Antonio Jardim – 000461-0024
Figure 4
M A T H I S F U N plaintext
13(M) 1(A)20(T) 8(H) 9(I)19(S) 6(F)21(U)14(N) plaintext
+ 13(M) 1(A)20(T) 9(H)13(M) 1(A)20(T) 9(H)13(M) key
= 26 2 40 17 22 20 26 30 27 plaintext + key
= 26(Z) 2(B)14(N)17(P)22(V)20(T)26(Z) 4(D) 1(A) plaintext+key(mod 26)
Z B N P V T Z D A → ciphertext
The equation for the Vigenere cipher that can be used to describe it is:
e k =K (x +k ) mod 26 where x is the corresponding letter value, k is the value
of the key, e k is the value of the encrypted letter, 26 is the total number of letters in
the alphabet and K is the key length.
Kasiski Test
A famous cryptographer called Friedrich Kasiski developed a method to
discover the key length ( m ) of the encrypted message. Kasiskis method works by
assuming that two identical segments of ciphertext of length ≥ 3 likely correspond to
the same segments of plaintext. If his assumption was correct then m would be a
divisor of the distance ( d ) between the two identical segments.
7|Page
Antonio Jardim – 000461-0024
This process of searching for two identical settings is then performed multiple
times (usually by a computer program) and the value of d is recorded. From figure 5
it is clear that a pattern emerges because there is a repetition of the d values. The
numbers in red together with the number 4 however are anomalies and can be
discarded. By collecting all values of d and checking what is there greatest common
factor the value of m can be defined. In figure 6 it possible to observe that m is
12.
Figure 6
36/2 48/2 120/2 252/2 396/2 2x2x3 = 12
18/2 24/2 60/2 126/2 198/2
9/3 12/2 30/2 63/3 99/3
3/3 6/2 15/3 21/3 33/3
1 3/3 5/5 7/7 11/11
1 1 1 1
Index of Coincidence
In cryptography, coincidence counting is the technique inventedFigure 5
by William
Friedman of putting two texts side-by-side and counting the number of times that
identical letters appear in the same position in both texts. The Index of Coincidence is
the equation that presents the probability that two randomly selected letters are the
8|Page
Antonio Jardim – 000461-0024
same. Each particular language has a different index of coincidence. Ordinary English
has an incidence of coincidence of 0,067, t his probability has been determined through
frequency studies. The formula for finding the index of coincidence (IC) is: IC =
c
∑ ni ( ni −1 )
where ∈ Z 26 , n is the particular letter, N is the total
i=1
N ( N−1 ) /c
number of letters and c is the size of the periods that the ciphertext will be divided.
A simplified equation would be IC =
¿ a' s (¿ a' s−1)+ ¿ b' s ( ¿ b ' s−1 ) + …+¿ z ' s (¿ z' s−1)
. Iteratively divide the text into
TotalLetters ( TotalLetters−1 ) /c
increasing size (periods) and check the index of coincidence for each period, the first
value of c with an IC of 0,067 or greater is most likely the key length. The following
example will illustrate this idea more clearly:
Figure 7
If we encrypt the plaintext “I LIKE MATHEMATICS” using the
following key “MATH” the cyphertext will end up being “V MCSR
NUBUFGIGJWA”
c
>Then perform the first part of the equation ∑ n i ( ni −1 ) :
i=1
1(1-1)+1(1-1)+1(1-1)+1(1-1)+1(1-1)+1(1-1)+2(2-1)+1(1-1)+1(1-1)
+2(2-1)+1(1-1)+1(1-1)+1(1-1)+1(1-1) =
0+0+0+0+0+0+2+0+0+2+0+0+0+0 = 4
240
=60
4
9|Page
Antonio Jardim – 000461-0024
4
=0,067
60
After acquiring the key length through either one of the processes mentioned
the process of decrypting the shift is the same as the one previously seen in the Caesar
cipher, comparing the original alphabet letter frequency with the message letter
frequency. This would have to be done for every individual shift, so the longer the key
word the longer it takes to decrypt a Vigenere Cipher.
Figure 8
H E L L O plaintext
7 (H) 4 (E) 11 (L) 11 (L) 14 (O) plaintext
+ 23 (X) 12 (M) 2 (C) 10 (K) 11 (L) key
= 30 16 13 21 25 plaintext + key
= 4 (E) 16 (Q) 13 (N) 21 (V) 25 (Z) plaintext + key (mod 26)
E Q N V Z → ciphertext
10 | P a g e
Antonio Jardim – 000461-0024
Modern Cryptography
Up until the 1970’s all ciphers used the symmetric key cryptosystem, however
with the development of the internet there was a great need for safe asymmetric key
cryptosystems. Imagine if every website in the internet had to share a private key with
every user in the internet, both the users and the web sites would have to safely store
thousands of different keys. In order to solve this several asymmetric cryptosystems
were developed.
Diffie-Hellman Cipher
The Diffie-Helman cipher is one of the most famous asymmetric cryptosystems.
It uses prime numbers and modular arithmetic to create an equation that is extremely
difficult of being reversed without the correct key.
This cipher functions by selecting a prime modulus (P) and a primitive root (R)
of this prime modulus. A primitive root of a prime number means that when you raise
that number to higher and higher powers, and then divide by the modulus of this
prime number, the remainder lands on all of the numbers up to, but not including the
modulus number. Most importantly the number lands on all of them with the same
frequency. Figure 9 presents a clear example of this calculation.
Figure 9
Let p be a prime. Then b is a primitive root for p if the powers of b,
1, b, b^2, b^3, ...
include all of the residue classes mod p (except 0).
P=7
11 | P a g e
Antonio Jardim – 000461-0024
tested to see if it is a primitive root of P. The prime factorization of P−1 then needs
to be calculated. All of the prime factors (F) found will then be used in the following
P −1
equation R F ≠ 1modP . If for all values of F the equation is true then R is a
primitive root of P, if not R is not a primitive root of P. This is one of the methods for
calculating the primitive root of a prime number. Figure 10 presents a clear example of
this calculation.
Figure 10
P = 11
P – 1 = 10
10/2
5/5
1/1
After selecting a prime modulus and calculating a primitive root of this prime
modulus the following equation can be made Re modP=x . This equation has the
important property that for any value of e the solution is equally likely to be any
number o< x < P . The equation is also useful because it is a one way function,
meaning that even if the value of x is given it is extremely difficult to find the value of e,
the only way to do so is through trial and error.
The cryptosystem using the Diffie-Helman cipher would work by having the
prime modulus (P) and primitive root(R) as a public key, let’s suppose that these
numbers are 3 and 17, therefore 3e mod17=x . The sender then selects a private
key value that would be the exponent of the primitive root value, let’s suppose that
this number is 15, therefore 315 mod17=6 . The result of the equation using those
values is then sent to the receiver. The receiver then selects his own private key value
that would be the exponent of the primitive root, let’s suppose that this number is 13,
therefore 313 mod17=12 . The result of the equation using those values is then sent
back to the sender. Both the sender and receiver now have the results of a calculation
that uses their public key and their private keys. The sender now takes the result of the
receivers calculation and applies it to the equation in the place of the primitive root
giving the following equation 1215 mod 17=10 . The receiver also takes the result of
the sender’s calculation and applies it to the equation in the place of the primitive root
giving the following equation 613 mod 17=10 . Both the sender and the receiver end
up with the same shift, that is because they did the same exact calculation
328 mod 17=10 . By using each others results and applying their private keys to the
13 15
equation it is as if they are doing 315 mod 17=10 and 313 mod 17=10 .
12 | P a g e
Antonio Jardim – 000461-0024
Figure 11
This cipher is an extremely effective asymmetric cipher that allows both the
sender and the receiver to securely share the same shift without having to exchange
information previously.
RSA Cipher
The RSA cipher can be considered an improvement upon the Diffie-Hellmann
Cipher and is a common cipher used nowadays due to its secure aspects. James Ellis a
British engineer and mathematician came up with the idea of having a cryptosystem
where the receiver would not have a private key like in the Diffie-Hellman
cryptosystem, instead the receiver would have a trapdoor key that would “unlock” a
trapdoor one way function. A trapdoor one way function is a function that is easy to
solve in one direction and hard to solve in the other, unless you have access to the
trapdoor key. Clifford Cocks another British mathematician developed James Ellis’s idea
by mathematically solving the problem.
13 | P a g e
Antonio Jardim – 000461-0024
Clifford used the phi function (φ) defined by a mathematician called Euler. Phi is
used to measure the “breakability” of a number. The φ(e) demonstrates the number of
values that are less then or equal to e that do not share a comon factor with e. For
example for φ(8), we check for all values that are smaller or equal to 8 (1,2,3,4,5,6,7,8)
and if they have a common factor with 8. The numbers in red are those that do not
have a comon factor with 8 (1,2,3,4,5,6,7,8), therefore φ(8) = 4. The phi function can
be hard to calculate for ordinary numbers, however for prime numbers (P) the phi
function is incredibly easy since the only common factor they have is themselves,
therefore φ(P) = P-1. The phi function is also multiplicative meaning that φ(A*B) = φ(A)
* φ(B).
Clifford then used Eulers Totient Theorem to connect the phi function with the
modular exponentination seen in the Diffie-Helman cipher. The phi function can be
applied to the Rm × n modP=x equation since φ(m) * φ(n) = φ(e). However the most
important relationship between the phi function and modular exponentiation is the
following: Rφ(e) ≅ 1 mod e . Suppose R=5 and e=8, then 5φ(8) ≅ 1 mod 8 where
5φ(8) =54 =625 and (Notice that 625=624+1=78∗8+1 ) 625 mod 8=1.
Clifford then developed this equation using two simple rules the first 1k =1 that
allowed the exponent k to be added to the equation Rk∗φ (e ) ≅ 1 mod e and the
second that 1∗R=R that allowed him to multply the left side by R to have R on the
right side as follows R∗R k∗φ (e ) ≅ R mod e which can be simplified to this final
equation Rk∗φ (e )+1 ≅ R mod e .
message hi that the sender then sends to the receiver. The receiver can now easily
decrpt the message by using the trapdoor key m=2011 by doing
2011
1394 =89 mod 3127 . In order decrypt the message one needs the trapdoor key
and the only way thatt one can calculate the value of the trapdoor key is through the
14 | P a g e
Antonio Jardim – 000461-0024
prime factorization of e, which can be so large that it would take years for a computer
to be able to calculate the values of m and n .
Conclusion
In virtue of all the different symmetric and asymmetric ciphers analyzed it is
possible to conclude a number of things. The ancient cryptosystems that use
symmetric key are useful only to a certain extent since they rely on the fact that the
receiver and sender need to meet before exchanging information. The Caesar cipher is
the most basic cipher and yet it already presents a few difficulties when one wants to
decrypt it. However frequency analysis easily exposes the key of the cipher. The
Vigenere Cipher is essentially an improvement of the Caesar cipher with the intention
of making the cipher harder to decrypt. The letter frequency table of the Vigenere
cipher will appear to be flatter than the Caesar cipher letter frequency table and
therefore it is harder to decrypt. The one time pad is the strongest symmetric key
cipher since it has a complete equally distributed letter frequency table due to its
random features. All ciphers mentioned have different methods of decryption yet
there is one method of decryption that is common to all of them and that can present
the difference in strength between ciphers. This method is the Brute force method
previously described. For the Caesar cipher there are 26 possible keys; for the Vigenere
cipher there are 26k possible keys where k is the key length and for the One Time
Pad cipher there are 26m possible keys where m is the total length of the cipher.
Let’s suppose that the word HELLO is encrypted and that k =3 . If it is encrypted
1
using the Caesar cipher the probability of guessing the key is ; if it is encrypted
26
1 1
using the Vigenere cipher the probability of guessing the key is = and
26 17576
3
finally if it is encrypted using the One Time Pad cipher the probability of guessing the
1 1
key would be = . The increase in strength is extremely significant.
26
5
11881376
15 | P a g e
Antonio Jardim – 000461-0024
randomness while asymmetric with prime numbers. Essentially the strength of both
cryptosystems is based on the time taken for one to decrypt it. The strongest ciphers
presented of both cryptosystems allow for an increasingly large amount of time
needed for one to decrypt it, therefore they can essentially be considered of being
based on the same premise, that the amount of time needed for one to decrypt it is so
large that by the time one is able to decrypt it no longer is the same.
Bibilography
http://www.math.brown.edu/~jhs/MathCryptoHome.html
https://courses.cs.washington.edu/courses/csep521/97au/notes/lect6-html/sld049.htm
https://www.khanacademy.org/computing/computer-science/cryptography
https://www.khanacademy.org/computing/computer-
science/cryptography/modarithmetic/a/what-is-modular-arithmetic
http://en.wikipedia.org/wiki/One-time_pad
http://www.antilles.k12.vi.us/math/cryptotut/home.htm
http://www.antilles.k12.vi.us/math/cryptotut/mod_arithmetic.htm
MATHEMATICAL CRYPTOLOG
http://www.thonky.com/kryptos/index-of-coincidence/#interpreting-index-of-coincidence
http://sharkysoft.com/vigenere/
http://en.wikipedia.org/wiki/Index_of_coincidence
16 | P a g e