# Project: Cryptanalysis using English Language Statistics and Keyword Search Instructor: Dr.

Natarajan Meghanathan
Project Objective: You are given a ciphertext in English and the objective would be to determine the plaintext through Cryptanalysis that involves the use of English Language Statistics and Keyword Search. Project Description: The cipher used for the encryption is either Caesar Cipher (Substitution-based cipher) or Columnar Transposition (Permutation-based cipher). You would obtain the English Language Statistics of the Ciphertext to first find out whether a substitution-based or a permutation-based cipher is used. The integer key (the shift key for Caesar Cipher and the number of columns for Columnar Transposition) used is between 2 to 25, inclusive. You would then develop an automated smarter decryption program that could find the correct integer key and the corresponding plaintext, without going through a brute force approach of trying all possible values of the keys. You can make use of the fact that the plaintext will contain words such as “location”, “network”, “nodes” at several places. As you decrypt, search for these standalone words in your decrypted plaintext and if you could find them at several places (you can choose the minimum number of times to find each of these keywords), then you can continue the decryption with the particular key and print the entire plaintext. Note that the logic to decide on the correct key and the appropriate plaintext should be embedded within your decryption code. You need not search through the entire key space once you have found the right integer key. Use of English Language Statistics: When you obtain the English Language Statistics of the Ciphertext, if you find the frequency distribution of the characters to be more similar to that for standard English text (shown below), then the cipher used is likely to be a Permutation-based cipher. You can then start writing the decryption code for Columnar Transposition and attempt to find the plaintext. If the percentage frequency distribution of the ciphertext matches to that of the standard English text; but, the corresponding characters are different, try to guess/find out the probable shift key by comparing the two distributions. Then, develop a decryption code for Caesar Cipher and attempt to find the plaintext.

Note: (1) If your cipher turns out to be Caesar Cipher, you will decrypt only characters a-z and A-Z; all other characters in the plaintext are retained in the ciphertext. (2) In the case of Columnar Transposition, all characters (including characters other than a-z and A-Z) in the plaintext are subject to diffusion. (3) You need to retain an uppercase character in the ciphertext as an uppercase character in plaintext and similarly, a lowercase character in the ciphertext as a lowercase character in plaintext.

edu You need to submit everything together as one comprehensive report. What to submit: Submit as hardcopy to the instructor as well as e-mail to natarajan. . depending on what class of cipher was used. • • • • • • • The percentage frequency distribution of the characters in the ciphertext Your conclusion of which class of cipher (substitution or permutation) is used Report the correct shift key (in case of Caesar Cipher) or the number of columns (in case of Columnar Transposition). Briefly explain your strategy for keyword-based cryptanalysis in your decryption code. For e-mail submission.meghanathan@jsums. put everything in a zip file and attach it. A screenshot of the plaintext as the decryption is in progress The code to find the percentage frequency distribution of characters in the ciphertext The decryption code for the Caesar Cipher or Columnar Transposition.Ciphertext File: The project website has the ciphertext file assigned to each student.