breaking ciphers

© All Rights Reserved

17 views

breaking ciphers

© All Rights Reserved

- A Practical Introduction to Rule Based Expert Syst
- tut_1-1
- CryptoWorkbook.pdf
- A Practical Guide to Managing Reference Data With InfoSphere MDM
- So Lns Midterm Pr Acques
- Interview Experience Campus-2016
- syllabus
- The Mathematics and Statistics of Voting Power
- Fe 25959964
- BST Function
- 8.Classification Tree
- GATE 2016 CS Set 2 Answer Key
- The Result Oriented Process for Students Based on Distributed Data Mining
- Test Time 1687
- BMC Impact Solutions 7.3 - Event Management Guide.pdf
- cse 1997 gate paper
- Cryptography History
- VIRTUAL BACKBONE TREES FOR MOST MINIMAL
- SIC Module1
- US Federal Reserve: ifdp621

You are on page 1of 4

CS161 Labs

Posted on October 20, 2014

Introduction

In lecture you have written a couple of basic programs for encrypting text. In this lab we will

break those ciphers and crack the codes.

The message before encoding will be denoted as the plaintext and the encrypted message

will be denoted as the ciphertext. Encryption is simple a function from plaintext messages

to ciphertext messages. In this lab you will invert an unknown encryption function to get the

plaintext given only the ciphertext.

Dictionary-Based Attacks

The first class of attacks we will consider will be dictionary-based.

A Key Assumption

The basic hypothesis underlying such an attack is that the plaintext will mostly only have

words from a known dictionary (in our case this is the file /usr/share/dict/words)

on the Linux machines. Given that assumption it is your task to construct attacks which

successfully decrypt the messages written up in this lab.

We exclude special characters other than spaces and spaces will not be encrypted. The

special characters in some cases will remain in the ciphertext but our encryption algorithm

will ignore them. Not all of the words will be in the dictionary, but most will.

For each ciphertext we give you to decrypt we will let you know the set of ciphers that were

potentially used to encrypt it.

Breaking ROT-n

In class the rot-n code was introduced. In this part of the lab your goal is to break this and

decode the file c1.txt. Given a ciphertext infer what the n is.

A function that will be useful for cracking the code is:

hamming :: [String] -> [String] -> Int

which is a function to measure the distance between two strings based on how many

characters they share. You will also want to make use of the word list mentioned earlier.

Implement this section as a program

http://people.cs.uchicago.edu/~stoehr/cs161/posts/2014-10-20-lab-3.html

1/4

10/22/2014

where derot prints to standard output

n=[rotation factor]

[plaintext]

I will provide a ciphertext to decode, p1.txt among the lab files. It is highly recommended

that you create your own plaintexts, encode them using the rot program from class and then

see if you can decode them.

This is essentially a generalization of the Caesar rotation cipher. This cipher is contained in

c2.txt and the password used is of length 16. This means that you have 16 rotation ciphers

running simultaneously so if s is your ciphertext then s !! 0, s !! 16, s !! 32, etc.

all have the same rotation cipher. The same is true for s !! 1, s !! 17, etc.

Write a bash tool

./vindecoden [ciphertext file] [N]

which takes a ciphertext file and a length then outputs the plaintext and the password.

password=[estimated password]

[plaintext]

For this problem all of the letters are upper case, the spaces are not encoded and ignored by

the encryption, also, there are no special characters: just [A-Z]. Again, write your own

Vigenre encryption and produce a test ciphertext to make sure that your code works before

embarking on the more challenging lab assignment of p2.txt.

You may also assume that the password comes from words in dictionary (recalling that

everything is upper case). Use that fact in your code.

In order to crack this code it is very likely you will make use of n-gram counts. An letter ngram count is the number of times a particular sequence of characters occurs in a text.

Consider the letter bigram VVV which does not occur in any English word but ORM does.

One may which to check how many of the letter n-grams in a candidate decoding of the

ciphertext actually occur in the dictionary: i.e. do the ciphertext letter n-gram statistics match

the dictionary letter n-gram statistics? This is generally a very hard question to work on but

its made easier by the fact that the letter n-grams which occur in English are sparse and

highly concentrated. That is, most letter n-grams never occur in English so this can be a big

tip off that a particular decoding sequence is not possible.

To use this idea for decoding we need to build a data structure to efficiently hold and retrieve

letter n-gram counts. A list could work for this purpose but it would be very slow to use.

Since we are always counting letters A through Z we can encode them (using chr and

some simple arithmetic) as the numbers 0 through 26 which means that they can be

http://people.cs.uchicago.edu/~stoehr/cs161/posts/2014-10-20-lab-3.html

2/4

10/22/2014

expressed with a 5 bit binary number. Our efficient data-structure will map 5 bit binary

numbers to counts.

We create a tree data-structure that is constrained to have a particular height and with leaf

nodes that record counts.

data Tree = Leaf Int | Node Int Tree Tree deriving (Show, Eq)

Some example trees are here:

l0

l3

l1

l2

t0

t1

t2

=

=

=

=

=

=

=

Leaf

Leaf

Leaf

Leaf

Node

Node

Node

10

3

4

0

1 l2 l0

1 l3 l1

2 t1 t0

These have been named suggestively to indicate how the tree works. We need to have a

constructor

initTree n

which outputs a tree for a binary representation with n bits. Include a type signature and a

definition for that function. We will also want a getter-function

treeCount t w

which given a tree t and a number w returns the trees count for w. We also want a setterfunction which updates the tree

treeInsert t w

which will update the count that tree t has for number w. For both of the functions above

write a type signature and recursive definition. A hint for writing them is that the base case

(where the tree is just a Leaf Int) is obvious and you should just return the count. Next try

to handle the case for a tree formed with one root node and two leaves and think about how

the algorithm should recurse Then handle the case where you have two levels and hence four

leaves, etc. The definition for each function shouldnt be longer than five lines. mod and div

are your friends here: review them if you dont know what they do.

You will want to write an interface for this tree that handles the abstraction from character to

integer. It is up to you to decide how to handle that abstraction. asciiTreeInsert, for

instance, is one way to go. You will also want to generalize to the case where you have

multiple characters since we care about n-gram statistcs. Its difficult to make the tree

structure handle any particular length of n-gram, so just define a tree for the shorter n-grams

and use those to generate counts. You will want to think about the underlying binary

representation when doing this.

The most general problem is where you do not know the length of the password. You will

http://people.cs.uchicago.edu/~stoehr/cs161/posts/2014-10-20-lab-3.html

3/4

10/22/2014

./vindecode [ciphertext file]

which outputs the password and the plaintext. To get this program to work you will want to

use some trick to reduce the search space of possible password lengths or you can just try to

brute-force it. To cut down the size of the search space you may want to think about whats

going to happen if the same word appears in the plaintext multiple times in the same location

modulo the password length (i.e. same place relative to the password). The double hint is that

you may want to look at the gcd of the text differences between multiple occurrences.There

is noise in calculating the gcd so youll want to focus on the gcd of subsets of the repetition

periods (and how long the repetition is). The file to decode is here: c3.txt.

What to turn in

You should have written two programs: derot.hs and vindecoden.hs which perform

the first two tasks. For the extra-credit task you will turn in vindecode.hs. Grading will

be based on whether your files can decode the ciphertexts and whether the code clearly

demonstrates how you did it. Save your programs into a folder lab2 within your subversion

repository.

Your code should also include some code for your n-gram counting functions. Make sure that

you have the functions treeInsert and treeCount implemented with the appropriate

type signatures.

It is a good idea to try to decode the passages first without restricting yourself to automatic

algorithms. The code you hand in does may take a while to decode the passages. If your code

is inefficient at performing the decrypting task then you should submit an example

demonstrating that your code does work for the simpler example. Make a note in you

README file along with your submission to discuss practicality.

http://people.cs.uchicago.edu/~stoehr/cs161/posts/2014-10-20-lab-3.html

4/4

- A Practical Introduction to Rule Based Expert SystUploaded byMontse de Garcia
- tut_1-1Uploaded byCái Nhiếp
- CryptoWorkbook.pdfUploaded bybandihoot
- A Practical Guide to Managing Reference Data With InfoSphere MDMUploaded bysvmglp
- So Lns Midterm Pr AcquesUploaded byGobara Dhan
- Interview Experience Campus-2016Uploaded byMayankLamba
- syllabusUploaded byAshokvannan
- The Mathematics and Statistics of Voting PowerUploaded byJulian Augusto Casas Herrera
- BST FunctionUploaded byMadan Ram
- Fe 25959964Uploaded byAnonymous 7VPPkWS8O
- 8.Classification TreeUploaded bynobeen666
- GATE 2016 CS Set 2 Answer KeyUploaded bysanketsdive
- The Result Oriented Process for Students Based on Distributed Data MiningUploaded byEditor IJACSA
- Test Time 1687Uploaded byAnjali Tailor
- BMC Impact Solutions 7.3 - Event Management Guide.pdfUploaded bygits
- cse 1997 gate paperUploaded byRavi Sankar
- Cryptography HistoryUploaded byTiffany Price
- VIRTUAL BACKBONE TREES FOR MOST MINIMALUploaded byAIRCC - IJCNC
- SIC Module1Uploaded byvidhya_bineesh
- US Federal Reserve: ifdp621Uploaded byThe Fed
- Unit 8Uploaded byHarsha Naidu
- Union FindUploaded byyre9029099
- 10.1.1.30Uploaded byAhmed Alkashab
- QuestionsUploaded byAbhishek Jha
- Lecture 10Uploaded byNapster
- Project 3Uploaded bysammyhkav
- IndexUploaded bycyberedu
- crytoUploaded byNiranjan Chandarraj
- Birch-09Uploaded byUmamageswari Kumaresan
- feb2011-otn-harvest-328207Uploaded byNikhil Gokhale

- PUMaC/HMMT TryoutUploaded byjell0boy
- 2010 ELMOUploaded byjell0boy
- Calculus MAO QuestionsUploaded byjell0boy
- 2014 Triple Mock AIME 1Uploaded byadawg159
- Wilcox an Introduction to Lebesgue Integration and Fourier SeriesUploaded byjell0boy
- Mock AIME 2013Uploaded byjell0boy
- Mock UsajmoUploaded byjell0boy
- Problem58 NotesUploaded byjell0boy
- Crime and PunishmentUploaded byakhunta6827
- BengalUploaded byjell0boy
- Midterm ReviewUploaded byjell0boy
- 2013FallInterschool FinalUploaded byjell0boy
- Thesis-SCT-TopologyUploaded byjell0boy
- Tools of the Trade Sally.pdfUploaded byjell0boy
- 2011 AMC 12A-ProblemsUploaded byjell0boy

- Paramagnetic Half SpinsUploaded byNikhil Nehra
- Unit 5 Branch and Bound - The MethodUploaded bySeravana Kumar
- Control YokogawaUploaded byfjranggara91
- A-New-Iterative-Triclass-Thresholding-Technique-in-Image-Segmentation.pdfUploaded bypondyit
- CSI 4142 - Winter 2017 - FinalUploaded byAmin Dhouib
- DTMCUploaded byDavid Lee
- Time Series ClassificationUploaded byi_khandelwal
- Lab Manual CssUploaded bypriyanka
- Introduction to the Theory of Neural ComputationUploaded byGeetak Gupta
- The Cart With an Inverted PendulumUploaded byvlrsenthil
- High Frequency Dynamics of Limit Order Markets RamaCont_Continued FractionUploaded byEmmanuelDasi
- Relationship Between Camera Angle and Vanishing Point in 2D Images.Uploaded byDeepu Singh
- An Empirical Comparison of Machine Learning Models for Time Series ForecastingUploaded bycharlescoutinho85
- A Program Asking the User to Unscramble the Scrambled WordUploaded byPooja Kapoor
- PR Lecture 1Uploaded byAyaz
- AIAA - EMCUploaded byhildebrando.castro2546
- Handout 2.2 Bike Crank WithNotesUploaded byJuan Emanuel Venturelli
- Fourier TransformUploaded byAguirre Ur
- BCD 2 BinaryUploaded byমাহমুদ আব্দুল্লাহ
- Seminar Final ReportUploaded byAbhishek K Nagesh
- Privacy Authentication Using Deniable Key ExchangeUploaded byIJSTE
- fornick.docxUploaded byMichael Benson
- Shashwb Project2 ReportUploaded byNikhil Yadav
- SPE-189969-PAUploaded byChris Ponners
- Fault Location in Distribution Systems with Distributed Generation Using Support Vector Machines and Smart MetersUploaded byJhonattan Javier
- B.tech. - R09 - CSE - Academic Regulations SyllabusUploaded byBalaji Balu
- From Modern Thermodynamics to How Nature Works 2812 OriginalUploaded byEdison Bittencourt
- Working PaperUploaded byGiorgio Spedicato
- A Probabilistic Neural Network Approach for Protein Superfamily ClassificationUploaded byOsman Demir
- Numerical Method for engineers-chapter 6Uploaded byMrbudakbaek