P. 1
A Cryptography Primer

A Cryptography Primer

|Views: 11|Likes:
Published by George C. Bragg
A high-level overview of what cryptography is, how it works, and what the major kinds are. "Crypto for dummies", essentially.
A high-level overview of what cryptography is, how it works, and what the major kinds are. "Crypto for dummies", essentially.

More info:

Published by: George C. Bragg on Dec 06, 2011
Copyright:Attribution Non-commercial

Availability:

Read on Scribd mobile: iPhone, iPad and Android.
download as DOC, PDF, TXT or read online from Scribd
See more
See less

07/01/2014

pdf

text

original

A Cryptography Primer The word Cryptography comes from the Greek words cryptos (hidden) and graphos (written

). Thus, even from the word itself we find a strong hint as to its meaning. A strict definition of the word as currently used would be “the science and study of making and using secret writing, such as codes and ciphers”. The intent of cryptography is simple – if you cannot trust that your messages will not be intercepted by a third party, then you mask the messages in such a way that others cannot easily read them. Early methods focused more on ciphers and codes – hidden alphabets and the like. One of the earliest known examples was the Caesar cipher. In order to communicate with his generals, Caesar had his messages encoded using a “shift” - i.e., each letter would be shifted down in the alphabet by a pre-arranged amount. So, for example, if the “shift” was pre-arranged to be 3, every letter “A” in the message would be replaced by a “D”, every “B” would be replaced by an “E”, and so on. In this sort of code, if you reach the end of the alphabet you simply wrap around to the beginning again. So, in our example above, “X” would become “A”, “Y” would become “B”, and “Z” would become “C”. This technique is still sometimes used today, most commonly as part of a larger encryption system such as the Vigenère cipher. Modern computer systems may also include a “ROT-13” function, which is basically a Caesar cipher with a shift of 13 (half the alphabet). On consideration, you should be able to spot the weakness in the Caesar cipher. Hint: what would be the maximum number of attempts that would be required to calculate all possible shifts of a given message? Secret alphabets have also been used in the past. This could either be a mapping of letters and digits to non-standard symbols, or a remapping of the letters and digits to other, random positions in the symbol set. So, for example, A could become X, B could become J, C could become 6, and so on. The easiest way to crack these (assuming a standard set of symbols mapping 1-to-1 to English letters and digits) is via frequency analysis. It is well known that certain letters in the English language are relatively common (such as S, T, E, and A), and others are very uncommon in normal written language (such as X, Q, and Z – i.e. the high-value Scrabble tiles). By counting up all occurrences of each symbol, and then using a bit of trial-and-error, it can be possible to crack the code without having to try every possible combination. This can be made even easier if the spaces between words in the original message are preserved – you then know that (for example) a single symbol by itself must be either an I or an A, and a two-symbol pair by itself can be one of only a handful of words (such as “is”, “as”, “or”, and “it”). Modern computers, which can perform millions of operations a second, would be able to crack such a code in minutes at worst. There is one application of secret alphabets which is still in common use today, and is widely considered to be impossible to break: the one-time pad. Basically, the one-time

pad is a sheet (or set of sheets) with a random, non-repeating set of symbols on it, and both parties (sender and recipient) must have a copy of the same pad. When someone wants to encode a message, they write each letter of the message under the next available symbol on the pad. After that, a simple combination is performed, where each letter of the original message is “added” to the symbol on the pad using their relative positions in the alphabet, resulting in a new symbol (again, wrapping around where appropriate). For example, consider the following: WE ATTACK AT DAWN Original message QGJSOPEMHZIERNQAL One-time pad NLJTIJFPSZJYRRRXZ Encrypted result The receiving party must use exactly the same sheet as the sender, write the message under the same characters, and then perform an equivalent “subtraction”. Without the code sheet from the one-time pad, the message is unrecoverable. (In this instance, we have valued a space between words as 0, so used the one-time character as the encrypted result.) The main problem with one-time pads is that the pads themselves must be distributed, and thus could be intercepted and copied. Additionally, if you and your message recipient are not using the same pads, then the message is not recoverable. Note also that a given one-time pad should only be used once (hence the name) – repeated use of the same pad may allow the bad guys to gain enough information to determine what’s on it. Most modern encryption is math-based, using keys (secret numbers). A complex mathematical function is performed on the message (or on regular-sized “chunks” of the message, called blocks) using the key. Encryption may be either shared-key (also called symmetric-key, and meaning that both parties know and use the same secret number to encrypt and decrypt messages) or public-key (meaning each party has two different secret numbers, one a public key that you can tell everyone, and the other a private key that should not be revealed to anyone at all). Shared-key encryption is easier to implement, but (like one-time pads) requires that the key be sent in a secure way to prevent interception. The basic principle of shared-key encryption is represented in the diagram below:

Original message (“plaintext”)

Math

Encrypted message (“ciphertext”)

Math

Original message

Key

Key

Since the same key is used for both encryption and decryption, it should be obvious that if the bad guys have your key, they can read your mail. There’s another weakness in this form of system, which may not be immediately obvious – as computers get faster, “brute force” attacks (in which every possible key is tried until one works) become easier. And, once your key is compromised by a brute-force attack, you have to go and send a replacement key to everyone who’s supposed to have a copy, while making sure the new key is not intercepted by the bad guys. This is helped in part by longer keys, usually expressed as the number of “bits” in a binary representation of the key. Every bit doubles the number of possible keys, so that a 56-bit key (for example) has 28 (or 256) times as many possible values as a 48-bit key (since 56 is 8 more than 48). One well-known example of this form of encryption is DES, more formally known as the Data Encryption Standard, which was introduced as a US Government standard in 1976. DES only had a 56-bit key length, so as time went on (and computers became more powerful), it became easier and easier to break DES via brute force. A 56-bit key may have 72,057,594,037,927,936 (72 quadrillion) possible combinations, but when you can process billions of keys a second (as with modern parallel processing techniques), it doesn’t take long to break even that sort of number. In response, it was strengthened to “Triple DES” (or 3DES), which applies the encryption algorithm three times to each block in an encryption/decryption/encryption cycle. Since DES is a symmetric-key encryption technique, using the same keys for both the encryption and decryption portions of the 3DES cycle would mean that it would be essentially the same as DES (first encrypting, then decrypting, and then finally re-encrypting the message all with the same key). Instead, the strongest form of 3DES uses 3 different 56-bit keys, one to encrypt the message, the second to “decrypt” (which really just encrypts it more), and then finally a third key to re-encrypt the message again.

Most modern encryption research focuses instead on shared-key encryption. In this technique, each party generates a “key pair”. One key is your public key – this can be distributed to anyone, and is used (together with someone else’s private key) to encrypt messages to you. The other key is your private key, used together with another person’s public key to decrypt messages from them, and should never be disclosed to anyone at all (no, not even your mother). One of the big advantages of this form of cryptography is that transmitting the public key is trivial – even if the bad guys have it, it’s only half of the equation (so to speak), so you can send it by any channel, even one that you know they’re listening on.

Original message (“plaintext”)

Math

Encrypted message (“ciphertext”)

Math

Original message

Your Public Key + Sender’s Private Key

Your Private Key + Sender’s Public Key

Without getting deeply into the mathematics involved, this technique relies on a unique property of prime numbers – the product of two prime numbers is not divisible by any other number (besides itself and the number one, obviously). As a trivial example, the number 21 is the product of 3 and 7 – and there is no other set of numbers which, when multiplied together, also equals 21. This trick is used to generate a unique key using the product of two very large prime numbers, which is then split into public and private keys. In practice, if Alice wants to send a message to Bob without it being intercepted by Eve, she encrypts the message using a combination of Bob’s public key and her private key. When Bob receives the message, he decrypts it using a combination of Alice’s public key and his private key. Eve may know both public keys, but she still can’t combine them to help her decrypt the message, so she’s stuck having to try to brute-force the key. Modern keys vary from 256 to over 1000 bits, so it can take a LONG time to try all possible combinations, even with modern supercomputers. Note that quantum computing (if it ever becomes economically viable) would mean that methods of breaking codes that are currently impractical due to time limitations would become much more viable. For example, factoring the product of large prime numbers (as discussed above) is currently very computationally intensive and thus considered impractical – what good is it if you’ve cracked my credit card number a hundred years after I’m dead and gone?

However, an algorithm exists which, when implemented on a quantum computer, would cut the time required to solve such a problem considerably (Shor’s algorithm). An additional advantage to this form of encryption is that you can confirm the identity of the message sender (assuming their private key hasn’t been compromised by the bad guys), even if the message isn’t otherwise encrypted. If you want to send an unencrypted message, but still be able to verify the sender, then the message sender appends a signature to the bottom of their message that is generated with their private key. The receiver applies the sender’s public key to the signature, which should decrypt back to the original message. If it doesn’t, then the sender’s private key was not used, and the message should not be trusted. The chief problem in shared-key encryption is, how do you know you’ve got the “right” public key, and not one from an imposter? This is especially important in eCommerce – you don’t want to be sending your credit card information to the bad guys, thinking that you’re on your favourite store’s website! Here, assurance is gained via public-key infrastructure (PKI) – a third-party (usually called a “certificate authority” or “trusted third party”) is used as a registry for public keys, and they take the responsibility for confirming the identity of the other party. PKI includes mechanisms for creating, destroying, validating, and revoking keys. A well-known example of a certificate authority for eCommerce is Verisign. One advantage to shared-key encryption is that it is generally less “computationally intensive” than public/private key pairs, meaning it’s well-suited for applications where speed is at least as important as the strength of the encryption (i.e. real-time encryption of voice traffic over a phone line). There are various ways of breaking encryption beyond brute force – some of them rely on particular quirks of the specific encryption algorithm in use, and others rely on having additional information beyond that readily available. For example, if I can intercept your coded transmission (probably fairly easily – since you’ve encoded it you likely assume I can’t read it and thus are transmitting it openly), but I can also get a copy of the unencoded message (say because someone forgets to burn the message form after transmission and I raid your garbage, or I have a spy on the inside), then I can determine not only what method of encryption you’re using, but also what your encryption keys are. Hence, any future messages you send with the same keys will be easily readable by me! This is called a “known-plaintext attack”. Sometimes, even though the message itself is unreadable by an interceptor, the very fact of transmission can give the bad guys information they shouldn’t have. For example, “radio chatter” often increases right before a military operation or police raid is launched – even if I can’t decrypt the transmissions, a sudden spike in transmissions probably means you’re up to something.

You're Reading a Free Preview

Download
scribd
/*********** DO NOT ALTER ANYTHING BELOW THIS LINE ! ************/ var s_code=s.t();if(s_code)document.write(s_code)//-->