You are on page 1of 249

Table of Contents

Theory of Cryptography............................................................................................................................. 13
A 10-paragraph Introduction to Ciphers ....................................................................................................... 13
A Brief History of Cryptography ............................................................................................................... 13
Cipher Construction ............................................................................................................................... 14
Kerckhoffs's principle ................................................................................................................................. 15
Kerckhoffs's research publications ........................................................................................................... 15
Kerckhoffs's principle today .................................................................................................................... 15
Steganography .......................................................................................................................................... 16
A Brief History of Steganography ............................................................................................................. 16
Modern steganography .......................................................................................................................... 16
Protocols................................................................................................................................................... 17
Arbitration Protocols .............................................................................................................................. 17
Dispute Protocols................................................................................................................................... 17
Self-Enforcing Protocols.......................................................................................................................... 18
Attacking Protocols ................................................................................................................................ 18
TCP/IP Protocols ....................................................................................................................................... 18
Application Layer................................................................................................................................... 21
Transport Layer..................................................................................................................................... 22
Internet Layer........................................................................................................................................ 25
Network Interface Layer ......................................................................................................................... 26
Internet Relay Chat................................................................................................................................. 29
Notation of Numbers.................................................................................................................................. 31
Base conversion algorithm...................................................................................................................... 32
Number of digits .................................................................................................................................... 32
Binary numbers......................................................................................................................................... 32
Notation of numbers in NBS .................................................................................................................... 33
Notation of numbers in C2....................................................................................................................... 35
Time Estimation of Mathematical Operations................................................................................................ 38
Binary addition ...................................................................................................................................... 38
Binary multiplication.............................................................................................................................. 38
Factorial................................................................................................................................................ 39
Polynomial time..................................................................................................................................... 39
Modular Arithmetic (Clock Arithmetic)......................................................................................................... 40
Modular inversion.................................................................................................................................. 40
Calculating inverse numbers in ZN ............................................................................................................ 40
Information-theoretic security of ciphers ...................................................................................................... 41
Perfect Security...................................................................................................................................... 41
Semantic Security................................................................................................................................... 42
Padding Mechanisms ................................................................................................................................. 42
Bit Padding............................................................................................................................................ 42
TBC (Trailing Bit Complement) Padding ................................................................................................... 42
PKCS#5 and PKCS#7 Padding ................................................................................................................. 43
ISO 7816-4 Padding ............................................................................................................................... 43
ISO 10126-2 Padding.............................................................................................................................. 43
ANSI X9.23 Padding................................................................................................................................ 43
Zero Byte Padding.................................................................................................................................. 43
Block Ciphers Modes of Operation ............................................................................................................... 44
ECB (Electronic Codebook) Mode ............................................................................................................ 44
CBC (Cipher-Block Chaining) Mode.......................................................................................................... 46
Security of the CBC mode ........................................................................................................................ 47
PCBC (Propagating or Plaintext Cipher-Block Chaining) Mode .................................................................... 48
CFB (Cipher Feedback) Mode .................................................................................................................. 49
OFB (Output Feedback) Mode ................................................................................................................. 50
CTR (Counter) Mode .............................................................................................................................. 50
Security of the CTR mode ........................................................................................................................ 51
Pseudorandom Generator (PRG)................................................................................................................. 52
PRG Implementation.............................................................................................................................. 52
PRG Output Quality ................................................................................................................................ 52
Pseudorandom Functions and Permutations ................................................................................................ 53
Creating PRF from PRG........................................................................................................................... 53
One-way Function...................................................................................................................................... 55
Trapdoor one-way function .................................................................................................................... 55
One-way hash function ........................................................................................................................... 55
Hash function expanding ........................................................................................................................ 55
Hash functions based on block ciphers ..................................................................................................... 56
Message Authentication Code (MAC) ........................................................................................................... 57
MAC algorithms based on PRF................................................................................................................. 57
CBC MAC............................................................................................................................................... 58
NMAC................................................................................................................................................... 59
CMAC ................................................................................................................................................... 60
PMAC ................................................................................................................................................... 60
One-time MAC ....................................................................................................................................... 61
Carter-Wegman MAC ............................................................................................................................. 62
HMAC................................................................................................................................................... 62
Password-Based Encryption (PBE).............................................................................................................. 63
Salt ....................................................................................................................................................... 65
Iteration Count....................................................................................................................................... 65
Software Signing and Authorisation ............................................................................................................. 65
Secure device architecture ...................................................................................................................... 65
Software signing .................................................................................................................................... 66
Secure software flashing ......................................................................................................................... 68
Software signing usage today .................................................................................................................. 69
Index of Coincidence................................................................................................................................... 70
Using IC in cryptography......................................................................................................................... 70
Expected values for some languages......................................................................................................... 71
What Is Encryption? ......................................................................................................................... 71
Encryption Types / Methods ........................................................................................................... 72
Encryption Algorithms ...................................................................................................................... 72
Encryption Standards ...................................................................................................................... 73
File Encryption Overview ................................................................................................................ 74
Disk Encryption Overview ............................................................................................................... 74
Email Encryption Overview ............................................................................................................. 74
Encryption Best Practices ............................................................................................................... 74
All Simple Ciphers..................................................................................................................................... 75
Simple Substitution Ciphers ........................................................................................................................ 75
Usage.................................................................................................................................................... 75
Description............................................................................................................................................ 75
Security of simple substitution ciphers...................................................................................................... 76
Simple Substitution Ciphers: ............................................................................................................ 76
Caesar Cipher ............................................................................................................................................ 76
Usage.................................................................................................................................................... 76
Algorithm.............................................................................................................................................. 77
Security of the Caesar cipher.................................................................................................................... 77
Implementation..................................................................................................................................... 77
ROT13 ...................................................................................................................................................... 78
Usage.................................................................................................................................................... 78
Algorithm.............................................................................................................................................. 78
Security of the ROT13 cipher ................................................................................................................... 78
Implementation..................................................................................................................................... 78
Homophonic Substitution Ciphers ............................................................................................................... 79
Usage.................................................................................................................................................... 79
Description............................................................................................................................................ 79
Homophonic Substitution Cipher:.................................................................................................... 79
Book Cipher .............................................................................................................................................. 79
Usage.................................................................................................................................................... 79
Algorithm.............................................................................................................................................. 79
Security of the book cipher ...................................................................................................................... 80
Polygraphic Substitution Ciphers ................................................................................................................. 80
Usage.................................................................................................................................................... 80
Description............................................................................................................................................ 80
Polygraphic Substitution Ciphers: ................................................................................................... 81
Playfair Cipher ........................................................................................................................................... 81
Usage.................................................................................................................................................... 81
Algorithm.............................................................................................................................................. 81
Security of the Playfair cipher................................................................................................................... 82
Implementation..................................................................................................................................... 83
Two-Square Cipher (Double Playfair)........................................................................................................... 84
Usage.................................................................................................................................................... 84
Algorithm.............................................................................................................................................. 84
Security of the two-square cipher............................................................................................................. 86
Implementation..................................................................................................................................... 86
Four-Square Cipher.................................................................................................................................... 87
Usage.................................................................................................................................................... 87
Algorithm.............................................................................................................................................. 87
Security of the four-square cipher ............................................................................................................ 89
Implementation..................................................................................................................................... 90
Hill Cipher ................................................................................................................................................. 90
Usage.................................................................................................................................................... 91
Algorithm.............................................................................................................................................. 91
Security of the Hill cipher......................................................................................................................... 94
Number of possible keys ......................................................................................................................... 94
Polyalphabetic Substitution Ciphers ............................................................................................................. 95
Usage.................................................................................................................................................... 95
Description............................................................................................................................................ 95
Security of polyalphabetic substitution ciphers .......................................................................................... 95
Polyalphabetic Substitution Ciphers:............................................................................................... 95
Trithemius Cipher...................................................................................................................................... 96
Usage.................................................................................................................................................... 96
Algorithm.............................................................................................................................................. 96
Security of the Trithemius cipher ............................................................................................................. 97
Vigenère Cipher ......................................................................................................................................... 97
Usage.................................................................................................................................................... 97
Algorithm.............................................................................................................................................. 97
Security of the Vigenère cipher................................................................................................................. 98
Beaufort Cipher ......................................................................................................................................... 99
Usage.................................................................................................................................................... 99
Algorithm............................................................................................................................................ 100
Security of the Beaufort Cipher .............................................................................................................. 100
Implementation................................................................................................................................... 101
Running Key Cipher ................................................................................................................................. 101
Usage.................................................................................................................................................. 101
Algorithm............................................................................................................................................ 101
Security of the running key cipher .......................................................................................................... 102
Autokey Cipher........................................................................................................................................ 103
Usage.................................................................................................................................................. 103
Algorithm............................................................................................................................................ 103
Security of the autokey cipher................................................................................................................ 104
Nihilist Cipher.......................................................................................................................................... 105
Usage.................................................................................................................................................. 105
Algorithm............................................................................................................................................ 105
Security of the Nihilist cipher ................................................................................................................. 106
VIC Cipher ............................................................................................................................................... 107
Usage.................................................................................................................................................. 107
Algorithm............................................................................................................................................ 107
Security of VIC...................................................................................................................................... 109
Transposition Ciphers .............................................................................................................................. 109
Usage.................................................................................................................................................. 109
Description.......................................................................................................................................... 109
Transposition Ciphers:.................................................................................................................... 109
Rail Fence Cipher ..................................................................................................................................... 110
Usage.................................................................................................................................................. 110
Algorithm............................................................................................................................................ 110
Security of the Rail Fence Cipher ............................................................................................................ 111
Implementation................................................................................................................................... 111
Route Cipher ........................................................................................................................................... 112
Usage.................................................................................................................................................. 112
Algorithm............................................................................................................................................ 112
Security of the Route Cipher .................................................................................................................. 113
Implementation................................................................................................................................... 113
Columnar Transposition........................................................................................................................... 114
Usage.................................................................................................................................................. 114
Algorithm............................................................................................................................................ 114
Security of the Columnar Transposition.................................................................................................. 115
Implementation................................................................................................................................... 116
Double Columnar Transposition................................................................................................................ 118
Usage.................................................................................................................................................. 118
Algorithm............................................................................................................................................ 118
Security of the Double Columnar Transposition....................................................................................... 119
Myszkowski Transposition ....................................................................................................................... 119
Usage.................................................................................................................................................. 120
Algorithm............................................................................................................................................ 120
Security of the Myszkowski Transposition .............................................................................................. 121
Cryptographic Rotor Machines .................................................................................................................. 122
Usage.................................................................................................................................................. 122
Description.......................................................................................................................................... 122
Cryptographic Rotor Machines: ..................................................................................................... 123
Hebern Cryptographic Rotor Machine........................................................................................................ 123
Usage.................................................................................................................................................. 123
Algorithm............................................................................................................................................ 123
Security of the Hebern rotor machine ..................................................................................................... 124
Images: ............................................................................................................................................... 124
Lorenz Cryptographic Rotor Machine......................................................................................................... 126
Usage.................................................................................................................................................. 126
Algorithm............................................................................................................................................ 126
Security of the Lorenz rotor machine...................................................................................................... 128
Image:................................................................................................................................................. 128
Enigma Cryptographic Rotor Machine........................................................................................................ 129
Usage.................................................................................................................................................. 129
Algorithm............................................................................................................................................ 130
Security of Enigma ............................................................................................................................... 131
Images: ............................................................................................................................................... 132
Simple XOR Cipher ................................................................................................................................... 134
Usage.................................................................................................................................................. 134
Algorithm............................................................................................................................................ 134
Security of the simple XOR cipher........................................................................................................... 135
Implementation................................................................................................................................... 135
Symmetric Ciphers................................................................................................................................... 136
Stream Symmetric Ciphers........................................................................................................................ 136
All Stream Ciphers: .......................................................................................................................... 136
One-Time Pad (OTP) ................................................................................................................................ 136
Usage.................................................................................................................................................. 136
Algorithm............................................................................................................................................ 137
Block Diagram of OTP Algorithm ........................................................................................................... 137
Maths:................................................................................................................................................. 138
Implementation................................................................................................................................... 138
RC4 ........................................................................................................................................................ 139
Usage.................................................................................................................................................. 139
Algorithm............................................................................................................................................ 139
Creating the Table ................................................................................................................................ 139
Encryption and Decryption ................................................................................................................... 140
Speed of RC4 ....................................................................................................................................... 140
Security of RC4..................................................................................................................................... 140
Block Diagram of RC4 ........................................................................................................................... 140
Maths:................................................................................................................................................. 141
Implementation: .................................................................................................................................. 141
Keystream Initialisation ........................................................................................................................ 142
Keystream Generation.......................................................................................................................... 142
Salsa20 ................................................................................................................................................... 142
Usage.................................................................................................................................................. 142
Algorithm............................................................................................................................................ 142
Block Diagram of Salsa20 Algorithm ...................................................................................................... 143
Maths:................................................................................................................................................. 144
CSS (Content Scramble System)................................................................................................................. 149
Usage.................................................................................................................................................. 149
Algorithm............................................................................................................................................ 149
CSS Modes........................................................................................................................................... 149
CSS Keys ............................................................................................................................................. 150
CSS System.......................................................................................................................................... 150
CSS Protocol ........................................................................................................................................ 151
Block Diagram of CSS Algorithm for Audiovisual Data .............................................................................. 152
Block Diagram of CSS Algorithm for Key Bytes......................................................................................... 152
Block Diagram of CSS Additional Encryption of Keys ................................................................................ 153
Block Diagram of CSS LFSR Registers ..................................................................................................... 153
Maths:................................................................................................................................................. 154
Block Symmetric Ciphers .......................................................................................................................... 157
DES (Data Encryption Standard)................................................................................................................ 157
Usage.................................................................................................................................................. 157
Algorithm............................................................................................................................................ 157
Security of DES..................................................................................................................................... 160
Block Diagram of DES Algorithm............................................................................................................ 160
Block Diagram of DES Feistel Function ................................................................................................... 161
Block Diagram of DES Key Schedule ....................................................................................................... 162
Maths:................................................................................................................................................. 163
RC2 ........................................................................................................................................................ 173
Usage.................................................................................................................................................. 174
Algorithm............................................................................................................................................ 174
Encryption .......................................................................................................................................... 175
Decryption .......................................................................................................................................... 176
Block Diagram of RC2 Encryption .......................................................................................................... 176
Block Diagram of RC2 Decryption .......................................................................................................... 178
Maths:................................................................................................................................................. 180
Triple DES (3DES).................................................................................................................................... 182
Usage.................................................................................................................................................. 182
Algorithm............................................................................................................................................ 182
Block Diagram of 3DES Encryption ........................................................................................................ 183
Block Diagram of 3DES Decryption ........................................................................................................ 183
Maths:................................................................................................................................................. 184
AES (Advanced Encryption Standard) ........................................................................................................ 185
Usage.................................................................................................................................................. 185
Algorithm............................................................................................................................................ 185
Block Diagram of AES Encryption:.......................................................................................................... 187
Block Diagram of AES Key Expansion: .................................................................................................... 188
Maths:................................................................................................................................................. 190
Blowfish.................................................................................................................................................. 193
Usage.................................................................................................................................................. 193
Camellia .................................................................................................................................................. 193
Usage.................................................................................................................................................. 194
Algorithm............................................................................................................................................ 194
Block Diagram of Camellia Encryption for 128-bit Key ............................................................................. 195
Block Diagram of Camellia Encryption for 192 or 256-bit Key ................................................................... 197
Block Diagram of Camellia Decryption for 128-bit Key ............................................................................. 199
Block Diagram of Camellia Decryption for 192 or 256-bit Key ................................................................... 201
Block Diagram of Camellia 6-round Block ............................................................................................... 203
Block Diagram of Camellia - Creating Helper Variables of Key.................................................................... 205
Maths:................................................................................................................................................. 206
Implementation................................................................................................................................... 214
Serpent ................................................................................................................................................... 214
Usage.................................................................................................................................................. 214
Twofish................................................................................................................................................... 215
Asymmetric Ciphers................................................................................................................................. 215
Asymmetric Ciphers: ....................................................................................................................... 215
Merkle's Puzzles ...................................................................................................................................... 215
Usage.................................................................................................................................................. 216
Algorithm............................................................................................................................................ 216
Security of Merkle's Puzzles................................................................................................................... 216
Block Diagram of Merkle's Puzzles Protocol ............................................................................................ 217
Maths:................................................................................................................................................. 217
Diffie–Hellman Protocol............................................................................................................................ 217
Usage.................................................................................................................................................. 217
Algorithm............................................................................................................................................ 218
Public key encryption ........................................................................................................................... 218
Security of the Diffie-Hellman protocol.................................................................................................... 218
Block Diagram of Diffie-Hellman protocol ............................................................................................... 219
Maths:................................................................................................................................................. 219
RSA ........................................................................................................................................................ 219
Usage.................................................................................................................................................. 220
Algorithm............................................................................................................................................ 220
Key Generation .................................................................................................................................... 220
Encryption .......................................................................................................................................... 220
Decryption .......................................................................................................................................... 221
Message Authentication........................................................................................................................ 221
Security of RSA..................................................................................................................................... 221
Block Diagram of RSA encryption and decryption.................................................................................... 221
Maths:................................................................................................................................................. 222
Attack Models for Cryptanalysis................................................................................................................. 222
Theoretical Attack Models: ............................................................................................................. 223
Known-Plaintext Attack............................................................................................................................ 223
Known-Plaintext Attack Efficiency ......................................................................................................... 223
Chosen-Plaintext Attack............................................................................................................................ 224
Adaptive-Chosen-Plaintext Attack.......................................................................................................... 224
Ciphertext-Only (Known Ciphertext) Attack................................................................................................ 224
Chosen-Ciphertext Attack ......................................................................................................................... 225
Adaptive-Chosen-Ciphertext Attack ....................................................................................................... 225
Chosen-Key Attack ................................................................................................................................... 225
Cryptographic Attacks: .................................................................................................................. 225
Brute-Force Attack ................................................................................................................................... 226
Dictionary Attack ................................................................................................................................. 226
Reverse Brute-Force Attack................................................................................................................... 226
Denial-of-Service Attack............................................................................................................................ 226
DoS Techniques ................................................................................................................................... 227
Targeting Layers .................................................................................................................................. 227
Attacker's Goal..................................................................................................................................... 228
DDoS (Distributed Denial-of-Service) Attack ........................................................................................... 228
Degradation-of-Service ......................................................................................................................... 228
Reflected (Spoofed) Attack .................................................................................................................... 228
Slowloris Attacks.................................................................................................................................. 229
Zombie Computers .............................................................................................................................. 229
Tools .................................................................................................................................................. 230
Man-in-the-Middle Attack ......................................................................................................................... 230
Attack on Two-Time Pad .......................................................................................................................... 230
Venona Project .................................................................................................................................... 231
MS-PPTP............................................................................................................................................. 231
802.11 WEP ........................................................................................................................................ 232
Key Reinstallation Attack ...................................................................................................................... 233
KRACK.................................................................................................................................................... 233
WPA2 Secret Key ................................................................................................................................. 233
Performing the Attack........................................................................................................................... 234
Protection against KRACK..................................................................................................................... 234
Conclusion .......................................................................................................................................... 234
Frequency Analysis .................................................................................................................................. 235
Frequency Analysis of Substitution Ciphers............................................................................................. 235
Meet-in-the-middle Attack ........................................................................................................................ 235
Meet-in-the-middle Complexity ............................................................................................................. 236
Meet-in-the-middle 2D ......................................................................................................................... 237
Meet-in-the-middle nD ......................................................................................................................... 238
Replay Attack .......................................................................................................................................... 239
Cut-and-Paste Attack ............................................................................................................................ 240
Homograph Attack................................................................................................................................... 240
Simple homograph attacks.................................................................................................................... 240
Non-ASCII ULRs................................................................................................................................... 240
Homograph attacks using non-ASCII characters ...................................................................................... 241
Security............................................................................................................................................... 241
Cryptographic Tools................................................................................................................................ 242
Cryptography in Java ................................................................................................................................ 242
Providers ............................................................................................................................................ 242
Policy Files........................................................................................................................................... 243
Security Tokens ....................................................................................................................................... 243
Static password tokens ......................................................................................................................... 244
Time-synchronized tokens.................................................................................................................... 244
Asynchronous tokens........................................................................................................................... 244
Tokens with public and private keys ...................................................................................................... 244
Key-Based Authentication (Public Key Authentication) ................................................................................ 244
Public key authentication on Linux......................................................................................................... 245
Public key authentication on Windows................................................................................................... 246
Docker.................................................................................................................................................... 247
Sandboxes........................................................................................................................................... 248
Docker API .......................................................................................................................................... 248
Creating Docker images ........................................................................................................................ 248
Docker networks ................................................................................................................................. 249

Theory of Cryptography
A 10-paragraph Introduction to
Ciphers
It is difficult to say with certainty, but it seems probable that soon after mastering the art of writing,
people started to feel the need to hide and mask what was written. Probably over time and with
increasing importance of written messages, the need became stronger. First states were created
and more and more important information had to be sent in writing over long distances. The
information that should have remained undisclosed.

A Brief History of Cryptography


Initially, people tried to mask the fact of the existence of messages and not just their content.
Such an action is called steganography. Over time the first simple substitution ciphers were
invented. The idea was to replace some letters by other letters, in a way that is known only to
privy parties.
It should be noted here that alphabets created in Europe (Latin, Greek) turned out to be large
facilitation. They contain relatively few letters, that are easy to manipulate. Using substitution
ciphers for texts in Chinese is no longer so obvious.

With the appearance of methods of breaking simple substitution ciphers, the ordinary exchange
of letters became no longer strong enough. The new ciphers were invented. They allowed better
mixing of letters, obscuring messages and corrupting typical language characteristics (letters
frequency, popular pairs of characters). Besides substitution of letters, ciphers started to use
transposition of letters.

Cipher Construction
At this stage a full model of building ciphers was developed. Having some long messages that
should be encrypted, one knows a recipe (an algorithm) which is a list of steps to perform for
changing plaintext letters into ciphertext characters. It is also necessary to choose a secret key.
It will be used together with the selected algorithm. For example, the algorithm may be
a sentence "move each letter right" and the secret key may be a phrase "by three positions".
This distinction comes from the possibility to reduce the amount of information that have to be
exchanged between interested parties. The number of positions of how much all letters should be
moved is the most important information in this situation, while the information about the fact that
the shift should be performed to the right is not crucial for message's security and can be
transmitted at the beginning of the message without encryption. Moreover, such distinction allows
to use the algorithm multiple times with different keys, for example during communication with
different people.

Current encryption algorithms operate on computers or electronic devices. Secret keys consist of
tens of characters and during encryption and decryption millions of operations are performed.
Encryption algorithms are part of larger algorithms, communication protocols and various
standards. People have to deal with them quite often in many areas of their lives.

Finally, let's notice two things. Firstly, the main benefit of using ciphers is the fact that secret keys
are much shorter in comparison to amount of transmitted information. This allows to replace
a difficult problem (providing in secret some long messages) by an easier challenge (providing in
secret a shorter key; the key can be then used many times).

Second, any algorithm may be publicly known or it may be secret. In theory, the latter option
provides additional security. The general rule says that every cipher should be reliable and secure
even after the full publication of its algorithm. Thus, one should always assume that an intruder
knows everything about attacked systems, except their secret keys.
Most modern ciphers base on publicly known algorithms. In practice, there are two arguments for
this solution. First, after some time every algorithm usually becomes known - due to coincidence,
bribery, betrayal, or analysis of equipment or software that use the cipher. Second, a publicly
known algorithm can be tested and analysed by thousands of honest and wise people. If they find
errors, they can publicly disclose the issues. Thus it is possible to improve the cipher and correct
the algorithm.
Kerckhoffs's principle

Auguste Kerckhoffs

Kerckhoffs's principle is one of the basic principles of modern cryptography. It was formulated in
the end of the nineteenth century by Dutch cryptographer Auguste Kerckhoffs. The principle goes
as follows: A cryptographic system should be secure even if everything about the system,
except the key, is public knowledge.

Kerckhoffs's research publications


Kerckhoffs’s best known publications are two journal articles published in 1883 in the
French "Le Journal des Sciences Militaires" under the common title "La Cryptographie
Militaire" (Military cryptography). The articles covered the solutions of military cryptography that
were most up-to-date at that time. They gave a practical, experience-based approach, including
six design principles for military ciphers:
1. The system must be practically, if not mathematically, indecipherable.
2. It must not be required to be secret, and it must be able to fall into the hands of the
enemy without inconvenience.
3. Its key must be communicable and retainable without the help of written notes, and
changeable or modifiable at the will of the correspondents.
4. It must be applicable to telegraphic correspondence.
5. Apparatus and documents must be portable, and its usage and function must not require
the concourse of several people.
6. Finally, it is necessary, given the circumstances that command its application, that
the system be easy to use, requiring neither mental strain nor the knowledge of a long
series of rules to observe.
The second axiom is currently known as Kerckhoffs's principle.

Kerckhoffs's principle today


Kerckhoffs's principle is applied in virtually all contemporary encryption algorithms (DES, AES,
etc.). These algorithms are considered to be secure and thoroughly investigated. The security of
the encrypted message depends solely on the security of the secret encryption key (its quality).

Keeping algorithms secret may act as a significant barrier to cryptanalysis, but only if such
algorithms are used in a strictly limited circle, which protects the algorithm from being revealed.
Most government ciphers are kept secret. Commercial encryption algorithms, released to
the market, have mostly been broken quite swiftly.

Kerckhoffs's principle was reformulated (perhaps independently) by Claude Shannon


as "The enemy knows the system". In that form it is called Shannon's maxim.

Steganography
Steganography is a way of sending the hidden data in such a way that nobody (apart from the
sender and intended recipients) knows that the secret message was sent. There aren't any
ciphers or other encryption like it is in cryptography.

A Brief History of Steganography


The first information about steganography is from the 5th century BC. Herodotus described that
for sending secret data, messages were not written in wax covering a wooden board (which was
a common way to store information in that time) but letters were written directly on the board and
after that it was covered with wax.

During World War II, many agents used so called microdots. The whole document A4 was
reduced to the size of a dot and used as a part of other common text.

Steganography also includes all kinds of invisible inks. They have been used since ancient times
around the world. Over time new technologies have been invented and better recipes of invisible
inks have been developed. Mixtures have become more difficult to detect: more odorless, invisible
under ultraviolet light, easily soluble in water, etc. In 1999 CIA refused to disclose the recipes of
invisible inks that had been used during World War I, arguing that they were still important for
national security.

Modern steganography
With the development of technology, possibilities for data hiding have increased. For example
the microdot technology is used in almost all modern printers. It allows to mark all created
printouts in a way, that is invisible for users.

Currently, very popular kind of steganography is hiding information in digital pictures. There is
some redundancy in storing images. All pixels in digital pictures are coded using a specified
amount of bits and usually it is impossible to notice the changing of the least important bits. The
least important bits can be used for storing secret information. A similar situation happens when
storing digital sound.

Other clever steganography methods may include:

o delays in network packages,


o modifying printed letters - their size, spacing, position,
o using invisible and zero-width Unicode characters (for instance Zero-Width
Joiner and Zero-Width Non-Joiner, that are interpreted only in Arabic language),
o changing the order of elements in sets,
o adding data in ignored sections of files, for example after the end of file
character (EOF).

Protocols
A protocol is a set of actions that two or more entities need to perform in order to accomplish
a task. All users take the actions step by step and successfully carry out the agreed procedure to
the end.

Computers and other electronic devices use communications protocols to establish a connection
and exchange data. Nowadays there are many protocols and communications standards which
are recognized globally. Thanks to that, various different devices located in different places in the
world may communicate with each other quite easily.

Cryptographic protocols are protocols that use cryptography. They have to guarantee that no
entity will be able to gain more knowledge and access more privileges than it was designed in
their algorithms. Cryptographic protocols include various types of encryption, message
authentication or key agreement algorithms.

Arbitration Protocols
An additional entity, apart from communicating sides, takes part in arbitration protocols. The new
entity is called an arbiter, and by definition, the arbiter is impartial, not interested in
the communication and trusted by all the other sides. He acts like a bank officer, mediating in
financial services.
Arbitration protocols simplify a lot of tasks which are performed by computers. The arbiter makes
it easier to resolve disputes and exchange secret data safely. On the other hand, using arbitration
protocols may sometimes be inconvenient:

o There is a need to find an arbiter, which may be located far from the other
sides, and which would be trusted by all the other entities.
o The servers serving as arbiters must be financed and maintained.
o An arbiter is an obvious bottleneck of the transaction. A damaged, attacked or
faulty arbiter is a serious problem for the communicating parties.
Most modern systems for transferring money, like credit cards and PayPal, require trusted
intermediaries, like banks and credit card companies, to facilitate the transfer.

Dispute Protocols
A dispute protocol is a kind of arbitration protocol, in which the arbiter is involved only when it is
really required. If there are not any problems, then the communicating parties perform the whole
task and exchange information without the participation of the arbiter. On the other hand, if a
problem occurs - an error, an unexpected circumstance or fraud - an arbiter is called for help.
The arbiter has information and power to fix the situation.

Dispute protocols are cheaper and easier in use than arbitration protocols. Usually the fact of
the arbiter's existence alone prevents fraud. Because the arbiter does not have to be involved in
most communications, the major disadvantage of arbitration protocols is overcome.
Self-Enforcing Protocols
In self-enforcing protocols the whole communication doesn't require trusted third parties.
The algorithms are designed in a way that assures that any fraud attempt made by one side is
immediately visible for others and they are able to prevent it, without suffering any loses.

Undoubtedly self-enforcing protocols have the largest number of advantages and they eliminate
the need of involving additional entities. Unfortunately, no all operations can be carried out by
using the protocols of this kind.

Attacking Protocols
In general, there are two types of attacks on protocols: active and passive

o Passive attacks: the intruder may eavesdrop the communication but he is not
able to interfere with the exchange of messages.
o Active attacks: the attacker tries to change the protocol - by sending new
messages, modifying or removing the existing ones, or even altering the whole
communication channel.
The main goal of a passive attack is only overhearing the communications. On the other hand,
the goals of an active attack may vary, and the effects may usually be much more dangerous for
the victims. In the most complex active attacks many intruders take part, attacking various points
of the targeted system.

TCP/IP Protocols
TCP/IP (Transmission Control Protocol/Internet Protocol) is a set of protocols, that are used for
data transmission over computer networks. The TCP/IP model recognizes the main functionalities
of the theoretical OSI model. The image below presents the corresponding layers of both TCP/IP
and OSI models.
Every message, which is sent by an application, has to pass through all the TCP/IP layers, from
the application layer to the lowest network interface layer. Then, it is transmitted over network to
another computer. Finally, it moves all the way up to the application layer and then to the target
application.

While data is passed down from the application to the network, each layer adds its own header
to every message. Each header is then handled by a corresponding layer on the receiving
computer (where, as we said earlier, messages are passed from the network up to the application
layer and beyond). Both the content and the size of each header depend on the protocol that has
been used in the layer.

Sending a message in TCP/IP


Receiving a message in TCP/IP
Application Layer
The first of four TCP/IP layers makes the communication between computer programs and lower
layer protocols thus allowing applications to use networks. The programs can use one of many
application layer protocols to request different kinds of actions.

There are a lot of application layer protocols that use TCP/IP data transmission. Some of the
popular protocols are:

o HTTP, HTTPS - for web browsing,


o FTP, TFTP, NFS - for file transfer,
o SMTP - for sending email messages,
o POP3 - for receiving email messages,
o IMAP - for managing email messages on the server,
o Telnet, rLogin - for accessing remote computers,
o SNMP - for network management,
o DNS - for finding IP addresses assigned to Web addresses,
o IRC - for online chats
Application layer messages vary depending on the protocol that has been used. Each protocol
requires different input data and produces different queries that are to be sent to the transport
layer. Irrespective of what was produced by the application layer, the transport layer treats every
received message as data and doesn't care about its content.
Internet sockets
Internet sockets are structures that are used for communication between application and transport
layers. Every process or application trying to connect to the network, has to associate its input
and output channels by defining the corresponding internet sockets objects.

An internet socket contain an IP address, a port number and a transport layer protocol name.
A unique combination of those three values determines a proper process that should deal with
the message.

The port number can be assigned automatically by the operating system, manually by the user or
is set as a default for some popular applications. The port number is a 16-bit integer (0 - 65535).

Some popular application layer protocols use by default predefined and well-known port numbers.
For example, HTTP uses port 80, HTTPS uses port 443, SMTP port 25, Telnet port 23, and FTP
uses two ports: 20 for data transmission and 21 for transmission control. The list of such default
port numbers is managed by the Internet Assigned Numbers Authority organization.
The process of associating an application to a socket is called binding. After successful binding
the application doesn't need to care about network management because all further operations
are handled by protocols of lower layers of TCP/IP.
In some operating systems some special privileges are required for applications to bind to port
numbers less than 1024. Therefore, a lot of processes prefer using higher port numbers allocated
for short term use. Such ports are called ephemeral ports.
A user can specify a port number in a URL. For instance the following URL forces the browser to
try to reach the website using port 8080, instead of default HTTP port 80:

http://www.example.com:8080/path

Transport Layer
The transport layer receives messages from the application layer. It divides them into smaller
packets, adds a header, and sends the messages down to the internet layer. The header contains
several control information, especially source and target port numbers.

Port numbers are used by the transport layer while handling incoming packets from the internet
layer (thus, during receiving data). Thanks to the port number, it is possible to determine what
kind of contents there is inside the incoming message, thus which application layer protocol
should receive it. For example, a packet with the target port number equal to 25 will be delivered
to the protocol connected to this port, usually SMTP. In this case, SMTP will provide data to
the email application that requested it.

TCP
The most common protocol used in the transport layer is TCP (Transmission Control Protocol).
This is a connection oriented protocol. TCP offers reliable, peer-acknowledged, ordered, session-
based connectivity between two hosts.

All the features mentioned above are provided by the TCP layer itself. This means, that it may
operate with other, unreliable, protocols in the lower layers and that this shouldn't affect the
communication from the application layer perspective.
TCP Reliability
During sending data, TCP assures that data has been provided to the recipient. The receiver
checks if the received packet was intact during transmission (by checking the checksum of the
data) and, if so, the receiver confirms it by sending an acknowledgement to the sender. If
the sender doesn't receive the acknowledgement for a message within some time period, it will
resend the lost packet.

After several unsuccessful attempts, TCP assumes that the receiver is unreachable and informs
the application layer that the transmission has failed.

TCP Ordering
The TCP header contains a field with the message sequence number. The sequence number is
incremented by one for every message sent. During receiving data, TCP rearranges incoming
packets and put them in the right order. Thanks to that, the application layer doesn't need to care
about the ordering of network packets.

TCP Header
The TCP header consists of 20 or more bytes. The size depends on the fact whether or not
the optional options field is used. The maximum size of the options field is 40 bytes, thus the
maximum size of the whole header is 60 bytes.

The TCP header structure

TCP Session
Two applications need to establish a session to exchange data. TCP requires three messages to
create the session:
1. SYN - the first application (the client) sends a synchronize packet to the host.
The message contains a random sequence number, which has been set by the client.
2. SYN-ACK - the host responds to the client. It increases the sequence number from
the client by one and sends it back in the message as an acknowledgement number.
Also, the response message contains another sequence number chosen randomly by
the host.
3. ACK - the client sends an acknowledgement message to the host. The message
contains both received numbers increased by one.
When the transmission is completed, the session should be terminated. Each side can terminate
the session. The second side is supposed to acknowledge that.

TCP Usage
TCP is widely used by protocols and applications that require high reliability. It is not as fast
as UDP but, if configured properly, it still provides quite good speed together with high quality of
transmitted data.

There are a lot of application layer protocols that are most mostly used together with TCP.
Some of the most popular ones are:

o HTTP, HTTPS
o FTP
o SMTP
o Telnet
UDP
The second popular protocol that is used in the transport layer is UDP (User Datagram Protocol
or Universal Datagram Protocol), a simpler, connectionless protocol. One program just sends
some packages to another, without creating any kind of relation between them.

Due to its simplicity UDP is faster than TCP. On the other hand, it doesn't provide such reliability
as TCP. There is no guarantee that the messages would reach the receiver. UDP doesn't deliver
packets in the same order that they were sent. It is up to the application to check that the received
messages are intact and to deal with data in the correct order.

UPD Header
The UDP header is 8-byte long. It is much shorter and simpler than the corresponding TCP
header.

The UDP header structure


UDP Usage
UDP is preferred if unimportant data is transmitted or the communication has to be really fast. For
example, UDP is used for DNS requests (because of a huge number of clients sending many
short messages to relatively few DNS servers). Similarly, during audio and video transmission
the loss of some packets is not so damaging to the receiver.

There are a lot of application layer protocols that use UDP, for example:

o DNS
o DHCP
o TFTP
o SNMP
o RIP
o VOIP
DCCP
Datagram Congestion Control Protocol is a protocol that allows application to use congestion
control mechanisms and to maintain reliable connections. It doesn't provide reliable in-order
delivery.

DCCP is used by applications which work with quickly changing data (streaming media, online
games, VoIP). In such situations it is often better to use new piece of available data than ask for
retransmitting the old damaged package.

RSVP
Resource Reservation Protocol allows for reservation of resources across a network. It is mainly
used by routers and hosts to assure delivering specific levels of quality of service (QoS) for clients.

RSVP can reserve bandwidth for one-to-one and one-to-many transmissions. The protocol is
initiated by the client (receiver), which asks the router to reserve some resources.

SCTP
Stream Control Transmission Protocol allows sending multiple streams of data through one
stream. It ensures reliable and in-order transmission with congestion control, similarly to TCP, but
allows sending related data streams together in the same messages.

In general, SCTP is quite a powerful protocol. However, due to the poor support of routers and
operating systems, at present it is not popular and widely used.

Internet Layer
The internet layer adds another header to the messages received from the transport layer.
The most important fields in the new header are IP addresses of both source and target
machines. The IP address is a unique virtual number that allows to find the device in the network.

Each network device has also another special number assigned to it, called a MAC address. This
is a unique number that cannot be changed (it is stored in ROM) and that allows to identify
the device throughout the world. However, locating a device based on MAC in a global network
is practically impossible because this number is strictly hardware related and it doesn't tell us
anything about position of the device. On the other hand, IP addresses allow us to find any
computer by using DNS servers. Every computer can query a DNS server and obtain information
about the location of the target device in the network.

In general, messages travel through several routers before reaching the target server (pointed
out by the target IP address). To find out a way between the computer and the server, one could
use the Windows command:

tracert www.google.co.uk

There are a few protocols that work in the internet layer. The most important, and the most
popular, of them is IP (Internet Protocol). It would be a good idea to name some other internet
layer protocols:

o ARP (Address Resolution Protocol)


o RARP (Reverse Address Resolution Protocol)
o ICMP (Internet Control Message Protocol)
IP
IP is used for transmitting data packets over the network. At present two versions of this protocol
are in use: IPv4 (IP version 4) and IPv6 (IP version 6).

IP doesn't provide any acknowledge system that means it is unreliable. It is up to TCP operating
in the transport layer to make sure that all the requested data has been delivered. Therefore
the TCP/IP connection will be reliable.

IP Datagrams
The data packets are taken from the transport layer and divided into datagrams. Every datagram
consists of the IP header and the bytes received from the transport layer. The maximum size of
a datagram depends on the IP version: 216−1 bytes for IPv4, and 232−1 for IPv6. If the transport
layer packet is too large, it will be divided into several smaller datagrams.
Usually the data is divided into even smaller datagrams. It is caused by the limited capabilities of
physical networks. For example, the maximum size of an Ethernet datagram is 1 500 bytes, so
usually the datagrams created in the internet layer based on Ethernet will be slightly smaller than
1 500 bytes (to allow the lower layers to add additional headers). The maximum datagram size
for a network is called MTU (Maximum Transfer Unit).

IP allows to divide a datagram into smaller datagrams if this datagram has to go through
a different kind of network with smaller MTU. When the smaller datagrams arrive to the previous
type of network, they can be re-assembled into the original datagram. There is a special field in
the IP header to allow such operations (called Fragment Offset).

Network Interface Layer


The network interface layer allows datagrams from the internet layer to be sent via physical
network to another computer, where they are passed up through the corresponding network layer
to the internet layer and beyond. At present, most computers are connected to Ethernet networks,
which may be either wired or wireless. Therefore, usually the TCP/IP protocols of the upper layers
are used together with the set of Ethernet protocols.
There are three Ethernet layers. The first two, Logic Link Control (LLC) and Media Access
Control (MAC), correspond to the data link layer of the OSI reference model. The lowest layer is
called the physical layer, as in the OSI reference model.

Logic Link Control Layer


The main functionality of the first Ethernet layer is to inform the target machine which internet
layer protocol ought to be used to properly deal with the incoming message.

The layer simply adds the information about the protocol used in the internet layer, and about
the protocol that is intended to receive the message. This allows the LLC layer on the target
computer to deliver datagrams correctly.

The layer is defined by IEEE 802.2 standard.


Media Access Control Layer
The media access control layer creates the final message (Ethernet frame) that will be sent over
the network.
The layer creates its own header, similarly like other layers. It contains the source MAC address
and the target MAC address, that is the physical addresses of both machines which want to
exchange information. If the target machine is located beyond a router, in a different network,
the target MAC address will be the router MAC address (and it will be changed to another one
during processing the message by the router).

The media access control layer adds also 4 CRC bytes which may be used for data correction.

The layer is defined by IEEE 802.3 standard, if a cabled network is being used. For wireless
networks IEEE 802.11 is used.
Physical Layer
The physical layer is responsible for converting messages into electricity or electromagnetic
waves (depending on the type of the network) and for transmitting them over the physical network
between communicating machines.

It is described by the same specifications as the MAC layer, IEEE 802.3 and IEEE 802.11.

Before establishing the connection, both sides negotiate the encryption parameters during so
called TLS handshake protocol. They must agree which encryption algorithm will be used and
create proper cryptographic keys. The encryption used later for securing all messages is
symmetric and usually the negotiated symmetric key is valid only for the time of one session.
The process of establishing the shared secret key is secure and the eavesdropper cannot obtain it
even if he intercepted all the messages exchanged between the client and the server. What is
more, the handshake protocol guarantees that the negotiated secret key was intact during
transmission by the intruder, that is, that the communication is reliable.

The whole process of establishing the secure connection is protected against man-in-the-
middle attacks.
Authentication
Both sides may authenticate themselves before creating the session. The authentication is
performed by using the digital certificates signed by trusted third parties and asymmetric
encryption with public and private keys.

The authentication step is optional and one or both sides may not require it. Usually, for
convenience reasons, only the server authenticate itself.

The client may authenticate the other side by using the other side's public key (available from the
certificate received from trusted Certificate Authorities) to decrypt some information encrypted
earlier by the other side by using the corresponding private key. If the information can by properly
decrypted, then the client should assume that the other side can be trusted.

Message Integrity
The whole communication protected by TLS/SSL is reliable and the protocol itself checks the
integrity of all received messages.

The integrity checks are based on message authentication codes attached to all messages. They
are supposed to secure the messages against damages and alteration.
Similarly to other TLS/SSL functionalities, message integrity may also be provided by various
different cryptographic algorithms, depending on the client and server capabilities.

Handshake Protocol
The handshake procedure begins just after the sides agreed to use TLS. The client and the server
choose all the parameters of the secure connection they are going to create.

1. First, the client sends a list of supported ciphers and hash functions.
2. Then, the server selects the ones that it supports as well, and notifies the client of
the decision.
3. Usually (and also optionally), the server identifies itself by presenting a valid digital
certificate, which contains several information like the name of the server and its public
key. The public key is used by the client to check the server validity.
4. The client may use the server public key to encrypt a random number and send it to the
server, thus establishing the secret key which only the server will be able to decrypt.
5. Alternatively, the even better approach is to use a more secure asymmetric algorithm to
establish a stronger symmetric key. There exist two asymmetric key exchange
algorithms, Diffie-Hellman and Elliptic curve Diffie-Hellman, which provide an additional
level of security by having the property of perfect forward secrecy. It means that secret
symmetric keys established for each session will remain secure even if the long-term
public and private keys used during the handshake protocol are compromised.
If any of the steps described above fails (on either side), the connection is cancelled. The second
phase of communication, the record protocol will not be started.

Due to the fact that session negotiating by using an asymmetric encryption algorithm is a rather
expensive procedure, then instead of creating a new symmetric key, either side may try to resume
the previously used session. If the other side accepts that, they will use the secret keys created
for the previous session.

Security of TLS/SSL
The secure TLS/SSL connection may be configured to use various underlying symmetric and
asymmetric encryption algorithms. The strength of the protection depends strongly on
the selected cipher and its implementation.

The two first SSL protocol versions are generally considered to be unsafe, whilst the third SSL
version is comparable to TLS 1.1. As opposite to that, the newer TLS versions are much more
refined and provide much better security. Although there exist several attacks targeting various
TLS algorithm implementations, it is considered to be a strong and efficient tool for providing
security during communicating over computer networks.

It is recommended to create secret keys by algorithms which provide perfect forward secrecy.
That guarantees that private keys compromising (that belong for example to trusted Certificate
Authorities) will not compromise the privacy of all communications protected by the derived
private keys. Certificate Authority organisations were recently targeted by many attacks which
led to disclosure of many long-term private keys and compromised many digital certificates.

Internet Relay Chat


IRC is an application layer protocol which allows to exchange text messages between users.

The protocol was created in 1988 by a Finnish software engineer, Jarkko Oikarinen. It was
designed mainly for group communication via various discussion forums called channels but
the protocol allows also to send and receive private messages or data.

IRC Overview
IRC works in client/server model. At first, every user has to install a client application. Using
the client application, it is possible to send text messages to the IRC server, which transfers
messages to other clients. The servers are connected together and form larger groups, so they
can exchange messages between themselves.
There are several IRC services that provide some additional functionalities, like bots (sending
messages generated by computer programs to channels) or bouncers (daemon processes that
provide IRC communication to offline users or to computers without any IRC client installed).

The image below presents an example of the IRC network:


IRC Protocol Design
Usually IRC runs over the TCP protocol. The official TCP port assigned to IRC is 194, however
to avoid having to run the server application with root privileges, the most common port to run IRC
is 6667/TCP and a few other ports nearby (6660-6669 and 7000).

IRC specification is covered by several documents, RFC 1459 and a couple of later ones:
RFC 2811, RFC 2812, and RFC 2813. However, most client and server applications don't follow
the design strictly.

IRC was used originally only for sending text messages. Each character was encoded
using 8 bits, without specifying the type of encoding. This could cause problems when conversing
users were using different encoding. At present, UTF-8 is the most popular encoding used in IRC
messages and it is supported by most IRC applications.
IRC users communicate with server and other users by sending simple text commands. Every
command specifies who is the recipient (a server, a channel or another user) and additional
parameters like the text of the message.

Security of IRC
The original design of IRC is insecure. Most servers don't require users to register an account
and usually people can choose nicknames just before connecting to the channels.
Every process of changing the network structures is usually problematic and it may cause various
issues (for example, because of several users having the same nicknames not necessarily with
the same privileges). Also, it is assumed that servers trust one another during exchanging
messages. A server that behaves incorrectly can cause problems to the whole network.

In the early 2000s some IRC networks were often attacked using DDoS and other more
sophisticated attacks. This caused many users migrated to different IRC networks or abandoned
that way of communication completely.
The limitations of the protocol are well known, and therefore improvements are often introduced
in modern implementations. A lot of IRC servers have already started to support
secure SSL/TLS connections.
IRC Today
IRC was the most popular in 2003. It is estimated that it was using by over one million people on
hundreds of thousands of channels. Nowadays, the number of users have decreased to less than
half a million in 2014. The reasons why people use IRC applications have also changed.

At the beginning, the IRC networks were used for social networking, however now websites like
Facebook or Twitter took over these functions. People used to use IRC networks to broadcast
unofficial or illegal news and information. At present, there are much better ways to do it
(like TOR). IRC channels were used to exchange information about piracy software and warez.
Nowadays, bad guys prefer to look for such information in other places, like P2P.
Due to commercialization of the Internet, a lot of companies have decided to invest money in their
own products and to create their own ways of communication instead of using publicly available
IRC. On the other hand, there are several IRC-based commercial or open source projects that
are widely used by development teams and various firms and organizations for internal and
external communication.
IRC is a very old protocol and it has been using for many years. The way of using the protocol
has changed over that time. One may predict that IRC technology will be still used in various
applications and services, at least over the next several years.

Notation of Numbers
Notation of numbers is a way in which all numbers are represented, by using a limited set of
different digits. Notation that is used currently for representing numbers is called positional
notation (or place-value notation), in contrast to some ancient notations, such as Roman
numerals.

Definition Positional notation is a method of representing numbers in that way, that each digit
in a sequence is multiplied by the appropriate multiple (equal to the digit's position minus one) of
a number which is considered to be a base of a positional numeral system.

• For example, a four-digit number a3a2a1a0 in the numeral system of


base b means: a3b3 + a2b2 + a1b1 + a0b0
Nowadays, most people use the decimal numeral system, however for example the Celts (and
the Elves in Tolkien's Middle-earth) used the duodecimal system. All computers use the binary
numeral system. Due to the convenience of presenting and easy conversion between binary
and hexadecimal systems (one character in the hexadecimal system corresponds to exactly four
characters in the binary numeral system), the hexadecimal system is often used for presenting
numbers stored by computers (using the binary system).

A non-negative integer in the numeral system of base b may be presented as:


(dk-1dk-2...d1d0)b
where:
dk-1, dk-2, ..., d1, d0 are digits (in the range 0 to b-1)
Base conversion algorithm
To write a number n in the numeral system of base b one can use the following algorithm:
1. Divide n by b producing a quotient and a remainder (this is called the Euclidean
division).
2. The remainder put as the last digit of the outcome.
3. Put the quotient in place of n.
4. Repeat the steps above to receive the next last but one digit of the outcome and other
earlier digits.
5. Stop the algorithm when the result of dividing is equal to 0.
For example, to present a number 19 using the binary numeral system the following steps should
be executed:
19/2 = 9 + 1
9/2 = 4 + 1
4/2 = 2 + 0
2/2 = 1 + 0
1/2 = 0 + 1
19 = (100011)2
Whereas, to present a number 19 using the numeral system of base seven one should perform
the steps:
19/7 = 2 + 5
2/7 = 0 + 2
19 = (25)7
The algorithm can also be used to change numbers that are written in the numeral system of
smaller base into numbers in the numeral system of bigger base.

Number of digits
An integer n, which satisfies the inequality bk-1 <= n < bk has k digits in a numeral system
of base b. This relationship can be represented by the logarithm:
number of digits = [logbn] + 1
where:
a symbol [] means the integer part of the number.

Binary numbers
Binary numbers are stored as sequences of zeros and ones. At present there are a couple of
binary numeral systems in use. The most popular are:
o the Natural Binary System - in short NBS, which allows to store only
nonnegative numbers,
o the Two's Complement System - in short 2C, which allows to store both
positive and negative numbers.

Notation of numbers in NBS


A binary n-digit number stored in NBS, can have positive values ranging from 0 to 2n-1.
Definition A decimal value of a n-digit binary number stored using the natural binary system
as:
an-1 ... a2a1a0
is equal to:
2n-1an-1 + ... + 22a2 + 21a1 + 20a0

where:

• ai can be equal to either 0 or 1.


Notation of fractions in NBS
Using the natural binary system, one can present binary fixed point numbers. A certain part of
the bits is used for storage of the fractional part of the number.

Definition A decimal value of a (n+m)-digit binary number stored using the natural binary
system as:
an-1 ... a2a1a0 , a-1a-2 ... a-m
is equal to:
2n-1an-1 + ... + 22a2 + 21a1 + 20a0 + 2-1a-1 + 2-2a-2 + ... + 2-ma-m

where:

• n is a number of integer bits,


• m is a number of fractional bits,
• ai can be equal to either 0 or 1.
Binary addition in NBS
Binary addition of two numbers in NBS may be presented as simple columnar addition:

1 1 1 1

111101

+ 000110

1000011

Binary subtraction in NBS


Binary subtraction of two numbers in NBS may be presented as simple columnar subtraction. One
should subtract corresponding bits and if necessary borrow bits from columns on the left.
The borrowed numbers have always values one greater than the base of the system is, so in this
case 10 (or 2 in the decimal notation; for convenience the example below uses the latter
number).
For example, after subtracting 6 from 19 in NBS, one will receive:
1 2

10011

- 110

1101

This is a correct result (13) in the decimal numeral system as well.


Binary multiplication in NBS
Binary multiplication of two numbers in NBS may be presented in the same way as multiplication
of decimal numbers. For example, the result of multiplication of 22 and 5 (equal to 110) can be
calculated in the following way (all the numbers are written in NSB):
10110

x 101

10110

00000

+ 10110

1101110

Binary division in NBS


Binary division of two numbers in NBS may be performed using shifting and subtraction.

One should write the two numbers under a horizontal line, the dividend over the divisor.
The divisor should be moved left, until its most significant non-zero bit is located right under
the most significant non-zero bit of the dividend.

Then the dividend should be compared to the divisor. If the dividend (the number written higher)
is bigger than the divisor, one ought to subtract the divisor from the dividend and write 1 above
the horizontal line in the last position. During the next comparison, one should use the result of
the subtraction instead of the dividend.
If the divisor is bigger than the dividend, then no subtraction is performed. One ought
to write 0 above the horizontal line in the last position, move the divisor (the number written lower)
to the right by one position and repeat comparing.
If the divisor can't be moved to the right, than one should write 0 above the horizontal line in
the last position. The current result of subtraction is the remainder of division.
For example, after dividing 11 by 3 one will receive the result 3 and the remainder 2:
011

1011
11

1011

- 11

101

- 11

10

After obtaining the remainder, it is possible to continue dividing and receive fractional digits (as
during dividing of decimal numbers).

Notation of numbers in C2
A binary n-digit number stored in C2, can have positive and negative values ranging from -2n-
1 to 2n-1-1.

Definition The decimal value of a n-digit binary number stored using the C2 system as:
an-1 ... a2a1a0
is equal to:
-(2n-1)an-1 + 2n-2an-2 +... + 22a2 + 21a1 + 20a0

where:

ai can be equal to either 0 or 1.



The most-significant bit of a binary number in C2 is referred to as the sign bit and it determines if
the number is positive or negative (the number is negative if the sign bit is equal to 1).
Sign extension in C2
Numbers in C2 can be stored using various numbers of bits (numbers in the decimal system can
also be stored using various numbers of digits, by adding zeros to the left side). In order
to increase the number of bits of a binary number, one must add a new bit to the left side of binary
sequence and set it to the same value as the sign bit has. This guarantees that the new number
will have exactly the same value as the original one.

For example, a 4-digit binary number -7 stored in C2 (10012), can be extended to 5 bits by
copying the sign bit (110012):
10012 = -23 + 20 = -8 + 1 = -7
110012 = -24 + 23 + 20 = -16 + 8 + 1 = -7
The inverse of a number in C2
Computing the inverse of a number in C2 is performed by negation of all the bits in the number
(replacing zeros by ones, and ones by zeros) and then addition the result to 1.
For example, to calculate the value of number 11 in C2, knowing the binary representation of -
11 in C2 (it is equal to 1101012), one should in the first place negate all the bits:
~110101 = 001010,
and then add 1 to the result:
01010

+00001

01011

23 + 21 + 20 = 8 + 2 + 1 = 11
Binary addition in C2
Binary addition of two numbers in C2 may be presented as simple columnar addition, just like
in NBS. For example, after adding 6 to -11 in C2, one will receive:
1

110101

+ 000110

111011

The result is equal to -5 in the decimal numeral system:


-25 + 24 + 23 + 21 + 20 = -32 + 16 + 8 + 2 + 1 = -5
During performing the computations, one should remember that the result must fit within the same
range to which the manipulated numbers are defined. Otherwise, the received value would have
been incorrect.

Binary subtraction in C2
Binary subtraction of two numbers in C2 may be presented as simple columnar subtraction, just
like in NBS. The borrowed numbers have (like in NBS) values one greater than the base of
the system is, so in this case 10 (or 2 in the decimal notation; to simplify the notation,
the examples below use the latter number). For example, after subtracting 6 from -11 in C2, one
will receive:
1 2 2

110101

-000110

101111

This is a correct result in the decimal numeral system as well:


-2 + 2 + 2 + 2 + 2 = -32 + 8 + 4 + 2 + 1 = -17
5 3 2 1 0

Similarly, to subtract 22 from -11 in C2 one should:


2 1 2 2

1110101

-0010110

1011111
The result is equal to the decimal value:
-2 + 2 + 2 + 2 + 2 + 2 = -64 + 16 + 8 + 4 + 2 + 1 = -33
6 4 3 2 1 0

Binary multiplication in C2
Binary multiplication in C2 is slightly more complicated than in NBS. One of the efficient
algorithms is called Booth's multiplication algorithm. In order to multiply two
numbers: X of lenX bits and Y of lenY bits, both stored in C2, one should perform the following
Initialization steps:
1. If any of the given numbers (factors) is equal to the largest negative number which can
be stored using as many bits as this factor has, then it should be expanded and a new
bit should be added to its left side,
2. Calculate the inversion of X in C2 (-X),
3. Initialize the helper variable A of size of lenX+lenY+1 bits: fill the most
significant lenX bits of A with all the bits of X, and the rest lenY+1 bits of A with zeros
(A stands for addition, which is performed later during the algorithm),
4. Initialize the helper variable S of size of lenX+lenY+1 bits: fill the most
significant lenX bits of S with all the bits of -X, and the rest lenY+1 bits of S with zeros
(S stands for shifting, which is performed later during the algorithm),
5. Initialize the helper variable P of size of lenX+lenY+1 bits: fill the most
significant lenX bits of P with zeros, then the next lenY bits of P with all the bits of Y, and
the last (the least significant) bit of P set to 0 (P stands for product, the result of
multiplication).
After initialization of variables, one should repeat the two steps below lenY times:
1. If the two last bits of P are equal to 01, then one should compute P+A (ignoring any
overflow) and assign the result to P. Otherwise, if the two last bits of P are equal to 10,
then one should compute P+S (ignoring any overflow) and assign the result to P.
Otherwise, if they are equal to either 00 or 11, then P should remain unchanged.
2. All the bits of the current number P should be arithmetically shifted right by one position
(abandoning the least significant bit and leaving unchanged the value of the most
significant bit) and the result ought to be assigned to P.
Finally, after completing the steps above, one should remove the least significant bit from
the received number P. The result (the new value of P) is the product of multiplication of the two
numbers X and Y.
Example of binary multiplication in C2
Let's consider the multiplication of -8 and 3:
X = -8 = 10002
Y = 3 = 0112
In the first step, the number X should be extended:
X = 11000
Then, it is necessary to calculate the value of -X:
-X = ~11000 + 01 = 00111 + 01 = 01000
Calculating the values of helper variables A, S and P:
A = 1 1000 0000
S = 0 1000 0000
P = 0 0000 0110
Then the next two steps of the algorithm should be repeated three times (because
the number Y is 3-bit long):
Iteration 1:
P = P + S = 0 1000 0110
P >> 1 = 0 0100 0011
Iteration 2:
P >> 1 = 0 0010 0001
Iteration 3:
P = P + A = 1 1010 0001
P >> 1 = 1 1101 0000
In the end, it is necessary to remove the least significant bit of P:
P = 1110 1000
The result is equal to:
-2 + 2 + 2 + 2 = -128 + 64 + 32 + 8 = -24
7 6 5 3

This is a correct result of multiplication of -8 and 3.

Time Estimation of
Mathematical Operations
Time needed by a computer to perform a task can be evaluated by estimating a number of
required operations on bits. One must analyse how many changes of bits is necessary to perform
the calculations on the number, which is stored in computer memory by using binary notation.

Binary addition
Binary addition of two numbers (stored in Natural Binary System - NBS) may be presented as
a simple columnar addition:

1 1 1

11100

+ 01110

101010

First, the vacant spaces in the shorter number are filled with zeros. As a result, both numbers are
of the same length of k bits. After that, simple summation should be performed for each pair of
digits. If the result of the summation is bigger than 1, then the result is set to 0 and
the number 1 is carry to the next position.
The result of addition of two k-digit numbers contains k or k+1 digits. The whole operation of
summation requires k operations on bits.

Binary multiplication
Analogously to addition one can perform multiplication of two binary numbers:

11101
x 01111

11101

111010

+11101

101111001

During multiplication of two binary numbers n and m of lengths of respectively k and j bits one
can receive up to j rows (the number of rows is equal to the number of ones in the number m)
which contain copies of the number n shifted to the left by the adequate number of positions.
Multiplication can be presented as up to j-1 additions. Each addition requires k operations
on bits.
The number of operations on bits during multiplication of two binary numbers is smaller than
a product of their lengths. The result of multiplication has k+j or k+j-1 digits.
It should be added that there are known quite faster algorithms of multiplication of big numbers.
Some of them allow to multiply two k-digit numbers using only k(ln k)(ln (ln k)) operations
on bits.

Factorial
The factorial of a non-negative integer n (n!) is the product of all positive integers less than or
equal to n.
The product of n numbers of length k is up to nk-bit long. Therefore, the number n! is also
up to nk-bit long. To calculate n! one must perform n-2 multiplications of numbers of length of
up to k bits and a number of length of up to nk bits.
Therefore, the total number of operations is equal to:
(n-2)nk2
The value can be presented by using only the number n:
(n-2)n([log2n]+1) 2

Approximately:
n2(log2n)2
Polynomial time
According to Cobham's thesis, algorithms are considered to be fast and effective, if they run in
polynomial time.
Definition An algorithm that performs operations on numbers n1, n2, ..., nr, of lengths of
respectively k1, k2, ..., kr digits, runs in polynomial time, if there are integers d1, d2, ..., dr such as
that the number of operations on bits necessary to perform this algorithm can be presented
as O(k1d1 k2d2 ... krdr).

• Addition, subtraction, multiplication and division are algorithms running in polynomial


time
• Base conversion algorithm runs in polynomial time
• The algorithm which calculate n! does not run in polynomial time
Modular Arithmetic (Clock
Arithmetic)
Modular arithmetic is a system of arithmetic for integers, where values reset to zero and begin
to increase again, after reaching a certain predefined value, called the modulus (modulo). Modular
arithmetic is widely used in computer science and cryptography.
Definition Let ZN be a set of all non-negative integers that are smaller than N:
ZN = {0,1,2,...,N-1}

where:

• N is a positive integer,
• if N is a prime, it will be denoted p (and the whole set as Zp).
To determine the value of an integer for a modulus N, one should divide this number by N. Its
value in ZN is equal to the remainder of the division. In modular arithmetic, it is possible to define
all typical operations, as in normal arithmetic. They work as one may expect. It is possible to use
the same commutative, associative, and distributive laws.

Modular inversion
Integers in modular arithmetic may (but not must) have inverse numbers.

Definition The inverse of x in ZN is a number y in ZN, that satisfies the equation:


x · y = 1 (in ZN)

where:

y is denoted x-1

For example, if N is an odd number, then the inverse of 2 in ZN is (N+1)/2:
x · (N+1)/2 = N + 1 = 1 (in ZN)
Theorem A number x is invertible in ZN if and only if the numbers x and N are relatively prime.

•The theorem can be proved using the fact that it is possible to present the greatest common
divisor of two integers as a sum of two products each of the numbers and another properly
selected integer:
a·x + b·y = gcd(x,y)
Determining inverse numbers in ZN allows solving linear equations in modular arithmetic:
the equation: a · x + b = 0 (in ZN)
has the solution: x = -b · a-1 (in ZN)
Definition The symbol Z*N denotes a set of all elements of ZN that are invertible in ZN; that
means the set of numbers x that belong to ZN, and x and N are relatively prime.

• for example for a prime number p:


Z*p = Zp \ {0} = {1, 2, …, p - 1}

Calculating inverse numbers in ZN


Inverse numbers in ZN can be determined in time O(log2N) using the Euclidean algorithm, which
allows to compute the greatest common divisor of two integers.
Extended Euclidean algorithm finds the coefficients of Bézout's identity, that are integer
numbers a and b such that:
a·x + b·y = gcd(x,y)
In order to receive the inverse number, one should perform the following transformations:
a·x + b·y = 1
b·y = 1 (in Za)
Thus, b is the inverse number of y in Za.
The numbers a and b should be calculated during each step of the Euclidean algorithm, using
the received values of those numbers in previous steps and the quotients:
ai = ai-2 - qi-1·ai-1
bi = bi-2 - qi-1·bi-1
If algorithm's steps are numbered from 1, one should assume the following initial values of the
coefficients:
a-1 = 1
a0 = 0
b-1 = 0
b0 = 1
The final values of the coefficients a and b are received in the same step when the greatest
common divisor of x and y is calculated (thus, in the step with the last non-zero remainder).

Information-theoretic security
of ciphers
The quality of a cipher can be described by checking its resistance to strictly technical attacks
(that is bypassing the human factor). The highest defined level of security is referred to as
the perfect security. In fact, to describe the practical security of ciphers, the semantic security
property is usually used.

Perfect Security
Definition A cipher (E, D) defined over (K, M, C) has perfect secrecy if for every two
messages m1 and m2 (of the same size) belonging to M and for every c belonging to C, there is
an equality:
P[E(k, m1)=c] = P[E(k, m2)=c]
where:

• the key k belongs to a set K of all possible keys,


• M is a set of all possible messages,
• C is a set of all possible ciphertexts.
• Given a ciphertexts it is not possible to distinguish between two possible sent messages
• Given a ciphertexts it is not possible to find out anything about sent text
• It is not possible to break the cipher using ciphertext-only attacks
• |K| >= |M|
The one-time pad is the only cipher known to have perfect secrecy.

Semantic Security
Definition A cipher is semantically secure if knowledge of the ciphertext and the length of
the original message, does not reveal any additional information on the original message that
can be feasibly extracted.

Any probabilistic, polynomial-time algorithm (PPTA) which receives the ciphertext created by
a semantically secure cipher of any certain message and its length cannot determine any partial
information on the message with probability non-negligibly higher than all other PPTAs that only
have access to the message length and don't have access to the ciphertext.
One can prove that the OTP cipher is semantically secure if it uses a random encryption key.
Similarly, each stream cipher can have the property, if a pseudorandom generator used in
the cipher is secure.

Padding Mechanisms
Padding standards are mechanisms for appending some predefined values to messages. They
are used with algorithms which deal with blocks of data. Typical examples of such operations
are block symmetric ciphers and MAC algorithms. These algorithms work on the whole data
blocks. Therefore, if a message length is not a multiple of the block size, a stardard for adding
some number of bytes to the end of the message is required.
The information which padding standard has been used, must be provided to the receiver. This
allows them to determine (after decrypting the ciphertext) where the original message ends, and
the unimportant pad bytes starts.

All the padding standards defined below work in a similar way. They describe which values should
be appended to the message, to fill up the last block.

Using padding is a convenient way of making sure that encrypted data is of the correct size. The
only drawback is the fact that even if the original message contains the correct number of bytes
(a multiple of the block size), some padding must be added to fulfil the process and make sure
that the receiver would be able to understand the message. Usually, a new dummy block must
be added which will contain only the padding bytes.

There are a few padding types described below. The first two paddings are based on bits,
whereas the others are based on bytes.

Bit Padding
A single 1 bit is appended to the data. Then, all other bits of the padding (if any are required) are
zeros.
1 0 1 0 0 0 0 1 1 0 100000

This padding scheme is defined in ISO/IEC 9797-1 documentation.

TBC (Trailing Bit Complement) Padding


If the data ends in a 0 bit, all the padding bits will be ones. If the data ends in a 1 bit, all the
padding bits will be zeros.
1 0 1 0 0 0 0 1 1 0 111111
1 0 1 0 0 0 0 1 1 1 000000

PKCS#5 and PKCS#7 Padding


The value of each pad byte is the total number of bytes that are added. Of course, the total number
of pad bytes depends on the block size.

For example, if the message is 3 bytes shorter than an integer multiple of the block size, then
3 pad bytes should be added, each of them of value 3. If 5 bytes should be added, then each of
them should be 5.

0x10 0x11 0x36 0x67 0x38 0xBC 0x03 0x21 0xEF 0x03 0x03 0x03
0x10 0x11 0x36 0x67 0x38 0xBC 0x06 0x06 0x06 0x06 0x06 0x06

ISO 7816-4 Padding


The first byte of the padding is 0x80. All other bytes of the padding are zeros. Such construction
allows to create paddings of any size.

0x10 0x11 0x36 0x67 0x38 0xBC 0x03 0x21 0xEF 0x80 0x00 0x00

The padding mechanism is defined in ISO/IEC 7816-4 documentation.

ISO 10126-2 Padding


The last byte of the padding (thus, the last byte of the block) is the number of pad bytes. All other
bytes of the padding are some random data.

0x10 0x11 0x36 0x67 0x38 0xBC 0x03 0x21 0xEF 0x23 0x86 0x03

The padding mechanism is defined in ISO 10126-2 documentation.

ANSI X9.23 Padding


The last byte of the padding (thus, the last byte of the block) is the number of pad bytes. All other
bytes of the padding are zeros.

0x10 0x11 0x36 0x67 0x38 0xBC 0x03 0x21 0xEF 0x00 0x00 0x03

The padding mechanism is defined in the ANSI X9.23 standard.

Zero Byte Padding


All padding bytes are zeros. This type of padding is rather unreliable (what if the data ends with
zeros?) and should be use only if necessary in legacy applications.

0x10 0x11 0x36 0x67 0x38 0xBC 0x03 0x21 0xEF 0x00 0x00 0x00
0x10 0x11 0x36 0x67 0x38 0xBC 0x03 0x21 0x00 0x00 0x00 0x00

Date: 2020-03-09
Block Ciphers Modes of
Operation
The modes of operation of block ciphers are configuration methods that allow those ciphers to
work with large data streams, without the risk of compromising the provided security.

It is not recommended, however it is possible while working with block ciphers, to use the same
secret key bits for encrypting the same plaintext parts. Using one deterministic algorithm for
a number of identical input data, results in some number of identical ciphertext blocks.

This is a very dangerous situation for the cipher's users. An intruder would be able to get much
information by knowing the distribution of identical message parts, even if he would not be able
to break the cipher and discover the original messages.

Luckily, there exist ways to blur the cipher output. The idea is to mix the plaintext blocks (which
are known) with the ciphertext blocks (which have been just created), and to use the result as the
cipher input for the next blocks. As a result, the user avoids creating identical output ciphertext
blocks from identical plaintext data. These modifications are called the block cipher modes
of operations.

ECB (Electronic Codebook) Mode


It is the simplest mode of encryption. Each plaintext block is encrypted separately. Similarly, each
ciphertext block is decrypted separately. Thus, it is possible to encrypt and decrypt by using many
threads simultaneously. However, in this mode the created ciphertext is not blurred.

Encryption in the ECB mode


Decryption in the ECB mode

A typical example of weakness of encryption using ECB mode is encoding a bitmap image
(for example a .bmp file). Even a strong encryption algorithm used in ECB mode cannot blur
efficiently the plaintext.

The bitmap image encrypted using DES and the same secret key. The ECB mode was used for
the middle image and the more complicated CBC mode was used for the bottom image.
A message that is encrypted using the ECB mode should be extended until a size that is equal to
an integer multiple of the single block length. A popular method of aligning the length of the last
block is about appending an additional bit equal to 1 and then filling the rest of the block with bits
equal to 0. It allows to determine precisely the end of the original message. There exist
more methods of aligning the message size.
Apart from revealing the hints regarding the content of plaintext, the ciphers that are used in ECB
mode are also more vulnerable to replay attacks.

CBC (Cipher-Block Chaining) Mode


The CBC encryption mode was invented in IBM in 1976. This mode is about adding XOR each
plaintext block to the ciphertext block that was previously produced. The result is then encrypted
using the cipher algorithm in the usual way. As a result, every subsequent ciphertext block
depends on the previous one. The first plaintext block is added XOR to a random initialization
vector (commonly referred to as IV). The vector has the same size as a plaintext block.
Encryption in CBC mode can only be performed by using one thread. Despite this disadvantage,
this is a very popular way of using block ciphers. CBC mode is used in many applications.

During decrypting of a ciphertext block, one should add XOR the output data received from the
decryption algorithm to the previous ciphertext block. Because the receiver knows all the
ciphertext blocks just after obtaining the encrypted message, he can decrypt the message using
many threads simultaneously.

Encryption in the CBC mode


Decryption in the CBC mode

If one bit of a plaintext message is damaged (for example because of some earlier transmission
error), all subsequent ciphertext blocks will be damaged and it will be never possible to decrypt
the ciphertext received from this plaintext. As opposed to that, if one ciphertext bit is damaged,
only two received plaintext blocks will be damaged. It might be possible to recover the data.

A message that is to be encrypted using the CBC mode, should be extended till the size that is
equal to an integer multiple of a single block length (similarly, as in the case of using the
ECB mode).

Security of the CBC mode


The initialization vector IV should be created randomly by the sender. During transmission it
should be concatenated with ciphertext blocks, to allow decryption of the message by
the receiver. If an intruder could predict what vector would be used, then the encryption would not
be resistant to chosen-plaintext attacks:

In the example presented above, if the intruder is able to predict that the vector IV1 will be used
by the attacked system to produce the response c1, they can guess which one of the two
encrypted messages m0 or m1 is carried by the response c1. This situation breaks the rule that
the intruder shouldn't be able to distinguish between two ciphertexts even if they have chosen
both plaintexts. Therefore, the attacked system is vulnerable to chosen-plaintext attacks.
If the vector IV is generated based on non-random data, for example the user password, it should
be encrypted before use. One should use a separate secret key for this activity.
The initialization vector IV should be changed after using the secret key a number of times. It can
be shown that even properly created IV used too many times, makes the system vulnerable
to chosen-plaintext attacks. For AES cipher it is estimated to be 248 blocks, while for 3DES it is
about 216 plaintext blocks.

PCBC (Propagating or Plaintext Cipher-Block


Chaining) Mode
The PCBC mode is similar to the previously described CBC mode. It also mixes bits from the
previous and current plaintext blocks, before encrypting them. In contrast to the CBC mode, if one
ciphertext bit is damaged, the next plaintext block and all subsequent blocks will be damaged and
unable to be decrypted correctly.

In the PCBC mode both encryption and decryption can be performed using only one thread at
a time.

Encryption in the PCBC mode


Decryption in the PCBC mode

CFB (Cipher Feedback) Mode


The CFB mode is similar to the CBC mode described above. The main difference is that one
should encrypt ciphertext data from the previous round (so not the plaintext block) and then add
the output to the plaintext bits. It does not affect the cipher security but it results in the fact that
the same encryption algorithm (as was used for encrypting plaintext data) should be used during
the decryption process.

Encryption in the CFB mode

Decryption in the CFB mode

If one bit of a plaintext message is damaged, the corresponding ciphertext block and all
subsequent ciphertext blocks will be damaged. Encryption in CFB mode can be performed only
by using one thread.

On the other hand, as in CBC mode, one can decrypt ciphertext blocks using many threads
simultaneously. Similarly, if one ciphertext bit is damaged, only two received plaintext blocks will
be damaged.

As opposed to the previous block cipher modes, the encrypted message doesn't need to be
extended till the size that is equal to an integer multiple of a single block length.
OFB (Output Feedback) Mode
Algorithms that work in the OFB mode create keystream bits that are used for encryption
subsequent data blocks. In this regard, the way of working of the block cipher becomes similar to
the way of working of a typical stream cipher.

Encryption in the OFB mode

Decryption in the OFB mode

Because of the continuous creation of keystream bits, both encryption and decryption can
be performed using only one thread at a time. Similarly, as in the CFB mode, both data encryption
and decryption uses the same cipher encryption algorithm.

If one bit of a plaintext or ciphertext message is damaged (for example because of a transmission
error), only one corresponding ciphertext or respectively plaintext bit is damaged as well. It is
possible to use various correction algorithms to restore the previous value of damaged parts of
the received message.

The biggest drawback of OFB is that the repetition of encrypting the initialization vector may
produce the same state that has occurred before. It is an unlikely situation but in such a case the
plaintext will start to be encrypted by the same data as previously.

CTR (Counter) Mode


Using the CTR mode makes block cipher way of working similar to a stream cipher. As in the OFB
mode, keystream bits are created regardless of content of encrypting data blocks. In this mode,
subsequent values of an increasing counter are added to a nonce value (the nonce means a
number that is unique: number used once) and the results are encrypted as usual. The nonce plays
the same role as initialization vectors in the previous modes.

Encryption in the CTR mode

Decryption in the CTR mode

It is one of the most popular block ciphers modes of operation. Both encryption and decryption
can be performed using many threads at the same time.

If one bit of a plaintext or ciphertext message is damaged, only one corresponding output bit is
damaged as well. Thus, it is possible to use various correction algorithms to restore the previous
value of damaged parts of received messages.

The CTR mode is also known as the SIC mode (Segment Integer Counter).

Security of the CTR mode


As in the case of the CBC mode, one should change the secret key after using it for encrypting
a number of sent messages. It can be proved that the CTR mode generally provides quite good
security and that the secret key needs to be changed less often than in the CBC mode.

For example, for the AES cipher the secret key should be changed after about 264 plaintext
blocks.
Pseudorandom Generator
(PRG)
Pseudorandom generators (PRG) are used to create random sequences of numbers in
deterministic devices. All computer algorithms are strictly deterministic. PRGs allow encryption of
many data blocks using data generated from secret keys which have only few bits.
Definition Pseudorandom generator (PRG) is an efficient and deterministic function, which
returns a longer pseudorandom output sequence based on the received shorter input:
G:{0,1}s -> {0,1}n
where:

• n >> s
Pseudorandom generator has to be unpredictable. There must not be any efficient algorithm that
after receiving the previous output bits from PRG would be able to predict the next output bit with
probability non-negligibly higher than 0.5.
Pseudorandom generators are used for creating pseudorandom functions and permutations,
which are widely used in cryptography (for example, for implementation of block ciphers).

PRG Implementation
Nowadays, pseudorandom generators are implemented in most operating systems
(for example /dev/random in Linux) and in many libraries for various programming languages.
In general, their behaviour is similar.
First, the algorithm initializes the internal state of the generator based on some external
information (for example, the current time or temperature). Then, all bytes of the state are mixed
for the whole time when the generator works. The changes are based on various external and
random input data - the frequency and the way of using the keyboard and mouse by the user,
network traffic, hardware interrupts and other kinds of information from outside the deterministic
environment where the algorithm works.

The pseudorandom generator algorithm continuously changes its internal state. The internal state
is then used to generate output sequences of numbers, which should be as random as possible.
All the modifications of the state are performed in a way that is supposed to provide the best
possible protection against sequence analysis of the produced output data.

PRG Output Quality


There are many standards that describe requirements, which should be fulfilled by pseudorandom
generators. For example, the American National Institute of Standards and Technology is the
author of several norms, like NIST SP 800-90.
There exist many different statistical tests that can be used to assess the quality
of pseudorandom generators. They check if received sequences are random and unpredictable.
Some statistical test examples include:

o a number of 1 bits in the produced sequence is similar to the number of 0 bits


o a number of 00 pairs in the produced sequence is more or less equal to one-
fourth of all bits
o a length of the longest sequence of zeros or ones is similar to its mathematical
estimation

Pseudorandom Functions and


Permutations
Pseudorandom functions are efficient and deterministic functions which return pseudorandom
output indistinguishable from random sequences. They are made based on pseudorandom
generators but contrary to them, in addition to the internal state, they can accept any input data.
The input may be arbitrary but the output must always look completely random.
A pseudorandom function, which output is indistinguishable from random sequences, is called
a secure one.

Definition The pseudorandom function (PRF) defined over (K, X, Y) is an efficient


and deterministic function which returns a pseudorandom output sequence:
F: K x X -> Y

Pseudorandom permutations can be defined in a similar way. They create output data
indistinguishable from random sequences.

Definition The pseudorandom permutation (PRP) defined over (K, X) is an efficient


and deterministic function which returns a pseudorandom output sequence:
E: K x X -> X

• the function E (k, .) in one-to-one


• there exists an efficient inversion algorithm D (k, x)
Pseudorandom permutations which produce output sequences that are indistinguishable from
random sequences, are called secure PRPs. It can be proved that secure PRP defined over large
enough X is also a secure PRF (according to the PRF Switching Lemma).

Creating PRF from PRG


Let G be a secure pseudorandom generator. Having G, it is possible to create a secure
pseudorandom function F.
1-bit pseudorandom function
Let G returns two times more bits than it has bits in its internal state:
G: K -> K2
It is possible to define a 1-bit pseudorandom function:
F: K x {0, 1} -> K
as:
F(k, b) = G(k)[b]
where:
b can be equal to either 0 or 1.
The function F returns the first or the second half of generator's output data, based on
the received input bit.
If the generator G is secure, then the function F is also secure.
2-bit pseudorandom function

Having the generator G, one can expand it and define a new generator G1:
G1: K -> K4
as:
G1(k) = G(G(k)[0]) || G(G(k)[1])
Analogously, it is possible to define an expanded pseudorandom function F1, which takes 2 bits
as its input:
F1: K x {0, 1}2 -> K
as:
F1(k, c) = G1(k)[c]
where:
c is any two bits.
N-bit pseudorandom function
The presented procedure can be repeated any number of times and one can receive
a pseudorandom function which could be of any size. This method of creating pseudorandom
functions is known as Goldreich-Goldwasser-Micali Construction, based on the names
of people who invented it.
It can be proved that if the generator G is secure, then a pseudorandom function working on input
data of length of n bits and defined in a way described above is also secure.
One-way Function
One-way functions are easy to compute but it is very difficult to compute their inverse functions.
Thus, having data x it is easy to calculate f(x) but, on the other hand, knowing the value
of f(x) it is quite difficult to calculate the value of x.
There is not a mathematical prove that one-way functions exist. In practical applications functions
that behave similarly as real one-way functions are used.
One-way functions are key elements of various tools useful in modern cryptography. They are
used in pseudorandom generators, authentication of messages and digital signatures.

Trapdoor one-way function


Trapdoor one-way functions are types of one-way functions that contain a kind of "back door"
(trapdoor). As in the case of ordinary one-way functions it is easy to compute their values for given
data but it is very difficult to compute their inverse functions. However, if one has some additional
secret information, he can easily compute the inverse function as well.

An example of such trapdoor one-way functions may be finding the prime factors of large
numbers. Nowadays, this task is practically infeasible. On the other hand, knowing one of
the factors, it is easy to compute the other ones.

One-way hash function


One-way hash functions (there are a lot of other names of functions of this type) transform input
messages of various length into output sequences of fixed length (usually shorter). The output
sequence is often called a hash value. Hash values are often used to mark input sequences, that
is to assign to them some unique values that characterize them.
One-way hash functions fulfil all conditions of one-way functions. It is easy to compute their values
based on input data but having only a hash value one can't determine the original input sequence.

A one-way hash function should be collision-free. This means that it should be very difficult to find
two different sequences that produce the same hash value.
Algorithms of one-way hash functions are often known to the public. They provide security thanks
to their properties as one-way functions. Usually, a change of one bit of input data, causes
changing about half of the output bits. Data generated by hash functions should be
pseudorandom (it cannot be possible to distinguish output data from ordinary random data).

One-way hash functions are used to protect data against intentional or unintentional
modifications. Having some data, one can calculate a checksum that may be attached to
the message and checked by other recipients (by computing the same checksum and compare it
with the received checksum value). They are used for example in an message authentication
algorithm HMAC.
Moreover, hash functions are used for storing data efficiently, in so-called hash tables. Data can
be accessed by finding hash values, which are stored in computer memory.

Hash function expanding


Having a hash function that operates on small blocks of data, one can expand this function and
create a new function that operates on larger input data of various sizes. Is such a way it is
possible to compute a hash value for any messages. To achieve this, one can use so
called Merkle-Damgard construction.
A scheme below presents a way of using multiple hash functions (h: T x X -> T, where T is a set
of all possible hash values) that allows to compute a hash value for the whole message (H: XL -
> T). Each hash function operates on one data block.

The first h function is initialized by a previously determined vector IV. The last data block should
be fulfilled to the full-length using previously agreed bits. A popular method of aligning the length
of the last block is about appending an additional bit equals to 1 and then filling the rest of
the block with bits equal to 0. It allows to determine precisely the end of the real message.
It may be proved that if the function h is collision-free, then the function H (which is created based
on functions h) is also collision-free. Every collision in the function H would automatically result in
a collision in the function h.

Hash functions based on block ciphers


There are a few popular ways of creating one-way hash functions, that operate on input data of
various lengths, using algorithms of block ciphers.

The Davies-Meyer hash function (denoted h) uses the encryption algorithm E that operates
on subsequent data blocks:
h(H, m) = E(m, H) XOR H
A scheme of Davies-Meyer function is presented below:

It can by proved that if E is a secure algorithm of a block cipher, then an intruder would have to
perform about O(2n/2) encryption operations to find a collision (thus, to find a message with
the same hash value as the hash value that he would like to find).
Another kind of hash functions based on block ciphers are Miyaguchi-Preneel functions. There
are 12 kinds of those functions, for example:
h(H, m) = E(m, H) XOR H XOR m
or:
h(H, m) = E(H XOR m, m) XOR m

Message Authentication Code


(MAC)
A message authentication code (often called MAC) is a block of a few bytes that is used
to authenticate a message. The receiver can check this block and be sure that the message
hasn't been modified by the third party.

The abbreviation MAC can also be used for describing algorithms that can create
an authentication code and verify its correctness.

Definition MAC defined over (K, M, T) is a pair of algorithms (S, V):

• S(k, m): returns a message authentication code t which belongs to a set T


• V(k, m, t): returns a value true or false depending on the correctness of
the received authentication code
where:

•M is a set of all possible messages m,


•K is a set of all possible keys k,
•T is a set of all possible authentication codes t
The simplest way to mark the authenticity of the message is to compute its checksum, for
example using the CRC algorithm. One can attach the result to the transmitted message.

The primary disadvantage of this method is the lack of protection against intentional modifications
in the message content. The intruder can change the message, then calculate a new checksum,
and eventually replace the original checksum by the new value. An ordinary CRC algorithm allows
only to detect randomly damaged parts of messages (but not intentional changes made by
the attacker).

The following paragraphs present several MAC algorithms that provide security against
intentional changes of authentication codes.

MAC algorithms based on PRF


It is possible to create secure MAC algorithms using a secure pseudorandom function (PRF).
Definition Having a pseudorandom function F: (K x X) -> Y one can define a pair of secure
MAC algorithms (S, V) as:

• S(k, m) := F(k, m)
• V(k, m, t): returns a value true if t = F(k, m) or false otherwise
• One should consider that the set Y should be large enough that a probability of guessing the
result of the F function would be negligible.
For example, to encrypt a 16-byte long message one can use the AES encryption algorithm or
any other similar symmetric cipher that operates on data blocks of size of 16 bytes.
The following paragraphs present some MAC algorithms that allow to protect longer messages.

CBC MAC
CBC MAC is based on a pseudorandom function (for convenience called F). It works similarly to
encryption performed in the CBC mode, with a difference that intermediate values are
not returned. Moreover, after encryption of the last data block, one additional encryption of
the current result is performed using the second secret key.
The additional encryption is performed to protect the calculated code. The whole process,
including the last additional step, is often referred to as ECBC MAC (Encrypted MAC), in contrast
to the previous algorithm steps called Raw CBC MAC.

Without the last algorithm step (that is, without encryption using the second key), an intruder could
attack CBC MAC security using a chosen-plaintext attack:
1. The intruder chooses a message m of size of one block.
2. The intruder obtains a value of authentication code of the message from the attacked
system: t = F(k, m).
3. At this moment, the attacker can determine a value of authentication code of
the message m1 of the size of two blocks m1 = (m, t XOR m):
rawCBC(k, m1) = rawCBC(k, (m, t XOR m)) = F(k, F(k, m) XOR (t XOR m)) = F(k, t XO
R (t XOR m)) = t
CBC MAC can protect a message of any length, from one to many blocks. To ensure security,
while using CBC MAC one should change the secret key every some time. It can be proved that
after sending the number of messages that is equal roughly to the square of the number of all
possible values of data blocks, the key is no longer safe.

The last data block should be filled up to the full length using previously agreed bits.
The additional bits should clearly determine the end of the original message to prevent attackers
from using a potential ambiguity. A popular method of aligning the length of the last block is to
append an additional bit equal to 1 and then filling the rest of the block up with bits equal to 0. If
there is not enough free space in the last block, one should add one more extra block and fill it
with the additional padding bits.
For comparison, adding only zeros would cause ambiguity where is the last bit of the broadcast
message (because the original message may have zeros as last bits of data). Furthermore, a lot
of messages with different contents that only differ in the number of zeros at the end, would have
the same authentication codes. This situation would break safety rules of message encoding.

ECBC MAC is used in various applications, for example in banking systems (ANSI X9.9, X9.19
and FIPS 186-3 standards). It is often based on the AES algorithm, that is used as F function.

NMAC
The NMAC algorithm (Nested MAC) is similar to the CBC MAC algorithm described earlier. It uses
a slightly different pseudorandom function F. The function F returns numbers that are correct
values of secret keys (thus, not the values of data blocks).
As in the case of CBC MAC, after encryption of the last data block, one additional encryption of
the result is performed, using the second secret encryption key. Because the previous result of
encryption of the last data block consists of the same amount of bits as the secret key, an
additional sequence of bits (a fix pad) should be append, to assure that the result has the same
size as data blocks. NMAC is usually used in systems, where the length of data blocks is much
bigger than the size of secret keys.

The last additional encryption is performed to protect the calculated code, as in the case of
CBC MAC. During encryption the subsequent blocks without the last step of NMAC, the algorithm
is commonly referred to as a Cascade.
Without the last step of the algorithm (that is, without encryption using the second key),
an intruder would be able to append any number of blocks to the intercepted message with
the correctly calculated authentication code. Then, he could calculate a new authentication code
and attach it to the modified message. As input to the first new added function F, the attacker
would use the original authentication code of the original message.

To ensure NMAC security, one should change the secret key from time to time. It can be proved
that after sending the number of messages equal roughly to the square of the number of all
possible values of secret keys, the key is no longer safe.

The NMAC algorithm uses the same methods for adding padding bits to the end of the last
incomplete message block, as the CBC MAC algorithm.
CMAC
The CMAC algorithm is similar to the previously described CBC MAC algorithm. Is uses the same
pseudorandom function F, which returns numbers that are elements of the set of all possible
values of data blocks.

Instead of the last additional encryption that uses a second key, CMAC uses two additional keys
that are added to input bits to the last block of F function. Depending on whether the last message
block is completely filled up with data bits, or it must be filled up with a previously determined
sequence of padding characters, the corresponding encryption key should be used.

Adding the additional key in the last encryption step protects against appending new blocks of
modified messages by a potential intruder. It is not necessary to use an additional encryption by
the F function, unlike in other MAC algorithms. Thanks to this solution, there is no need to add
an additional block to make room for padding (it is enough to choose the correct additional key).

CMAC is considered to be secure. It provides a safe way for message authentication. It is certified
for example by the American institute NIST.

PMAC
The PMAC algorithm (Parallel MAC) can be performed using many threads at time, unlike other
MAC algorithms described above (that require sequential processing of data blocks).
PMAC uses two secret encryption keys. The first secret key is used in P functions. All P functions
receive also the subsequent numbers of the additional counter. Output bits of P functions are
added XOR to data blocks. The result is encrypted by a pseudorandom function F, that uses
the second secret key. The P function should be uncomplicated and it should work much faster
than F functions.

Output bits from all F functions and output bits from the last data block (which is not encrypted by
the F function) are added XOR together, then the result is encrypted using the F function
algorithm, with the second secret encryption key.

As usual, it can be proved that PMAC is secure, if the secret key is changed from time to time. A
new key should be created after sending the number of messages that is equal roughly to
the square of the number of all possible values of data blocks.

PMAC allows to update authentication codes easily and quickly, in a case when one of the
message block was replaced by a new one. For example, if a block m[x] is replaced by m'[x],
then the following calculations should be performed:
tag' = F(k2, (F-1(k2, tag) XOR F(k2, m[x] XOR P(k1, x)) XOR F(k2, m'[x] XOR P(k1, x)))
One-time MAC
Similar to one-time encryption, one can define a one-time MAC algorithm, which provides security
against potential attacks and it is generally faster than other message authentication algorithms
based on PFR functions.
Definition One-time MAC is a pair of algorithms (S, V):

• S(m, k1, k2) := P(m, k1) + k2 (mod q): returns an authentication code t
• V(m, k1, k2, t): returns a value true or false depending on the correctness of
the examined authentication code t
where:

• q is a large prime (about 2128),


• m is a message that contains L blocks of size of 128 bits,
• k1, k2 are two secret keys; each of them has value from the interval [1, q],
• P(m, x) = m[L]⋅xL + ... + m[1]⋅x is a polynomial of degree L
It can be proved that two messages secured by using the same keys are indistinguishable for
potential observers.
Carter-Wegman MAC
The construction of Carter-Wegman MAC is based on the idea of one-time MAC. It is extended
by a pseudorandom function to allow using one secret key many times for subsequent messages.

Definition Having a secure one-time MAC (S, V) defined over sets (M, KJ, T) and a secure
pseudorandom function F: KF x {0,1}n->{0,1}n, one can define a pair of algorithms Carter-
Wegman MAC:

• SC-W(m, kF, kJ) := (r, F(kF, r) XOR S(m, kJ)): returns an authentication code that is
a pair (r, tC-W)
• VC-W(m, kJ, F(kF, r) XOR tC-W): returns true or false depending on the correctness of
the examined authentication code (r, tC-W)
where:

• kJ, kF are two secret keys of sets KJ and KF,


• T is a set of all possible values of authentication codes of the one-time MAC algorithm (of
length of n bits),
• r is a random number of length of n bits,
• (r, tC-W) is a value of an authentication code of the Carter-Wegman algorithm (of length
of 2n bits)

HMAC
HMAC is a popular system of checking message integrity, which is quite similar to NMAC.

The HMAC algorithm uses one-way hash functions to produce unique mac values.

The input parameters ipad and opad are used to modify the secret key. They may have various
values assigned. It is recommended to choose the values that would make both inputs to the hash
functions look as dissimilar as possible (that is, that modify the secret key in two different ways).
Using a secure hash function (that means the function which doesn't produce the same outputs
for different input data) guarantees the security of the HMAC algorithm.
Nowadays, the HMAC algorithm is used in many systems, including some popular Internet
protocols (SSL, IPsec, SSH).

Password-Based Encryption
(PBE)
Password-based encryption is a popular method of creating strong cryptographic keys.

The strength of the cipher depends on the strength of the secret key. A strong secret key must
contain characters that are not easily predictable, thus the secret key cannot be simply derived
from the user's password (because passwords are usually memorable subsets of ASCII or UTF-
8 characters).

Password-based encryption allows to create strong secret keys based on passwords provided by
the users. The produced key bytes are supposed to be as random and unpredictable as possible.

PBE algorithms use a user's password together with some additional input parameters:

o salt
o iteration count
There are two popular PBE standards that describe how to convert password bytes into the secret
key: PKCS #5 (supports ASCII characters) and PKCS #12 (which supports 16-bit characters).
In essence, they use a mixing function based around a secure hash function which is applied a
number of times (specified by an iteration count). After the mixing, the output bytes are used to
create the key for the cipher (together with the initialization vector if needed).
A diagram of PBE algorithms
Salt
The salt is a random number. It is supposed to prevent dictionary attacks. Without the salt,
an intruder could use the same PBE algorithms and create a lot of keys for some popular
phrases, often used as passwords. Adding a random value makes the combined input to the
PBE algorithm completely random. It is no longer possible for the attacker to check all the likely
PBE algorithm inputs.

Due to the fact that the salt is random, it is highly unlikely that the same salt would be reused
twice, for multiple encryptions. The salt is not a secret value. It may be transmitted along with
the ciphertext to the receiver.

Salt values are created by pseudorandom number generators. Ideally, the length of the salt
should be the same as the output size of the hash function that was used to create it.

Iteration Count
The key derivation procedure may be made more complicated by running PBE algorithm many
times. This would make the process of creating the secret key much more time consuming. Such
a situation is certainly acceptable for the user, who has to perform the authentication procedure
rarely and doesn't mind short delays. On the other hand, the attacker using brute force
attacks and checking thousands of combinations would suffer significantly due to the increased
time complexity.
Similarly to the salt, the iteration count may be transmitted to the receiver in the clear, along with
the ciphertext.

It is recommended to use 1000 or more iterations to achieve a sufficiently good security level.

Software Signing and


Authorisation
An important aspect of software security is protection against unauthorized software modifications
and alterations during application updating and loading new versions. In order to achieve that,
various signing and authorisation mechanisms are available.

Nowadays, most embedded devices allow downloading updated software versions and data. The
new software can be loaded into the device by using an application called a boot loader. Boot
loaders are programs that are stored in the device memory and are able to download data from
outside and flash the device.

Secure device architecture


Usually, the software stored in the device consists of a boot loader, which is kept in the read-only
memory, and the operating system or another main application, which is stored in the
reprogrammable (often flash) memory. Such an approach makes it possible to modify the
executable code and data, and to update or patch the applications. The protection must be
provided against unauthorised modification of sensitive code.
Whereas users will be able to modify most of the memory and software functionalities, a boot
loader application and a software digest must be kept in a separate, secure memory area, that
may be read but cannot be overwritten. Usually Read-Only Memory (ROM), Write-Once Read-
Many (WORM) memory, or internal microcontroller flash memory (EEPROM) are used for that
purpose. They are initialized by the manufacturer during the production process. At the same
time, most of the device functionality and its applications can be modified and debugged freely,
by using normal development tools.

Signing the software prevents it from being further modified without detection. Also, a secure
digest allows to authenticate the producer (or, at least, the approver) of the application version,
thus making sure that no malicious third party application is executed.

Software signing
The software stored in the device should be executed only if it was successfully validated during
the boot process. Digital signatures are used most often to confirm the software authenticity.

In order to sign the software, the first step it to use a hash function (for example, one of the SHA
algorithms) to calculate a short and unique software digest. The digest has a fixed, relatively short
(comparing to the size of the full firmware) length.
The digest should be then encrypted, by using an asymmetric cipher and a private key, which is
known only to the manufacturer. RSA is one of the most popular ciphers used for digest
encryption.
A corresponding public key should be saved in the read-only memory area, next to the bootloader.
Public keys are often contained in signed digital certificates.

The encrypted digest is stored in the device normal memory. It changes every time when the new
software version is loaded, for example during software update. During device start-up,
a bootloader should check if the stored digest value is correct. First, it should decrypt the digest,
by using the public key. Then, it has to calculate a second digest value by itself, by using the
same hash function, as it was used originally. Finally, the two values should be compared. If they
are not the same, then one should assume that the code was modified after being signed. The
program execution should stop, and the device should turn off.
It is crucial to highlight that this checking should take place at the beginning of the software flow,
when the processor still operates in the read-only memory, and the control didn't reach the larger,
writeable memory area.

In general, every secure microcontroller should start executing software from an internal,
immutable memory. All the software verification procedures should be stored in that read-only,
initially executed memory. Such a location is often called trusted, and the device starting point of
execution is called the root of trust.
Some more complex schemes of signing could be used to protect the code. The digest may be
signed by multiple entities, for example by the producer and the external validation authority. It is
also possible to add intermediate keys to introduce more sophisticated security mechanisms.

Secure software flashing


When a new software version is being loaded into the device, it should be tested and checked, to
make sure that no unauthorised modifications were applied.

Generally, during manufacturing, the code that was developed and prepared for a device is then
signed using a proper private key and stored (together with the digest) in a secure database.
Asymmetric cipher algorithms mentioned above, are also most often used in this procedure.

During programming the device, the flashing application downloads the software and the
corresponding digest. After boot loader authentication, the secured software (obtained earlier
from the database) is sent to the device.

The boot loader decrypts the received signature, and calculates the software digest. If both digest
values are the same, then the boot loader program should accept the new application version.
Software signing usage today
Secure software flashing is implemented in numerous applications, operating in many areas,
especially in telecommunication, aeronautical, and automotive industries. It is particularly useful
if the software systems are distributed and consist of many, semi-independent devices.

Due to the fact, that a lot of small embedded devices have relatively little computational power,
special care should be taken when using complex hash and encryption algorithms. It is
recommended to configure the asymmetric ciphers in a way, that will make the public-key
decryption procedure as cheap as possible. For example, in case of the RSA cipher, the public
key component should be as small as possible (say, 3) to make computation faster.
Index of Coincidence
The index of coincidence shows how likely is the situation that during comparing some two texts
(letter by letter), two currently compared letters are the same.

A value of the index of coincidence is calculated based on the probability of occurrence of


a specified letter and the probability of comparing it to the same letter from the second text (which
is of course determined by the probability of occurrence of the letter in the second text). For
the text of N-letter length and the alphabet with c different letters (for example, for the English
alphabet c = 26) the value of the index of coincidence IC during comparing this text to the same
text shifted relative to the first one by random number of letters may be presented as:
IC = (n1(n1-1) + ... + nc(nc-1)) / (N(N-1) / c) ,
where ni is a number of occurrences of the letter in the whole text. Click here to find out more.
In particular, while analysing letter frequencies in the specified language (fi) it is possible to
calculate the expected value of the index of coincidence for this language (that means
the expected value of the index of coincidence while comparing texts written in the same
language):
ICexpected = (f12 + ... + fc2) / (1/c)
It is easy to notice that if all letters in a specified language were equally often, then the expected
value would be equal to 1. Of course, in all the existing languages different letters occur with
different frequencies so indexes of coincidence for different languages differ from each other. For
English the expected value is equal to 1,73.
One will notice that the index of coincidence calculated for two texts written in two different
languages is usually noticeably smaller than expected indexes of coincidence calculated for these
languages. It is caused by the fact that the letters which are popular in the first text (in the first
language), may be less popular in the second text (written in the second language). Thus,
the probability of meeting the same letters in the compared texts is smaller.

Using IC in cryptography
The index of coincidence is used in cryptography for breaking substitution ciphers and simple
XOR ciphers.

IC can be used to determine the length of the secret key if a secret message is encrypted using
one of those ciphers. It may be achieved by comparing (letter by letter or byte by byte)
the encrypted text with the same text shifted by a number of characters which is equal to
the currently tested key size. For each testing possibility (so for each key size, from 1 until finding
the solution) one must calculate the value of IC and remember its value.
When one tests the correct text offset, which is equal to the length of the secret key, the confusion
introduced by the secret key will disappear:

o in the case of a substitution cipher, the letters in both texts at corresponding


positions are shifted by the same number of characters, or
o in the case of a XOR cipher, changes of all bits in corresponding bytes are
the same.
After finding a correct shift, all compared characters in the first and the second text (although they
are not known) belong to the same language, so after calculating their index of coincidence,
the result will be similar to the expected value of the index of coincidence for the specified
language and it will be much different from other, previously testes, values of the index of
coincidence (which were calculated for wrong shifts).

During comparing two texts with wrong text offset, letters (bytes) in the first text will be changed
differently than in the second text. Therefore, it is possible to consider the letters as belonging to
other languages, with different frequencies of letter occurrences in the first and the second text.

A significantly larger value of IC will be calculated for all shifts equal to the key length or its
multiplicity (because the same key is repeated periodically).

Expected values for some languages


Indexes of coincidence can be calculated for different languages. They depend on average
frequencies of letters. Of course, the frequencies can be determined only approximately because
in different kind of texts (scientific, historical, fiction) the frequencies are slightly different.

o English - 1.73
o Russian - 1.76
o Spanish - 1.94
o Portuguese - 1.94
o Italian - 1.94
o French - 2.02
o German - 2.05
Sometimes, the values of indexes of coincidence are presented without the normalization (the
normalized value depends on the number of letters in the alphabet). For example, for English
language, the expected IC value without normalization is equal to:
1,73 / 26 = 0,067
What Is Encryption?

In cryptography, encryption is the process of encoding a message or information in a way


that only authorized parties can access it and those who are not authorized cannot.
Encryption Types / Methods
Asymmetric Encryption

In public-key encryption schemes, the encryption key is published for anyone to use and
for encrypting messages. Only the receiving party has access to the decryption key that
enables messages to be read. Public-key encryption was first described in a secret
document in 1973. Before that, all encryption schemes were symmetric-key (also called
private-key).

Symmetric Encryption

In symmetric-key schemes, the encryption and decryption keys are the same.
Communicating parties must have the same key in order to achieve secure communication.

Encryption Algorithms
Triple DES Encryption

Triple DES was designed to replace the original Data Encryption Standard (DES) algorithm,
which hackers learned to defeat with ease. At one time, Triple DES was the recommended
standard and the most widely used symmetric algorithm in the industry.

Triple DES uses three individual keys with 56 bits each. The total key length adds up to 168
bits, but experts say that 112-bits in key strength is more like it.
Though it is slowly being phased out, Triple DES is still a dependable hardware encryption
solution for financial services and other industries.

RSA Encryption

RSA is a public-key encryption algorithm and the standard for encrypting data sent over
the internet. It also happens to be one of the methods used in PGP and GPG programs.

Unlike Triple DES, RSA is considered an asymmetric encryption algorithm because it uses a
pair of keys. The public key is used to encrypt a message and a private key to decrypt it. It
takes attackers quite a bit of time and processing power to break this encryption code.

Advanced Encryption Standards (AES)

The Advanced Encryption Standard (AES) is the algorithm trusted as the standard by the
U.S. government and many other organizations.

Although it is extremely efficient in 128-bit form, AES encryption also uses keys of 192 and
256 bits for heavy-duty encryption.

AES is considered resistant to all attacks, with the exception of brute-force attacks, which
attempt to decipher messages using all possible combinations in the 128-, 192- or 256-bit
cipher. Still, security experts believe that AES will eventually become the standard for
encrypting data in the private sector.

Twofish encryption algorithm

Blowfish encryption algorithm

IDEA encryption algorithm

MD5 encryption algorithm

HMAC encryption algorithm

Encryption Standards
There are a number of standards related to cryptography. Here are the following standards
for encryption:

• Data Encryption Standard (now obsolete)


• Advanced Encryption Standard
• RSA (the original public-key algorithm)
• Open PGP
File Encryption Overview
File system-level encryption, often called file and folder encryption, is a form of disk
encryption where individual files or directories are encrypted by the file system itself.

Disk Encryption Overview


Disk encryption is a technology that protects information by converting it into unreadable
code that cannot be deciphered easily by authorized users. Disk encryption uses disk
encryption software or hardware to encrypt every bit of data that goes on a disk or disk
volume.

Email Encryption Overview


Email encryption is encryption of email messages designed to protect the content from
being read by entities other than the intended recipients. Email encryption may also
include authentication. Email is not secure and may disclose sensitive information. Most
emails are currently transmitted in the clear (not encrypted) form. By means of some
available tools, people other than designated recipients can read the email content. Email
encryption traditionally uses one of two protocols, either TLS or end-to-end encryption.
Within end-to-end encryption, there are several options, including PGP and S/MIME
protocols.

Encryption Best Practices


1. Know the laws: When it comes to safeguarding the personally identifiable
information, organizations must adhere to many overlapping, privacy-related
regulations. The top six regulations that impact many organizations include: FERPA,
HIPAA, HITECH, COPPA, PCI DSS and state-specific data breach notifications laws.
2. Assess the data: A security rule under HIPAA does not explicitly require
encryption, but it does state that entities should perform a data risk assessment and
implement encryption if the evaluation indicates that encryption would be a
“reasonable and appropriate” safeguard. If an organization decides not to encrypt
electronic protected health information (ePHI), the institution must document and
justify that decision and then implement an “equivalent alternative measure.”
3. Determine the required or needed level of encryption: The U.S. Department of
Health and Human Services (HHS) turns to the National Institute of Standards and
Technology (NIST) for recommended encryption-level practices. HHS and NIST have
both produced robust documentation for adhering to HIPAA’s Security Rule.
NIST Special Publication 800-111 takes a broad approach to encryption on user
devices. In a nutshell, it states that when there is even a remote possibility of risk,
encryption needs to be in place. FIPS 140-2, which incorporates AES into its
protocols, is an ideal choice. FIPS 140-2 helps education entities ensure that PII is
“rendered unusable, unreadable or indecipherable to unauthorized individuals.” A
device that meets FIPS 140-2 requirements has a cryptographic erase function that
“leverages the encryption of target data by enabling sanitization of the target data’s
encryption key, leaving only the cipher text remaining on the media, effectively
sanitizing the data.”
4. Be mindful of sensitive data transfers and remote access: Encryption must
extend beyond laptops and backup drives. Communicating or sending data over the
internet needs Transport Layer Security (TLS), a protocol for transmitting data over
a network, and AES encryption. When an employee accesses an institution’s local
network, a secure VPN connection is essential when ePHI is involved. Also, before
putting a handful of student files on a physical external device for transfer between
systems or offices, the device must be encrypted and meet FIPS 140-2 requirements
to avoid potential violations.
5. Note the fine print details: Unfortunately, many schools fail to engage in proper
due diligence in reviewing third-party services’ privacy and data-security policies,
and inadvertently authorize data collection and data-mining practices that
parents/students find unacceptable or violate FERPA. Regulatory compliance entails
much more than simply password-protecting an office’s workstations. It requires
using encryption to protect data-at-rest when stored on school systems or
removable media device. Remember that data at rest that is outside the school’s
firewall (or “in the wild”) is the top source of security breaches.

All Simple Ciphers


Simple Substitution Ciphers
Simple substitution ciphers replace each plaintext letter by another one character. The
transformation is unequivocal and reversible.

Usage
Early simple substitution ciphers were used as early as in ancient times. They were one of the first
ways (after steganography) to secure messages.
Description
Simple substitution ciphers work by replacing each plaintext character by another one character.
To decode ciphertext letters, one should use a reverse substitution and change the letters back.

Before using a substitution cipher, one should choose substitutions that will be used for changing
all alphabet letters. This can be performed by writing all alphabet letters in the alphabetical order
in the first row, and then in the second row the same letters but in any other random order. Letters
from upper and lower rows form pairs that should be used during encryption.

For example, let us consider the following two sequences of letters which define the substitution
cipher:
ABCDEFGHIJKLMNOPQRSTUVWXYZ
TIMEODANSFRBCGHJKLPQUVWXYZ
One can notice that the letter F is encoded using the letter D, while the same ciphertext
letter F corresponds to the plaintext litter J. On the other hand, the letter Z corresponds to Z (it is
not changed during encryption).
Mixing of the alphabet can be proceed by determining a keyword (or a few keywords), writing it
down (skipping repeated letters) under the original alphabet letters and completing remaining
empty spaces at the end with the remaining alphabet letters. It makes it easier to remember the
substitutions and to exchange the secret key between all sides.

This method has been used to create the substitution in the example above. A Greek
sentence: timeo Danaos et dona ferentes was used as its keywords.
One of the characteristics of simple substitution ciphers is that different plaintext alphabet letters
would always produce different ciphertext letters. It is not possible that both of them would be
encrypted by the same alphabet letter.

Security of simple substitution ciphers


Because of the fact that every character can be encoded by every alphabet letter, there
are 26! possible pairs of letters in the Latin alphabet (which is 26-letter long). This results in
a large number of combinations. The provided security is approximately equal to the strength of
the cipher with the secret key of size of 88 bits.
However, a much more effective approach than brute-force attacks is to use frequency
analysis of ciphertext letters in order to break a simple substitution cipher. The letters in all
languages appear in average texts with different frequencies. Thus, it is much faster to discover
the substitutions used in the cipher, by comparing frequent ciphertext letters with the letters which
are frequently used in the language used for encryption. By using this approach, it is assumed
that analysing of encoded messages of size of about 50 letters, is sufficient for discovering
the substitutions.
Simple Substitution Ciphers:
Caesar Cipher
SIMPLE SUBSTITUTION CIPHER

The Caesar cipher is a simple substitution cipher, which replaces each plaintext letter by a
different letter of the alphabet. The cipher is named after Gaius Julius Caesar (100 BC – 44 BC),
who used it for communication with his friends and allies.

Usage
Julius Caesar encrypted his correspondence in many ways, for example by writing texts
in reverse order or writing Latin texts using Greek letters. Some ancient authors (for example,
the Roman historian Gaius Suetonius Tranquillus, who lived in the first century of our era) wrote
that he was using the cipher with various shifts, of one or three characters.

Algorithm
The Caesar cipher is one of the simplest substitution ciphers.

Each plaintext letter is replaced by another one, which is offset by a certain amount of alphabet
positions (always in the same direction). If the algorithm points to the position after the last letter
in the alphabet, one should move to the beginning of the alphabet.

The cipher can be presented using mathematical formulas for encrypting and decrypting
characters:
En(x) = (x+n) mod 26
Dn(x) = (x-n) mod 26
where:
n is the offset (the secret key) and 26 is the total number of letters in the Latin alphabet
(of course, for other languages one should use other a different number).

Security of the Caesar cipher


The Caesar cipher can be easily broken using brute force attacks. To discover the plaintext one
should just check all the possible 26 (for the Latin alphabet) offsets.
Like other substitution ciphers, the Caesar cipher can also be attacked using known ciphertext
attacks and letter frequency analysis of the ciphertext.

Implementation
Simple encryption and decryption functions implemented in Python:

KEY = 3

def encrypt(text):
encrypted = ""
for ch in text:
if ord(ch) >= ord('a') and ord(ch) <= ord('z'):
newCode = ord(ch) + KEY
if (newCode > ord('z')):
newCode -= 26
encrypted += chr(newCode)
if ord(ch) >= ord('A') and ord(ch) <= ord('Z'):
newCode = ord(ch) + KEY
if (newCode > ord('Z')):
newCode -= 26
encrypted += chr(newCode)
return encrypted

def decrypt(text):
decrypted = ""
for ch in text:
if ord(ch) >= ord('a') and ord(ch) <= ord('z'):
newCode = ord(ch) - KEY
if (newCode < ord('a')):
newCode += 26
decrypted += chr(newCode)
if ord(ch) >= ord('A') and ord(ch) <= ord('Z'):
newCode = ord(ch) - KEY
if (newCode < ord('A')):
newCode += 26
decrypted += chr(newCode)
return decrypted

ROT13
SIMPLE SUBSTITUTION CIPHER

ROT13 is one of simple substitution ciphers. It is a special case of the Caesar cipher. It appeared
in use in the early eighties of the twentieth century.

Usage
The cipher is often used for hiding content of transmitted information and avoidance automatic
algorithms checking words used in messages. It is used for hiding e-mail addresses from
spambots or prohibited contents in posts on internet forums.

Algorithm
The ROT13 algorithm is about shifting characters by 13 positions in the Latin alphabet (which
contains 26 letters in total).

For example, the letter A is encrypted as N, and the letter B is encrypted as O.


ROT13 is simple in use because applied twice to a letter produces exactly the same letter:
ROT13(ROT13(x)) = x
Therefore, the ROT13 decryption algorithm is about to use the same encryption function for
ciphertext characters.

Security of the ROT13 cipher


The cipher doesn't have a secret key and to read a message encrypted using ROT13, one must
only know that this particular cipher has been used.

What is interesting (and sad), there are known cases of using the cipher in respectable and
popular applications (Netscape Communicator, ebooks of New Paradigm Research Group)
in order to protect the stored data.

Implementation
ROT13 can be implemented using a Linux command tr:
alias rot13="tr a-zA-Z n-za-mN-ZA-M"
Homophonic Substitution Ciphers
Homophonic substitution ciphers convert each plaintext character to one of the previously
determined letters or graphic symbols.

Usage
Homophonic substitution ciphers were invented as an improvement of simple substitution ciphers.
They were very popular during the Renaissance and they were used by diplomats in Europe for
many centuries.
Description
Homophonic substitution ciphers work by replacing each plaintext character by another character,
number, word or even graphic symbol. To decode ciphertext letters, one should use the reversed
substitution and change characters in the other side.

The main motivation of introducing such types of ciphers was a possibility to obscure frequencies
of ciphertext characters. Usually popular letters are replaced by one of several characters,
numbers or phrases. Different replacements are used randomly thus frequency analysis is much
more difficult.

Because of the fact that all 26 letters of the Latin alphabet should be replaced by many
corresponding phrases, the most popular technique is to assign a few numbers to each letter.
One can also expand the alphabet and add a few new characters, for example by assigning
different meanings to small and large letters, writing letters upside down or inventing new graphic
symbols.

Homophonic Substitution Cipher:


Book Cipher
HOMOPHONIC SUBSTITUTION CIPHER

First mention about book ciphers appeared in 1526 in the works of Jacobus Silvestri.

Usage
Around seventy years after developed the first efficient methods of printing books in 15th century,
the first book ciphers were invented. Thanks to their simplicity, they were used for the next
hundreds of years.

Algorithm
There are several types of book cipher's algorithms. The most popular method consists of
replacing each letter of the plaintext by three numbers - the number of a page, the number of
a line and the number of a character in the line. The numbers are chosen in such a way,
to indicate the same letter, as in the plaintext. Therefore the ciphertext consists of long
sequences of numbers. During decryption one must find each letter of the message pointed by
the triple sequences of numbers.

The other type of the book cipher consists in replacing each letter of the plaintext by two
numbers - the number of a page and the number of a word on the page. Both sides must agree
in advance, which letter of the pointed words will be used for encryption (for example the first one
or - because usually the first letters of words are generally less diverse - the second one or
the third one).

In order to use the book cipher, both sides must agree in advance for using exactly the same
book (including exactly the same edition) during their communication. Because of its popularity
and because all its verses are numbered, the common practice is to use the Bible as the key.

Security of the book cipher


The book cipher has two main weaknesses. First, its use is time-consuming. Each character must
be encoded by a few numbers and must be separately found in the given book during decryption.

Second, there is a real danger that the intruder will air or guess which book is used by both sides
for encrypting their communication. In the era of computers, it is not a big problem to quickly
check a lot of books potentially used as secret keys by use brute force attacks.

Polygraphic Substitution Ciphers


Polygraphic substitution divide the plaintext into groups of letters. Then, they replace each group
of letters by one of the predefined letters, numbers, graphic symbols, or by another group
of characters.

Usage
Polygraphic substitution ciphers were invented as an improvement of simple substitution ciphers.
They were very popular during the Renaissance and they were used in Europe for many
centuries.
Description
Polygraphic substitution ciphers work by dividing the plaintext into many parts, and replacing each
group by a word, a single character or number, or anything else. To decrypt ciphertext letters, one
should use the reversed substitution and change phrases in the opposite direction.

The main motivation of introducing ciphers of this type was a possibility to obscure frequencies
of ciphertext characters. To achieve that, popular plaintext phrases should be replaced by one of
a few previously assigned to that phrase characters, numbers, or other phrases. Different
replacements should be used randomly, thus making the frequency analysis much more difficult.
Polygraphic substitution ciphers provide larger randomness and flexibility that homophonic
substitution ciphers due to a possibility to encrypt whole groups of characters at once.
A popular technique used in polygraphic substitution ciphers is to assign several predefined words
or numbers to each popular plaintext word. European diplomats used codenames to encode
important institutions, places, and names of important people.
Polygraphic Substitution Ciphers:
Playfair Cipher
POLYGRAPHIC SUBSTITUTION CIPHER

The cipher was invented by the British inventor Charles Wheatstone, who lived in the 19th
century. Its first description was presented in 1854. The cipher is named after the Scottish
scientist and politician, Lyon Playfair, who heavily popularized its use.

Usage
With the support of baron Playfair, the cipher was adapted for usage by the British Army. It was
used during the Second Boer War, and then in World War I and World War II (also by other
countries). Like all other ciphers of that period, it was withdrawn from use when the first computers
appeared. Nowadays, it can be broken relatively quickly by using brute force attacks.

Algorithm
The Playfair cipher is a kind of polygraphic substitution cipher. A plaintext is divided into groups
of characters and then one of the predefined characters is assigned to each group. The Playfair's
algorithm operates on groups of size of two letters.

Before encryption, one should prepare a table based on a secret keyword. The table has
dimensions of 5 by 5 letters and contains 25 letters of the Latin alphabet (the Latin alphabet has
26 letters, so one should skip one of the rare letters - for example x or q; or should
count i and j as one letter).
During filling table cells, one should use the secret keyword (or a few secret words). First, all
duplicated letters in the secret word should be skipped (only the first ones should be used). Then,
all the remaining letters should be entered into the table, without changing their original order
found in the keyword. Before doing that, the parties should agree in which order the table ought
to be filled (for example, row by row from left to right and from top to bottom). The rest cells of
the table should be filled with the rest alphabet letters in the ordinary alphabetical order.

For example, if one uses a Latin sentence as a keyword: pecunia non olet (it is believed that its
author was Roman Emperor Vespasian), counting i and j as one letter and filling the table row
by row, from top to bottom, one will receive the following table:
p e c u n

i a o l t

b d f g h

k m r s t

u w x y z
The next step during encryption is about dividing the plaintext into parts of length of two letters. If
necessary, one can append a rare letter (for example X or Q) to the original text.
The algorithm finds both letters of each pair in the table and designates a rectangle that has two
corners pointed by the letters. Then, these two letters should be replaced by another two letters,
determined by two other rectangle's corners. This procedure should be performed for all plaintext
pairs, all of them should be replaced by letters received from the table.

Both parties should agree in which order the new letters ought to be appended to the ciphertext
(for example, the first letter would be a letter in a corner that is determined by a row that contains
the first of both encoding plaintext letters).

The case with both letters in the currently encrypting pair that are located in the same row, should
be handled differently. In such a case, one should usually change them into two letters lying
directly to the right of them. If the original letter is located in the last position of the row, one should
take the first letter of the row.

If the both letters in the currently encrypting pair are in the same column, one should perform
similar operations. Usually, one ought to change them into two letters lying directly below them.
If the original letter is in the last position of the column, one should take the first letter of
the column.

The last case for consideration is the situation when the current pair consists of two identical
letters. One should add an additional rare letter (for example X or Q) before the first letter of
the pair. Then, after encrypting the new pair, one should continue the whole procedure for the rest
of characters (starting from the second letter of the original pair).
Ciphertext decryption is performed in a similar way. First, the recipient must create (knowing
the secret keyword) the same table as the sender. Then, he decodes pairs of letters, using
analogous operations (determining rectangle corners in the reverse order and skipping
unnecessary added letters like X or Q).

Security of the Playfair cipher


The general method of breaking the Playfair cipher is about performing frequency analysis
of pairs of letters. Knowing estimated frequencies for a language that was used in the message,
one can try to match frequent ciphertext pairs to frequent pairs of letters in the language.

Because of its simplicity, the cipher is characterized by features that make it easier to break. First,
one can notice that pairs of letters and their inverse pairs (that means pairs like AC and CA)
produce the same pairs in the ciphertext. It can be detected by creating databases of popular
words and phrases that contain such combinations. Also, the Playfair cipher's ciphertext is
characterized by a lack of the same repeated letters that are located next to each other.
The other method of attacking the cipher is about randomly filling the table and trying to decode
the ciphertext based on its current values. Then, the attacker can slightly modify the table and try
to decode the ciphertext again. He should continue modifying the table, accepting changes that
improve quality of the current proposed plaintext. It is a relatively simple method, quite easy
to implement.

The third very effective method of breaking the Playfair cipher is about guessing plaintext
fragments, for example salutations to a sender, or dates and places of sending the message.
Knowing the ciphertext and probable plaintext parts, one can very easily recreate the table that
was used for encryption. This was a very common method of attacking German ciphers similar to
the Playfair cipher during the Second World War.
Implementation
Implementing the Playfair Cipher is a relatively simple task. The main challenge is to deal properly
with row and column numbers, for each pair of characters.

Below, there is a JavaScript function which performs encryption of the input message, and return
the result. Note, that the keyword is passed as one dimensional string:

function encrypt(messageInput, keyword) {


var messageOutput = '';

var pos = 0;
while (pos < messageInput.length) {
var m1 = messageInput[pos];
var m2 = '';

if (pos + 1 < messageInput.length) {


if (messageInput[pos + 1] != m1) {
m2 = messageInput[pos + 1];
pos += 2;
} else {
m2 = 'Q' // some dummy letter
pos += 1;
{
} else {
m2 = 'Q' // some dummy letter
pos += 1;
}

var c1 = m1;
var c2 = m2;

var idx1 = keyword.indexOf(m1);


var idx2 = keyword.indexOf(m2);
var row1 = Math.floor(idx1 / 5);
var col1 = idx1 % 5;
var row2 = Math.floor(idx2 / 5);
var col2 = idx2 % 5;
if ((row1 !== row2) && (col1 !== col2)) {
c1 = keyword[(5 * row1) + col2];
c2 = keyword[(5 * row2) + col1];
} else
if ((row1 !== row2) && (col1 === col2)) {
c1 = keyword[(5 * ((5 + row1 + 1) % 5)) + col1];
c2 = keyword[(5 * ((5 + row2 + 1) % 5)) + col1];
} else
if ((row1 === row2) && (col1 !== col2)) {
c1 = keyword[(5 * row1) + ((5 + col1 + 1) % 5)];
c2 = keyword[(5 * row1) + ((5 + col2 + 1) % 5)];
} else {
// error
}

messageOutput = messageOutput.concat(c1);
messageOutput = messageOutput.concat(c2);
}

return messageOutput;
}

You may try out the Playfair Cipher online on Crypto-Online website.

Two-Square Cipher (Double


Playfair)
POLYGRAPHIC SUBSTITUTION CIPHER

A two-square cipher is a modification of the Playfair cipher and provides slightly better protection
of exchanged messages.

Usage
The cipher was widely used by diplomats and armies until World War II. Nowadays, it is
considered to be easily breakable by using brute force attacks.

Algorithm
The two-square cipher is a polygraphic substitution cipher. The original plaintext is divided into
groups of a few letters. Then, each group is replaced by another previously determined group
of characters. The two-square cipher operates on groups of the size of two letters.

Before encryption one should prepare two tables, using words (or longer phrases) that are used
as secret keys. Both tables have dimensions of 5 by 5 letters and contain 25 letters of the Latin
alphabet. The Latin alphabet has 26 letters, so to create the table, one of the rare letters should
be skipped (for example, x or q), or the letters i and j should be treated as one letter.
During inserting letters into the two tables, one should use the secret keywords. First, all
duplicated letters in the keywords should be skipped (only the first occurrences should remain).
Then, all the remaining letters should be entered (in the original order) into the tables (letters from
the first keyword to the first table, and letters from the second one to the second table). Of course,
before that, the parties should agree, in which order the table cells ought to be filled (for example,
row by row from left to right, and from top to bottom). The rest cells of the table should be filled
with the remaining alphabet letters, usually in the alphabetical order.
For example, if one used names of both parents of the Roman Emperor Vespasian as secret
keywords: Titus Flavius Sabinus and Vespasia Polla, treating the letters i and j as one letter, and
filling the tables row by row, from left to right and from top to bottom, one would receive
the following two tables:
t i u s f

l a v b n

c d e g h

k m o p q

r w x y z

v e s p a

i o l b c

d f g h k

m n q r t

u w x y z

The tables should be placed side by side in such a way that lines of rows (or lines of columns)
are aligned.

To perform encryption, as a next step, one should divide the plaintext into pairs. Each pair should
consist of two consecutive letters. If necessary, a rare letter may be appended to the original text
(for example X or Q).
Then, one should find the first letter of each pair in the first table, and the second letter in
the second table. Then, one should create a rectangle that covers over two tables and has
corners in cells determined by the two letters. To encrypt those two letters, they have to be
replaced by another two letters, that are determined by two other corners of the rectangle.
The same steps should be repeated for all plaintext pairs. All letters should be replaced by letters
chosen based on the tables.

For example, encrypting of a pair of letters AS by using the tables defined above, produces one
of the two pairs: IL or LI.
Note, that both parties should agree in which order the new letters ought to be appended to
the ciphertext (for example, the first letter would be a letter from the left table and the second letter
would be taken from the right table).

If both letters in the currently encrypting pair are located in the same row (or, respectively, in
the same column), then a new pair should the same as the original one. It must by highlighted
that one of the weaknesses of the two-square cipher is that such a situation takes place for twenty
percent of all possible pairs of letters.

Ciphertext decryption is performed in a similar way. Firstly, the recipient should create (knowing
the secret keywords) the same two tables as the sender. Then, he should decode pairs of letters,
by using analogous operations. He should find the two ciphertext letters in the two tables, then he
should determine the rectangle corners (which will locate the original plaintext letters), and finally
the plaintext letters should be appended in the correct order to the plaintext.

Security of the two-square cipher


The biggest weakness of the cipher is the fact that plaintext fragments may be quite often visible
in the corresponding ciphertext. Based on that feature, and on two methods of attacking the cipher
- frequency analysis and guessing plaintext parts (described below) - an intruder can predict and
discover increasingly longer fragments of the original message.

Ciphertext frequency analysis of is about finding frequent repetitions of the same pairs of letters.
Knowing approximate frequencies of digraphs in a given language, one can try to match popular
ciphertext pairs to popular digraphs occurring in the language. Because of using two secret keys,
this is a more difficult task than the same operation performed against the Playfair cipher.

Guessing plaintext fragments is about finding ciphertext letters that correspond to popular phrases
expected during communications, for example welcoming, or date and place of creating the
message. Knowing the ciphertext and probable plaintext fragments, one can recreate the tables
used for encryption. This was a very common method of attacking German ciphers similar to
the two-square cipher, during the Second World War.

Implementation
Implementing the Two-Square Cipher is a relatively simple task. After preparing the input text and
passwords, the main challenge is to deal properly with row and column numbers, for each
character.

Below, there is a JavaScript function which performs encryption of the input message, and return
the result. Note, that both keywords are passed as one dimensional strings:

function encrypt(messageInput, keyword) {


var messageOutput = '';

var pos = 0;
while (pos < messageInput.length) {
var m1 = messageInput[pos];
var m2 = '';

if (pos + 1 < messageInput.length) {


m2 = messageInput[pos + 1];
pos += 2;
} else {
m2 = 'Q' // some dummy letter
pos += 1;
}

var c1 = m1;
var c2 = m2;

var idx1 = keyword[0].indexOf(m1);


var idx2 = keyword[1].indexOf(m2);

if (Math.floor(idx1 / 5) !== Math.floor(idx2 / 5)) {


c1 = keyword[0][(5 * Math.floor(idx2 / 5)) + idx1 % 5];
c2 = keyword[1][(5 * Math.floor(idx1 / 5)) + idx2 % 5];
}

messageOutput = messageOutput.concat(c1);
messageOutput = messageOutput.concat(c2);
}

return messageOutput;
}

You may try out the Two-Square Cipher online on Crypto-Online website.

Four-Square Cipher
POLYGRAPHIC SUBSTITUTION CIPHER

The four-square cipher is a modified version of the Playfair cipher. It provides better security
of protected data. It was invented by a French cryptanalyst Félix Delastelle in 19th century.

Usage
It was used by all armies during World War II. Nowadays, it is considered to be easily breakable
by using brute force attacks.

Algorithm
The four-square cipher is a polygraphic substitution cipher. The whole plaintext is divided into
groups of letters. Then, each group is replaced by another previously determined group of
characters. The four-square cipher operates on groups of the size of two letters.

Before encryption, it is necessary to prepare four tables. All the tables have dimensions of 5 by
5 letters and contain 25 letters of the Latin alphabet. Due to the fact that the Latin alphabet
contains 26 letters, one of the rare letters (for example x or q) should be skipped, or the
letters i and j should be treated as one letter. The tables should be placed side by side in such
a way that they create a bigger square with the side length of two tables. All lines of rows and
columns should be retained.
Two tables (the upper left and lower right ones) contain letters in the alphabetical order. During
filling cells of the two other tables, one should use two secret keywords (that would be used to
protect the data). Firstly, all duplicated letters in the keywords should be skipped (only the first
occurrences should used). Then, all the remaining keyword letters should be entered (in the
original order) into the tables (the letters from the first keyword to the first remaining table -for
example the upper right one- and the letters from the second keyword to the second table).
The communicating parties must agree earlier, in which order each table ought to be filled (for
example row by row from left to right and from top to bottom). The rest cells of the both tables
should be filled with the rest alphabet letters, usually in the alphabetical order.

For example, if one used names of both parents of the Roman Emperor Vespasian as secret
keywords: Titus Flavius Sabinus and Vespasia Polla, treating letters i and j as one letter, and filling
the tables row by row, from left to right and from top to bottom, the two following tables would be
received:
a b c d e

f g h i k

l m n o p

q r s t u

v w x y z

t i u s f

l a v b n

c d e g h

k m o p q

r w x y z

v e s p a

i o l b c

d f g h k
m n q r t

u w x y z

a b c d e

f g h i k

l m n o p

q r s t u

v w x y z

In the next step of encryption, the whole plaintext should be split into pairs. Each pair should
consist of two consecutive letters. If required, a rare letter should be appended to the original text
(for example X or Q).
During encryption, two subsequent letters are encoded at a time. One should find the first letter
of each pair in the upper left table and the second letter in the lower right table. Then, one should
create a rectangle that covers over four tables and has corners in the cells determined by the two
plaintext letters. The letters are encrypted by replacing them by another two letters, that are
pointed by two other corners of the rectangle. The same steps should be repeated for all plaintext
pairs. All letters should be replaced by letters determined by the four encryption tables.

For example, after encryption of a pair AS using the tables defined above, one would receive a
new pair IL or LI.
Both parties should agree in which order the new letters ought to be appended to the ciphertext
(for example, the first letter would be a letter from the upper right table, and the second letter
should be taken from the lower left table).

Ciphertext decryption is performed in a similar way. Firstly, the recipient should create (knowing
the secret keywords) the same four tables as the sender. Then, he should decode all letters pair
by pair, using analogous operations. He needs to find two ciphertext letters in the upper right and
lower left tables. After that, the rectangle corners should be found, and they should point to the
two new plaintext letters (taken from the upper left and lower right tables).

Security of the four-square cipher


The algorithm of the four-square cipher provides better security than the Playfair cipher and
the two-square cipher. Thanks to the use of four tables, several typical weaknesses of those
ciphers are avoided (for example, there are not plaintext letters appearing in the ciphertext, or the
reversed plaintext pairs are not encrypted by using the same letters).
The small disadvantage of the four-square cipher is the fact that it is slightly slower and more
difficult to use (when comparing to the two ciphers mentioned above). It happens because there
are more tables and secret keys involved in the operations that needs to be remembered.
The most effective method of breaking the four-square cipher is using frequency analysis method
of pairs of ciphertext letters, and comparing ciphertext fragments with fragments parts. Thanks to
the use of computers, the four-square ciphers can be broken relatively quickly.

Implementation
Implementing the Four-Square Cipher is a relatively simple task. After preparing the input text
and passwords, the main task is to deal properly with row and column numbers, for each pair of
character.

Below, there is a JavaScript function which performs encryption of the input message, and return
the result. Note, that both keywords are passed as one dimensional strings:

function encrypt(messageInput, keyword, alphabet) {


var messageOutput = "";

var pos = 0;
while (pos < messageInput.length) {
var m1 = messageInput[pos];
var m2 = '';

if (pos + 1 < messageInput.length) {


m2 = messageInput[pos + 1];
pos += 2;
} else {
m2 = 'Q' // some dummy letter
pos += 1;
}

var idx1 = alphabet.indexOf(m1); // upper-left table


var idx2 = alphabet.indexOf(m2); // lower-right table
var c1 = keyword[0][(5 * Math.floor(idx1 / 5)) + idx2 % 5];
var c2 = keyword[1][(5 * Math.floor(idx2 / 5)) + idx1 % 5];

messageOutput = messageOutput.concat(c1);
messageOutput = messageOutput.concat(c2);
}

return messageOutput;
}

You may try out the Four-Square Cipher online on Crypto-Online website.

Hill Cipher
POLYGRAPHIC SUBSTITUTION CIPHER
The Hill cipher is considered to be the first polygraphic cipher in which it is practical to work on
more than three symbols at once.

Usage
The Hill cipher was created in 1929 by Lester S. Hill, an American mathematician.

Algorithm
In the Hill cipher each letter corresponds to one unique number, from 0 to 25. The simplest
scheme (A = 0, B = 1, ..., Z = 25) is used the most often but one can choose other combination
as well.

Messages are divided into n-letter blocks. Encryption is performed by multiplication of all blocks
by one n x n secret matrix, which contains also numbers from 0 to 25. All the results should be
modulo 26. The matrix can be defined based on a secret keyword, which contains n2 letters (one
should just ignore other unnecessary letters).
The decryption algorithm is similar to the encryption process. One should divide the ciphertext
into blocks (each with n letters) and multiply them by the inverse of the matrix modulo 26 used for
encryption.
Encryption and Decryption
To encrypt a message vino using a 2 x 2 matrix, one should divide the message into two blocks
of two letters. Then one should change the letters into numbers:

-->

21

10

14
15

The matrix below wwill be used as a secret key:

K=

3 3

2 5

It is easy to calculate the inverse of the matrix modulo 26 using for decryption:

K-1 =

15 17

20 9

Encryption of two plaintext blocks is about multiplication them with the key matrix:

3 3

2 5

21

10

93 mod 26

92 mod 26

15

14
3 3

2 5

14

15

87 mod 26

103 mod 26

25

The result is a 4-digit sequence 15 14 9 25 so oniy. Decryption is similar to encryption. Ciphertext


letters are changed into digits, divide into two 2-digit blocks and multiplying with the inverse of
the matrix K-1:
15 17

20 9

15

14

463 mod 26

426 mod 26

=
21

10

15 17

20 9

21

560 mod 26

405 mod 26

14

15

The received four number can be changed into the original plaintext letters vino.

Security of the Hill cipher


The basic version of Hill cipher is vulnerable to known-plaintext attacks because the whole
encryption process is linear. The intruder who intercepts n2 pairs of plaintext and ciphertext
corresponding letters can create a linear system which can be quite easily solved. Adding a few
more pairs allows to choose a correct solution if more than one was discovered. Solving linear
equations usually doesn't take much time.

Number of possible keys


There are 26n2 possible matrices of dimension n x n, which can contain only numbers from 0
to 26. However to consider a matrix as a potential secret key, the matrix must be invertible.
The number of invertible matrices can be computed using the Chinese Remainder Theorem.
A matrix is invertible modulo 26 if and only if it is invertible both modulo 13 and modulo 2.
The number of invertible n x n matrices modulo 2 is determined by the order of the general
linear group GL(n,Z2):
2n2(1 - 1/2) (1 - 1/22) ... (1 - 1/2n)
Similarly, the number of invertible n x n matrices modulo 13 can be calculated in the same way:
13n2(1 - 1/13) (1 - 1/132) ... (1 - 1/13n)
Finally, the number of such invertible matrices modulo 26 is equal to the product of the two above
results:
26n2(1 - 1/2) (1 - 1/22) ... (1 - 1/2n)(1 - 1/13) (1 - 1/132) ... (1 - 1/13n)
This may be approximated as 4,64 n2 - 1,7, which is about 116 bits for 5 x 5 matrix.

Polyalphabetic Substitution
Ciphers
Each plaintext character is replaced by another letter. A way of substitution is changed cyclically
and it depends on a current position of the modified letter.

Usage
Polyalphabetic substitution ciphers were invented by an artist, philosopher and scientist Leon
Battista Alberti. In 1467 he presented a device called the cipher disk. It provides polyalphabetic
substitutions with mixed alphabets.

Description
In polyalphabetic substitution ciphers one should define a few possible combinations of
substitutions of all alphabet letters by other letters. Then, one should use the substitutions
cyclically, one after the other, changing the replacement after each new letter.

To use this cipher, one should choose, remember and deliver to all parties some substitutions of
all alphabet letters. Then, the substitutions should be used in a specific order. To decrypt
the message, one should use corresponding substitutions in the same order but the letters should
be changed in the other side.

The strongest version of a polyalphabetic substitution cipher is to define all its transformations
randomly. Such a method was preferred by Alberti himself.

On the other hand, due to the large amount of data to remember, some easy to remember and
easy to hand over to another person substitutions were invented and widely used. The Vigenère
cipher is an example of such an approach.
Security of polyalphabetic substitution ciphers
A properly implemented polyalphabetic substitution cipher is quite difficult to break. Its strength is
based on many possible combinations of changing alphabet letters. Some effective methods of
attacking such ciphers were discovered in the nineteenth century. They are about to guess
a secret key's length in a first step. After that, one can examine the ciphertext using frequency
analysis methods.
Polyalphabetic Substitution Ciphers:
Trithemius Cipher
POLYALPHABETIC SUBSTITUTION CIPHER

The cipher was invented by a German monk Johannes Trithemius, who lived at the turn
of fifteenth and sixteenth centuries. He described it in his book Polygraphia published in 1508.
This is considered to be one of the first books dedicated entirely to cryptography.

Usage
The cipher is very simple and it doesn't provide good security of transmitted messages. However,
the Trithemius cipher was an important step during developing polyalphabetic ciphers
in sixteenth-century Europe.

Algorithm
The Trithemius cipher was one of many polyalphabetic ciphers designed to be easy in frequent
use. Instead of using random combinations of alphabet letters, Trithemius proposed using a
special table. After some time the table was named tabula recta.

Tabula Recta

The first row contains all alphabet letters in the original order. Next rows also contain all letters
but in each row they are shifted to the left by one position. The table has 26 rows and 26 columns
(there are 26 letters in the Latin alphabet).

During encryption, subsequent plaintext letters are replaced by relevant letters from subsequent
rows of the table. After using the last row, one should move back to the first row. It means that all
plaintext letters are increased by number of positions determined by the actual row. Therefore
the first letter is encrypted without shift, the second letter with the shift determined by the second
row (so by one position), the third letter with the shift determined by the third row (so by two
positions) and so on.

For example, a word MACHINE encoded using the cipher would create ciphertext MBEKMSK.

Security of the Trithemius cipher


The Trithemius cipher does not have a secret keyword that protects ciphertext. To guess
an original message, it is enough to know that this particular cipher has been used. Furthermore,
one may assume that the Trithemius cipher is a particular case of Vigenère encryption - using
the following key: ABCDEFGHIJKLMNOPQRSTUVWXYZ.

Vigenère Cipher
POLYALPHABETIC SUBSTITUTION CIPHER

The cipher was invented by Italian Giovan Battista Bellaso, who described it in 1553 in his
book "La cifra del. Sig. Giovan Battista Bellaso". However it is named, due to the wrong widespread
belief in the nineteenth century, after the French diplomat and alchemist Blaise de Vigenère, who
lived in the sixteenth century.

Usage
The Vigenère cipher is quite easy to use and provide relatively good security. It was widely used
for a long time until the twentieth century.

Algorithm
The Vigenère cipher is a kind of polyalphabetic substitution cipher. It is about replacing plaintext
letters by other letters. The parties have to agree the common shared keyword (which may be
also a sentence), which is used during encryption algorithm. They don't have to specify all
26 substitutions for all possible letters of the alphabet.

During encrypting and decrypting, one should use a table which contains all alphabet letters in
the correct order in the first row and then, in subsequent rows, letters shifted to the left by one
subsequent position. The table has a Latin name tabula recta and it was used the first time in
cryptography by a German monk Johannes Trithemius.
Tabula Recta

In order to encrypt a message, one should use a secret keyword (or a few words). The keyword
is used to choose rows with variously shifted alphabet letters. Subsequent plaintext letters are
replaced by subsequent corresponding letters in rows, which are pointed by keyword letters.

For example, if one choose a word rex as a secret keyword, the first message letter should be
encrypted using the row r, the second letter using the row e, the third letter using the row x.
Therefore, the first letter should be shifted by 17 positions, the second plaintext letter should be
shifted by 4 positions and the third letter should be shifted by 23 alphabet positions. Then, one
should use keyword letters from the beginning. The fourth plaintext letter will be encrypted using
the row r.
There is a simple variant of the Vigenère cipher, referred to as Variant Beaufort. Using this
variant, one should encrypt the message using the Vigenère decryption method and decrypt
the ciphertext using the Vigenère encryption algorithm. One just move letters in the opposite
direction than in the original algorithm. This method has nothing in common with the Beaufort
cipher so they shouldn't be confused.

Security of the Vigenère cipher


In order to break the Vigenère cipher one should determine a secret key size (the length
of keyword or sentence, which were used for encrypting message). Below, there are presented
two methods of guessing this length. Both methods base on ciphertext analyzing. They were
discovered by Kasiski and Friedman.

After determining the length of the key, further cryptanalysis is based on frequency analysis
of ciphertext letters. Ciphertext letters encrypted with different secret key letters should
be analyzed separately. Plaintext letters encrypted used the first secret key letter should be tested
separately, plaintext letters encrypted used the second secret key letter should be also tested
separately and so on. Knowing the key size, the main task is to break a few texts, separately
encrypted using the Caesar cipher.
Kasiski examination
This method of determining the secret key length was created by the German soldier,
archaeologist and cryptographer Friedrich Kasiski in the nineteenth century. One should search
through ciphertext looking for sequences of the same characters. Finding such sequences may
mean that they are created by encoding the same parts of plaintext using the same parts of secret
key.

For example, the principle can be noticed during encryption following plaintext letters using
following secret key letters:

Key: NATURAENATURAENATURAENATURAE

Plaintext: ALIUDESTFACEREALIUDESTDICERE

Ciphertext: NLBOUEWGFTWVRINLBOUEWGDBWVRI

Based on analysis and measurement distances between beginnings of repeated sequences


(which in the example is equal to 14 characters), one can assume that the encrypting key has
one of the following lengths: 1, 2, 7 or 14 characters.

If during ciphertext analysis one found more sequences of the same characters, then one could
assume that the secret key has the length equal to one of the numbers suggested by different
repeated sequences.

Sequences of the same ciphertext characters may be also caused by random mixing of various
plaintext and secret key letters. The more repeated sequences in the ciphertext will be found,
the more likely they are caused by encrypting the same parts of plaintext using the same secret
key letters (and it is not just a random coincidence).

Friedman test
William Friedman was a cryptographer in the US army. He elaborated a method of guessing
the keyword length for the Vigenère cipher in the third decade of the twentieth century. It is based
on calculating an index of coincidence and one should compare ciphertext letters with the same
letters shifted by various numbers of letters.

Beaufort Cipher
POLYALPHABETIC SUBSTITUTION CIPHER

The cipher is named after British admiral Francis Beaufort, who lived at the turn of the 18th and
19th centuries.

Usage
The Beaufort cipher is a simple polyalphabetic cipher. It uses a table called tabula recta, which
was first introduced in the Trithemius cipher. It shouldn't be confused with a special variant of
the Vigenère cipher, named Variant Beaufort.
The Beaufort cipher was used in rotor-based cipher machines Hagelin M-209 in the middle of the
20th century.

Algorithm
The Beaufort cipher's algorithm is based on the table called tabula recta:

Tabula Recta

During encryption, all plaintext letters are replaced by other letters, based on the tabula
recta table. Both sides share one secret key, which consists of one or more words. Each plaintext
letter is encrypted by using one key letter. After the last key character has been used, the
algorithm goes back to the first key letter and starts taking key characters again from the
beginning.
The encryption process is presented below. For every plaintext letter, one should perform the
following operations, by using one key character.

1. Find the plaintext letter in the topmost horizontal row.


2. Travel down the column, until you find the current key letter.
3. The leftmost letter in the current row is the new ciphertext letter.
Decryption of the ciphertext uses exactly the same algorithm. This means that after the second
encryption of ciphertext, one receives the original plaintext. Ciphers that have this property are
referred to as reciprocal ciphers.

Security of the Beaufort Cipher


Ciphertexts created by the Beaufort cipher can be very easily changed into ciphertexts of
the Vigenère cipher. This can be done by replacing every letter of the ciphertext by its opposite
letter (a becomes z, b becomes y and so on). After this modification, the ciphertext can be
analyzed by using all the existing attacks against the Vigenère cipher.

Implementation
A simple encryption/decryption function implemented in Python:

def encrypt_or_decrypt(key, input_txt):


output_txt = []
for pos in range(0, len(input_txt)):
letter_row = 'A'
letter_txt = input_txt[pos]
while letter_txt != key[pos%len(key)]:
letter_txt = chr((ord(letter_txt)-ord('A')+1)%26
+ ord('A'))
letter_row = chr((ord(letter_row)-ord('A')+1)%26
+ ord('A'))
output_txt.append(letter_row)
return ''.join(output_txt)

Running Key Cipher


POLYALPHABETIC SUBSTITUTION CIPHER

The running key cipher is a variation of the Vigenère cipher. Each letter of the plaintext is shifted
along some number of alphabet positions in one specified direction.

Usage
Like other polyalphabetic ciphers, the running key cipher had been quite widely used until the first
part of the twentieth century when effective attacks to this kind of ciphers were discovered.

Algorithm
Encrypting using running key is about shifting plaintext letters along some numbers of alphabet
positions. The numbers are determined by letters of a secret keyword (like in other substitution
ciphers). To search for proper letters during encrypting and decrypting, one can use tabula recta,
as during using for example the Trithemius cipher or the Vigenère cipher, both based on the same
idea.
Tabula Recta

Instead of determining a secret keyword and them using it repeatedly during encrypting all
messages, the running key cipher uses long, publicly available sets of letters - books and other
similar long texts. Parties should agree which book exactly (and exactly which edition) they will
use during the communication. The must determine the number of the first page used for
encryption, the first row and the number of letter in the row.

All letters of the message are encrypted using subsequent letters found in the book. After
encrypting some characters, one may jump to another, arbitrarily selected position in the book
and continue taking key letters from new positions. It is possible to encode a number of a new
page, a number of a new row and a number of the first letter in the row as subsequent letters.
The letters can be appended to the plaintext and both can be encrypted together. The second
party, after finding the letters and decoding them, jumps to the new position of the secret key
letters. One may also provide information about changing the book using during encryption.

Security of the running key cipher


The running key cipher distinguishes from other polyalphabetic substitution ciphers. Instead of
a relatively short keyword, used over and over again, is uses a secret key of the same length as
plaintext size. If ciphertext characters were completely random, the cipher would provide perfect
security as the OTP cipher. However in this case both plaintext and secret key consist of existing
words and sentences, which makes ciphertext analysis much easier.
The intruder can try to guess parts of plaintext and match them in such a way, that receiving
secret keys characters will create meaningful sequences, that make up words and sentences.

In order to increase cipher's security, the parties can take ciphering letters not from one sequence
but from some different sequences (in different parts of the text) at the same time. The attacker
would have to guess rules used for changing the sequences. In this case, the analysis is much
more difficult because secret key letters don't create correct words.
Another idea to make cryptanalysis more difficult is about assigning a few words to each alphabet
letter and using those words instead of keyword letters. The method is intended to make difficult
distinction ciphertext letters from plaintext letters. Usually ciphertext doesn't consist of words,
unlike plaintext and secret key sequences.

Effective and popular methods for improving the cipher and creating better secret key characters
are about to using texts which contain unusual expressions (it was often used for example
by KGB) or avoiding the use of tabula recta and replacing it by random combinations.

Autokey Cipher
POLYALPHABETIC SUBSTITUTION CIPHER

The autokey cipher was presented in 1586 by a French diplomat and alchemist Blaise de
Vigenère.

Usage
The autokey cipher was used in Europe until the 20th century. Currently it is considered to be
easy to break. However, the idea to create key letters based on plaintext letters is used in many
modern ciphers.

Algorithm
Similarly to other polyalphabetic substitution ciphers, the autokey cipher algorithm is about
changing plaintext letters based on secret key letters. Each letter of the message is shifted along
some alphabet positions. The number of positions is equal to the place in the alphabet of
the current key letter.

To simplify calculations, one can use a table which contains in subsequent row alphabets with
letters shifted along increasingly larger number of positions. The table is called tabula recta and
looks like the one below:
Tabula Recta

Unlike in other similar ciphers, after using all of secret key letters, the algorithm doesn't go back
to its first letter but starts to take plaintext letters as new key letters.

For example, after encryption two words Opinio communis using the secret key Ab ovo one
receives:
Plaintext: OPINIOCOMMUNIS

Key: ABOVOOPINIOCOM

Ciphertext: OQWIWCRWZUIPWE

Security of the autokey cipher


Due to avoid repetition of the same secret key letters, the cipher is resistant to attacks based
on dividing ciphertext into parts corresponding to subsequent secret key characters. However its
weakness is that all key characters create words and sentences which in addition are the same
as in plaintext.

To break the cipher, the intruder should try to guess some parts of plaintext (for example trying
some common sequences of letters). Comparing them to plaintext allows to receive some
characters of the secret key. One should try to find such letters which result in disclosure of correct
words among the secret key characters.
Nihilist Cipher
POLYALPHABETIC SUBSTITUTION CIPHER

First used in the eighties of the nineteenth century in Russia by Nihilist organizations.

Usage
The cipher is named after the Nihilist movement, who fought against czarism in Russia
and attacked czarism's officials in the nineteenth century. They killed the tsar Alexander II in
the successful assassination in 1881.

The original algorithm was not very strong but there are some modifications which provide much
better security. One of ciphers which belongs to the Nihilist family of ciphers is the VIC cipher.

Algorithm
An algorithm of the Nihilist cipher uses a matrix called a Polybius square. It has 5 rows and
5 columns and it is filled with all Latin letters (there are 26 Latin letters, so usually
the letters i and j are treated as one character).
An order of letters in the table depends on a secret word which is shared by the two
communicating parties. To determine the order of letters, one should remove duplicate letters of
the secret word and then enter the rest letters into the table. Usually the letters are written starting
from the top leftmost cell and going to the right, row by row. However, the parties can agree to
the different order. The rest of empty cells are filled with the rest of letters, which aren't contained
by the secret word. Usually the parties use an alphabetical order.

Both rows and columns of the table are numbered from 1 to 5.

For example, if a full name of the tsar Alexander II killed by Nihilists (Aleksandr II Nikolaevich) is
used as the secret word, then the table can be written as below:

1 2 3 4 5

1 a l e k s

2 n d r i o

3 v c h b f

4 g m p q t

5 u w x y z

Each letter of the secret key and each letter of a given message is changed into a two-digit
number, determined by digits of rows and columns. The secret key is usually different than
the secret word used for create the table in the previous step. The secret key is used during
encryption of the whole communication. Both the secret key (used for creating the table) and
the secret key (used for encrypting all messages) must be shared between the communicating
parties.

During encryption one should add one by one all the numbers created from plaintext letters to
the numbers created from the secret key's letters. The results can be two-digit or three-digit
numbers. The created ciphertext can contain a sequence of numbers or the received numbers
can be changed into letters using the same table and the inverse transform.

For example, the steps during encrypting of a sentence Acta est fabula using a secret
keyword Vivere and the Polybius square defined above, are presented below.
The plaintext and the secret encrypting key should be written in two rows, one under another:
a c t a e s t f a b u l a

v i v e r e v i v e r e v

After replacing letters by numbers, the rows have the following form:

11 32 45 11 13 15 45 35 11 34 51 12 11

31 24 31 13 23 13 31 24 31 13 23 13 31

The ciphertext is created by adding the plaintext numbers to the secret key numbers:

42 56 76 24 36 28 76 59 42 47 74 25 52

The recipient, which knows the secret key, subtracts the secret key numbers from the ciphertext
numbers. He receives the plaintext numbers, which can be changed into letters using the same
table as the sender used for encryption.

Security of the Nihilist cipher


The Nihilist cipher is quite similar to the Vigenère cipher. It uses numbers instead of letters.
Therefore, one can use similar methods for its analysis and breaking. During analyzing
frequencies of characters in the ciphertext, one should check two-digit numbers.
During encryption using the Nihilist algorithm, the ordinary addition is used (without modulo
operation), so the ciphertext may contain three-digit numbers. It happens when the corresponding
letters of the plaintext and the secret key are located in the last (fifth) row of the table (so
the addends are bigger that 50). It makes the analyzing of the ciphertext much more easier.

Breaking the Nihilist cipher


The first step during breaking the Nihilist cipher is discovering a length of the secret key. It may
be achieved by trying some possible key lengths for probable values (for example from 4 to 15
characters). For each of the numbers, one should write the ciphertext, breaking lines after
the amount of characters equal to that number. The created tables have numbers of columns
equal to the currently analyzed lengths of the secret key.

Then, for each received table, one should check numbers in all its columns. The numbers in each
columns should have both tens digits and ones digits which differ from each other not more than 5.
It is caused by the fact that in each column all the characters have been encoded using the same
letter of the secret key, so each plaintext number in the column has been added to the same
secret key number. The value of the secret key number does not affect a difference between
plaintext numbers, which were encoded using this number.

During adding any numbers from the Polybius square to another number which also belongs to
the Polybius square, one always receives numbers, which have tens digits and ones digits that
don't differ from each other more than 5 (so the difference between the biggest and the smallest
tens digits in the column can't be bigger than 5; the same situation applies to ones digits). In
the next steps, one should use only tables (only such potential lengths of the secret key) which
satisfy this condition.

The second step is to determine possible key numbers, which could be used for encryption. For
each column, one should find all possible numbers, which subtracted from all ciphertext numbers
in the column, result in numbers of values from 11 to 55. Every other number, which doesn't
satisfy the condition, should be discarded. In practice, it is possible to eliminate a lot of potential
secret key numbers in that way.

Further analysis of the ciphertext can rely on changing numbers into letters using a trial and error
method looking for solutions which disclose fragments of the original plaintext. For each correct
solution, one should create all possible encrypting keys and subtract them from the ciphertext,
receiving potential plaintext letters.

VIC Cipher
POLYALPHABETIC SUBSTITUTION CIPHER

Used by Soviet spies all over the world, in the middle of the twentieth century. Its name is based
on a nickname VICTOR of a Soviet agent spying in USA under the name Reino Häyhänen.
In 1957 he surrendered to the American intelligence and disclosed details of the cipher.

Usage
The VIC cipher is regarded as the most complex modification of the Nihilist cipher family. It is
considered to be one of the strongest ciphers, which can be used manually without computers.
By the time it was disclosed as a result of betrayal, American counterintelligence hadn't managed
to break the cipher.

Algorithm
The VIC cipher uses a table which allows changing letters of plaintext into numbers. It is called
a straddling checkerboard.
It differs from tables used in other substitution ciphers because it produces shorter sequences of
numbers (it is much more comfortable for sending to the second party).

The straddling checkerboard can be created in the following form:

0 1 2 3 4 5 6 7 8 9
E T A O N R I S

2 B C D F G H J K L M

6 P Q / U V W X Y Z .

The highest row is populated with the ten digits from 0 to 9. The second row is typically filled with
popular letters in any order. In English a mnemonic ESTONIA-R can be used to remember
the most frequent letters. Free cells should be left under two digits and in the leftmost column.

Each of both lower rows receives one of the two remaining digits, which isn't used in the second
row. Then, the two rows should be filled with letters in alphabetical order. Because of two empty
remaining cells, two additional special characters may be entered into the table. They can be
used for special purposes or shortcuts agreed previously between the two parties.

During encryption using VIC one should replace letters of the message by numbers created
based on numbers of rows and columns. The most popular letters should be replaced by only one
digit of the column (that results in producing shorter ciphertext).

For example, one can encrypt the name of the famous Scottish queen using the table presented
above:

M A R Y Q U E E N O F S C O T S

29 3 7 67 61 63 0 0 5 4 23 9 21 4 1 9

It should be noticed, that a lot of numbers in the received sequence have only one digit.

The next step is to add some specified numbers to the all digits of the created sequence. One
should add one by one all digits of the changing message to all digits of the secret sequence.
After the last letter of the secret sequence, algorithm goes back to the first digit of the sequence
and continues its work. The addition is done modulo 10, so if the result is bigger than 10 then
the tens digit should be discarded.

Continuing the example, one could add the received numbers to the secret sequence of four
digits, the year of Mary's birth (1542):

2 9 3 7 6 7 6 1 6 3 0 0 5 4 2 3 9 2 1 4 1 9

+ 1 5 4 2 1 5 4 2 1 5 4 2 1 5 4 2 1 5 4 2 1 5

= 3 4 7 9 7 2 0 3 7 8 4 2 6 9 6 5 0 7 5 6 2 4

The received digits can be used as a ciphertext and send to the second party. Sometimes, it is
a good idea to change digits back into letters, using the same table as during encryption.
Changing numbers into letters is straightforward and intuitive. After finding one of the two digits
which are assigned to the two lower rows, one should use a proper two-digit number.
The sequence of digits received previously can be changed into a sequence of letters as below:

3 4 7 9 7 20 3 7 8 4 26 9 65 0 7 5 62 4

A O R S R B A R I O J S W E R N / O

Decrypting can be performed using the same straddling checkerboard, the same secret number
and the steps performed in reverse order. The secret number's digits should be subtracted from
ciphertext's digits. If any of the results are smaller than 0, then one should add 10 to
the ciphertext's digits.

Security of VIC
The VIC cipher is well designed and provides quite good security. It makes ciphertext analyzing
very time-consuming by breaking the original frequency distribution.

There are many modifications of the VIC cipher. Changes can be introduced in the straddling
checkerboard by changing the order of letters. Some cells may be left empty, what makes
cryptanalysis more difficult.

The received ciphertext's characters can be modify at the end of encryption using one of
the transposition ciphers' algorithms.

Transposition Ciphers
To encrypt data, transposition ciphers rearrange the original message letters. The same letters
will appear in both plaintext and ciphertext, but the idea is that the permutation used to protect
data should be difficult to break without the knowledge of the secret key.

Usage
Transposition ciphers have been used since ancient times. They are perhaps as old, as the oldest
substitution ciphers and steganography methods. At present, in modern ciphers, various
transpositions are used together with substitutions, to make the cryptanalysis more difficult.
Description
There is not any common algorithm, that would be used in all transposition ciphers. The main
idea is to change the letter order in such a way, that would prevent attackers from reading it, while
at the same time, allow the receiver to decrypt messages easily and effectively.

Both sender and receiver should share a common secret, usually a keyword, that determines the
exact transpositions that should be applied to the text.

Transposition ciphers usually require more memory and more complex operations, than
substitution ciphers. That is why modern ciphers implemented pragmatically and electronically
are usually based on substitutions, and less often on transpositions.

Transposition Ciphers:
Rail Fence Cipher
TRANSPOSITION CIPHER

The Rail Fence Cipher is a transposition cipher, which rearranges the plaintext letters by drawing
them in a way that they form a shape of the rails of an imaginary fence.

Usage
The Rail Fence Cipher was invented in ancient times. It was used by the Greeks, who created a
special tool, called scytale, to make message encryption and decryption easier. Currently, it is
usually used with a piece of paper. The letters are arranged in a way which is similar to the shape
of the top edge of the rail fence.

Algorithm
To encrypt the message, the letters should be written in a zigzag pattern, going downwards and
upwards between the levels of the top and bottom imaginary rails. The shape that is formed by
the letters is similar to the shape of the top edge of the rail fence.

Next, all the letters should be read off and concatenated, to produce one line of ciphertext. The
letters should be read in rows, usually from the top row down to the bottom one.

The secret key is the number of levels in the rail. It is also a number of rows of letters that are
created during encryption. This number cannot be very big, so the number of possible keys is
quite limited.

For example, let us encrypt a name of one of the countries in Europe: The United Kingdom.
Let's assume that the secret key is 3, so three levels of rails will be produced.
First, we will remove the empty spaces, and encrypt only the capitalized letters:

THEUNITEDKINGDOM

Next, the plaintext letters will form the shape of the fence:

T . . . N . . . D . . . G . . .

. H . U . I . E . K . N . D . M

. . E . . . T . . . I . . . O .

Then, the letters should be read row by row, starting from the top one. Finally, they ought to be
concatenated to form one ciphertext message. In our example, the calculated ciphertext
sequence would be:

TNDGHUIEKNDMETIO

To decrypt the message, the receiver should know the secret key, that is the number of levels of
the rail. Based on the number of rows and the ciphertext length, it is possible to reconstruct the
grid and fill it with letters in the right order (that is, in the same way as used by the sender during
encryption).

Security of the Rail Fence Cipher


Ciphertexts produced by transposition ciphers are relatively easy to recognize, because
the frequency distribution of all ciphertext alphabet letters is the same as in plain messages
written in the same language.
Due to the small number of possible keys, the Rail Fence Cipher can be broken quite easily by
using brute force attacks. The attacker should check all the practicable numbers of rail levels, that
might have been using during encryption.

Implementation
The encryption function in the Rail Fence Cipher performs two major steps. First, the letters are
entered into a table, that represents the imaginary fence. Then, the letters should be read off in
rows.

Below, there is a JavaScript function which performs encryption of the input message and returns
the result. Note, that rowNumber input parameter is determined by the cipher's secret key:
function encrypt(messageInput, rowNumber) {
var messageOutput = '';

var fanceTable = [];


for (var pos = 0; pos < rowNumber; ++pos) {
fanceTable[pos] = [];
}

// First, enter the letters into the fence table:

var r = 0;
var direction = 1;

for (var c = 0; c < messageInput.length; ++c) {


fanceTable[r].push(messageInput[c]);

if (((r == rowNumber - 1) && (direction == 1)) ||


((r == 0) && (direction == -1))) {
direction = -direction;
}

r = r + direction;
}

// Then, read off the ciphertext:

var row = 0;
while (row < rowNumber) {
for (var pos = 0; pos < fanceTable[row].length; ++pos) {
messageOutput = messageOutput.concat(fanceTable[row][pos]);
}
++row;
}

return messageOutput;
}

You may try out the Rail Fence Cipher online on Crypto-Online website.

Route Cipher
TRANSPOSITION CIPHER

The Route Cipher is a transposition cipher. It rearranges the plaintext letters based on a shape of
an imaginary path drawn on a grid.

Usage
The Route Cipher is a simple transposition cipher that can be performed manually, without the
need of using additional equipment. It was quite popular throughout centuries, and used to protect
information when more sophisticated ways were not available.

Currently, the Route Cipher is usually used with a piece of paper. The letters fill the grid which
has dimensions defined by the secret key.

Algorithm
To encrypt the message, the first step is to create a grid of one dimension determined by the
secret key, and the second dimension depended on the data size. The parties must also agree
which dimension (width or height) is described by the secret key, and in what way the grid will be
filled with plaintext letters (row by row, or column by column). If some cells in the grid remain
empty, one of two possible approaches should be taken:

1. The cells may be left empty, and just ignored during all further operations.
2. The sender may enter there some rare letters, and treat them as a part of the plaintext.
After decryption, the receiver should be able to determine, that the letters have no
sense, and that they should be ignored.
One more thing must be agreed by the sender and receiver in order to encrypt the message: an
order in which the letters in the grid should be appended to ciphertext. The order should not be
too simple, to prevent parts of plaintext appearing in the produced ciphertext. On the other hand,
the order should not be too difficult, to prevent the need of remembering to difficult configurations
by communicating parties.

The users often choose the rules that form some kind of paths on the grid, which should be follow
during encryption, for example 'spiral clockwise inwards, starting from the top left corner'. The
name of the cipher is derived from these paths, used for every data encryption and decryption.
The way in which the path is defined is also a part of the secret key of this cipher.

As an example, let's encrypt a name of a city in Great Britain, Brighton and Hove. The secret
key will be 3, and it will determine the width of the grid. We will fill the grid row by row, from left to
right. Finally, we will read the grid clockwise, going inwards, and starting from the top right corner.
As usual, the encryption starts by removing the non-letter characters, and capitalizing all the
letters:

BRIGHTONANDHOVE

The letters are then entered into the grid, which is 3-column wide:

BRI

GHT

ONA

NDH

OVE

Luckily, in our case, there is no need to add any additional characters at the bottom of the grid.

The letters are then read, and appended to the ciphertext. The reading starts from the top right,
and spiral clockwise inwards. The produced encrypted text will be:

ITAHEVONOGBRHND

As we can see, the original text was hidden, and the ciphertext doesn't reveal any plaintext parts.

Knowing the length of the ciphertext and the secret key, the receiver is able to recreate a grid of
the same size, as the one used for encryption. Then, knowing the path directions, the receiver
can simple enter the letters into correct cells. Finally, the plaintext is revealed by reading the grid
in the same way, as was used by the sender to enter the letters into the table.

Security of the Route Cipher


The Route Cipher provides better security than previously described the Rail Fence Cipher, due
to the fact that the secret key defines not only the size of the grid but also the path. There are
many possible paths, so there are a lot of available secret keys. Longer messages protected by
this cipher are considered to be too difficult to break by brute force attacks, even be modern
computers.
On the other hand, not all configurations provide satisfactory protection. Badly chosen keys may
reveal plaintext fragments, thus making the cryptanalysis much easier.

Implementation
The main functionality of the Route Cipher is reading and following the shape which forms the
path. The implementation versions will differ, depending on the types of used paths.
You may try out the Route Cipher online on Crypto-Online website.

Columnar Transposition
TRANSPOSITION CIPHER

The Columnar Transposition rearranges the plaintext letters, based on a matrix filled with letters
in the order determined by the secret keyword.

Usage
The Columnar Transposition is a simple transposition cipher that can be performed manually,
without the need of using additional equipment. It was very popular throughout centuries, and it
was used in various situations by diplomats, soldiers, and spies.

The encryption and decryption can be performed by hand, using a piece of paper and a simple
matrix, into which the user enters the letters of the message.

Algorithm
The name of the cipher comes after the operations on a matrix, that are performed during both,
encryption and decryption. The number of columns of the matrix is determined by the secret key.

The secret key is usually a word (or just a sequence of letters). It has to be converted into a
sequence of numbers. The numbers are defined by an alphabetical order of the letters in the
keyword. The letter which is first in the alphabet will be the number 1, the second letter in the
alphabetical order will be 2, and so on.
If there are multiple identical letters in the keyword, each next occurrence of the same letter should
be converted into a number that is equal to the number for the previous occurrence increased by
one.

For example, the keyword:


SWINDON
would produce the following sequence of numbers:
6723154
We can see, that we converted the letters N into the numbers 3 and 4.
To encrypt a message, all the letters should be entered into the matrix, row by row, from left to
right. The size of the matrix depends on the length of the message. The only known dimension is
width, which is determined by the length of the secret keyword (which is the same as the length
of the corresponding sequence of numbers), and known to both sides of the communication.

If, after entering the whole message, there are some empty cells in the bottom row of the matrix,
one of two approaches can be taken:

1. The cells may be left empty, and just ignored during all further operations (this is so
called an irregular columnar transposition cipher).
2. The sender may enter there some rare letters, and treat them as a part of the plaintext.
After decryption, the receiver should be able to determine, that the letters have no
sense, and that they should be ignored (in this case, the cipher is called a regular
columnar transposition cipher).
Next, the letters should be read off in a specific way, and write down to form the ciphertext. The
order of reading the letters is determined by the sequence of numbers, produced from the
keyword. They should be read column by column, from top to bottom, starting from the column,
which position is the same as the position of the number 1 in the key sequence. The next column
to read off is determined by the number 2 in the key sequence, and so on, until all the columns
are read off. To make this step easier, it is recommended to write the sequence numbers above
the corresponding columns.
As an example, let's encrypt a message A Midsummer Night's Dream, which is a comedy written
by Shakespeare. We will use the secret key mentioned above. The number sequence derived
from this keyword is 6723154, so the matrix created for the encryption will have seven columns.
After removing all non-letter characters, and changing the letters to upper case, the message
should be entered into the table:

6 7 2 3 1 5 4

A M I D S U M

M E R N I G H

T S D R E A M

Above the message, there are numbers derived from the keyword. These numbers determine the
order, in which the columns should be read (top to bottom), and appended to the produced
ciphertext. In our example, the first column will be SIE, the second will be IRD, and so on. The
produced ciphertext is:
SIE IRD DNR MHM UGA AMT MES
Finally, after removing the spaces, which were added to indicate separate columns, we receive
the encrypted message:
SIEIRDDNRMHMUGAAMTMES
To decrypt a received ciphertext, the receiver has to perform the following steps:

1. Knowing the secret keyword, and the length of the received message, the table of the
same size, as the one used for encryption, should be created.
2. The ciphertext should be entered into columns, from the leftmost columns to the
rightmost column, from top to bottom.
3. The columns should be rearranged, and put into the order defined by the keyword.
4. The decrypted message should be read out, row by row, starting from the top row, and
from left to right.

Security of the Columnar Transposition


The Columnar Transposition was used for serious purposes all over the world, until the beginning
of the second half of the 20th century.

To break the ciphertext, an attacker should try to create the tables of different sizes, enter the
encrypted message down into the columns, and for each table look for anagrams appearing in
rows.
Implementation
Encryption
Below, there are encryption functions written in Python. The input parameters are the message
and the secret keyword. The main function, encrypt, uses two helper functions to create the
matrix and the keyword sequence of numbers.
def encrypt(message, keyword):
matrix = createEncMatrix(len(keyword), message)
keywordSequence = getKeywordSequence(keyword)

ciphertext = "";
for num in range(len(keywordSequence)):
pos = keywordSequence.index(num+1)
for row in range(len(matrix)):
if len(matrix[row]) > pos:
ciphertext += matrix[row][pos]
return ciphertext

def createEncMatrix(width, message):


r = 0
c = 0
matrix = [[]]
for pos, ch in enumerate(message):
matrix[r].append(ch)
c += 1
if c >= width:
c = 0
r += 1
matrix.append([])

return matrix

def getKeywordSequence(keyword):
sequence = []
for pos, ch in enumerate(keyword):
previousLetters = keyword[:pos]
newNumber = 1
for previousPos, previousCh in enumerate(previousLetters):
if previousCh > ch:
sequence[previousPos] += 1
else:
newNumber += 1
sequence.append(newNumber)
return sequence

Decryption
The Python functions written below allow to decrypt Columnar Transposition ciphertext. The input
parameters are the message and the secret keyword. The main function, decrypt, uses helper
functions to create the matrix and the keyword sequence of numbers.
def decrypt(message, keyword):
matrix = createDecrMatrix(getKeywordSequence(keyword), message)

plaintext = "";
for r in range(len(matrix)):
for c in range (len(matrix[r])):
plaintext += matrix[r][c]
return plaintext

def createDecrMatrix(keywordSequence, message):


width = len(keywordSequence)
height = len(message) / width
if height * width < len(message):
height += 1

matrix = createEmptyMatrix(width, height, len(message))

pos = 0
for num in range(len(keywordSequence)):
column = keywordSequence.index(num+1)

r = 0
while (r < len(matrix)) and (len(matrix[r]) > column):
matrix[r][column] = message[pos]
r += 1
pos += 1

return matrix

def createEmptyMatrix(width, height, length):


matrix = []
totalAdded = 0
for r in range(height):
matrix.append([])
for c in range(width):
if totalAdded >= length:
return matrix
matrix[r].append('')
totalAdded += 1
return matrix

def getKeywordSequence(keyword):
sequence = []
for pos, ch in enumerate(keyword):
previousLetters = keyword[:pos]
newNumber = 1
for previousPos, previousCh in enumerate(previousLetters):
if previousCh > ch:
sequence[previousPos] += 1
else:
newNumber += 1
sequence.append(newNumber)
return sequence

Double Columnar
Transposition
TRANSPOSITION CIPHER

The Double Columnar Transposition rearranges the plaintext letters, based on matrices filled with
letters in the order determined by the secret keyword.

Usage
The Double Columnar Transposition was introduced is a modification of the Columnar
Transposition. It is quite similar to its predecessor, and it has been used in similar situations.
The encryption and decryption can be performed by hand, using a piece of paper and a simple
matrix, in a similar way as it is done for the Columnar Transposition.

Algorithm
The Double Columnar Transposition was introduced to make cryptanalysis of messages
encrypted by the Columnar Transposition more difficult. It was supposed to prevent anagrams of
the plaintext words appearing in the analysed ciphertext.
The main idea behind the Double Columnar Transposition is to encrypt the message twice, by
using the original Columnar Transposition, with identical or different secret keys. The output from
the first encryption would be the input to the second encryption.
The matrices used in both steps may have different sizes, if the two keywords of different lengths
have been used.

All the operation performed during encryption and decryption, and all the parameters that have to
be defined, remain the same, as in the Columnar Transposition.

Security of the Double Columnar Transposition


Breaking the Double Columnar Transposition is more difficult than breaking its simpler version,
due to the fact that anagrams will not appear when trying to apply different sizes of matrices to
the intercepted ciphertext.

An attacker has to try many different combinations of keywords in order to find patterns in the
ciphertext. The cipher is more likely to be broken if multiple messages of the same length and
encrypted with the same keys were intercepted. They can be anagrammed simultaneously, which
makes the cryptanalysis much more effective.

It may be estimated that having a few messages of the same length, encrypted with identical
keys, would allow the attacker to determine both the plaintexts and the secret keys. This technique
was widely using by the French for breaking German messages at the beginning of World War I,
until the Germans improved their system.

The Double Columnar Transposition remains one of the strongest ciphers that can by used
manually, without the need of having electronic equipment. Another cipher that is considered to
be as strong as it is the VIC cipher.

Myszkowski Transposition
TRANSPOSITION CIPHER

The Myszkowski Transposition rearranges the plaintext letters, based on a matrix filled with letters
in the order determined by the secret keyword.
Usage
The Myszkowski Transposition is a very similar cipher to the Columnar Transposition. It was
proposed by Émile Victor Théodore Myszkowski in 1902.
The encryption and decryption can be performed by hand, using a piece of paper and a simple
matrix, into which the user enters the letters of the message.

Algorithm
Similarly to the Columnar Transposition, both encryption and decryption are performed by using
a matrix. The number of columns of the matrix is determined by the secret key.
The Myszkowski Transposition requires the secret key that is usually a word (or just a sequence
of letters). It should be converted into a sequence of numbers. The numbers are defined by an
alphabetical order of the letters in the keyword. The letter which is first in the alphabet will be the
number 1, the second letter in the alphabetical order will be 2, and so on.
Contrary to the Columnar Transposition, the keyword has to contain some recurrent letters.
Identical letters should have the same numbers assigned (which is again, different from the
Columnar Transposition).

For example, the keyword:


SWINDON
would produce the following sequence of numbers:
5623143
We can see, that we converted both letters N into two threes.
To encrypt a message, all the letters should be entered into the matrix, row by row, from left to
right. The size of the matrix depends on the length of the message. The only known dimension is
width, which is determined by the length of the secret keyword (which is the same as the length
of the corresponding sequence of numbers), and known to both sides of the communication.

If, after entering the whole message, there are some empty cells in the bottom row of the matrix,
one of two approaches can be taken:

1. The cells may be left empty, and just ignored during all further operations.
2. The sender may enter there some rare letters, and treat them as a part of the plaintext.
After decryption, the receiver should be able to determine, that the letters have no
sense, and that they should be ignored.
Next, the letters should be read off in a specific way, and write down to form the ciphertext. The
order of reading the letters is determined by the sequence of numbers, produced from the
keyword. They should be read column by column, from top to bottom, starting from the column,
which position is the same as the position of the number 1 in the key sequence. The next column
to read off is determined by the number 2 in the key sequence, and so on, until all the columns
are read off.
If several columns have the same numbers assigned to them (and as we know, this is a
requirement for this cipher), the letters from those columns should be read off together, row by
row. This procedure is different from the one which is performed in the Columnar Transposition,
which results in a different ciphertext being produced from both ciphers, even if all other
parameters are the identical.

To make this step easier, and to allows the user quickly located each column, it is recommended
to write the sequence numbers above the corresponding columns.
For example, let's encrypt a message A Midsummer Night's Dream, which is a comedy written by
Shakespeare (this is the same message we encrypted with the Columnar Transposition). We will
use the secret key mentioned above. The number sequence derived from this keyword
is 5623143, so the matrix created for the encryption will have seven columns.
After removing all non-letter characters, and changing the letters to upper case, the message
should be entered into the table:

5 6 2 3 1 4 3

A M I D S U M

M E R N I G H

T S D R E A M

There are numbers derived from the keyword written above the message. These numbers
determine the order, in which the columns should be read (top to bottom), and appended to the
produced ciphertext. In our example, the first column will be SIE, and the second will be IRD.
Then, there are two columns which have threes assigned: DNR and MHM. The letters from them
should be read off row by row, starting from the top. All the remaining columns should be dealt
with in a usual way. The produced ciphertext is:
SIE IRD DMNHRM UGA AMT MES
Finally, after removing the spaces, which were added to indicate separate columns, we receive
the encrypted message:
SIEIRDDMNHRMUGAAMTMES
To decrypt a received ciphertext, the receiver has to perform the following steps:

1. Knowing the secret keyword, and the length of the received message, the table of the
same size, as the one used for encryption, should be created.
2. The ciphertext should be entered into columns, from the leftmost columns to the
rightmost column, from top to bottom. The columns with the same numbers should be
filled together, row by row.
3. The columns should be rearranged, and put into the order defined by the keyword.
4. The decrypted message should be read out, row by row, starting from the top row, and
from left to right.

Security of the Myszkowski Transposition


The Myszkowski Transposition provides a quite comparable level of protection as the Columnar
Transposition.

To break the ciphertext, an attacker should act in a similar way. That is, he should create the
tables of different sizes, enter the encrypted message down into the columns, and for each table
look for anagrams appearing in rows.
Cryptographic Rotor Machines
Electric rotor machines were mechanical devices that allowed to use encryption algorithms that
were much more complex than ciphers, which were used manually. They were developed in the
middle of the second decade of the 20th century. They became one of the most important
cryptographic solutions in the world for the next tens of years.

Usage
The concept of using rotor machines in cryptography occurred to a number of inventors
independently. At present, two Dutch naval officers, Theo A. van Hengel (1875 – 1939) and R. P.
C. Spengler (1875 – 1955) are considered to invent the first rotor cipher machine in 1915. There
were four more people who created (more or less independently) their own cryptographic rotor
machines not much time later: Edward Hebern, Arvid Damm, Hugo Koch and Arthur Scherbius.
Electro-mechanical machines fitted with movable rotors were able to produce long random
keystreams, thus allowing to encrypt messages by using complicated polyalphabetic substitution
ciphers.
Description
The main idea that lies behind rotor machines is relatively simple. One can imagine a simple
device, similar to a typewriter, with a number of keys used to input text. The number of keys may
differ, however usually there are 26 to 32 characters.

Simple substitution cipher


Each keystroke produces an output character, depending of the internal construction of the
machine. In the simplest case, if only wires are used (without any rotors), each input key will be
mapped to one specific output character.

For example, if someone pressed K in the keyboard, the machine would always produce C.
As a result, the machine would encrypt the messages by using a simple substitution cipher.
Adding a rotor
Having a simple substitution machine, one can imagine adding an additional internal rotor with
an internal wiring. The rotor will rotate with a gear, each time after a keystroke. As a result, after
pressing the same letter twice, it will be encoded differently due to different internal wiring.

For example, if someone pressed KK in the keyboard, the machine would produce CB (because
the wiring changed after the first keystroke, due to the rotor movement).
The internal wiring of the rotor should be kept secret, however we may expect that over time the
enemy will discover its design. It will make it easier for them to break the cipher but it won't
compromise the security altogether.

To decode a ciphertext the receiver would need a machine with the same rotor. Adding the rotor
caused the encryption to become a stronger polyalphabetic substitution cipher.
Make it difficult
To improve the security, one could add more rotors. The output of one rotor would be connected
to the input of the second rotor. Similarly, the second rotor output would be connected to the third
one, and so on. The strength of the encryption depends on several factors:
o the number of rotors inside the machine.
o the size of each rotor.
o the number of rotor types (with different internal wirings).
Each rotor would contain a different internal wiring. The substitution performed by each rotor
should be unknown for the enemy. To make cryptanalysis more difficult and to ensure that the
wiring inside each rotor changes with different frequency, the discs should rotate with different
speeds.

Additionally, depending on the design of the machine, some additional features may be added to
the machine, to ensure that the produced substitution is as random as possible (for example, an
additional fixed substitution that does not depends on the rotors).

Some rotor machines (most notably Enigma) were designed to be symmetrical. That means that
encrypting the same message twice (with the same settings), would produce the original
message.
Cryptographic Rotor Machines:
Hebern Cryptographic Rotor
Machine
Hebern rotor machine was one of the first cryptographic rotor machines that allowed to encrypt

messages automatically and effectively, and was supposed to provide more complex

cryptographic algorithms than the ciphers used manually.

Usage
Edward Hugh Hebern created his first rotor machine in 1917 and patented it one year later. By

that time, he had already invented several electric machines that were supposed to be used for

message encryption and decryption. His cryptographic rotor machines had never become

popular, due to some lacks in design, which made the US Army not to purchase more than a few

copies.

Algorithm
Similar to other cryptographic rotor machines, the Hebern machine used a disk with electrical

wires to encode and decode characters. Each rotor contained 26 electrical contacts on either

side. While rotating, the contacts changed the connections to the wires on both sides of the disc.
The wires on both sides of the disc were connected to input and output characters, so the rotating

disc acted as a simple substitution cipher.

The rotor installed in the Hebern machine rotated a gear each time a key was pressed. The secret

key in this case might be presented as an internal wiring of the rotor. Because the rotor had

26 connections, the key settings would be reused after 26 characters. Attacking such a cipher

would be a relatively easy task, and the amount of work would be comparable to attacking

old polyalphabetic substitution ciphers.

Over time, to make the key size longer, Hebern added additional rotors to the machine. All input

letters were passing through all rotors, which means that every letter was changed several times,

before it was delivered as the output ciphertext character.

The first rotor moved after a keystroke, while each next rotor turned once after the previous one

rotated a full turn.

Security of the Hebern rotor machine


The machine was proved to be unsecure by an American cryptographer, William Frederick

Friedman. He proved that due to the fact that the rotors moved only when the previous disc had

rotated a full turn, the whole algorithm might be divided into a number of single substitution

ciphers, each one with 26-letter long texts. This mean that the encryption was easy to break by

using common frequency analysis methods.


Images:
The Hebern single-rotor machine

The Hebern multiple-rotor machine

Date: 2020-03-09
Lorenz Cryptographic Rotor
Machine
The Lorenz rotor machines were used by the Germans during World War II for strategic
communication between major cities in German–occupied Europe.

Three versions of the Lorenz machine were created during the 1940s: SZ40 (started being used
in 1941), SZ42A (1943) and SZ42B (1944). The letters SZ, which form the model names,
originated from the German word Schlüsselzusatz, which means cipher attachment. And indeed,
these machines were constructed as an attachment to a standard Lorenz teleprinter. Thus, the
cryptographic extension could be attached to a teleprinter and extend its functionality.

Usage
The Lorenz rotor machine was developed in 1940 by a German company Lorenz, which was
a major telecommunication producer in Germany at that time.

Algorithm
The Lorenz cryptographic machine was supposed to implement the OTP encryption. The idea
was to use an electro-mechanical machine to overcome the problem of distribution of keystream
characters (the task would be even more difficult during the war). The rotors would turn with
different speeds, thus generating a random sequence. The sequence would be possible to
regenerate by the receiver, if they used the same rotors and the same initial parameters.
All characters in the Lorenz teleprinter were encoded by using 5-bit Baudot codes. Both input and
output letters were encoded on a paper tape. Each plaintext letter was XOR-ed with a secret-key
character, which was also encoded by using 5 bits. A pseudorandom keystream was generated
character-by-character by the internal rotor mechanism.
A diagram of the Lorenz machine

Pseudorandom key bits were generated by 10 rotors. Fife rotors were turned after every
keystroke, whereas the other five ones rotated not after every character, depending from the
output from additional two discs, called the motor wheels.

The main rotors were connected in pairs. Each bit out of fife plaintext bits (which encoded one
letter) first was moving through an always-rotating wheel and then through a corresponding
sometimes-rotating wheel. The signal value could have been changed by any of them, depending
on the rotor positions.

The two motor rotors were connected one after another. The movement of the second motor rotor
was triggered by the first one. The fife sometimes-rotating wheels would move together, if the
position of the second motor rotor triggered that.

Each wheel was fitted with a different number of cams, thus they all rotated with different speeds.
Also, the numbers were all co-prime with each other, to provide the longest possible time before
the pattern repeated.

The key sequence generated by the Lorenz rotor machine depended on its initial configuration:
o The patterns of cams on the wheels.
o The starting rotor positions.
The cam settings had been changed daily since the second part of 1944 (and much less
frequently before). They were distributed in the secret codebooks.

The initial wheel positions (12-letter indicator) were chosen by the operator before each
transmission and sent without encryption at the beginning of the message. Later, the procedure
changed and the operators sent 2-digit codes, which could be found in a codebook called the QEP
book. The codes corresponded to the initial wheel positions.

Security of the Lorenz rotor machine


The Allies were able to break the Lorenz cipher relatively quickly.

One of the typical danger related to the usage of one-time encryption by radio operators was
sending the same message twice, encrypted by using the same secret key (the same initial
settings). This situation might have place, when the receiver had some problems with recording
the message.

If the sender used exactly the same secret key to encrypt exactly the same message, intercepting
the communication would not provide any information to the eavesdropper. Unfortunately, during
sending the second message, the sender could make some small changes in the text, like adding
abbreviations or changing single words.

The Allies were lucky to intercept the message that allowed them to break the cipher in the middle
of 1941. It was broadcast twice, by using the same secret key. The second message had some
abbreviations made at the beginning of it. Also, it was long enough to allow the British to break
the code and to recover both plaintexts and the keystream characters.

After discovering the secret keystream, the Allies managed to determine the internal structure of
the machine, in spite of the fact that almost until the end of the war they hadn't seen any Lorenz
machine. The rotor mechanism design was not optimal and the true security provided by the
machine turned out to be much weaker than predicted by the Germans.

According to the German inventors, the number of possible combinations of internal rotor
positions was impressive, too large to make it possible to break the security by using brute force
attacks. However, due to the fact that each bit of the letter was encoded separately (each bit was
passed through only two rotors), the actual number of possible combinations was much smaller.
Also, because all the sometimes-rotating wheels turned at the some moment, the machine
produced relatively long parts of ciphertext that were not affected by those wheels (between their
turns). This turned out to be the crucial drawback of the cipher.

As a result, the team of British code breakers managed to build their own machine for decrypting
intercepted messages. The Allies were able to read all German communication encrypted by
Lorenz machine.

Image:
The Lorenz SZ42 machine

Enigma Cryptographic Rotor


Machine
Enigma was one of the most popular and most important cryptographic electro-mechanical rotor
machines. It was used in the middle of the 20th century by the Germans and other nations for
encryption of thousands of radio messages. The Enigma machine was invented by a German
engineer Arthur Scherbius at the end of World War I.

Usage
The Enigma machines started to be used commercially in the early 1920s but gained most
popularity for their military applications. During World War II they were used by the German
armies and became their main cryptographic tool for encryption and decryption of tactical and
strategic communication.
Even after the war, the Enigma machines were still used by various agencies in some countries,
until in the second part of the 20th century it was revealed, that their security had been
compromised by the Allies as early as at the beginning of World War II.

Algorithm
The Enigma machines were produced in many versions and they were continuously improved
over time. As for other cryptographic rotor machines, the strength of encryption depended mainly
on the several rotors, each with 26 electrical connections corresponding to 26 alphabet letters.

To make the cipher even stronger, the Germans added some more elements, like the plugboard.
Also, to allow encryption and decryption by using the same machines with the same parameters,
an additional important device was added, called the reflector.

In the most popular Enigma version, a signal from the key pressed by the user was passing first
through the plugboard, then the entry wheel, then through the three moving rotors until it reached
the reflector. After being processed by the reflector, the signal went all the way back through
the three rotors, the entry wheel and the plugboard up to the output panel of lamps.

A diagram of the Enigma machine

Rotors
The number of rotors in different Enigma versions varied. At the beginning, the German army
used a version with three different available rotors. All the three rotors were installed in the
machines each day and used for encryption. The rotor ordering (Walzenlage) and each ring
settings (Grundstellung) were a part of everyday secret initial configuration, distributed in
codebooks. The ring settings were the relative positions of the alphabet ring to the rotor wiring.
The initial position of each rotor was supposed to be set randomly by the operator before sending
a message.
Over time (in December 1938), a two new discs were added, but only three rotors were still
installed in the machines each day. The number of possible initial combinations increased
significantly.
The Naval version of Enigma was distributed with more rotors that the ordinary Army version.
Starting from six, the number of available rotors gradually increased up to eight. Also, the Naval
Enigma discs rotated with the higher frequency due to a different number of notches. The later
versions of Naval Enigma used four rotors at once, instead of three.

The plugboard
The military versions of Enigma contained an additional element called
the plugboard (Steckerbrett in German). If installed, it would connect the keyboard output to the
rest of the machine.
The plugboard allowed the operator to create pairs of letters, by using cables plugged into their
corresponding connectors. The signals from the letters connected by the cable were swapped
twice. First time, before they entered the main rotor mechanism, and then at the end of
the encryption process just before producing the output.

It was possible to create up to 13 connections but usually only 10 were used. The connections of
the plugs in the plugboard (Steckerverbindungen) were a part of the initial everyday configuration,
available from the secret codebook.
The plugboard turned out to be a useful feature, which significantly increased the strength of the
cipher.

The entry wheel


All Enigmas were fitted with an element called the entry wheel or the entry stator (Eintrittswalze in
German). This was a simple static wiring which connected the keyboard or the plugboard to the
rotor mechanism.
The commercial Enigma versions had the keys connected in the order of their sequence on
a QWERTZ keyboard (Q-A, W-B, E-C, R-T, etc.), while the military versions connected the letters
in an alphabetical order: A-A, B-B, and so on.
The reflector
A reflector (Umkehrwalze in German) was a device which made Enigma machines symmetrical.
This means that encrypting the same message twice, would produce the original message.
The reflector connected the outputs of the last rotor in pairs, thus sending the signals back, again
through the whole encryption mechanism.

There were several versions of this device. It made using the machine much easier and was one
of the reasons of popularity of Enigma. On the other hand, due to some mathematical properties,
it significantly reduced the cipher strength.

Security of Enigma
Due to its universal applications by the Germans during World War II, breaking the Enigma was
an extremely important success for the Allies. It allowed them to intercept all kinds of
communication, almost throughout the whole war.

The history of Enigma cryptanalysis is undoubtedly fascinating but due to many versions of the
machine and many stories describing the attempts from different perspectives, this website is
simply not large enough to accommodate the topic.
One should refer to the books, lectures and simulations that deal with the Enigma cryptanalysis
history.

A short story of breaking Enigma


The history of attacking Enigma is a history of great Allies mathematical analysis, shameful
German self-confidence, and unfortunate errors of the Enigma operators. It eventually reaches
the building of first large computer-like machines in the UK and later in the US.

The first three-rotor military Enigma machines were broken by the Polish Biuro Szyfrów agency,
long before the outbreak of World War II, in 1932. After that year, the Polish intelligence were able
to read all the messages encoded by Enigma almost in the real time. Three cryptologists had a
particularly great impact on breaking the cipher: Marian Rejewski (1905–1980), Jerzy Różycki
(1909-1942) and Henryk Zygalski (1908-1978).
Just before the outbreak of World War II, when the Germans added more rotor designs to Enigma,
thus making the decoding impossible for Poles, the Polish intelligence handed on the complete
documentation to the French and the British.

Over the next years, the burden of breaking new Enigma versions was taken on chiefly by the
British intelligence. The teams located at Bletchley Park were able to successfully break many
Enigma versions. What is more important, the British commanders were able to use the received
information to their advantage. Alan Turing (1912-1954) was a remarkable British cryptologist and
mathematician of that time.
Throughout the end of the war, the Americans built huge and powerful machines that were able
to break the latest and most complex versions of Naval Enigmas (fitted with four rotors).

Images:
The Enigma machine with three rotors
The Enigma plugboard (with two cables used)

Simple XOR Cipher


POLYALPHABETIC SUBSTITUTION CIPHER

Despite its simplicity and susceptibility to attacks, the simple XOR cipher was used in many
commercial applications, thanks to its speed and uncomplicated implementation.

Usage
The simple XOR cipher was quite popular in early times of computers, in operating systems MS-
DOS and Macintosh.

Algorithm
The simple XOR cipher is a variation of the Vigenère cipher. It differs from the original version
because it operates on bytes, which are stored in computer memory, instead of letters.
Instead of adding two alphabet letters, as in the original version of the Vigenère cipher, the XOR
algorithm adds subsequent plaintext bytes to secret key bytes using XOR operation. After using
the last secret key byte, one should return to the first byte (as in the Vigenère encryption).
In order to decrypt ciphertext bytes, one should take the same steps as during encryption.
Subsequent ciphertext bytes should be added to subsequent secret key bytes using XOR
operation.

Both encryption and decryption can be presented using the following equations:
M XOR K = C
C XOR K = M
Security of the simple XOR cipher
The simple XOR cipher is quite easy to break. It doesn't offer better protection that some other
classical polyalphabetic substitution ciphers. Using a computer, it is possible to break the cipher
in a relatively short time.
Almost always, the first step to break the cipher should be guessing a length of the secret key. It
can be easily achieved by calculating an index of coincidence of the ciphertext.
After determining the length of the key, one should write down the same ciphertext in two lines,
one under another. Bytes in the lower line should be offset by the secret key size with respect to
the same bytes in the upper line. Then, after adding XOR both texts (after adding each two bytes
in the same columns), one will receive a sequence of bytes without secret key modifications.

Thanks to the redundancy of information in languages stored in binary form as bytes, it


is possible to guess the original message letters based on the received bytes.

Implementation
The application written in C, that encrypt a given text file using a simple XOR cipher:

#include <stdio.h>

int main (int argc, char *argv[])


{
FILE *fi, *fo;
char *cp;
int c;

if ((cp = argv[1]) && *cp!='\0') {


if ((fi = fopen(argv[2], "rb")) != NULL) {
if ((fo = fopen(argv[3], "wb")) != NULL) {
while ((c = getc(fi)) != EOF) {
if (!*cp) cp = argv[1];
c ^= *(cp++);
putc(c,fo);
}
fclose(fo);
}
fclose(fi);
}
}
return 0;
}
Usage:
program_name key input_file output_file

Symmetric Ciphers
Symmetric ciphers use the same cryptographic keys for both encryption of plaintext and
decryption of ciphertext. They are faster than asymmetric ciphers and allow encrypting large sets
of data. However, they require sophisticated mechanisms to securely distribute the secret keys
to both parties.
Definition A symmetric cipher defined over (K, M, C), where:

• K - a set of all possible keys,


• M - a set of all possible messages,
• C - a set of all possible ciphertexts
is a pair of efficient algorithms (E, D), where:

• E: K × M -> C
• D: K × C -> M
such that for every m belonging to M, k belonging to K there is an equality:

• D(k, E(k, m)) = m (the consistency rule)


• Function E is often randomized
• Function D is always deterministic
There are two kinds of symmetric ciphers: stream ciphers and block ciphers:

Stream Symmetric Ciphers


Stream ciphers are based on generating a possible infinite cryptographic keystream of random
data. They take one output bit (or byte) at a time, and use it to encrypt the corresponding bit
(or byte) of input data.
Stream ciphers work on continuous stream of plaintext data and they do not divide it into smaller
blocks.

All Stream Ciphers:


One-Time Pad (OTP)
STREAM CIPHER WITH SYMMETRIC SECRET KEY

o Key length = message length


Invented in 1917 by Gilbert Vernam, an engineer at AT&T Corporation in the USA.

Usage
It has been proven that OTP is impossible to crack if it is used correctly. It has the perfect secrecy
property and allows very fast encryption and decryption. However, the secret key must be at least
as long as the message, what makes it quite inconvenient to use while sending large electronic
information.
Algorithm
Both data encryption and decryption by using OTP takes place in the same way. All bytes of
the message (or of the ciphertext) are added XOR to bytes of the secret key.

The bytes are added one by one, and each addition produces one output byte:

mi XOR ki = ci
ci XOR ki = mi
Using the same key repeatedly
Each part of the secret key can be used only once for encrypting exactly one part of the message
(of course, of the same length). Using the same key bytes more than once, allows the attacker
to discover the two original messages summed by XOR:

M1 XOR K = C1
M2 XOR K = C2
C1 XOR C2 = M1 XOR K XOR M2 XOR K = M1 XOR M2
Having two original messages summed by XOR, the intruder can try to broke the cipher, by using
attacks based on language and encoding features.
Providing no integrity
It is possible to modify the ciphertext in such a way, that the receiver would not be able to detect
that. What is worse, the changes have a predictable impact on the message. If the attackers know
the structure of the message, they are able to change only the desired parts of the message.

m -> enc(m, k) -> m XOR k


(m XOR k) XOR p = m XOR k XOR p
m XOR k XOR p -> dec(m XOR p, k) -> m XOR p

where p is the modification added by the attacker.


Secret Sharing
OTP allows to share the secret key among a number of people. Then, the encrypted text can be
decoded only when all those parties use their parts of the key. Each person will know only
one subkey.

To encrypt a text of size n by using a secret key sharing by m people, it is required to


prepare m*n key characters. As a result, each subkey will contain n characters and will allow to
encrypt a text up n-character long.
For example, if a secret key is shared among three parties, it will be required to have all three
subkeys XORed with the ciphertext in order to recover the original message.

K = K1 XOR K2 XOR K3
C = M XOR K1 XOR K2 XOR K3
M = C XOR K1 XOR K2 XOR K3
Block Diagram of OTP Algorithm
Maths:
XOR
The only operation during the OTP encryption and decryption is Exclusive Or (XOR). The key
bytes are added XOR to the data bytes, one after another.
Each time, all the 8 bits in the first byte are added XOR to the 8 bits in the second bytes.

b1 b2 b1 XOR b2

0 0 0

0 1 1

1 0 1

1 1 0

A truth table for XOR

Implementation
OTP encryption implemented in C++:

string otp(const string & plaintext, const string & key) {


size_t len = plaintext.size() < key.size() ?
plaintext.size() : key.size();
string ciphertext;
ciphertext.resize(len);

for (size_t i = 0; i < len; ++i) {


ciphertext[i] = ((short int)plaintext[i]) ^
((short int)key[i]);
}

return ciphertext;
}

RC4
STREAM CIPHER WITH SYMMETRIC SECRET KEY

o Key length: up to 2048 bits


RC4 is a symmetric stream cipher, known and praised for its speed and simplicity.

Usage
Designed by Ron Rivest of RSA Security in 1987. Implementation of RC4 cipher wasn't known
until September 1994 when it was anonymously posted to the Cypherpunks mailing list. RC4 is
often referred to as ARCFOUR or ARC4 to avoid problems with RC4 trademarked name.
The cipher is officially named after "Rivest Cipher 4" but the acronym RC is alternatively
understood to stand for "Ron's Code".

RC4 is one of the most popular ciphers. It is widely used in popular protocols, for example
to protect Internet traffic - TLS (Transport Layer Security) or to protect wireless networks - WEP
(Wired Equivalent Privacy).

Algorithm
RC4 is a stream symmetric cipher. It operates by creating long keystream sequences and adding
them to data bytes.

RC4 encrypts data by adding it XOR byte by byte, one after the other, to keystream bytes. The
whole RC4 algorithm is based on creating keystream bytes. The keystream is received from a 1-
d table called the T table.

Creating the Table


The T table is 256-byte long, and is created based on the secret key. It is created as a first step
of both encryption and decryption. The following operations must be performed in order to create
the table:
1. Every cell in the table is filled with a number equal to its position. The positions of the
table are numbered from 0 to 255.
2. A new temporary helper variable is created and set to 0.
3. For each element in the array the two following operations are performed (note, that the
values are from 0 to 255):
1. The value of temporary variable is updated (see Mathematical functions tab).
2. The number in the array at the current position is swapped with the number in
the array at the position determined by the temporary variable.

Encryption and Decryption


During encryption and decryption the keystream bytes are constantly generated. They are added
XOR to message bytes. The keystream bytes are produced based on the T table. The following
steps are performed:
1. Two helper variables p1 and p2 are created and set to 0.
2. The variable p1 is increased by 1 and the result is modulo divided by 256.
3. The variable p2 is increased by the value in the array T at the position determined by
the temporary variable p1 (T[p1]). Then, the result is divided modulo by 256.
4. The value in the array at the position p1 is swapped with the value in the array at
the position p2.
5. The value in the array at the position p1 is added to the value in the array at
the position p2. Then, the result is modulo divided by 256 and assigned to the new
helper variable p3.
6. The value in the array at the position p3 is a new keystream byte.
7. If more keystream bytes are needed, all the steps from the point II onwards should be
repeated.

Speed of RC4
The RC4 algorithm is designed especially to be used in software solutions because it only
manipulates single bytes. Unlike many other stream ciphers, it doesn't use LFSR registers, which
can be implemented optimally in hardware solutions but they are not so fast in applications.

Security of RC4
The cipher was created quite long time ago and it has some weaknesses which have been
improved in modern stream ciphers. It is possible to find keystream byte values that are slightly
more likely to occur than other combinations. In fact, over the last 20 years, several bytes like that
have been found. Some attacks based on this weakness were discovered.

Probably the most important weakness of RC4 cipher is the insufficient key schedule. Because
of that issue, it is possible to obtain some information about the secret key based on the first bytes
of keystream. It is recommended to simply discard a number of first bytes of the keystream. This
improvement is known as RC4-dropN, where N is usually a multiple of 256.
RC4 does not take a separate nonce alongside the key for every encryption. Therefore, the
cryptosystem must take care of unique values of keystream and specify how to combine
the nonce with the original secret key. The best idea would be to hash the nonce and the key
together to generate the base for creating the RC4 keystream. Unfortunately, many applications
simply concatenate key and nonce, which make them vulnerable to so called related key attacks.
This weakness of RC4 was used in Fluhrer, Mantin and Shamir (FMS) attack against WEP,
published in 2001.

Block Diagram of RC4


Maths:
Modification of the temporary variable during creating
the keystream table
During initialisation of the T table (256-byte long) used for generating keystream, the value of
temporary variable is updated for every element in the table. The updated temporary variable is
then used for modifying other numbers in the table.
xhelp = (xhelp + wcurrent + kcurrent mod len(K)) mod 256 ,
where:

o xhelp - the temporary helper variable,


o wcurrent - the value in the table T at the current position,
o kcurrent mod len(K) - the value in the key array which is being created at the current
position dividing modulo by the length of the key (because the key may be
shorter than 256 bytes),
o mod 256 - modulo division by 256 (that is, the remainder of division)
After the operations above, the current value in the T table is swapped with the value at
the position determined by the temporary variable. All positions in the table are numbered from 0.

Implementation:
Keystream Initialisation
Initialisation a T table, used for generation of keystream bytes. K is the secret key, that is an array
of length k_len.
for i from 0 to 255
T[i] := i
endfor
x_temp := 0
for i from 0 to 255
x_temp := (x_temp + T[i] + K[i mod k_len]) mod 256
swap(T[i], T[x_temp])
endfor

Keystream Generation
For keystream bytes generation, the loop below is executed as long as new bytes are needed.

p1 := 0
p2 := 0
while GeneratingOutput
p1 := (p1 + 1) mod 256
p2 := (p2 + T[p1]) mod 256
swap(T[p1], T[p2])
send(T[(T[p1] + T[p2]) mod 256])
endwhile

Salsa20
STREAM CIPHER WITH SYMMETRIC SECRET KEY

o Key length = 32 bytes


Salsa20 is a modern and efficient stream symmetric cipher. It was designed in 2005 by Daniel
Bernstein, research professor of Computer Science at the University of Illinois at Chicago.

Usage
Salsa20 is a cipher that was submitted to eSTREAM project, running from 2004 to 2008, which
was supposed to promote development of stream ciphers. It is considered to be a well-designed
and efficient algorithm. There aren't any known and effective attacks on the family of Salsa20
ciphers.

Algorithm
Salsa20 is a stream cipher that works on data blocks of size of 64 bytes.
Encryption
For each 64-byte data block, the algorithm uses the Salsa20 expansion function. The input to
the function is the secret key (which can have either 32 or 16 bytes) and an 8-byte
long nonce concatenated with an additional block number, which values change from 0 to 264-
1 (it is also stored on 8 bytes). Every call to the expansion function increases the block number
by one.
The core of Salsa20 encryption algorithm is a hash function which receives the 64-byte long input
data from the Salsa20 expansion function, mixes it, and eventually returns the 64-byte long
output. The Salsa20 hash function works on the received sequence of bytes, which consists of:
o the secret key.
o the nonce with the block number.
o four constant vectors received from the expansion function, which values
depend on the size of the secret key.
The hash function operates on data divided into words. Every word contains 4 bytes and can
have values from 0 to 232-1. Therefore, the input data is 16-word long, a key contains 8 or
4 words, and the nonce has 2 words.
The output from the Salsa20 expansion function is added XOR to the 64-byte block of data.
The result is a 64-byte block of ciphertext.

Decryption
The same algorithm should be used during decryption. The data should be divided into parts of
the same size.

The output from the Salsa20 expansion function should be added XOR to the 64-byte block of
ciphertext. The result is a 64-byte block of plaintext.

Other Salsa20 ciphers


There are also some other ciphers, which are based on the Salsa20 algorithm but differ in details.

o Salsa20/8 and Salsa20/12 - they work exactly as the original Salsa20


algorithm but instead of 10 doublerounds inside the hash function, they
perform 4 or respectively 6 doublerounds.
o ChaCha family - published by Bernstein in 2008. They provide better security
than the original Salsa20 cipher, by using slightly better hash functions. The
hash function input data was rearranged, to allow to implement the algorithm in
a more efficient way.

Block Diagram of Salsa20 Algorithm


Scheme of Salsa20 algorithm

Maths:
The operations for Salsa20 algorithm are presented in the order from the low-level functions, to
the more complex functions, which use the functions described above them.

Sum of two words


The addition of two 4-byte words is denoted in the algorithm description as a + b. The result is
divided modulo by 232 (so, by the maximum value that can be stored in one word).
The sum of two words a and b is equal to a+b mod 232. The result is a valid 4-byte long word.
Exclusive-Or of two words
The Exclusive-Or of two 4-byte words is denoted in the algorithm description as a XOR b.
To perform XOR addition, two words are compared bit by bit, and XOR addition is performed for
each pair. The result is also a valid proper 4-byte long word.

Binary Left Rotation


The a-bit left rotation of a 4-byte word w is denoted in the algorithm as w <<< a.
The leftmost a bits move to the rightmost positions.
A rotation of bits to the left w <<< a of a 4-byte word w may be represented as the multiplication:
2a·w mod(232-1)
The result is a valid 4-byte long word.
Quarterround Function
The Quarterround Function takes 4 words as input and returns another 4-word sequence.

If x is a 4-word input:
x= (x0, x1, x2, x3)
then the function can be defined as follow:
quarterround(x) = (y0, y1, y2, y3)
where:
y1 = x1 XOR ((x0 + x3) <<< 7)
y2 = x2 XOR ((y1 + x0) <<< 9)
y3 = x3 XOR ((y2 + y1) <<< 13)
y0 = x0 XOR ((y3 + y2) <<< 18)
The Quarterround Function can be performed in place, without the need of allocating any
additional memory. First, x1 changes to y1, then x2 changes to y2, next x3 changes to y3, then
x0 changes to y0. The Quarterround Function is invertible because all the modifications above are
invertible.
Rowround Function
The Rowround Function takes 16 words as input, transforms them, and returns 16-word
sequence.

This function is very similar to the Columnround Function but it operates on the words in
a different order.

If x is a 16-word input:
x= (x0, x1, x2, ..., x15)
then the function can be defined as follow:
rowround(x) = (y0, y1, y2, ..., y15)
where:
(y0, y1, y2, y3) = quarterround(x0, x1, x2, x3)
(y5, y6, y7, y4) = quarterround(x5, x6, x7, x4)
(y10, y11, y8, y9) = quarterround(x10, x11, x8, x9)
(y15, y12, y13, y14) = quarterround(x15, x12, x13, x14)
The 16-word input can be presented as a square matrix:

x0 x1 x2 x3

x4 x5 x6 x7

x8 x9 x10 x11

x12 x13 x14 x15

The rows in the matrix can be changed in parallel. Each of them is modified by the Quarterround
Function.

In the first row, the words are changed in the following order:

1. x1
2. x2
3. x3
4. x0
In the second row, the words are changed in the following order:

1. x6
2. x7
3. x4
4. x5
In the third row, the words are modified in the order:

1. x11
2. x8
3. x9
4. x10
Finally, in the last, fourth row, the words are changed in the order:

1. x12
2. x13
3. x14
4. x15
Columnround Function
The Columnround Function takes 16 words as input and returns 16-word sequence.

This function is very similar to the Rowround Function but operates on the words in different order.

If x is a 16-word input:
x= (x0, x1, x2, ..., x15)
then the function can be defined as follow:
columnround(x) = (y0, y1, y2, ..., y15)
where:
(y0, y4, y8, y12) = quarterround(x0, x4, x8, x12)
(y5, y9, y13, y1) = quarterround(x5, x9, x13, x1)
(y10, y14, y2, y6) = quarterround(x10, x14, x2, x6)
(y15, y3, y7, y11) = quarterround(x15, x3, x7, x11)

The 16-word input can be presented as a square matrix:

x0 x1 x2 x3

x4 x5 x6 x7

x8 x9 x10 x11

x12 x13 x14 x15

The columns in the matrix can be changed in parallel. Each of them is modified by
the Quarterround Function.

In the first column, the words are changed in the following order:

1. x4
2. x8
3. x12
4. x0
In the second column, the words are modified in the following order:

1. x9
2. x13
3. x1
4. x5
In the third column, the words are changed in the order:

1. x14
2. x2
3. x6
4. x10
In the last fourth column, the words are modified in the order:

1. x3
2. x7
3. x11
4. x15
Doubleround Function
The Doubleround Function takes 16 words as input and returns 16-word sequence.
If x is a 16-word input, then the Doubleround Function can be defined as follow:
doubleround(x) = rowround(columnround(x))
Littleendian Function
The Littleendian Function changes the order of a 4-byte sequence.

If b is a sequence of four bytes:


b= (b0, b1, b2, b3)
then the function is defined as:
littleendian(b) = b0 + 2 b1 + 2 b2 + 2 b3
8 16 24

The Littleendian Function is invertible. It just simply changes the order of bytes in a word.

Salsa20 Hash Function


The Salsa20 Hash Function takes 64 bytes as input and returns a 64-byte sequence.

First, the Hash Function creates 16 words from the received 64-byte input. If input is a sequence
of 64 bytes:
input =(b0, b1, b2, ..., b63)
then 16 words are created as below:
w0 = littleendian(b0, b1, b2, b3)
w1 = littleendian(b4, b5, b6, b7)
[...]
w15 = littleendian(b60, b61, b62, b63)
Then, all 16 words are modified by 10 iterations of the Doubleround Function:
(x0, x1, ..., x15) = doubleround10(w0, w1, ..., w15)
Finally, the 16 words received as input are added (as described above) to the modified 16 words
and changed to 64 new bytes using the Littleendian Function. The bytes are output from
the Salsa20 Hash Function:
output = littleendian-1(x0+w0) + littleendian-1(x1+w1) + ... + littleendian-1(x15+w15)
Salsa20 Expansion Function
The Salsa20 Expansion Function takes two sequences of bytes. The first sequence can have
either 16 or 32 bytes and the second sequence (n) is always 16-byte long. The function returns
another sequence of 64 bytes.
If the first sequence is 32-bytes long, then it is divided into two shorter sequences of 16 bytes
(k0 and k1). The Salsa20 Expansion Function is defined by using the Salsa20 Hash Function,
as shown below:
Salsa20Expansionk0, k1(n) = Salsa20Hash(a0, k0, a1, n, a2 , k1, a3)
where:
a0 = (101, 120, 112, 97)
a1 = (110, 100, 32, 51)
a2 = (50, 45, 98, 121)
a3 = (116, 101, 32, 107)
If the first sequence is 16-bytes long (k), then the Salsa20 Expansion Function is defined by using
the Salsa20 Hash Function as below:
Salsa20Expansionk(n) = Salsa20Hash(b0, k, b1, n, b2, k, b3)
where:
b0 = (101, 120, 112, 97)
b1 = (110, 100, 32, 49)
b2 = (54, 45, 98, 121)
b3 = (116, 101, 32, 107)
The constant values of the vector (a0, a1, a2, a3) mean 'expand 32-byte k' in ASCII code. Similarly,
the constant values of the second vector (b0, b1, b2, b3) mean 'expand 16-byte k' in ASCII.

CSS (Content Scramble System)


STREAM CIPHER WITH SYMMETRIC SECRET KEY

o Key length = 40 bits


CSS was presented by the DVD Forum organization in 1996, as a means to encrypt DVDs
content.

Usage
Because of its poor design, the effective key size is about 16 bits long.

It was compromised in 1999 by brute force attack. In this year, the DeCSS application was
published, which was able to quite fast break the CSS protection.

Algorithm
CSS is a stream cipher. The internal state machine is initialized using a 5-byte long secret key.
The state machine has 42 bits and contains two linear feedback shift registers - LFSR. The stream
of bytes is generated by registers and added XOR to stream of input data. Before addition, data
bytes are changing in a lookup table and keystream bytes move through optional inverters. There
are some lookup tables defined and they contain different coefficients.
The CSS cipher is created to protect audiovisual data on DVDs. There are a few different keys
in the whole CSS system. They are used to mutual authentication, encryption of sectors
and whole files. Some keys are stored encrypted and they must be decrypted before usage (using
CSS cipher, where the encrypted data are bytes of wanted secret key).

Because of many different types of tasks of CSS - working with audiovisual data and different
kind of keys - a few modes of CSS algorithm exist. All of them are generally similar but there are
some differences in detail (for example coefficients in tables).

When CSS algorithm decrypts one of the secret key, 5 encoded bytes of this key are mixed with
bytes received from registers in more complicated way. Instead of simple lookup tables and
addition XOR with keystream, there is a key mangling operation. Each byte goes through two
lookup tables and is added XOR twice with one byte from keystream.

CSS Modes
o authentication and establishing connection - the host establishes
communication with DVD and both sides create a bus key,
o disc key decryption - the host receives and decrypts a disc key using one of its
player keys,
o title key decryption - the host receives and decrypts a title key using
the obtained disc key,
o audiovisual data decryption - using the obtained title key and a sector key read
from a DVD, the host decrypts audiovisual data stored in one sector of
the DVD.

CSS Keys
o player keys - stored in a DVD driver; they are used for decryption of a disc key
(which is stored on DVD),
o disk key - encrypted on DVD and decrypted by the DVD driver using its player
keys; it is used for decryption of a title key,
o sector key - stored unecrypted in each sector of the DVD and read by the DVD
driver; it is used together with a title key for decryption of audiovisual data
in one DVD sector,
o title keys - encrypted on DVD and decrypted by the DVD driver using the disc
key obtained ealier; they are used for decryption of audiovisual data,
o session key or bus key - random key created during authentication between
the host and DVD drive; it is used for encryption of future communication
between them.

CSS System
The whole CSS system contains of three elements: a DVD, a DVD driver and a host (a computer,
an application for playing DVDs).

Every DVD contains an encoded unique disc key. Similarly, each DVD driver has a few player
keys. Each DVD has a hidden sector, which contains a disc key encrypted in many copies using
each of the 409 existing player keys. On writeable DVDs, the hidden sector is cleared and can't
be changed. A DVD driver tries to read a DVD and uses its player keys to decrypt one of the copy
of the encrypted disc key on DVD.

After each try and obtaining a result, which may be a correct disc key, the DVD driver performs
the following test: using received 5 bytes, which may be a disc key it tries to decrypt a test
sequence (stored on DVD), which is the real disc key, encrypted using the real disc key. If
the DVD driver receives the same 5 bytes like the 5 bytes it used as a key, then it is certain that
those 5 bytes are the real and correct disc key.

A DVD contains usually a few encoded title keys. Each of them protect one part of the movie,
called VTS (Video Title Set). Each VTS contains a set of files named as VTS_AA_B.CCC, where
every A and B means one digit. CCC may be one of three possible file extensions:
(.VOB, .BUP or .IFO). All the files which have the same number AA belong to the same VTS.
Title keys are decrypted using the disc key.
Each data sector on DVD is 2048-byte long (so it has the same size as sectors on CD-ROMs).
A sector starts with a MPEG-2 PACK header, 128-byte long. After the header there are either
audiovisual data (called stream data; for example MPEG-2 data or AC-3 data) or other information
(PCI or DSI). A sector key is stored unecrypted in bytes 80-84 in the header.

If a sector contains audiovisual data, then after the MPEG-2 PACK header it is stored a header
of audiovisual data (stream header), which contains 2 bits determining encryption type.

o 00 - no encryption
o 01 - CSS encryption
o 10 - reserved/not used
o 11 - CPRM encryption
If a sector doesn't contain audiovisual data, those bits are not stored in this sector (because not-
stream data are not encrypted).

For decryption of data stored on DVD, they are used two keys - a sector key (different value for
every sector) and a title key (each DVD contains usually a few title keys, one for each VTS). First
two bytes of the title key are added XOR with two first bytes of the sector key (bytes 80-81 of
the sector header) and then passed into the LFSR-17 register. Last three bytes of the title key are
added XOR with three last bytes of the sector key (bytes 82-84 of the sector header) and passed
into the LFSR-25 register. The host obtains the title key using the disc key decrypted earlier.
On each DVD it is stored also a region code, which determine a part of world where the DVD can
be played.

CSS Protocol
1. Mutual authentication
For decryption of a DVD in a DVD driver, a host must authenticate itself to the DVD using
a challenge-response protocol and CSS encryption. Theoretically, the DVD must also
authenticate itself to the host, however a host's application usually skips this checking.
During the authentication both sides use a predefined authentication key - F4 10 45 A3
E2.
The authentication requires the following steps:

1. The host receives an AGID (Authentication Grant ID) number from the DVD
drive. AGID is used as a session ID for the current communication. Click here to
find out more.
2. The host generates 10 random bytes and sends them to the DVD driver. The
driver encrypts them and sends back to the host 5-byte long sequence. Click
here to find out more.
3. The host decrypts driver's answer and checks if the result is the same as
challenge previously sent to the driver. The DVD driver can answer using one of
32 variants, so the host must make 32 tests to check which of them has been
chosen by the driver. Click here to find out more.
4. The DVD driver generates a random 10-byte long sequence and sends it to the
host. The host encrypts it. Click here to find out more.
5. The host sends back to the driver the encrypted sequence - the new
key KEY2. Click here to find out more.
At this point, both the host and driver know all 10 bytes - two keys created by
the host and the driver. The bus key is created by encryption using the 10-byte
long sequence using CSS in mode 3. It is 5-byte long and prevents to eavesdrop
future communication (particularly sending title and disc keys).

6. The host reads a disk key from a hidden sector on DVD. Click here to find out
more.
If the last step is succeeded, the host will be able to read a DVD using ordinary
commands SCSI read. Otherwise, the authentication fails.
2. Decoding a disk key
A DVD driver decrypts a disc key from a DVD using all its player keys. Each manufacturer
of DVD drivers usually possesses one or a few player keys and uses them in his products.
3. Reading and decoding title keys
Encrypted title keys are sent from a DVD to a host. Transmitting of title keys, as well as
the entire transmission in general, is encrypted using a bus key.

4. Sending encrypted data


A host reads from a DVD the whole sector of encrypted data.

5. Decoding encrypted data


A host decrypts the whole sector using an obtained title key and a sector key stored in
the header of each DVD sector.

Block Diagram of CSS Algorithm for Audiovisual


Data

Block Diagram of CSS Algorithm for Key Bytes


Block Diagram of CSS Additional Encryption of
Keys

Block Diagram of CSS LFSR Registers


Maths:
Initialisation of Registers
Based on the secret 40-bit long key, two linear feedback shift registers - LFSR, LFSR-17 (17-bit
length) and LFSR-25 (25-bit length) and one more additional bit CC are created.
All 40 bits are divided into two LFSR registers. By assigning one letter to each bit of the key,
the first order of bits may be presented as below:

KEY = jklmnopq abcdefgh QRSTUWXY IJKLMNOP ABCDEFGH


LFSR-17 = q ponmlkji hgfedcba
LFSR-25 = Y XWVUTSRQ PONMLKJI HGFEDCBA
Bits in LFSR registers are filled in the reverse order as in the key.

Bits V and i are set to 1 (to prevent initialisation of the LFSR registers by zeros) and bit CC is set
to 0.
LFSR-17 Operations
Bits are shifted right by one position. A new bit which appears in the leftmost position of
the register and at the output is a sum of the first and fifteenth bits.
LFSR-17 register operations

LFSR-25 Operations
Bits are shifted right by one position. A new bit which appears in the leftmost position of
the register and at the output is a sum of four bits from the register.

LFSR-25 register operations

Inverter Modes
Depending on the current CSS mode, the inverters reverse or do not reverse output bits
from LFSR registers.
The following table presents which registers and in which CSS modes reverse order of bits in
each byte.
Mode LFSR-17 LFSR-25

Authentication yes no

Disc key no no

Title key no yes

Audiovisual data yes no

Table of inverters operations

Lookup Table
Each byte for encryption or decryption is replaced by another byte, based on one of five lookup
tables using in CSS. There are different tables for encryption and decryption and different tables
for different CSS modes.

33 73 3B 26 63 23 6B 76 3E 7E 36 2B 6E 2E 66 7B

D3 93 DB 06 43 03 4B 96 DE 9E D6 0B 4E 0E 46 9B

57 17 5F 82 C7 87 CF 12 5A 1A 52 8F CA 8A C2 1F

D9 99 D1 00 49 09 41 90 D8 98 D0 01 48 08 40 91

3D 7D 35 24 6D 2D 65 74 3C 7C 34 25 6C 2C 64 75

DD 9D D5 04 4D 0D 45 94 DC 9C D4 05 4C 0C 44 95

59 19 51 80 C9 89 C1 10 58 18 50 81 C8 88 C0 11

D7 97 DF 02 47 07 4F 92 DA 9A D2 0F 4A 0A 42 9F

53 13 5B 86 C3 83 CB 16 5E 1E 56 8B CE 8E C6 1B

B3 F3 BB A6 E3 A3 EB F6 BE FE B6 AB EE AE E6 FB

37 77 3F 22 67 27 6F 72 3A 7A 32 2F 6A 2A 62 7F

B9 F9 B1 A0 E9 A9 E1 F0 B8 F8 B0 A1 E8 A8 E0 F1

5D 1D 55 84 CD 8D C5 14 5C 1C 54 85 CC 8C C4 15
BD FD B5 A4 ED AD E5 F4 BC FC B4 A5 EC AC E4 F5

39 79 31 20 69 29 61 70 38 78 30 21 68 28 60 71

B7 F7 BF A2 E7 A7 EF F2 BA FA B2 AF EA AA E2 FF

Lookup table for decrypting of audiovisual data from a DVD

Block Symmetric Ciphers


Block ciphers work on larger fragments of data (called blocks) at a time, by encrypting data blocks
one by one. During encryption input data are divided into blocks of fixed-length and each of them
is processed by several functions with the secret key. Both lengths of data block and key, and the
functions using in the process are determined by the algorithm. The inverse functions are used
for decryption.

Block cipher algorithms are often able to combine data from different blocks in order to provide
additional security (e.g. AES in CBC mode).
Block ciphers may be described as efficient and deterministic functions, which permute contents
of all data blocks. They simply mix all the bits in each block. Permutation functions must be
pseudorandom and the output should be indistinguishable from pure random data. To allow
decryption, the inverse permutations must be used. The inverse permutations need also to be
quite efficient.

DES (Data Encryption


Standard)
BLOCK CIPHER WITH SYMMETRIC SECRET KEY

o Block length = 64 bits


o Key length = 56 bits
DES was one of the most popular block symmetric ciphers. It was created in the early 1970s at
IBM and adopted as a federal standard by NBS in 1976.

Usage
DES is one of the most thoroughly examined encryption algorithms. In 1981 it was included
in ANSI standards as Data Encryption Algorithm for private sector.

At the beginning of the 21st century, DES started to be considered insecure, mainly due to
its relatively short secret key length, what makes it vulnerable to brute force attacks. In 2001 DES
cipher was replaced by AES. DES is still one of the most popular cipher.

Algorithm
DES uses the key which is 64-bit long, however only 56 bits are actually used by the algorithm.
Every 8th bit of the key is a control one and it can be used for parity control.

In the encryption process, the data is first divided into 64-bit long blocks. Then, each block
undergoes the following operations:

1. Initial permutation rearranges bits in a certain, predefined way. This step does not
enhance the security of algorithm. It was introduced to make passing data into
encryption machines easier, at the times when the cipher was invented.
2. The input data is divided into two 32-bit parts: the left one and the right one.
3. 56 bits are selected from the 64-bit key (Permutation PC-1). They are then divided into
two 28-bit parts.
4. Sixteen rounds of the following operations (so called Feistel functions) are then
performed:
1. Both halves of key are rotated left by one or two bits (specified for each round).
Then 48 subkey bits are selected by Permutation PC-2.
2. The right half of data is expanded to 48 bits using the Expansion Permutation.
3. The expanded half of data is combined using XOR operation with the 48-bit
subkey chosen earlier.
4. The combined data is divided into eight 6-bit pieces. Each part is then an input to
one of the S-Boxes (the first 6-bit part is the input to the first S-Box, the second
6-bit part enters the second S-Box, and so on). The first and the last bits stand
for the row, and the rest of bits define the column of an S-Box table. After
determining the location in the table, the value is read and converted to binary
format. The output from each S-Box is 4-bit long, so the output from all S-Boxes
is 32-bit long. Each S-box has a different structure.
5. The output bits from S-Boxes are combined, and they undergo P-Box
Permutation.
6. Then, the bits of the changed right side are added to the bits of the left side.
7. The modified left half of data becomes a new right half, and the previous right
half becomes a new left side.
5. After all sixteen rounds, the left and the right halves of data are combined using the XOR
operation.
6. The Final Permutation is performed.
During decryption, the same set of operations is performed but in reverse order. The subkeys are
also selected in reverse order (compared to encryption).

Weak keys in DES


A weak key in the DES cipher is a key which generates the same subkeys in all the successive
rounds of encryption algorithm. There are four known weak keys in DES (expressed in
hexadecimal):

o 0x00000000000000 (only zeros)


o 0xFFFFFFFFFFFFFF (only ones)
o 0x0000000FFFFFFF (only zeros and ones)
o 0xFFFFFFF0000000 (only ones and zeros)
Semiweak keys in DES
A semiweak key in the DES cipher is a key for which one can find another key that produces
the same encrypted ciphertext from the same given plaintext. There are twelve known semiweak
keys in DES (expressed in hexadecimal, along with parity bits):

o 0x01E001E001F101F1 and 0xE001E001F101F101


o 0xFE01FE01FE01FE01 and 0x01FE01FE01FE01FE
o 0x1FE01FE00EF10EF1 and 0xE01FE01FF10EF10E
o 0xE0FEE0FEF1FEF1FE and 0xFEE0FEE0FEF1FEF1
o 0x1F011F010E010E01 and 0x011F011F010E010E
o 0xFE1FFE1FFE0EFE0E and 0x1FFE1FFE0EFE0EFE
Initial and Final Permutations in DES
The Initial and Final Permutations have no influence on security. They don't use a secret key and
can be undone by anybody. They were introduced to make hardware implementation easier in
some contexts. A hardware circuit which receives data over an 8-bit bus can accumulate the bits
into eight shift registers, which is more efficient (in terms of circuit area) than a single 64-bit
register. This process naturally performs the Initial Permutation of DES.

Let's assume that somebody is designing a hardware circuit which should do some encryption
with DES. The data will be received in blocks of 8 bits. This means that there are 8 lines, each
yielding one bit at each clock. A common device for accumulating data is a shift register: the input
line plugs into a one-bit register, which itself plugs into another, which plugs into a third register,
and so on. At each clock, each register receives the contents from the previous register, and
the first register accepts the new bit. Therefore, the contents are shifted.

With an 8-bit bus, 8 shift registers are needed, each receiving 8 bits for every input block. The first
register receives bits 1, 9, 17, 25, 33, 41, 49 and 57. The second register receives bits 2, 10, 18,
..., and so on. After eight clocks, eight registers received the complete 64-bit block and it is time
to proceed with the DES algorithm itself.

If initial permutation was not used, then the first step of the first round would extract the 'left half'
(32 bits) which, at that point, would consist of the leftmost 4 bits of each of the 8 shift registers.
The 'right half' would also get bits from all the 8 shift registers. If you think of it as wires from
the shift registers to the units which use the bits, then you end up with a bunch of wires which
heavily cross each other. Crossing is doable but requires some circuit area, which is
the expensive resource in hardware designs.

On the other hand, if you consider that the wires must extract the input bits and permute them as
per the DES specification, you will find out that there is no crossing anymore. In other words,
the accumulation of bits into the shift registers inherently performs a permutation of the bits, which
is exactly the initial permutation of DES. By defining that initial permutation, the DES standard
says: 'well, now that you have accumulated the bits in eight shift registers, just use them in that
order, that's fine'.

The same thing is done again at the end of the algorithm during the Final Permutation.

DES was designed at a time when 8-bit bus were the top of the technology and one thousand
transistors were an awfully expensive amount of logic.
Security of DES
DES is considered to be a well-designed and effective algorithm. However, just after
its publication, many cryptographers believed that the size of its key is too small. At present,
the 56-bit long key can be broken relatively cheaply, by using brute force attacks within a few
days.
It is quite easy to attack DES knowing some parts of plaintext. The intruder can try all 256 possible
keys. He looks for a key, which used for decryption of an encrypted block of the known plaintext,
produces exactly the same plaintext. In practice, it is enough to know two or three blocks
of plaintext to be able to determine if the currently testing key which works for them, will be
working for other blocks as well. Probability that the found key is incorrect and converts correctly
only the known plaintext blocks is negligibly small.
The fastest known attacks on DES use linear cryptanalysis. They require knowing 243 blocks of
plaintext and their time complexity is around 239 to 243.

Block Diagram of DES Algorithm


Block Diagram of DES Feistel Function
Block Diagram of DES Key Schedule
Maths:
Permutations are presented in the form of tables only to make them easier to comprehend; one
should bear in mind that input data is vectors, not matrices. Click to learn more on how to read
the permutation table.
Initial Permutation
Initial Permutation is processed on each block of input data in the beginning of encryption.

58 50 42 34 26 18 10 2

60 52 44 36 28 20 12 4

62 54 46 38 30 22 14 6

64 56 48 40 32 24 16 8

57 49 41 33 25 17 9 1

59 51 43 35 27 19 11 3

61 53 45 37 29 21 13 5

63 55 47 39 31 23 15 7

Initial Permutation Table

Scheme of DES Initial Permutation

Key Permutation PC-1


56 bits are selected from 64-bit key. The key is then divided into two parts. Each part undergoes
bit shifting (every eighth bit is excluded from encryption and can be used for parity control).
Left half

57 49 41 33 25 17 9

1 58 50 42 34 26 18

10 2 59 51 43 35 27

19 11 3 60 52 44 36

Right half

63 55 47 39 31 23 15

7 62 54 46 38 30 22

14 6 61 53 45 37 29

21 13 5 28 20 12 4

PC-1 Permutation Table

Scheme of DES PC-1 permutation

Expansion Permutation
Expansion Permutation initiates each round of Feistel functions. It expands the right half of data
from 32 bits to 48 bits.

32 1 2 3 4 5 4 5
6 7 8 9 8 9 10 11

12 13 12 13 14 15 16 17

16 17 18 19 20 21 20 21

22 23 24 25 24 25 26 27

28 29 28 29 30 31 32 1

Expansion Permutation Table

Scheme of DES expansion permutation

Binary Rotation
In each round of Feistel functions, the 28-bit halves of key are rotated left by one or two bits.

Amount of
No. of cycle bits

1 1

2 1

3 2

4 2
5 2

6 2

7 2

8 2

9 1

10 2

11 2

12 2

13 2

14 2

15 2

16 1

Binary Rotation Table

Key Permutation PC-2


During Key Permutation PC-2 48 bits are selected from the 56-bit subkey obtained for a given
round of Feistel functions.

14 17 11 24 1 5 3 28

15 6 21 10 23 19 12 4

26 8 16 7 27 20 13 2

41 52 31 37 47 55 30 40

51 45 33 48 44 49 39 56

34 53 46 42 50 36 29 32
PC-2 Permutation Table

Scheme of PC-2 permutation

S-Blocks
In S-Boxes encryption each 6-bit input block is replaced by 4-bit output.

If input bits are marked as a1, a2, a3, a4, a5 and a6, then the a1 and a6 comprise a 2-bit figure
standing for a row, and the a2, a3, a4 and a5 comprise a 2-bit figure standing for a column (both
columns and rows are numbered from zero). Where the row and column cross, there is the output
figure of the block. For example, if the input string of bits is 101010, the output will be 0110.

x00 x00 x00 x00 x01 x01 x01 x01 x10 x10 x10 x10 x11 x11 x11 x11
00x 01x 10x 11x 00x 01x 10x 11x 00x 01x 10x 11x 00x 01x 10x 11x

0yy
yy0 14 4 13 1 2 15 11 8 3 10 6 12 5 9 0 7

0yy
yy1 0 15 7 4 14 2 13 1 10 6 12 11 9 5 3 8

1yy
yy0 4 1 14 8 13 6 2 11 15 12 9 7 3 10 5 0

1yy
yy1 15 12 8 2 4 9 1 7 5 11 3 14 10 0 6 13

S1

x00 x00 x00 x00 x01 x01 x01 x01 x10 x10 x10 x10 x11 x11 x11 x11
00x 01x 10x 11x 00x 01x 10x 11x 00x 01x 10x 11x 00x 01x 10x 11x
0yy
yy0 15 1 8 14 6 11 3 4 9 7 2 13 12 0 5 10

0yy
yy1 3 13 4 7 15 2 8 14 12 0 1 10 6 9 11 5

1yy
yy0 0 14 7 11 10 4 13 1 5 8 12 6 9 3 2 15

1yy
yy1 13 8 10 1 3 15 4 2 11 6 7 12 0 5 14 9

S2

x00 x00 x00 x00 x01 x01 x01 x01 x10 x10 x10 x10 x11 x11 x11 x11
00x 01x 10x 11x 00x 01x 10x 11x 00x 01x 10x 11x 00x 01x 10x 11x

0yy
yy0 10 0 9 14 6 3 15 5 1 13 12 7 11 4 2 8

0yy
yy1 13 7 0 9 3 4 6 10 2 8 5 14 12 11 15 1

1yy
yy0 13 6 4 9 8 15 3 0 1 1 2 12 5 10 14 7

1yy
yy1 1 10 13 0 6 9 8 7 4 15 14 3 11 5 2 12

S3

x00 x00 x00 x00 x01 x01 x01 x01 x10 x10 x10 x10 x11 x11 x11 x11
00x 01x 10x 11x 00x 01x 10x 11x 00x 01x 10x 11x 00x 01x 10x 11x

0yy
yy0 7 13 14 3 0 6 9 10 1 2 8 5 11 12 4 15
0yy
yy1 13 8 11 5 6 15 0 3 4 7 2 12 1 10 14 9

1yy
yy0 10 6 9 0 12 11 7 13 15 1 3 14 5 2 8 4

1yy
yy1 3 15 0 6 10 1 13 8 9 4 5 11 12 7 2 14

S4

x00 x00 x00 x00 x01 x01 x01 x01 x10 x10 x10 x10 x11 x11 x11 x11
00x 01x 10x 11x 00x 01x 10x 11x 00x 01x 10x 11x 00x 01x 10x 11x

0yy
yy0 2 12 4 1 7 10 11 6 8 5 3 15 13 0 14 9

0yy
yy1 14 11 2 12 4 7 13 1 5 0 15 10 3 9 8 6

1yy
yy0 4 2 1 11 10 13 7 8 15 9 12 5 6 3 0 14

1yy
yy1 11 8 12 7 1 14 2 13 6 15 0 9 10 4 5 3

S5

x00 x00 x00 x00 x01 x01 x01 x01 x10 x10 x10 x10 x11 x11 x11 x11
00x 01x 10x 11x 00x 01x 10x 11x 00x 01x 10x 11x 00x 01x 10x 11x

0yy
yy0 12 1 10 15 9 2 6 8 0 13 3 4 14 7 5 11

0yy
yy1 10 15 4 2 7 12 9 5 6 1 13 14 0 11 3 8
1yy
yy0 9 14 15 5 2 8 12 3 7 0 4 10 1 13 11 6

1yy
yy1 4 3 2 12 9 5 15 10 11 14 1 7 6 0 8 13

S6

x00 x00 x00 x00 x01 x01 x01 x01 x10 x10 x10 x10 x11 x11 x11 x11
00x 01x 10x 11x 00x 01x 10x 11x 00x 01x 10x 11x 00x 01x 10x 11x

0yy
yy0 4 11 2 14 15 0 8 13 3 12 9 7 5 10 6 1

0yy
yy1 13 0 11 7 4 9 1 10 14 3 5 12 2 15 8 6

1yy
yy0 1 4 11 13 12 3 7 14 10 15 6 8 0 5 9 2

1yy
yy1 6 11 13 8 1 4 10 7 9 5 0 15 14 2 3 12

S7

x00 x00 x00 x00 x01 x01 x01 x01 x10 x10 x10 x10 x11 x11 x11 x11
00x 01x 10x 11x 00x 01x 10x 11x 00x 01x 10x 11x 00x 01x 10x 11x

0yy
yy0 13 2 8 4 6 15 11 1 10 9 3 14 5 0 12 7

0yy
yy1 1 15 13 8 10 3 7 4 12 5 6 11 0 14 9 2

1yy
yy0 7 11 4 1 9 12 14 2 0 6 10 13 15 3 5 8
1yy
yy1 2 1 14 7 4 10 8 13 15 12 9 0 3 5 6 11

S8

Permutation P
Permutation P is processed on the 32-bit output block from eight S-Boxes.

16 7 20 21 29 12 28 17

1 15 23 26 5 18 31 10

2 8 24 14 32 27 3 9

19 13 30 6 22 11 4 25

Permutation P Table

Scheme of permutation P
Final Permutation
Final Permutation is processed for each block of data after the last round of Feistel functions. It
is the inverse of the Initial Permutation.

40 8 48 16 56 24 64 32

39 7 47 15 55 23 63 31

38 6 46 14 54 22 62 30

37 5 45 13 53 21 61 29

36 4 44 12 52 20 60 28

35 3 43 11 51 19 59 27

34 2 42 10 50 18 58 26

33 1 41 9 49 17 57 25

Final Permutation Table

Scheme of DES final permutation

RC2
BLOCK CIPHER WITH SYMMETRIC SECRET KEY

o Block length = 8 bytes


o Key length = from one byte up to 128 bytes
RC2 is a block symmetric cipher which was popular in the first half of the 90s of the last century.
It was greatly promoted by the US government agencies.

Usage
RC2 was designed by Ron Rivest of RSA Security in 1987, who created also a few other ciphers.
RC2 algorithm had been kept secret until 1996, when it was anonymously posted on sci.crypt
group.
RC2 is also known as ARC2. The acronym RC is understood as "Rivest Cipher" or "Ron's Code".

Algorithm
RC2 is a block cipher, and the block size is 8 bytes (64 bits). This means that the input data is
first divided into blocks of 8 bytes and then each of them is processed separately.

Each data block is treated as four words, each word has 16 bits (2 bytes). The array of four words
is presented as R[0] R[1] R[2] R[3]. Both encryption and decryption take this array as input
and modify the four words. The output is returned in the same array.
Key Expansion
Apart from the data, the RC2 cipher takes as input a secret user key. The key provided by the
user may be of size from one byte up to 128 bytes. Let us denote the key size (in bytes) as Keysize.
The first operation which RC2 then performs is to expand the key, to receive new 128 key bytes
which will be used for encryption of decryption of all data bytes.
The user provides also a second value, denoted Keybit-limit, which determine the maximum
effective key size, provided in bits. This means, that no matter how many bytes of the secret key
has been provided by the user, the cipher will operate on the key which effective size is Keybyte-
limit, where:
Keybyte-limit = (Keybit-limit + 7) / 8

Of course, the real strenght of the key will be even smaller, if the user provided less key bytes
than Keybyte-limit.
In order to assure that only that many bites will be used to secure the data, the following bit mask
is defined:
Keymask = 255 mod 2^(8 + Keybit-limit - 8·Keybyte-limit)

The 8 - (8·Keybyte-limit - Keybit-limit) least significant bits of Keymask are set, the rest are zeros.
Key Words and Bytes
The mathematical operations that are performed on the key operate on both single bytes and
whole words. So, for convenience, the numbers in the key may be presented as either 64 words:
K[0], K[1], ..., K[63]

or 128 separate bytes:


L[0], L[1], ..., L[127].
Note, that each part of the key array may be addressed by using either bytes or words. The
following equation can be defined:
K[i] = L[2·i] + 256·L[2·i+1]
The low-order eight bits of each word is located before the high-order byte.
Key Expansion Algorithm
The first step of the key expansion algorithm is to fill the first Keysize bytes of key array with the
bytes received from the user:
L[0], L[1], ..., L[Keysize-1].
Then, the following steps are performed on the key data. TPI mentioned below is a fixed array of
size 256, filled randomly with values from 0 to 255:
1. For i = Keysize, Keysize+1, and up until 127:
L[i] = TPI[(L[i-1] + L[i-Keysize]) mod 256]
2. L[128-Keybyte-limit] = TPI[L[128-Keybyte-limit] & Keymask]
3. For i = 127-Keybyte-limit, down until 0:
L[i] = TPI[L[i+1] XOR L[i+Keybyte-limit]]
The effective encryption key is determined by the bytes:
L[128-Keybyte-limit], L[128-Keybyte-limit+1]..., L[127]

The operation & performed in the second step above limits the effective size of the key down to
just Keybit-limit bits.

Encryption
The encryption procedure takes as input four words: R[0] R[1] R[2] R[3], which form one
block of data. Every block of data will be encrypted by using the same 64 words of expanded
secret key: K[0] K[1] ... K[63].
The following encryption steps are performed on every data block:

1. Initialize the counter j to 0.


2. Perform five Mixing Rounds.
3. Perform one Mashing Round.
4. Perform six Mixing Rounds.
5. Perform one Mashing Round.
6. Perform five Mixing Rounds.
Because every Mixing Round uses 4 key bytes, all 128 key bytes are used during encryption of
one data block. The Mashing Rounds use key bytes in a more unpredicted manner.

Each Mixing Round consists of four Mixing operations, performed on four words of the data block:

1. Mixing R[0]
2. Mixing R[1]
3. Mixing R[2]
4. Mixing R[3]
Each Mashing Round consists of four Mashing operations, performed on four words of the data
block:

1. Mashing R[0]
2. Mashing R[1]
3. Mashing R[2]
4. Mashing R[3]

Decryption
The decryption procedure takes as input four ciphertext words: R[0] R[1] R[2] R[3], which
form one block of encrypted data. Every block of data will be decrypted by using the same 64
words of expanded secret key: K[0] K[1] ... K[63].
The following decryption steps are performed on every ciphertext block:

1. Initialize the counter j to 63.


2. Perform five R-Mixing Rounds.
3. Perform one R-Mashing Round.
4. Perform six R-Mixing Rounds.
5. Perform one R-Mashing Round.
6. Perform five R-Mixing Rounds.
Each R-Mixing Round consists of four R-Mixing operations, performed on four words of the
encrypted data block:

1. R-Mixing R[3]
2. R-Mixing R[2]
3. R-Mixing R[1]
4. R-Mixing R[0]
Each R-Mashing Round consists of four R-Mashing operations, performed on four words of the
encrypted data block:

1. R-Mashing R[3]
2. R-Mashing R[2]
3. R-Mashing R[1]
4. R-Mashing R[0]

Block Diagram of RC2 Encryption


Block Diagram of RC2 Decryption
Maths:
The operations for RC2 algorithm are presented in the order from the low-level functions, to the
more complex operations, that use the functions described above them.

Table PI (TPI)
Table PI is a fixed array that contains 256 elements, using during key expansion operations.

Table PI is filled with numbers, which are a random permutations of all possible byte values from
0 to 255. The order of numbers is based on the digits of PI: 3.1415...

d9 78 f9 c4 19 dd b5 ed 28 e9 fd 79 4a a0 d8 9d

c6 7e 37 83 2b 76 53 8e 62 4c 64 88 44 8b fb a2

17 9a 59 f5 87 b3 4f 13 61 45 6d 8d 09 81 7d 32

bd 8f 40 eb 86 b7 7b 0b f0 95 21 22 5c 6b 4e 82

54 d6 65 93 ce 60 b2 1c 73 56 c0 14 a7 8c f1 dc

12 75 ca 1f 3b be e4 d1 42 3d d4 30 a3 3c b6 26

6f bf 0e da 46 69 07 57 27 f2 1d 9b bc 94 43 03

f8 11 c7 f6 90 ef 3e e7 06 c3 d5 2f c8 66 1e d7

08 e8 ea de 80 52 ee f7 84 aa 72 ac 35 4d 6a 2a

96 1a d2 71 5a 15 49 74 4b 9f d0 5e 04 18 a4 ec

c2 e0 41 6e 0f 51 cb cc 24 91 af 50 a1 f4 70 39

99 7c 3a 85 23 b8 b4 7a fc 02 36 5b 25 55 97 31

2d 5d fa 98 e3 8a 92 ae 05 df 29 10 67 6c ba c9

d3 00 e6 cf e1 9e a8 2c 63 16 01 3f 58 e2 89 a9

0d 38 34 1b ab 33 ff b0 bb 48 0c 5f b9 b1 cd 2e

c5 f3 db 47 e5 a5 9c 77 0a a6 20 68 fe 7f c1 ad
Table PI in hexadecimal notation

Binary Left Rotation: <<<


The a-bit left rotation of a 2-byte word w is denoted in the algorithm as w <<< a. The leftmost a bits
move to the rightmost positions.
Binary Right Rotation: >>>
The a-bit right rotation of a 2-byte word w is denoted in the algorithm as w >>> a. The
rightmost a bits move to the leftmost positions.
Mixing Operation
Mixing Operation modifies one data word and increments the counter j during encryption process.
The following three operations are performed during Mixing:
R[i] = R[i] + K[j] +
+ (R[(i+4-1) mod 4] & R[(i+4-2) mod 4])
+ ((~R[(i+4-1) mod 4]) & R[(i+4-3) mod 4])
j = j + 1
R[i] = R[i] <<< s[i]

where j is the counter variable and the vector s contains the following
values: s[0] = 1, s[1] = 2, s[2] = 3, s[3] = 5.
Mashing Operation
Mashing Operation modifies one data word during encryption process.

The following operation is performed during Mashing:


R[i] = R[i] + K[R[(i+4-1) mod 4] & 63]
R-Mixing Operation
R-Mixing Operation modifies one ciphertext word and decrements the counter j during decryption
process.
The following three operations are performed during R-Mixing:
R[i] = R[i] >>> s[i]
R[i] = R[i] - K[j]
- (R[(i+4-1) mod 4] & R[(i+4-2) mod 4])
- ((~R[(i+4-1) mod 4]) & R[(i+4-3) mod 4])
j = j - 1

where j is the counter variable and the vector s contains the following
values: s[0] = 1, s[1] = 2, s[2] = 3, s[3] = 5.
R-Mashing Operation
R-Mashing Operation modifies one data word during decryption process.

The following operation is performed during R-Mashing:


R[i] = R[i] - K[R[(i+4-1) mod 4] & 63]
Triple DES (3DES)
BLOCK CIPHER WITH SYMMETRIC SECRET KEY

o Block length = 64 bits


o Key length = 56, 112, or 168 bits
3DES cipher is quite popular block symmetric cipher, created based on DES cipher. It was
presented in 1998, and described as a standard ANS X9.52. It is also called Triple Data
Encryption Algorithm (TDEA).

Usage
3DES cipher was developed because DES encryption, invented in the early 1970s and protected
by a 56-bit key, turned out to be too week and easy to break using modern computers of that time.
The effective security which 3DES provides is 112 bits, when an attacker uses meet-in-the-middle
attacks.
For several years, Triple DES was often used for electronic payments (for example, in EMV
standard). New protocols based on the cipher are still being created and maintained (as for 2016).
It was also used in several Microsoft products (for example, in Microsoft Outlook 2007, Microsoft
OneNote, Microsoft System Center Configuration Manager 2012) for protecting user configuration
and user data.

Algorithm
Triple DES algorithm performs three iterations of a typical DES algorithm. In its strongest version,
it uses a secret key which consists of 168 bits. The key is then divided into three 56-bit keys.
3DES Encryption
1. Encryption using the first secret key
2. Decryption using the second secret key
3. Encryption using the third secret key
The encryption and decryption operations may be presented as mathematical equations.

Encryption:
c = E3(D2(E1(m)))
Decryption:
m = D1(E2(D3(c)))
3DES with shorter keys
Using DES decryption operation in the second step of 3DES encryption provides backward
compatibility with the original DES algorithm. In this case, the first and second secret keys, or the
second and third secret keys should be identical, and their value is not important.

c= E3(D1(E1(m))) = E3(m)
c = E3(D3(E1(m))) = E1(m)
It is also possible to use the 3DES cipher with a secret key of size of 112 bits. In this case, the
first and third secret keys should be identical. Such an approach is stronger than simple DES
encryption used twice (with two separate 56-bit keys) because it provides better protection
against meet-in-the-middle attacks.
c = E1(D2(E1(m)))
Block Diagram of 3DES Encryption

Block Diagram of 3DES Decryption


Maths:
Transformations in 3DES
3DES is using exactly the same operations for decrypting and encrypting as DES algorithm

Each iteration of DES algorithm executes the following operations for all input data blocks:
the initial permutation, 16 iterations of Feistel functions, and the final permutation.

During key manipulation, the following operations are executed: binary rotation, PC-1
permutation, and PC-2 permutation.

For more details, please visit the description of DES encryption.


AES (Advanced Encryption
Standard)
BLOCK CIPHER WITH SYMMETRIC SECRET KEY

o Block length = 128 bits


o Key length = 128 or 192 or 256 bits
AES is a modern block symmetric cipher, one of the most popular ciphers in the world. It was
developed in 1997 by Vincent Rijmen and Joan Daemen, and later approved as a federal
encryption standard in the United States in 2002.

Usage
AES is considered as a strong and secure cipher. Over last few years (mostly 2005-2010) several
attacks against different AES implementations were described but generally speaking they
concern just some special cases and are not considered to be a threat to the AES algorithm itself.

Algorithm
A secret key in AES, for both data encryption and decryption, may contain 128 or 192 or 256 bits.
Based on the length of the key, a different number of encrypting cycles is performed.

Encryption
During encryption, the input data (plaintext) is divided into 128-bit blocks. The blocks of data are
presented as column-major matrices of size 4 bytes × 4 bytes, called states. The following
operations are performed for all blocks:
1. Preparing Subkeys: one starting subkey is created first, and later one more subkey for
every subsequent cycle of encryption (see below).
2. Initial Round: all bytes of data block are added to corresponding bytes of the starting
subkey using XOR operation.
3. A number of encrypting cycles takes place. The number of repetition depends on the
length of a secret key:
- 9 cycles of repetition for a 128-bit key,
- 11 cycles of repetition for a 192-bit key,
- 13 cycles of repetition for a 256-bit key.

The following operations are performed during each encryption round:


1. Each byte of the state matrix is replaced with another byte, based on a lookup
table, called Rijndael's S-Box. The operation is called the SB (Substitute Bytes)
Operation. The construction of the lookup table guarantees that this substitution
is non-linear.
2. The bytes stored in the last three rows of the state matrix are shifted to the left.
Note, that the bytes in the first row are not shifted at all. The bytes in the second
row are shifted by one position, in the third row by two positions, and the bytes in
the fourth row are shifted by three positions to the left. The leftmost bytes in each
row moves to the right side of the same row. This state is called SR (Shift Rows)
Operation.
3. MC (Mix Columns) Operation (the multiplication of columns): all columns are
multiplied with a constant matrix of size 4 bytes × 4 bytes.
4. AR (Add Round Key) Operation: adding XOR all state bytes to the subkey bytes.
A new subkey is created for every encryption round. Subkeys, like states, are 16-
byte long.
4. Final Round: the same operations are performed as in normal encryption rounds
(described above), besides the multiplication of columns, which in the Final Round is
omitted.
AES Key Expansion
AES uses a secret symmetric key, which contains 128, 192, or 256 bits (that is 16, 24, or 32 bytes
respectively). In order to encrypt all data blocks, the key must be expanded. The new bytes are
appended to the original bytes of the key:

o A 128-bit key (16 bytes) is expanded to 176 bytes.


o A 192-bit key (24 bytes) is expanded to 208 bytes.
o A 256-bit key (32 bytes) is expanded to 240 bytes.
The first bytes of the expanded key are all bytes of the original secret key. In order to create
succeeding bytes of the expanded key, the following steps must be performed, with iterations
numbered from 1. Steps below should be repeated until receiving a desirable number of bytes.
To simplify the notation, the length (in bytes) of the original secret key (before expansion) will be
denoted as n.
1. Creating next 4 bytes of the key:
1. Copying 4 last bytes of the current key to a temporary 4-byte vector.
2. Shifting those four bytes to the left by one position. The leftmost byte should
move to the rightmost position.
3. Each byte in the vector should be replaced by another one, based on Rijndael's
S-Boxes.
4. Rcon Operation: adding XOR the leftmost byte in the vector to a
number 2 raised to the power number equal to (number of current iteration - 1).
5. Adding XOR the received 4-byte vector to a 4-byte block starting n bytes before
the current end of the expanded key and appending the result to the end of the
expanded key. At this point, four new key bytes have been created.
2. Creating next 12 bytes of the key by performing the following steps three times:
1. Copying 4 last bytes of the current key to a temporary 4-byte vector.
2. Adding XOR the 4-byte vector to a 4-byte block starting n bytes before the
current end of the expanded key and appending the result to the end of the
expanded key.
3. If the original key is 256 bits long, the following steps should be performed once in order
to create 4 new key bytes:
1. Copying 4 last bytes of the current key to a temporary 4-byte vector.
2. Each byte in the vector should be replaced by another one, based on Rijndael's
S-Boxes.
3. Adding XOR the 4-byte vector to a 4-byte block starting n bytes before the
current end of the expanded key and appending the result to the end of the
expanded key.
4. If the original key is 128 bits long, the following steps should be omitted. If the original
key is 192 bits long, the following steps should be performed twice. If the original key
is 256 bits long, the following steps should be repeated three times:
1. Copying 4 last bytes of the current key to a temporary 4-byte vector.
2. Adding XOR the 4-byte vector to a 4-byte block starting n bytes before the
current end of the expanded key and appending the result to the end of the
expanded key.
5. Increasing the number of iterations by 1.

Decryption
During decryption, the encrypted text is used as input data to the algorithm. The corresponding,
inverse operations should be performed, as during encryption:

1. Inverse bytes substitution (ISB).


2. Bytes shifting to the right (ISR).
3. Adding XOR to a subkey (IAR).
4. Inverse multiplication of columns (IMC).
Subkeys for each iteration should be taken in the reverse order than during encryption.

AES Performance
In order to accelerate the application, one can decide to pre-compute the functions in different
rounds and replace them by simple byte substitution based on the calculated tables.

The disadvantage of this approach is that the size of the application will be much larger. It may
increase from several to tens of kilobytes, depending on the size of the secret key that is used.

Block Diagram of AES Encryption:


Block Diagram of AES Key Expansion:
Maths:
In all AES operations presented below, the bytes are written in a hexadecimal notation. Each
character represents four bits.

Substitution in Rijndael S-Box


In Rijndael S-Boxes every input byte is replaced by another byte. Values in S-Boxes were chosen
in a way, that provides a maximum non-linearity of this transformation. Thanks to that, the whole
AES encryption is non-linear.

The byte substitutions are presented in a table below. In the rows, there are specified the more
significant halves of input bytes. In the columns, there are the less significant halves of input
bytes. The value of the output byte may be found inside the table, at the intersection of
the specified row and the column.

x0 x1 x2 x3 x4 x5 x6 x7 x8 x9 xA xB xC xD xE xF

0x 63 7c 77 7b f2 6b 6f c5 30 01 67 2b fe d7 ab 76

1x ca 82 c9 7d fa 59 47 f0 ad d4 a2 af 9c a4 72 c0

2x b7 fd 93 26 36 3f f7 cc 34 a5 e5 f1 71 d8 31 15

3x 04 c7 23 c3 18 96 05 9a 07 12 80 e2 eb 27 b2 75

4x 09 83 2c 1a 1b 6e 5a a0 52 3b d6 b3 29 e3 2f 84

5x 53 d1 00 ed 20 fc b1 5b 6a cb be 39 4a 4c 58 cf

6x d0 ef aa fb 43 4d 33 85 45 f9 02 7f 50 3c 9f a8

7x 51 a3 40 8f 92 9d 38 f5 bc b6 da 21 10 ff f3 d2

8x cd 0c 13 ec 5f 97 44 17 c4 a7 7e 3d 64 5d 19 73

9x 60 81 4f dc 22 2a 90 88 46 ee b8 14 de 5e 0b db

Ax e0 32 3a 0a 49 06 24 5c c2 d3 ac 62 91 95 e4 79

Bx e7 c8 37 6d 8d d5 4e a9 6c 56 f4 ea 65 7a ae 08

Cx ba 78 25 2e 1c a6 b4 c6 e8 dd 74 1f 4b bd 8b 8a
Dx 70 3e b5 66 48 03 f6 0e 61 35 57 b9 86 c1 1d 9e

Ex e1 f8 98 11 69 d9 8e 94 9b 1e 87 e9 ce 55 28 df

Fx 8c a1 89 0d bf e6 42 68 41 99 2d 0f b0 54 bb 16

Rijndael S-Box

For example, for an input byte 3F, the new output byte is 75.
For decryption, the Inverse Rijndael S-Boxes are used. They can be obtained from the original
Rijndael S-Boxes.

x0 x1 x2 x3 x4 x5 x6 x7 x8 x9 xA xB xC xD xE xF

0x 52 09 6a d5 30 36 a5 38 bf 40 a3 9e 81 f3 d7 fb

1x 7c e3 39 82 9b 2f ff 87 34 8e 43 44 c4 de e9 cb

2x 54 7b 94 32 a6 c2 23 3d ee 4c 95 0b 42 fa c3 4e

3x 08 2e a1 66 28 d9 24 b2 76 5b a2 49 6d 8b d1 25

4x 72 f8 f6 64 86 68 98 16 d4 a4 5c cc 5d 65 b6 92

5x 6c 70 48 50 fd ed b9 da 5e 15 46 57 a7 8d 9d 84

6x 90 d8 ab 00 8c bc d3 0a f7 e4 58 05 b8 b3 45 06

7x d0 2c 1e 8f ca 3f 0f 02 c1 af bd 03 01 13 8a 6b

8x 3a 91 11 41 4f 67 dc ea 97 f2 cf ce f0 b4 e6 73

9x 96 ac 74 22 e7 ad 35 85 e2 f9 37 e8 1c 75 df 6e

Ax 47 f1 1a 71 1d 29 c5 89 6f b7 62 0e aa 18 be 1b

Bx fc 56 3e 4b c6 d2 79 20 9a db c0 fe 78 cd 5a f4

Cx 1f dd a8 33 88 07 c7 31 b1 12 10 59 27 80 ec 5f

Dx 60 51 7f a9 19 b5 4a 0d 2d e5 7a 9f 93 c9 9c ef
Ex a0 e0 3b 4d ae 2a f5 b0 c8 eb bb 3c 83 53 99 61

Fx 17 2b 04 7e ba 77 d6 26 e1 69 14 63 55 21 0c 7d

Inverse Rijndael S-Box

Multiplication of columns
Each column of a state matrix is multiplied by a predefined matrix of size of 4bytes x 4bytes.
The result of each multiplication is a new column which contains different 4 bytes.

Multiplication of the square matrix with the column c results in creating a new column r, with new
values.
2 3 1 1

1 2 3 1

1 1 2 3

3 1 1 2

c0

c1

c2

c3

r0

r1

r2

r3

Decryption
During decryption, an inverted matrix is used for multiplication:
e b d 9

9 e b d

d 9 e b

b d 9 e

Inverted decryption matrix

Rcon Operation
In each iteration of the key generation process, the first byte of the current 4-byte long temporary
vector is added XOR to 2 raised to the power of number one less than the current iteration
number. The Rcon operation is performed in Rijndael's finite field.
These values can be calculated in runtime or stored in a table in the application memory:

i 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14

xi 01 02 04 08 10 20 40 80 1b 36 6c d8 ab 4d 9a

Powers of x = 0x02

Blowfish
BLOCK CIPHER WITH SYMMETRIC SECRET KEY

o Block length = 64 bits


o Key length: from 32 bits up to 448 bits
Designed in 1993 by Bruce Schneier, an American cryptographer.

Usage
It is used in plenty of encryption products and provides a good encryption quality. There are not
any known effective cryptanalysis of this cipher.

Camellia
BLOCK CIPHER WITH SYMMETRIC SECRET KEY

o Block length = 128 bits


o Key length = 128, 192 or 256 bits
The cipher was developed in Japan by Mitsubishi and NTT companies in 2000. It was later
approved by the International Organization for Standardization (ISO), the European Union's
NESSIE project and the Japanese CRYPTREC project.
Usage
Camellia was designed to be efficient for both software and hardware implementations and it is
used in various devices from low-cost smart cards to high-speed network protocols.

Algorithm
Camellia is a symmetric block cipher with secret key of size of 128, 192 or 256 bits. The length
of plaintext and ciphertext blocks is always equal to 128 bits.

In the following description, the original names of variables and functions the Camellia
documentation are used to describe its algorithm.

The most importnt elements of the algorithm are F-functions. They are used during encryption,
decryption and creating helper variables of the key. The F-function takes 128 input bits, mixes
them with bits of subkeys ki and returns 128 new bits. Modification of bits in the F-function is
referred to as one round of the algorithm. F-function calls are gathered in blocks. Each block
contains six rounds.
Six-round blocks (that means block of six calls of the F-function) are separated by calls of FL-
functions and FL-1-functions. They manipulate 64-bit long parts of data and mix them with
subkeys kli.
Both encryption and decryption algorithms are about to perform some repetitions of the 6-round
blocks described above. The number of repetitions depends on the length of the currently used
secret key:

o 3 repetitions of the 6-round blocks - for the 128-bit secret key,


o 4 repetitions of the 6-round blocks - for the 192-bit or 256-bit secret keys.
Furthermore, at the beginning and at the end of both encryption and decryption algorithms,
additional operations are performed: data bits are added to bits of subkeys kwi.
Subkeys, which are used to encrypt each data block (or to decrypt each ciphertext block), are
created in the separate process. Tens of subkeys are calculated from the secret key for each
block. They are used for various operations in the main algorithm.

Key schedule
The secret key using in the Camellia cipher can consist of 128, 192 or 256 bits. In order to encrypt
data blocks, one have to create a few helper variables and then subkeys, based on secret key
bits. Each subkey is 64-bit long.

At first, one should calculate two variables of size of 128 bits (KL and KR) and four variables of size
of 64 bits (KLL, KLR, KRL and KRR). The following equations describe connections between those
variables:
o KLL = 64 left bits of KL
o KLR = 64 right bits of KL
o KRL = 64 left bits of KR
o KRR = 64 right bits of KR
The rest of connections should be determined based on the length of the secret key K.
o for the 128-bit long key:
o KL = K
o KR = 0
o for the 192-bit long key:
o KL = 128 left bits of K
o KRL = 64 right bits of K
o KRR = ~KRL (negation of bits)
o for the 256-bit long key:
o KL = 128 left bits of K
o KR = 128 right bits of K
Then, it is possible to calculate two new helper variables: KA and KB, based on the previous ones.
They are both 128-bit long. KB is nonzero if and only if the secret key consists of 192 or 256 bits.
While creating KA and KB one should use six help constant values, which are referred to as ∑i.
At the end, based on four 128-bit long just created variables KL, KR, KA and KB, one should compute
all secret subkeys of size of 64 bits: ki, kwi and kli. Subkeys are used in all the steps during
encryption and decryption in the Camellia algorithm.

Block Diagram of Camellia Encryption for 128-bit


Key
Block Diagram of Camellia Encryption for 192 or
256-bit Key
Block Diagram of Camellia Decryption for 128-bit
Key
Block Diagram of Camellia Decryption for 192 or
256-bit Key
Block Diagram of Camellia 6-round Block
Block Diagram of Camellia - Creating Helper
Variables of Key
Maths:
Camellia uses a few basic bitwise operations: bitwise AND, bitwise OR, exclusive OR (XOR), logical
negation on all bits and left circular rotation of the operand by n bits: <<< n (leftmost bits move
to rightmost positions of the variable).
Beyond those operations, there are defined some more complex functions. They are used during
encryption and decryption processes, and during creating subkeys.

S-function
S-function is used inside the F-function. The input data of size of 64 bits are substituted by other
8 bytes, which are returns for further processing.
The function uses four substitution tables. They are referred to as s-boxes.
The input data is divided into eight separate bytes x1,...,x8. x1 contains eight leftmost
bits, x2 contains eight next bits and x8 contains eight last rightmost bits.
Every of the s-blocks changes eight received bits into eight other bits indicated by the table.
The four s-blocks are referred to as s1,...,s4. If y1,...,y8 are the eight subsequent output
bytes (the byte y1 contains leftmost input bits, y8 contains rightmost output bits), modifications
performed by the S-function can be defined as:
y1 = s1(x1)
y2 = s2(x2)
y3 = s3(x3)
y4 = s4(x4)
y5 = s2(x5)
y6 = s3(x6)
y7 = s4(x7)
y8 = s1(x8)
s-box s1 contains 256 following numbers:
112 130 44 236 179 39 192 229 228 133 87 53 234 12 174 65

35 239 107 147 69 25 165 33 237 14 79 78 29 101 146 189

134 184 175 143 124 235 31 206 62 48 220 95 94 197 11 26

166 225 57 202 213 71 93 61 217 1 90 214 81 86 108 77

139 13 154 102 251 204 176 45 116 18 43 32 240 177 132 153

223 76 203 194 52 126 118 5 109 183 169 49 209 23 4 215

20 88 58 97 222 27 17 28 50 15 156 22 83 24 242 34

254 68 207 178 195 181 122 145 36 8 232 168 96 252 105 80

170 208 160 125 161 137 98 151 84 91 30 149 224 255 100 210
16 196 0 72 163 247 117 219 138 3 230 218 9 63 221 148

135 92 131 2 205 74 144 51 115 103 246 243 157 127 191 226

82 155 216 38 200 55 198 59 129 150 111 75 19 190 99 46

233 121 167 140 159 110 188 142 41 245 249 182 47 253 180 89

120 152 6 106 231 70 113 186 212 37 171 66 136 162 141 250

114 7 185 85 248 238 172 10 54 73 42 104 60 56 241 164

64 40 211 123 187 201 67 193 21 227 173 244 119 199 128 158

S-blocks return values determined by the one-byte input number. All cells in the table are
numbered from 0 to 255, from left to right and from top to bottom. For example s1[0] is equal
to 112, s1[1] is equal to 130, and s1[255] is equal to 158.
The remaining three s-boxes can be also defined as tables of 256 numbers. Alternatively, their
substitutions can be defined as operations on s1, with changed input data:
s2(x) := (s1(x) <<< 1)
s3(x) := (s1(x) >>> 1)
s4(x) := s1(x<<<1)
P-function
P-function is used inside the F-function. P-function takes input data of size of 8 bytes (which are
output bytes of the S-function), modifies them and returns the vector, which is also 8-byte long.
Output data of the P-function is also output data of the F-function.
The input data is divided into eight separate bytes x1,...,x8. The byte x1 contains leftmost eight
bits, x2 next eight bits, and x8 contains rightmost eight bits.
If y1,...,y8 are the eight subsequent output bytes (y1 contains eight leftmost bits, y2 next eight
bits, and y8 contains rightmost eight output bits), modifications performed by the P-function can
be defined as:
y1 = x1 XOR x3 XOR x4 XOR x6 XOR x7 XOR x8
y2 = x1 XOR x2 XOR x4 XOR x5 XOR x7 XOR x8
y3 = x1 XOR x2 XOR x3 XOR x5 XOR x6 XOR x8
y4 = x2 XOR x3 XOR x4 XOR x5 XOR x6 XOR x7
y5 = x1 XOR x2 XOR x6 XOR x7 XOR x8
y6 = x2 XOR x3 XOR x5 XOR x7 XOR x8
y7 = x3 XOR x4 XOR x5 XOR x6 XOR x8
y8 = x1 XOR x4 XOR x5 XOR x6 XOR x7
F-function
F-function is on of the main functions. It is used during encryption and decryption processes, and
during creating subkeys. The input data X of size of 64 bits are mixed with one of
the subkeys k (also 64-bit long). The function returns a 64-bit long output block Y.
Data bits are added XOR to key bits and the result is modified by two functions S and P.
(X, k) -> Y => P (S (X XOR k)) -> Y
FL-function
FL-function is used during both encryption and decryption processes. It takes 64-bit long input
data and one of the subkeys, then it performs some modifications and finally it returns a block of
data, which contains also 64 bits.
An input data block is referred to as X, while Y is a 64-bit long output block. kl is one of
the subkeys created before:
(X, kl) -> Y => (XL || XR, klL || klR) -> YL || YR
At the beginning X is divided into two 32-bit long parts: XL contains 32 left bits of X,
and XR contains 32 right bits of X. Then, two new blocks (each 32-bit long) are calculated:
YR = ((XL AND klL) <<< 1) XOR XR
YL = (YR OR klR) XOR XL
YL contains 32 left output bits of the FL-function, and YR contains 32 right output bits.
FL-1-function
FL-1-function is used during both encryption and decryption processes. It takes 64-bit long input
data and one of the subkeys, then it performs some modifications and finally it returns a block of
data, which contains also 64 bits.
An input data block is referred to as X, while Y is a 64-bit long output block. kl is one of
the subkeys created before:
(Y, kl) -> X => (YL || YR, klL || klR) -> XL || XR
At the beginning Y is divided into two 32-bit long parts: YL contains 32 left bits of Y,
and YR contains 32 right bits of Y. Then, two new blocks (each 32-bit long) are calculated:
XL = (YR OR klR) XOR YL
XR = ((XL AND klL) <<< 1) XOR YR
XL contains 32 left output bits of the FL-1-function, and XR contains 32 right output bits.
The key schedule constants
During creating subkeys in encryption and decryption processes, one should use six constant
predefined values, commonly referred to as ∑i.
The values below are 64-bit long and they are presented in hexadecimal.

∑1 =0xA09E667F3BCC908B

∑2 =0xB67AE8584CAA73B2

∑3 =0xC6EF372FE94F82BE

∑4 =0x54FF53A5F1D36F1C

∑5 =0x10E527FADE682D1D

∑6 =0xB05688C2B3E6C1FD

Creating Subkeys
Using four 128-bit long variables KL, KR, KA and KB one should calculate
subkeys ki, kwi and kli (all subkeys have 64 bits).
The table for creating subkeys for the secret key of size of 128 bits:

received
subkey value where used

64 left at the
kw1 bits KL beginning

64 right at the
kw2 bits KL beginning

64 left
k1 bits KA F (round 1)

64 right
k2 bits KA F (round 2)

64 left
bits
k3 (KL <<< 15) F (round 3)

64 right
bits
k4 (KL <<< 15) F (round 4)

64 left
bits
k5 (KA <<< 15) F (round 5)

64 right
bits
k6 (KA <<< 15) F (round 6)

64 left
bits
kl1 (KA <<< 30) FL

64 right
bits
kl2 (KA <<< 30) FL-1
64 left
bits
k7 (KL <<< 45) F (round 7)

64 right
bits
k8 (KL <<< 45) F (round 8)

64 left
bits
k9 (KA <<< 45) F (round 9)

64 right
bits
k10 (KL <<< 60) F (round 10)

64 left
bits
k12 (KA <<< 60) F (round 11)

64 right
bits
k12 (KL <<< 60) F (round 12)

64 left
bits
kl3 (KL <<< 77) FL

64 right
bits
kl4 (KL <<< 77) FL-1

64 left
bits
k13 (KL <<< 94) F (round 13)

64 right
bits
k14 (KL <<< 94) F (round 14)

64 left
bits
k15 (KA <<< 94) F (round 15)
64 right
bits
k16 (KA <<< 94) F (round 16)

64 left
bits
k17 (KL <<< 111) F (round 17)

64 right
bits
k18 (KL <<< 111) F (round 18)

64 left
bits
kw3 (KA <<< 111) at the end

64 right
bits
kw4 (KA <<< 111) at the end

The table for creating subkeys for the secret keys of size of 192 bits and 256 bits:

received
subkey value where used

64 left at the
kw1 bits KL beginning

64 right at the
kw2 bits KL beginning

64 left
k1 bits KB F (round 1)

64 right
k2 bits KB F (round 2)

64 left
bits
k3 (KR <<< 15) F (round 3)
64 right
bits
k4 (KR <<< 15) F (round 4)

64 left
bits
k5 (KA <<< 15) F (round 5)

64 right
bits
k6 (KA <<< 15) F (round 6)

64 left
bits
kl1 (KR <<< 30) FL

64 right
bits
kl2 (KR <<< 30) FL-1

64 left
bits
k7 (KB <<< 30) F (round 7)

64 right
bits
k8 (KB <<< 30) F (round 8)

64 left
bits
k9 (KL <<< 45) F (round 9)

64 right
bits
k10 (KL <<< 45) F (round 10)

64 left
bits
k12 (KA <<< 45) F (round 11)

64 right
bits
k12 (KA <<< 45) F (round 12)
64 left
bits
kl3 (KL <<< 60) FL

64 right
bits
kl4 (KL <<< 60) FL-1

64 left
bits
k13 (KR <<< 60) F (round 13)

64 right
bits
k14 (KR <<< 60) F (round 14)

64 left
bits
k15 (KB <<< 60) F (round 15)

64 right
bits
k16 (KB <<< 60) F (round 16)

64 left
bits
k17 (KL <<< 77) F (round 17)

64 right
bits
k18 (KL <<< 77) F (round 18)

64 left
bits
kl5 (KA <<< 77) FL

64 right
bits
kl6 (KA <<< 77) FL-1

64 left
bits
k19 (KR <<< 94) F (round 19)
64 right
bits
k20 (KR <<< 94) F (round 20)

64 left
bits
k21 (KA <<< 94) F (round 21)

64 right
bits
k22 (KA <<< 94) F (round 22)

64 left
bits
k23 (KL <<< 111) F (round 23)

64 right
bits
k24 (KL <<< 111) F (round 24)

64 left
bits
kw3 (KB <<< 111) at the end

64 right
bits
kw4 (KB <<< 111) at the end

Implementation
On the website of NTT Corporation you can find source codes and detailed descriptions of
the Camellia cipher:
info.isl.ntt.co.jp/crypt/eng/camellia

Serpent
BLOCK CIPHER WITH SYMMETRIC SECRET KEY

o Block length = 128 bits


o Key length = 128, 192 or 256 bits
Created in 1998 by Ross Anderson, Eli Biham and Lars Knudsen. It was a finalist in the contest
of preparing a prototype of the AES cipher, where it was placed second.

Usage
The Serpent cipher is considered to be stronger but also slower than AES. It has not been
patented and is in the public domain. Everybody can use it in his software without any limitations.

Twofish
BLOCK CIPHER WITH SYMMETRIC SECRET KEY

o Block length = 128 bits


o Key length = 128, 192 or 256 bits

Asymmetric Ciphers
Asymmetric ciphers are also referred to as ciphers with public and private keys. They use two
keys, one for encryption of messages and the other one during decryption.

Definition The system of asymmetric encryption consists of three algorithms (G, E, D):

• G( ) - the nondeterministic algorithm which returns a pair of keys (pk, sk),


• E(pk, m) - the nondeterministic algorithm which encrypts plaintext m and returns
ciphertext c,
• D(sk, c) - the deterministic algorithm which decrypts c and returns plaintext m.
All the three algorithms must provide consistency. For each pair of keys (pk, sk) created by G and
for every plaintext message m the following condition must be fulfilled:

• D(sk, E(pk, m)) = m


The public key is widely known and everybody can use it to encrypt any messages. The idea
of asymmetric encryption is that only the owner of the second key (the private key, which is not
known to anybody else), can decrypt the message. Similarly, data encrypted with the private key
can only be decrypted with the corresponding public key.

The intruder can encrypt any messages using the known public key. Asymmetric ciphers are
therefore vulnerable to the chosen plaintext attacks. The ciphers with public key encryption must
provide security against such attacks. After encrypted two messages using the same public key,
the intruder can't be able to distinguish which ciphertext is connected with which plaintext. Also,
an observer which analyses two messages encrypted using the same algorithm and the same
public key, can not be able to distinguish their ciphertexts.
Asymmetric ciphers are much slower than symmetric ciphers (usually thousand times slower). It
is common practice to use public key encryption only to establish the secure connection and
negotiate the new secret key, which is then used to protect further communication by using
symmetric encryption.

Asymmetric Ciphers:
Merkle's Puzzles
KEY-AGREEMENT PROTOCOL
o without authentication
Merkle's Puzzles protocol allows any two parties to negotiate a shared secret key. The secret key
can be later used to protect their further communication.

Usage
Merkle's Puzzles is one of the first algorithms for a public-key cryptosystem. It was invented
by Ralph Merkle in 1974 and published in 1978.

Algorithm
The Merkle's Puzzles algorithm describes a communication between two parties which allows
to create a shared secret key. It is impossible to deduce the key by a potential eavesdropper.
The key may be later use during the further communication, protected by the symmetric
encryption.
The protocol of the exchange of information between two parties (commonly referred to as Alice
and Bob) is presented below. To determine the shared secret key, the following steps should be
performed:

1. One of the parties (Alice) prepares a lot of messages, which consist of a unique random
id and a unique random key.
2. Then, Alice encrypts each message using a weak cipher (for example a symmetric
cipher with a secret 20-bit long key) and send all the encrypted messages to the second
party (to Bob).
3. Bob chooses randomly one of the received encrypted messages and breaks its security
using a brute force attack.
4. Bob informs Alice, what is the id value in the message selected by him.
5. At this point, both Alice and Bob possess the secret shared key, which was in
the message chosen and broken by Bob. They can use it during the further
communication.
The main idea of the algorithm of Merkle's Puzzles is that the big number of messages is
encrypted using a cipher weak enough to be broken by a brute force attack by the receiver.
The secret unique key is specified by the message id transmitted without encryption.
The attacker, who wants to capture the key, can overhear all messages. He would have to break
many messages using brute force attacks, hoping to find the message which was chosen
randomly. After its decryption, he would have had knowledge about the secret key. Therefore,
the key which was used for encryption of the messages must be long enough, that decryption of
a large number of messages would be practically impossible.

Security of Merkle's Puzzles


Assuming that decrypting one message using brute force attack requires n steps and that there
are m messages, the establishment of the secret key takes Alice and Bob O(n+m) time. On
the other hand, to discover the selected key, an eavesdropping intruder must perform calculations
with complexity O(n*m).
Quadratic complexity of decrypting is typically not considered secure enough against an attacker.
Too long key (that means too big value of n) causes using the protocol quite inconvenient for
its users. At present, there are more efficient and convenient ways to use encryption with public
and private keys.
Because the protocol does not provide authentication of the communicating parties, it is
vulnerable to man-in-the-middle attacks. The attacker may establish two sessions of the protocol
(one with Alice and one with Bob) and pretend to be the opposite side to each of them. He creates
two symmetric keys and uses them in the communication with them (still pretending to be
the opposite side to each of them).

Block Diagram of Merkle's Puzzles Protocol

Maths:
Encryption used in Merkle's Puzzles
A lot of messages, which are sent at the beginning of the protocol, must be encrypted by the first
person using a symmetric cipher.

To achieve this, one of the common and popular algorithms of symmetric encryption is used.
The only difference lies in the intentional weakening of the cipher, by used shorter symmetric
keys. Usually, the original random key (let us say 128-bit long) is replaced by the key consisting
of 98 bits set to 0 connected with 30 bits of random bits of the real key. The receiver must break
those 30 random bits to read the message and get the symmetric key.

Diffie–Hellman Protocol
KEY-AGREEMENT PROTOCOL
MESSAGE ENCRYPTION

o without authentication
The Diffie-Hellman Algorithm is one of the most popular key-agreement algorithms. It is used by
many protocols, including SSL/TLS. It is considered to be slightly faster than RSA, which can be
used for the same purpose.
The Diffie-Hellman Protocol may be also used for message encryption using the public key.

Usage
The algorithm was first published by the American cryptographers Whitfield Diffie and Martin
Hellman in 1976. However, it was revealed that the cipher had been discovered even earlier, by
the British intelligence agency (James H. Ellis, Clifford Cocks, and Malcolm J. Williamson) but
remained undisclosed.

Algorithm
The Diffie-Hellman algorithm describes a way of communication between two parties which
allows to create a shared secret key. The key may be later use during the further communication,
protected by the symmetric encryption. It is impossible to deduce the key by a potential
eavesdropper.
The algorithm uses mathematical theory about discrete logarithms in given groups.

The protocol of the exchange of information between two parties (commonly referred to as Alice
and Bob) is presented below. To determine the shared secret key using the Diffie-Hellman
protocol, the following steps should be performed:

1. Alice and Bob agree on a common prime number p and a generating element g.
2. Alice picks a random and secret natural number a and then
sends A = ga mod p to Bob.
3. Bob picks a random and secret natural number b and then
sends B = gb mod p to Alice.
4. Alice computes a number k = Ba mod p.
5. Bob computes the same number k = Ab mod p.
6. At this point, both Alice and Bob possess the secret number k.
The number k computed by both parties is very big - the numbers a and b used in the algorithm
have at least 100 digits and the prime number p has at least 300 digits. The value k can be
changed into the symmetric key using a hash function. The symmetric key is used for
the encryption of further communication.

Public key encryption


It is possible to use the Diffie-Hellman protocol for message encryption using the public and
private keys.

In such a case, Bob could send a message to Alice, encrypting it with Alice's public key, which
contains three numbers: g, p, ga mod p. To send a secret message to Alice, Bob chooses
a random number b, and then sends an unencrypted number gb mod p to Alice. Eventually,
the message itself can be sent, being encrypted by a symmetric key (ga)b mod p.
Only Alice will be able to determine the value of b and decrypt the message. Using the public key
protects the sides from man-in-the-middle attacks.
The algorithm allows asymmetric encryption of data. As present however, RSA is much more
popular algorithm that allows encryption using public and private keys.

Security of the Diffie-Hellman protocol


The protocol is considered secure if the initial numbers are chosen properly. The numbers have
to be random and very big. The attacker would have to solve the discrete logarithm problem in
a given group (it means that all mathematical operations are followed by reduction modulo
another big number). However, no efficient general algorithm for computing discrete logarithms
on conventional computers is known.
After establishing the symmetric key, both parties destroy the values a and b. This prevents
the initial numbers from being stolen by the intruder.
Because the protocol does not provide authentication of the communicating parties, it is
vulnerable to man-in-the-middle attacks. The attacker may establish two sessions of the protocol
(one with Alice and one with Bob) and pretend to be the opposite side to each of them. He sets
two symmetric keys and uses them in the communication with them (still pretending to be
the opposite side to each of them).

Block Diagram of Diffie-Hellman protocol

Maths:
Powers modulo a prime number
Raising a number to the power modulo a prime number is one of the most important operations
in modern cryptography. It is described by the general laws of numbers manipulation in modulo
arithmetic. Both the base and the exponent are positive integers. The result is also a natural
number.

The operation of raising a number to the power modulo a prime number may be presented as
a sequence of operations which consist of raising to the square and then dividing modulo
the prime number all the subsequent results.

For example, raising 4 to the power of 9 modulo 7 can be calculated in the following way:
49 mod 7:
42 mod 7 = 16 mod 7 = 2
4 mod 7
4 = 2 mod 7
2 = 4 mod 7 =4
48 mod 7 = 42 mod 7 = 16 mod 7 =2
49 mod 7 = (41 mod 7 * 48 mod 7) mod 7 = 4 * 2 mod 7 = 1

RSA
MESSAGE ENCRYPTION OR AUTHENTICATION

o Key has usually length of about 1000 to 4000 bits


RSA in one of the most popular algorithms with public key encryption. It can be used for either
encryption of messages or for digital signatures.
Usage
RSA was designed by Ron Rivest, Adi Shamir and Leonard Adleman in 1977.

Algorithm
The RSA algorithm allows to create a pair of keys: a public key and a private key. Everyone can
receive the public key and use it to encrypt a message. Only the owner of the private key would
be able to decrypt that message.

Similarly, the owner of the private key can encrypt some data by using it, thus allowing everyone
else to use the corresponding public key to decrypt the data.

RSA security is based on the practical difficulty of factoring the product of two large prime
numbers (this is so called the factoring problem).

Key Generation
Both RSA keys are generated using the following algorithm:

1. Choose two different prime numbers, usually they are denoted by p and q. The numbers
should be chosen at random and they should be of similar bit-length.
2. Calculate: n = p·q
The number n is used as the modulus for both private and public keys. Its length is
the length of the RSA key.
3. Calculate a value of Euler's totient function for n:
φ(n) = φ(p)·φ(q) = (p − 1)·(q − 1)
4. Choose an integer e that is larger than 1 and smaller than previously computed
value φ(n). The numbers e and φ(n) should be coprime. The number e is used as the
public key exponent.
5. Compute a number d such that: d·e = 1 (mod φ(n))
The number d is used as the private key exponent.
The public key consists of the modulus n and the public exponent e. The private key consists of
the modulus n and the private exponent d. All numbers related to the private key must be kept
secret: both n and d, and three other numbers: p, q and φ(n) which can be used to compute d.
A lot of users can use the same value of e. Its length should be relatively short, because time
complexity of encryption depends significantly on the number of bits of e. A prime
number 216+1 (thus 65537) is often used as the value of e. One can also use much smaller
numbers (for example 3) but they are considered to be less secure in some circumstances.
Each user should possess its own number n (which is computed from the two prime numbers).

Encryption
During encryption one should use a public key (n, e). All messages should be divided into
a number of parts. Then, each part should be converted to a number (that must be larger
than 0 and smaller than n). In practice, the message should be divided into fragments of the size
of a certain number of bits.
Then, every number of the message is raised modulo n to the power e:
ci = mie (mod n)
RSA can be used multiple times (with different keys) to encrypt a message. The received
ciphertext can be decrypted in any order. The result is always the same. It does not matter in
which order the operations have been performed. However, one shouldn't encrypt a message in
this way more than twice, because of attacks based on the Chinese remainder theorem.
Encryption can be performed by using a private key as well. The procedure is the same, as
described above, but the private key (n, d) should be used instead. The receiver will have to use
the public key to decrypt the message.

Decryption
During decryption one should use a private key (n, d).
The received ciphertext consists of numbers, which are smaller than n. Each ciphertext number
ought to be raised modulo n to the power d:
mi = cid (mod n)
The received plaintext numbers should be combined in the correct order into the original plaintext
message.

If the message was encrypted by a private key, decryption should be performed by using
the corresponding public key. The procedure is the same as the one presented above, but for
decryption the public key (n, e) should be used instead.

Message Authentication
RSA can be used to sign messages. A sender should produce a hash value of the message
content and then raise it to the power of d (modulo n). Therefore, he should perform the same
operations as during ordinary encryption procedure. The encoded hash value should be attached
to the message.
The recipient of the message can raise the received encrypted hash value to the power
of e (modulo n) and compare the result with a hash value calculated by him. If both values are
the same, then the recipient is assured that the message hasn't been changed.

Security of RSA
If one used a small exponent e (for example 4) to encrypt a small value m (smaller than n1/e), then
a ciphertext number would be smaller than the modulus n. Such a case allows to determine
the value of m using ordinary arithmetic operations, which are fast and effective.
To protect against the use of the algorithm for encrypting too small plaintext numbers, one should
add random paddings, that would increase the number values. Also, thanks to using random
paddings, the same plaintext numbers are encoded by various ciphertext numbers. There are
a lot of popular padding schemes, for example OAEP, PKCS#1 or RSA-PSS.

The RSA algorithm is deterministic, thus the cipher is vulnerable to chosen plaintext attacks. It is
possible to encrypt a lot of messages using a known public key. Therefore, an attacker can guess
a content of captured encrypted messages by comparing them with the messages created
by him.
Another feature of this cipher is that a ciphertext of the product of two plaintext numbers is
the same as the product of ciphertexts that correspond to those plaintexts.

RSA is generally considered to be a secure cryptosystem. It is used in various applications,


protocols and kinds of communication.

Block Diagram of RSA encryption and decryption


Maths:
Euler's totient function (phi function)
The value of Euler's totient function for a positive integer n, denoted φ(n), is determined by
the number of all positive integers less than or equal to n, that are relatively prime to n.
If n is a positive integer, then φ(n) is equals to the number of such integers k that k is not smaller
than 1 and it is not larger than n, and the numbers n and k are coprime.
The totient function is an multiplicative function. This means that if two numbers a and b are
relatively prime, then:
φ(a·b) = φ(a)·φ(b).
Modular exponentiation
Raising a number to the power modulo a positive number is one of the most important operations
in modern cryptography. It is described by the general laws of numbers manipulation in modulo
arithmetic. Both the base and the exponent are positive integers. The result is also a natural
number.
The operation of raising modulo a number to a power may be presented as a sequence of
operations which consist of raising to the square and then dividing modulo all the subsequent
results.

For example, raising 4 to the power of 9 modulo 7 can be calculated in the following way:
49 mod 7:
42 mod 7 = 16 mod 7 = 2
44 mod 7 = 22 mod 7 = 4 mod 7 =4
48 mod 7 = 42 mod 7 = 16 mod 7 =2
49 mod 7 = (41 mod 7 * 48 mod 7) mod 7 = 4 * 2 mod 7 = 1
As opposed to the common exponentiation, exponentiation in modular arithmetic is difficult and
there are no efficient algorithms to perform the reverse operation.

Attack Models for Cryptanalysis


Attacking a cipher or a cryptographic system may lead to breaking it fully or only partially. After
compromising the security, the attacker may obtain various amounts and kinds of information.
Lars Knudsen, a Danish researcher, proposed the following division for determining the scale of
attacker's success:
o Total break: deducing and obtaining a secret key.
o Global deduction: discovering an algorithm, which allows to decrypt many
messages, without knowing the actual secret key.
o Local deduction: discovering an original plaintext of the specific given
ciphertext.
o Information deduction: obtaining some information about the secret key
or original message (for example, a few bits of the key or information about
a plaintext format).
The best ciphers should protect against all the cipher's failures levels mentioned above. No attack
should be able to reveal any information related to the secret key and plaintext messages.

Theoretical Attack Models:


Known-Plaintext Attack
During known-plaintext attacks, the attacker has an access to the ciphertext and its corresponding
plaintext. His goal is to guess the secret key (or a number of secret keys) or to develop an
algorithm which would allow him to decrypt any further messages.

This gives the attacker much bigger possibilities to break the cipher than just by
performing ciphertext only attacks. However, he is no able to actively provide customized data or
secret keys which would be processed by the cipher.
Known-Plaintext Attack Efficiency
Known-plaintext attacks are most effective when they are used against the simplest kinds of
ciphers. For example, applying them against simple substitution ciphers allows the attacker to
break them almost immediately.
Known-plaintext attacks were commonly used for attacking the ciphers used during the Second
World War. The most notably example would be perhaps the attempts made by the British while
attacking German Enigma ciphers. The English intelligence targeted some common phrases,
commonly appearing in encrypted German messages, like weather forecasts or geographical
names.
The simple XOR cipher, used in the early days of computers, can be also broken easily by
knowing only some parts of plaintext and corresponding encrypted messages.
Modern ciphers are generally resistant against purely known-plaintext attacks. One of the
unfortunate exceptions was the old encryption method using in PKZIP application. Having just
one copy of encrypted file, together with its original version, it was possible to completely recover
the secret key.

In most cases however, the attacker should use more sophisticated types of cryptographic attacks
in order to break a well-designed modern cipher.

Date: 2020-03-09
Chosen-Plaintext Attack
During the chosen-plaintext attack, a cryptanalyst can choose arbitrary plaintext data to be
encrypted and then he receives the corresponding ciphertext. He tries to acquire the secret
encryption key or alternatively to create an algorithm which would allow him to decrypt any
ciphertext messages encrypted using this key (but without actually knowing the secret key).

This is a rather comfortable situation for the attacker. He can obtain more information about
the secret key and about the whole attacked system, because he is able to choose any text to be
processed by the cipher. He can analyse the system behaviour and output ciphertext, based on
any kind of input data.

During breaking deterministic ciphers with the public key, the intruder can easily create
a database with popular ciphertexts, for example with popular queries to the server. After that he
will be able to find the meaning of many intercepted encrypted messages, by simply comparing
them with his own database entries.

The most known chosen-plaintext attacks were performed by the Allied cryptanalysts during
World War II against the German Enigma ciphers.

Adaptive-Chosen-Plaintext Attack
In this kind of chosen-plaintext attack, the intruder has the capability to choose plaintext for
encryption many times. Instead of using one big block of text, it can choose the smaller one,
receive its encrypted ciphertext and then based on the answer, choose another one, and so on.
This allows him to investigate the attacked system in much more details.

Ciphertext-Only (Known
Ciphertext) Attack
During ciphertext-only attacks, the attacker has access only to a number of encrypted messages.
He has no idea what the plaintext data or the secret key may be. The goal is to recover as much
plaintext messages as possible or (preferably) to guess the secret key. After discovering the
encryption key, it will be possible to break all the other messages which have been encrypted by
this key.

While designing encryption algorithms, it is particularly important to secure them against


ciphertext-only attacks, as they are the most obvious starting point for every cryptanalysis. That
is why well prepared and reviewed ciphers are usually not very vulnerable to these kinds of
attacks. However, one may still find examples of protocols that have been broken by various
attacks based on ciphertext-only approach.

There are a few techniques which proved to be very effective even when targeting modern ciphers
and which are based only on the knowledge of the ciphertext messages. The most important
methods are:
o Attack on Two-Time Pad
o Frequency Analysis

Chosen-Ciphertext Attack
During the chosen-ciphertext attack, a cryptanalyst can analyse any chosen ciphertexts together
with their corresponding plaintexts. His goal is to acquire a secret key or to get as many
information about the attacked system as possible.

The attacker has capability to make the victim (who obviously knows the secret key) decrypt any
ciphertext and send him back the result. By analysing the chosen ciphertext and the
corresponding received plaintext, the intruder tries to guess the secret key which has been used
by the victim.

Chosen-ciphertext attacks are usually used for breaking systems with public key encryption. For
example, early versions of the RSA cipher were vulnerable to such attacks. They are used less
often for attacking systems protected by symmetric ciphers. Some self-synchronizing stream
ciphers have been also attacked successfully in that way.

Adaptive-Chosen-Ciphertext Attack
The adaptive-chosen-ciphertext attack is a kind of chosen-ciphertext attacks, during which
an attacker can make the attacked system decrypt many different ciphertexts. This means that
the new ciphertexts are created based on responses (plaintexts) received previously. The attacker
can request decrypting of many ciphertexts.

There exist rather few practical adaptive-chosen-ciphertext attacks. This model is rather used for
analysing the security of a given system. Proving that this attack doesn't break the security
confirms that any realistic chosen-ciphertext attack will not succeed.

Chosen-Key Attack
Chosen-key attacks are a bit different than other kinds of cryptographic attacks. Usually, they are
intended to not just break a cipher but to break the larger system which relies on that cipher.

The attacker should have some knowledge regarding the relationship between various keys that
can be used in the cipher. Usually, he knows exactly what keys have been used or he himself can
choose the secret key.

An example of a chosen-key attack can be a situation when an intruder tries to compromise a


hash function based on a block cipher. If the attacker was able to find two different keys which
would produce two block cipher outputs that are somehow related to each other, this would mean
that the main property of hash functions (never produce predictable output!) had been broken.

Cryptographic Attacks:
Brute-Force Attack
During the brute-force attack, the intruder tries all possible keys (or passwords), and checks which
one of them returns the correct plaintext. A brute-force attack is also called an exhaustive key
search.

An amount of time that is necessary to break a cipher is proportional to the size of the secret key.
The maximum number of attempts is equal to 2key size, where key size is the number of bits in
the key. Nowadays, it is possible to break a cipher with around 60-bit long key, by using the brute-
force attack in less than one day.
Using brute-force attacks may be beneficial against all ciphers in which the number of all possible
keys values is smaller than the number of all possible different messages. Therefore, all ciphers
may be targeted, with the exception of ciphers providing perfect security.
For breaking ciphers using brute-force attacks, very fast specially designed supercomputers are
often used. They are owned by big research laboratories or government agencies, and they
contain tens or hundreds of processors. Alternatively, large networks of thousands of regular
computers working together may be used to break the same cipher. Cryptographic brute-force
attacks are very scalable processes.

Dictionary Attack
Dictionary attacks are a kind of brute-force attacks, in which the intruder attempts to guess
a password by trying existing words or popular expressions.

Such an approach reduces significantly the number of possible passwords that have to be tested.
On the other hand, users often choose (or are required) to add some additional characters, like
numbers, to their passwords, thus making the passwords impossible to be found in dictionaries.
The applications that perform dictionary attacks often perform some common modifications of
tested words, for example they may append current years.

Reverse Brute-Force Attack


In reverse brute-force attack, the intruder tests a single (usually popular) password against
multiple victims. Usually a popular expression, like a word 'password', is tried against a huge
number of users. The attacker does not target a specified user but rather the whole system which
is used by them.

To prevent such attacks, administrators can ban using some popular and too predictable
passwords.

Denial-of-Service Attack
A Denial-of-Service attack (DoS attack) is an attack where an attacker attempts to disrupt the
services provided by a host, by not allowing its intended users to access the host from the Internet.
If the attack succeeds, the targeted computer will become unresponsive and nobody will be able
to connect with it.
DoS Techniques
There are a lot of methods that can be used to disable a server.

Reducing Performance
The most popular techniques are based on flooding the attacked system with thousands of fake
messages, thus forcing it to deal with them and making it unable to react to genuine requests
from the real users or clients.

By preparing the messages carefully and targeting the correct parts of the system, it is possible
to prepare such requests that would cause most difficulties to the victim's computer. Their
processing should be as time consuming as possible, and the maximum of server's computational
power should be used up.

Exhausting Resources
Instead of just sending random messages, the attacks may be designed to use up all available
host's resources of some particular type. For example, the attacker may prepare the messages
that would lead to allocating all host's network connections, thus making it unable to accept any
other network requests.

An example of this attack type is a SYN flood. During this attack a victim's computer receives
thousands of fake TCP/SYN packages, which force it to open separate TCP connections for each
of them.
Similarly, the messages may be designed in a way that will cause the server to fill up the whole
available memory or disc space (for example, with log messages of core dump files).

Crashing
Finally, other methods of DoS attacks are supposed to completely crash the attacked host, by
using some known vulnerabilities of its software.

This may be achieved for example by sending malformed messages which cause troubles for the
handlers on the server side. A lot of operating systems were vulnerable to the attacks of this type,
that were targeting the Internet Layer (therefore, they were dealing with IP addresses).

Targeting Layers
By choosing the way of constructing the messages, the attacker can target different network
layers of the attacked system. Usually attacks are performed against the application layer or and
the functionalities that handle popular lower protocols TCP or UDP. The complexity of high-level
algorithms allows the intruders to construct a lot of complicated messages, targeting various
vulnerabilities of the attacked systems.
More sophisticated attacks target lower layers of the TCP/IP network model. For example, there
exist a lot of tools working similarly to popular ping programs. They create large numbers of IP
packages, which are supposed to flood the network and reduce the network's bandwidth. The
packages may be either valid (in this case we could call the attack ICMP flooding) or invalid
(which may lead to the so-called Nuke attacks).
A popular lower layer attack is called a ping of death. This is basically a malformed ping package,
which may lead to a system crash on unprepared systems.
Attacker's Goal
Disabling the attacked computer may be a goal by itself. This is often the case in various political
attacks, when intruders want mainly to manifest their slogans. Also, it is often enough to disable
a targeted system or even just to pose the threat of doing that if the attackers want only to demand
a ransom for stopping the attack.

Other DoS attacks are more sophisticated. In such situations, disabling online services is just the
first step, and the attack will be continued in order to exploit the vulnerability of the system. Quite
often, after removing the original server, the attacker creates its own identical service which is
supposed to imitate the original one. Having a fake copy of the attacked system, the attackers
may take advantage of its users, and use the controlled fake server to steal their data.

The most dangerous DoS attacks are perhaps the attacks which result in damaging the actual
hardware. They are called permanent denial-of-service attacks or phlashing. A well-designed
attack may disable the components of the targeted system which are crucial for the actual
mechanical devices, thus breaking them and forcing the administrators to reinstall or even replace
damaged hardware.

DDoS (Distributed Denial-of-Service) Attack


A distributed denial-of-service is an attack where the targeted system is attacked by large number
of other machines, often located in different places, sometimes all around the world. The
complexity of such action is much higher, due to the necessity of configuring and coordinating a
large number of machines. On the other hand, the computational power of all connected devices
is also much bigger, which makes such attacks much more dangerous.

Thanks to the usage of thousands of computers, the number of generated messages, that have
to be handled by the attacked system, is really huge. Nowadays, the largest DDoS attacks can
generate as many as terabits of data per second.

Degradation-of-Service
Degradation-of-service attacks are similar to denial-of-service attacks but they are intended to not
completely block the server but rather to disturb it and reduce its performance. The amount of
sending messages is much smaller, and the server should be able to cope with the increased
traffic.

Therefore, these attacks are not so dangerous as the normal DoS attacks. They are intended to
reduce the performance of the attacked host, discouraging its clients, and to force the
administrators to take additional actions to improve the server's performance. All that results in
increased costs and financial damages which will affect the attacked system.

Well performed degradation-of-service attacks are designed in a way, which makes it not clear
for the administrators, whether any attack takes place at all or if they just face an increased traffic.

Reflected (Spoofed) Attack


The spoofed attack is similar to DDoS attacks because it also involves flooding the attacked
system with messages from many different sources and locations. However, the attacker, instead
of sending the messages directly to the victim, first send them to other computers, which reflect
them and resend to the targeted system.
This attack is also called the DRDoS (which means distributed reflected denial-of-service).
The attackers have to prepare specially constructed messages, with the fields indicating that they
were sent from the victim's computer (that is, with the fake source IP address). Then, those
messages are sent to many different machines, often genuine and located all over the world. The
servers that received the messages, which were surely unexpected but more-or-less valid, can
do only one sensible thing: send an error message to the victim's computer. As a result, the
targeted system will receive thousands of unexpected messages, which have to be dealt with.

Slowloris Attacks
During the Slowloris attacks, the attacker sends the request slowly but in a large number. The
targeted system has to keep the all the connection canals open, because they are perfectly valid
and therefore there is no reason to discard them.

This will result in using up the whole available pool of network connections on the server side.

These attacks require less sophisticated hardware to be used by the intruders, and make both
the detection and protection against them more difficult.

HTTP Post DoS Attack


An example of this type of attacks is the HTTP Post DoS attack. The HTTP message sent by the
intruder contains the HTTP header Content-Length with a large value.
The beginning of the message (the part containing the header) is received promptly by the
attacked host but the rest of the request is sending to the server at an extremely slow rate. Due
to the fact, that the message is valid, the server cannot discard it, and an allocated process
continue waiting for incoming bytes.

Of course, the attacker creates hundreds or thousands of such connections, depleting the
resources of the targeted system.

Shrew Attack
The shrew attack is another example of a DoS attack which is based on sending messages to the
attacked system at a slow rate. It targets the TCP protocol, so operates on a lower level than the
HTTP Post DoS attack, described above.

An attacker sends the messages at a carefully chosen rate, exploiting the TCP retransmission
mechanism. The TCP connections are not allowed to be closed, due to the ongoing
communication, and soon the whole TCP traffic may be disrupted.

Zombie Computers
DoS attacks are often performed indirectly. It means that the messages are not sent from the
intruder's computer but from other machines, which are controlled by the attacker. Those
machines are called zombies because their genuine users don't have the full control over them.
Quite often, zombie computers are infected by specialized malware. After receiving the order from
the attackers, the hidden applications will start sending packets to the targeted system. The users
working on those machines won't often be aware of their participation in the attack.
Using zombie computers have at least two advantages. Firstly, it allows to create large networks
of computers, which will attempt to break the target system. The computational power of many
(hundreds or thousands) connected zombies working together is much larger that the power of
any other possible network that could be built by the intruders.

Secondly, similarly as in the case of any other cryptographic attack, an additional separation
between the attacked system and the attackers always increases the difficulty of any
countermeasures that can be taken by the system administrators. For example, blocking the IP
addresses of zombie computers located all over the world and belonging to different operators is
much more difficult than just blocking the access from one organisation or location.

Tools
There exist a lot of tools and applications available in the Internet that can perform various types
of DoS attacks. In fact, the underground market offers a variety of products, with different features
and prices. One could name programs like GCHQ, HOIC, or MyDoom.
One could mention also two tools for DDoS attacks, which were created in the UK. They are
called Predators Face and Rolling Thunder.
There are also tools which can be used for Slowloris attacks, like PyLoris, QSlowloris, and Goloris.

Man-in-the-Middle Attack
During the man-in-the-middle attack, the hidden intruder joins the communication and intercepts
all messages.

First, the attacker creates two secret keys. Then, he uses the first key to start the communication
with the first side. The received answer is encrypted but the intruder can decrypt it easily, as he
knows the key. He encrypts the message again, this time with the second key. The encrypted
message is then send back to the second side. Then, after receiving the answer from the second
side, he decrypts the message, reads it, encrypts by the first key and sends back to the first site.
In this way, the whole communication moves through the attacker. He can receive a lot
of information about the whole system and even successfully impersonate authorized persons
and reach the access for hidden data.

To defend against this attack, a strong mutual authentication method must be used before starting
transmission of secret data. The other way of protection is to use known public keys, which can
be reach from for example known databases, instead of using any encryption key obtained from
one of the sides of the communication (so in this case - from the attacker).

This attack is often used for eavesdropping the communication with Wi-Fi access points or with
base stations in GSM networks. As an example, you can refer to the KRACK attack against
WPA2.

Attack on Two-Time Pad


The general rule of cryptography says that one should never use the same keystream characters
more than once. Otherwise, the cipher starts to be vulnerable to ciphertext-only attacks.
The following example shows, how the security of the OTP cipher is affected by using the same
keystream bytes twice:
c1 <- m1 XOR PRG(k)
c2 <- m2 XOR PRG(k)
Having the two ciphertexts, an eavesdropper is able to break the cipher just by adding them
together:

c1 XOR c2 = m1 XOR PRG(k) XOR m2 XOR PRG(k) = m1 XOR m2


The received byte sequence does not depend on the secret key. Due to the fact that there is
enough redundancy in languages and in ASCII encoding, the attacker is able to extract the original
messages:

m1 XOR m2 -> m1, m2


Nowadays, well-designed algorithms of symmetric ciphers add some unique (for every piece
of data) characters to secret key bits. In the simplest case, a regular counter could be used. It
may be stored on a few bytes and it should increase every iteration of the encryption algorithm.
This guarantees effective encryption, without the risk of repetition of secret key bits.

Venona Project
During and after the Second World War, hundreds of cryptanalysts of intelligence agencies of
the United States and the United Kingdom were collaborating against intelligence agencies of
the Soviet Union. All the messages sent by Soviet spies and diplomats were constantly stored
and analysed. The most important messages were encrypted with a One-Time Pad system.

Mane secret Soviet messages were revealed, due to a serious blunder on the part of the Soviets.
Because of shortages of code books, the operators reused some parts of the secret OTP keys
for encryption of multiple messages. Every page of the code book should have been used exactly
once, and then it should have been destroyed

This mistake broke the security of the One-Time Pad cipher. It allowed the Allies to decrypt many
secret messages and gained advantage over their communist opponents.

MS-PPTP
PPTP (Point to Point Tunnelling Protocol) is one of communication protocols, which allow
to create virtual private networks (VPN) using tunnelling. Implementation of this protocol created
by Microsoft was one of the most popular (used in Windows 98 and Windows NT), and also one
of the most faulty. MS-PPTP has been considered cryptographically broken by Microsoft since
2012, and it is no longer recommended.

One of the MS-PPTP weaknesses is the lack of proper synchronisation between the client and
the server. They use the same secret key (usually created from the user's password) in exactly
the same way, for sending their messages. Both parties fail to create unique keystreams by
adding some unique numbers.
Encryption in the MS-PPTP protocol

The client groups his messages together, and then encrypts them by using the shared secret key.
In the meantime, the same operations are performed by the server. It also groups the messages,
encrypts using by the same shared secret key, and sends them to the client.

Because the used secret key bytes are the same, the attacker may eavesdrop messages from
the client and from the server, which are encoded by using exactly the same keystream bytes.
Having such data, the attacker has great chances for breaking the cipher and recovering the
original data.

802.11 WEP
802.11 is a group of IEEE (Institute of Electrical and Electronics Engineers) standards of wireless
network protocols. In their older versions, it was recommended to use a WEP (Wired Equivalent
Privacy) standard (created in 1997) for encryption of wireless transmission.

WEP encryption
The messages exchanged between client and host are encrypted using RC4 stream symmetric
cipher. Both sides use the same 5-byte long secret key for generating keystream. Each side
generates the same keystream. To ensure that every message is encrypted using different bytes,
they add three additional bytes of the IV vector to every key sequence. IV is added unencrypted
to each encrypted message. This allows the receiver to decrypt all messages.
However, due to the fact that the IV vector has only 24 bits, so after relatively short time its values
begin to repeat. It happens after around 16 million frames, so (if network traffic is high), after
around 5 hours. Moreover, some devices reset the vector IV during the restart, which allows to
observe the same byte sequences even faster.
There exist a few other flaws that make WEP even less secure. Usually, regular counters are
used for creating the IV vector. It makes the key bytes used for encryption messages in both
directions relatively similar. Moreover, some values of the IV vector are considered weak because
they allow to attack specific bytes of the secret key. Because of all these weaknesses, the WEP
encryption can be broken in a very short time (within a few minutes).
In newer versions of IEEE standards, newer security protocols are recommended: WPA (Wi-Fi
Protected Access) and WPA2.

Key Reinstallation Attack


Another example of an attack based on the two-time-pad vulnerability is KRACK attack, presented
in October 2017. Due to its complexity, it was presented on a separate subpage.

KRACK
Key Reinstallation Attack (KRACK) is a complex attack against the WPA2 protocol. It is a
combination of a known-ciphertext attack and a man-in-the-middle attack. The intruder performs
the attack during the WPA2 handshake, that is during the initialisation of WPA2 connection. The
attack is based on flaws in the standard and its implementations.
At the moment when the details regarding this attack were published (on 16th October 2017) most
existing WPA2 implementation were vulnerable to KRACK attack. The authors carefully prepared
their publication, by creating a dedicated website, creating videos, and even preparing a special
logo (available in many resolutions). They proposed how to fix the standard and prepared patches
for all major WPA2 implementations, which should protect them from being vulnerable to attacks
based on KRACK.

One may predict that the lifetime of KRACK attack will not be particularly long. Most producers
released the high-priority patches, which fix the issue, shortly after the publication. However, it
seems reasonable to briefly present the way of performing this attack. This is a practical example
of cryptographic attack on two-time-pad.

WPA2 Secret Key


To authenticate itself to the WPA2 wireless network, every device exchange with the router a
number of messages. This allows them to generate a common secret key, which will be used to
encrypt all their further communication. In order to maintain security, it is required that the key is
unique. This means, in particular, that this key has not been used to protect any earlier
communication. Read more about the attacks on two-time-pad to find out why this situation should
be avoided.
The secret key is created based on a shared network password. In order to make sure that the
kay is unique, both sides (the client accessing the network and the wireless router) generate the
key based also on some random numbers, called nounces, and their MAC addresses.
In WPA2 protocol, to create the secret key, the client and the router exchange four messages.
Two first messages carry the random numbers chosen by both of them and transmitted to the
other device. After exchanging those two messages, the router sends a third message to the
client. It contains a group key (which is not important for us now) and makes the client generate
the secret key. In the last message the client just sends the confirmation to the router.

Performing the Attack


KRACK attack is about forcing the communicating sides to use the same secret keys
multiple times.

To achieve that, the first step of the intruder is to perform the man-in-the-middle attack. After
faking his MAC address, he will locate himself between both communicating sides. He will pretend
that he is one of them. The attacker must be able to intercept and block the messages sending
between the router and the client.
In the second step, the attacker should intercept and save the third handshake message. He must
also block the client response (the fourth handshake message) and prevent the router from
receiving it.

The faulty WPA2 specification recommends the client to generate the secret key every time after
receiving the third handshake message. Thanks to that, after some time the attacker can send
the third message again, which sets the key to the same value as previously (because it is created
based on the same data as before: the password, two nounces and the MAC address.

Depending on the particular algorithm used for encryption of further communication between the
router and the client, the attacker may compromise the security to a different extent: starting from
discovering the secret key protecting the communication in one direction, through the two-
direction communication key (CCMP and GCMP protocols), up to breaking the key completely
(because some Linux and Android versions reset the key to zeros as a result of this attack).

It is worth mentioning that the attacker cannot discover the network password stored on both
devices. Only the secret key is stolen, which is used only for message encryption during the
current session.

Protection against KRACK


KRACK attacks can be easily prevented, by changing the parts of code responsible for WPA2
handshake.

First of all, after receiving the third handshake message and before actually resetting the key, it
would be a good idea to check if the key hasn't been already generated. If the key exists, it
shouldn't be calculated again.
Also, the values of the nounces and the counters (incremented after sending every message)
shouldn't be reset if the key already exists and is used for encryption of ongoing communication.

Conclusion
KRACK attack presented on this webpage is an interesting example of man-in-the-middle attack,
performed in order to break two-time-pad encryption.
Above, I presented only the most popular version of this attack. The authors presented a number
of attacks on similar protocols (Fast BSS, TDSL, PeerKey), which base on similar handshake
algorithms. The presented also the way of stealing the group key and various modifications of this
attack for different protocol versions and operating systems. To find out more, you can visit the
website which is devoted to KRACK attack: www.krackattacks.com.

Frequency Analysis
Frequency analysis is one of the known ciphertext attacks. It is based on the study of
the frequency of letters or groups of letters in a ciphertext.
In all languages, different letters are used with different frequencies. For each language
proportions of appearance of all characters are slightly different, so texts written in a given
language have some certain common properties, which allow to distinguish them from texts
written in other languages.

For example, in English there are often used vowels like e, o, a or a consonant t. On the other
hand, there are some very rare letters, for example z or x. There are statements of frequencies
of letters in different languages. The frequencies can be determined only approximately because
in different kind of texts (scientific, historical, fiction) they are slightly different.
Each language has some typical and popular sequences of letters. In English, there are some
common bigrams, like tr, er, on, an, ss, tt and ee. Based on that, one can distinguish
an English text from texts written in other languages. It is possible to determine the correct order
of letters from mixed words.

Frequency Analysis of Substitution Ciphers


Frequency analysis is used for breaking substitution ciphers. The general idea is to find
the popular letters in the ciphertext and try to replace them by the common letters in the used
language.

The attacker usually checks some possibilities and makes some substitutions of letters
in ciphertext. He looks for possible appearing words and based on that makes more substitutions.
Using computers, it is possible to try a lot of combinations in relative short time.

For example, if in the analyzed ciphertext the most popular letter is x, one may predict
that x replaced e or o (one of the most popular letters in English) from the plaintext.
It is useful to look for popular pairs of letters or even try to predict some frequent longer
sequences of letters or whole words. The intruder always tries to find sequences of letters which
are often used in the selected language.

Meet-in-the-middle Attack
The meet-in-the-middle attack is one of the types of known plaintext attacks. The intruder has
to know some parts of plaintext and their ciphertexts. Using meet-in-the-middle attacks it is
possible to break ciphers, which have two or more secret keys for multiple encryption using the
same algorithm. For example, the 3DES cipher works in this way. Meet-in-the-middle attack was
first presented by Diffie and Hellman for cryptanalysis of DES algorithm.
A cipher, which is to be broken using meet-in-the-middle attack, can be defined as two
algorithms, one for encryption and one for decryption. Each of them contains two simpler
algorithms:
C = Eb(kb, Ea(ka, P))
P = Da(ka, Db(kb, C))

where:

• C is a ciphertext,
• P is a plaintext,
• E is an algorithm for encryption,
• D is an algorithm for decryption,
• ka and kb are two secret keys
A following equation can be written for the cipher defined above:
Db(kb, C) = Ea(ka, P)
Where C is the ciphertext, known to the intruder, which corresponds to the message P, also
known to the intruder.
The first step of the attack is to create a table with all possible values for one side of the equation.
One should calculate all possible ciphertexts of the known plaintext P created using the first secret
key, so Ea(ka,P). A number of rows in the table is equal to a number of possible secret keys. It
is good idea to sort the received table based on received ciphertexts Ea(ka,P), in order to simplify
its further searching.
The second step of the attack is to calculate values of Db(kb,C) for the second side of
the equation. One should compare them with the values of the first side of the equation, computed
earlier and stored in the table. The intruder searches a pair of secret keys ka and kb, for which
the value Ea(ka,P) found in the table and the just calculated value Db(kb,C) are the same.
The scheme of meet-in-the-middle attack

It is possible to attack encryption systems, where two encrypting algorithms E are different (and
used keys which have not necessarily the same lengths). In that case, in the first step the table is
created for weaker of two algorithms.

Meet-in-the-middle Complexity
The meet-in-the-middle attack allows much quicker breaking of the cipher than using the
ordinary brute force attack. Both time complexity and computational complexity depend on
lengths of two encrypting keys ka and kb. They may be presented as a sum of two products:
2len(ka) log(2len(ka)) + 2len(kb) log(2len(ka))
Where:
o 2len(ka) - creating the table with all possible values of Ea(ka,P),
o log(2len(ka)) - sorting the table with all possible values of Ea(ka,P),
o 2len(kb) - calculating all possible values of Db(kb,C),
o log(2len(ka)) - searching the sorted table with values of Ea(ka,P).
If lengths of both keys ka and kb are the same and equal to Lk, then time complexity of the meet-
in-the-middle attack can be presented as O(2Lk+1). Memory usage can be approximated
as O(2Lk). Time complexity of the brute force attack is much greater and equals to
approximately O(2Lk+Lk). However, the brute force attack uses only O(1) memory.

Meet-in-the-middle 2D
If an analyzed algorihtm can be divided into two simpler algorithms with one intermediate state
and if the state is smaller than a secret key, then is is possible to perform the two-dimentional
meet-in-the-middle attack. In modern block ciphers, algorithms often operate on small data blocks
using the quite long secret key.

If it is possible to find the intermediate state S, then the analyzed cipher may can be presented as:
C = E1(k1, E2(k2, P))
P = D2(k2, D1(k1, C))
Where values E2(k2,P) can be find in the set which contains all possible values of
the intermediate state S.
Both encryption algorithms E1 and E2 can be broken using a two-dimentional meet-in-the-middle
attack. A scheme of the two-dimentional attack is presented below:

In order to break a cipher using the two-dimensional meet-in-the-middle attack, one should take
the following steps:

1. Calculate all possible values of Ea1(ka1,P) (for known P and all possible values of
the key ka1), then insert them to a table together with values of corresponding keys ka1.
The table should be sorted by calculated values of Ea1(ka1,P).
2. Calculate all possible values of Db2(kb2,C) (for known C corresponding to P and all
possible values of the key kb2), then insert them to a table together with values
of corresponding keys kb2. The table should be sorted by calculated values of Db2(kb2,C).
3. For all possible values of the intermediate state S:
1. Calculate all possible values of Db1(kb1,S) (for all possible values of the key kb1),
then insert them to a table together with values of corresponding keys kb1. The
table should be sorted by calculated values of Db1(kb1,S).
2. Calculate all possible values of Ea2(ka2,S) (for all possible values of the key ka2),
then insert them to a table together with values of corresponding keys ka2.
The table should be sorted by calculated values of Ea2(ka2,S).
3. Compare values in four created tables, searching for
equality Ea1(ka1,P) = Db1(kb1,S) (one should receive a pair of keys (ka1,kb1))
and Ea2(ka2,S) = Db2(kb2,C) (a pair of keys (ka2,kb2)). All combinations of the two
pairs are the potential secret key for the whole cipher. One should check all
received combinations with other known plaintext and ciphertexts blocks.
Usually four extracted keys ka1, kb1, ka2 and kb2 share some bits. One should assign
an independent variable to each bit in the keys and treat them separately.

Meet-in-the-middle nD
It is possible that the attacked cipher can be divided into more than two simpler ciphers. In
the general case one could find n intermediate states and n+1 encryption algorithms which can
be break using the meet-in-the-middle method. A scheme of the multidimensional meet-in-the-
middle attack is presented below:

In order to break a cipher using the multidimensional meet-in-the-middle attack, one should take
the following steps:

1. Calculate all possible values of Ea1(ka1,P) (for known P and all possible values of
the key ka1), then insert them to a table together with values of corresponding keys ka1.
The table should be sorted by calculated values of Ea1(ka1,P).
2. Calculate all possible values of Dbn+1(kbn+1,C) (dla znanego C opowiadającego P and all
possible values of the key kbn+1), then insert them to a table together with values
of corresponding keys kbn+1. The table should be sorted by calculated values
of Dbn+1(kbn+1,C).
3. For all possible values of the intermediate state S1:
1. Calculate all possible values of Db1(kb1,S1) (for all possible values of the key kb1),
then insert them to a table together with values of corresponding keys kb1.
The table should be sorted by calculated values of Db1(kb1,S1).
2. Calculate all possible values of Ea2(ka2,S1) (for all possible values of the key ka2),
then insert them to a table together with values of corresponding keys ka2.
The table should be sorted by calculated values of Ea2(ka2,S1).
3. For all possible values of the intermediate state S2:
1. Calculate all possible values of Db2(kb2,S2) (for all possible values of
the key kb2), then insert them to a table together with values
of corresponding keys kb2. The table should be sorted by calculated
values of Db2(kb2,S2).
2. ...powtarzać analogiczne operacje aż do przejściowego stanu Sn...
3. For all possible values of the intermediate state Sn:
1. Calculate all possible values of Dbn(kbn,Sn) (for all possible values
of the key kbn), then insert them to a table together with values
of corresponding keys kbn. The table should be sorted by
calculated values of Dbn(kbn,Sn).
2. Calculate all possible values of Ean+1(kan+1,Sn) (for all possible
values of the key kan+1), then insert them to a table together with
values of corresponding keys kan+1. The table should be sorted by
calculated values of Ean+1(kan+1,Sn).
3. Analyze numbers in all created tables, comparing corresponding
values of Eai with Dbi. One should receive a few combinations of
corresponding pairs of keys (kai,kbi). The pairs should
be checked for other known parts of plaintext and ciphertext and
one combination of pairs should work encrypt and decrypt
properly all data.
One can choose an arbitrary order of analyzed intermediate states. The states are smaller than
encryption keys, so tables created in the multidimensional meet-in-the-middle attack have
approximately the same size as tables created during the regular meet-in-the-middle attack.

Replay Attack
During replay attacks the intruder sends to the victim the same message as was already used in
the victim's communication. The message is correctly encrypted, so its receiver may treat is as
a correct request and take actions desired by the intruder.

The attacker might either have eavesdropped a message between two sides before or he may
know the message format from his previous communication with one of the sides. This message
may contain some kind of the secret key and be used for authentication.

For example, when one makes an order to the bank to transfer money to some specified account,
the attacker may eavesdrop the frames. If that happens, the attacker can send the same (correct)
messages to the bank one more time, hoping that the bank will transfer money again to the same
account (probably connected with the intruder).

There are some methods to avoid replay attacks. First of all, before starting the communication
both sides may negotiate and create a random session key, valid only for a specified time and
during a specified process. Instead of session keys, it is also reasonable to use timestamps in all
messages and accept messages that have not been sent too long ago. The other popular
technique is to use one-time passwords for each request. This method of prevention is very often
used for banking operations.

Cut-and-Paste Attack
In this variation of replay attack, an attacker mixes parts of different ciphertexts and sends them
to the victim. Most likely, the newly created message will be incorrect but the receiver may react
in such a way which will allow the intruder to obtain more information about the attacked system.

Date: 2020-03-09

Homograph Attack
A homograph attack is based on standards of modern Internet that allow to create (and display in
web browsers) URLs with characters from various language sets (with non-ASCII letters).
Different languages may contain different but very similar characters. Attackers can register their
own domain names that are similar to the existing web addresses. Then they can create their own
websites that are, again, the same or very similar to the existing original sites (that usually belong
to banks, corporations, email or news services). The phony websites are used for stealing data
from users who happened to visit them.

Simple homograph attacks


In the simplest version of such attacks, a fake URL may consist only of simple ASCII alphanumeric
characters. The intruder uses symbols that are similar to each other. Often the letter q may be
confused with g, or o with 0.
Such URLs may fool some less experienced users:

http://www.g00gle.co.uk
http://bl00mberg.com

Non-ASCII ULRs
The ability to use non-English characters in ULR addresses was added in 2003, due to
the increasing number of non-English-speaking people that were using Internet. The change
allowed to register and use domain names that could have been understood by a much larger
number of interested people. Thus it became possible to create web addresses that were
combinations of ASCII and non-ASCII characters, or addresses that consisted only of national
symbols:
http://россия.net
http://газета.ру
http://budyń.pl

All non-Latin addresses need to be encoded in a special way to be handled by DNS servers. This
format is known as Punycode and all browsers translate non-ASCII URLs into Punycode in
the background before performing a DNS lookup. A Punycode domain name always starts
from xn-- and then contains ASCII characters of the original address followed by encoded
Unicode data. For instance, the latter address from the example above will be encoded in
the following form:
http://xn--budy-e2a.pl
Such domain names that contain letters from different alphabets are called Internationalized
Domain Names (IDNs). They are handled in various ways by different web browsers. Usually every
producer implements his own algorithms for determining the display format of requested URLs
and usually one of two solutions (with some minor modifications) is preferred:
o Display all URL characters using Unicode, or
o Display all URL characters using Unicode if and only if all the characters
belong to the same language that is chosen by user settings; display
Punycode URL otherwise.

Homograph attacks using non-ASCII characters


Different languages with characters encoded in a different way, may contain some letters that
look the same or at least very similar. Therefore it is possible to create URLs that consist of
different characters but are indistinguishable to the human eye.

For example, Latin and Cyrillic alphabets contain a couple of letters that look the same but have
completely different meaning and are encoded in a different way:

o (in Latin) a: U+0061, (in Cyrillic) а: U+0430


o (in Latin) c: U+0063, (in Cyrillic) с: U+0441
o (in Latin) p: U+0070, (in Cyrillic) р: U+0440
Given 100 000 characters supported by Unicode (many of which look alike), the intruder has great
potential for creating various fake URLs and even the most careful users may be confused.
At present neither DNS registrars, nor web browser vendors managed to prevent such attacks
from happening.

Finally, it may be shown that an attacker can use a character that happens to look like the actual
ASCII slash / (U+002F) - the mathematical division operator ∕ (U+2215). It allows him to set up
a subdomain that looks like another real domain, using his own name server and a top-level
domain. The fake URL address could look like the one below:
http://example.com∕a-top-level-domain.com/

The character located after .com is a mathematical division operator. In a web browser's address
bar the character would look like a common slash and the whole URL could be easily confused
with the directory a-top-level-domain.com located in the root directory under
the domain example.com:
http://example.com/a-top-level-domain.com/

Security
The best protection against homograph attacks seems to be provided by warning or proper
handlings such phony addresses by web browsers. Unfortunately this is not always the case and,
what is more, the behaviour may differ depending on browser vendor.
Cryptographic Tools
Cryptography in Java
The cryptographic functionality in Java is provided mainly by two libraries, Java Cryptography
Architecture (JCA) and Java Cryptography Extension (JCE). The first one, JCA, is tightly
integrated with the core Java API, and delivers the most basic cryptographic features. The latter
one, JCE, provides various advanced cryptographic operations.

In the past, both JCA and JCE libraries used to be treated differently by US export policies. Over
time however the regulations were relaxed, and at present they both are delivered as part of
Java SE and the division is no longer important (one should keep in mind that it does not mean
that the law won't change in the future).

The API functions and classes defined in JCA and JCE allow cryptographic operations to be
performed in Java applications. In addition to operations, the classes describe various objects
and security concepts. All classes belonging to JCA and JCE are called engines.
All JCA engines are located in the java.security package, whereas the JCE classes are
located in the javax.crypto package.
Among others, JCA delivers engines for random number generation (SecureRandom), key
generation and management (KeyPairGenerator, KeyStore), message authentication
(MessageDigest, Signature), and for certificate management
(CertificateFactory, CertPathBuilder, CertStore).
JCA contains engines that allow actual encryption and decryption (Cipher), secret key
generation and agreement (KeyGenerator, SecretKeyFactory, KeyAgreement), and
message authentication operations (Mac).

Providers
Whilst JCA and JCE define all cryptographic operations and objects, the actual implementations
of functionalities are located in separate classes, called providers. The providers implement the
API defined in JCA and JCE, and they are responsible for providing the actual cryptographic
algorithms.

Thanks to that, the whole cryptographic architecture is relatively flexible. It separates the
interfaces and generic classes from their implementations. For most of the time, after the
initialization, the programmers need to deal only with abstract terms, like 'cipher' or 'secret key'.

In order to be used in Java applications, all providers must be signed by using a certificate
from Oracle. A detailed instruction can be found in the JDK documentation.
The providers can be installed by configuring the Java Runtime: installing the JAR containing the
provider, and then enabling it by adding its name to the java.security file. Alternatively, the
providers may be installed during execution (by calling Security.addProvider(..) function)
by the application itself.
Each functionality, for example the AES cipher algorithm, may be defined by several providers.
The application, when calling the JCA and JCE API functions, can specify which provider should
be used. Alternatively, the Java engine will choose an available provider based on the preference
order specified in the java.security file.
Popular Providers
A default set of SUN providers (nowadays owned by Oracle) is installed together with the main
Java cryptographic functionality. There are many different kinds of SUN providers (SUN, SunJCE,
SunPKCS11, and so on), and they are used by both JCA and JCE libraries. They define most (if
not all) cryptographic functionalities and can be used strait away in Java applications.
An example of different providers is the collection of classes called Bouncy Castle. It was
developed by an Australian charitable organization, so the US law restrictions do not apply to it.
Bouncy Castle provides a large number of classes implementing various cryptographic
operations. The project full description may be found on the website: www.bouncycastle.org.
Another set of providers were created by Cryptix organization, however the project has not been
actively developed since 2005. Cryptix website is located at: www.cryptix.org.

Policy Files
By default, Java cryptographic functionalities have some limitations related to the size of various
types of secret keys. The restrictions are related to the US law and they are supposed to prevent
the application from using too strong ciphers.

One can overcome the limitations by downloading and applying the unlimited strength policy files.
They can usually be acquired from the original Java download web page. The download link is
usually located somewhere at the bottom of the page. After getting the packed archive, the user
should follow the instructions that can be found in the README file.

Security Tokens
Security tokens are tools that allow to prove one's identity electronically. They are usually used
as additional means of authentication, typically together with passwords.

The tokens may be either physical devices or pure software applications, operating on computers
or mobile devices. Depending on their implementation, security tokens may be referred to as
authentication tokens, cryptographic tokens, hardware or software tokens, USB tokens, or key
fobs.

Irrespective of the type, the main functionality of all security tokens is basically the same. Every
token provides some kind of authentication code for the users, which allows them to access
a particular service (for example, an online bank account).

Another typical application of tokens are hardware dongles. They are required by some
applications to prove ownership of the software. During the startup, the program queries the token
connected to the USB port and checks the authentication code.
Usually a security token requires a password to release the internal authentication code. The
password is usually in a form of a short pin number. Sometimes a more sophisticated ways of
authentication are implemented, for example fingerprint readers.

The way the authentication code is produced may also vary between tokens. The simplest (and
the most popular) method is to display the code on the device display, so that the user may use
it later when required. Other tokens use NFC or bluetooth technologies for transmitting the
password, or require to be connected in another way, for example to the computer USB port or
a smart card reader.
Tokens may use different means for generating authentication codes.

Static password tokens


The tokens with a static password are the simplest type of security tokens. The secret code is
stored inside the token and it is released when the user asks for it.

It is quite obvious that such tokens do not provide good security.

Time-synchronized tokens
The time-synchronized tokens generate a password based on the current time. They must contain
a timer which is synchronized with another timer, operating on the authentication server side. The
passwords generated by time-synchronized tokens change constantly at a set time interval, for
example every minute.

The time-synchronized tokens may, over time, become unsynchronized. In such a case, the
passwords generated by them cannot be used to access the protected service, until
a resynchronization is performed.

Asynchronous tokens
The passwords generated by asynchronous tokens change every time they are generated. The
algorithms may be based on hash functions that generate series of one-time codes based on a
shared secret symmetric key.
Each created password must be unpredictable to guess, even if all the previously generated
passwords are known. One of the popular algorithms used in asynchronous tokens is the OATH
algorithm.
Tokens with public and private keys
If the token contains a private key, the server may use the corresponding public key to
authenticate it, without the need of transmitting the private key outside the token environment.

Usually the server sends the data encrypted with the public key. After decrypting the message,
the token sends it back to the server, allowing it to confirm the token identity. In such a case,
a direct communication between the token and the server must be established.

Date: 2020-03-09

Key-Based Authentication
(Public Key Authentication)
Key-based authentication is a kind of authentication that may be used as an alternative to
password authentication. Instead of requiring a user's password, it is possible to confirm the
client's identity by using asymmetric cryptography algorithms, with public and private keys.
Nowadays, password authentication is more popular than public key authentication. It does not
require much preparation (at least, from the client's point of view), and perhaps is generally more
intuitive. In order to log into a server, users have to provide their secret passwords, which are
verified by the server. A disadvantage of this method is that, when a server is publicly available,
it may be targeted by various types of brute force and dictionary attacks, and the password may
eventually be broken and revealed. Moreover, this method requires the users to remember their
(ideally difficult and complex) passwords.
Public key authentication offers a solution to these problems. The idea is to assign a pair of
asymmetric keys to every user. Users would store their public keys in each system they want to
use, while at the some time their private keys would be kept secure on the computers, the users
want to use to connect with those secured systems. During establishing the connection, the server
would use the public key to authenticate the client, for example by encrypting some number and
asking the client to decrypt it, by using his corresponding private key.

Of course, the brute force attacks may still be performed by an attacker, but the complexity of
long and unreadable keys is much larger, and such attacks would have significantly smaller
chance of success. The asymmetric keys using at present consist of thousand of bits (as for year
2016, the recommended lengths are 2048 and 4096 bits).

The most popular algorithm used for key-based authentication is RSA.

Public key authentication on Linux


Most Linux releases provide native support for SSH (Secure Shell) cryptographic network protocol
(most commonly for OpenSSH). Users are able to easily create and configure pairs of asymmetric
keys. After that, they can log into servers that support key-based authentication automatically.
During establishing the connection, SSH processes running on the server use the proper public
keys to challenge and authenticate the client processes, that are located on the client machine.
Generating RSA keys
In order to benefit from all the advantages of key-based authentication that are described above,
first the user has to create a pair of asymmetric keys. On Linux this can be achieved by using the
SSH command:

ssh-keygen -t rsa -b 4096 -C "email@example.com"

The user will be then asked for a few configuration parameters, like the desired location to save
the keys and, optionally, a passphrase which will be used to protect the private key. Usually the
keys are created in ~/.ssh directory, in the home folder. The public key is stored
in id_rsa.pub file, whereas the private key can be found in id_rsa.
The private key can be stored on an external memory stick. The users will be able to authenticate
themselves on any computer they happen to use, as long as they are able to present the proper
memory card.

Private key passphrase


If during key pair generation, a password was entered by the user, then the private key will be
stored in an encrypted file. The user will be asked to authenticate himself, before the private key
is released and ready to use for establishing a connection. Usually, operating systems cache the
decrypted key in memory and release it automatically, until the end of local login session. On
Linux, the process responsible for that is called ssh-agent.
Because the private key is stored locally, on a trusted computer, it is not possible for an intruder
to start attacking it online. Usually, the private-key file permissions should be restricted, and only
the owner should be able to read it.

The encrypted private key can by attacked, only when the attacker got full access to the computer
(for example, by stealing it). By the time the cipher is broken, the user has time to remove his old
public key from servers, that he was using, and to create a new secure pair of asymmetric keys.

Such an approach is much more secure that normal, password-based authentication, where an
attacker can start attacking the user's password online.

Transfer a public key


In order to be able to log into a server, the user has to send there the public key. The SSH tool
can by used for that:

ssh-copy-id <username>@<host>

The user may be warned that the host is unknown, and will be asked for the proper password,
but apart from that the command automatically transfers the public key to the specified server.

An alternative method is to manually append the contents of id_rsa.pub file


to ~/.ssh/authorized_keys located on the server (in the root user's account).
The results of both methods are the same: the public key is added to the authorized_keys file,
which is then used by the server when someone tries to connect by using the SSH protocol.
Using SSH connection
The SSH connection to the server can be tested by using the command:

ssh <username>@<host>

If succeeded (that means if the key-based authentication is correctly configured), this command
will establish a secure connection to the server, without the need of providing the remote account's
password. If the private key is encrypted, then the user will be asked for the password which
protects it.

After enabling the key-based authentication on the server, the password authentication could be
disabled, to prevent brute-force attacks. It can be done by changing the
flag PasswordAuthentication in /etc/ssh/sshd_config, and restarting the SSH service.
It is also possible to specify the IP addresses which would be allowed to use the password
authentication, and block the functionality for the others.
The configuration file mentioned above is also a place where you can enable or disable the key-
based authentication on the server altogether. The flags are
called PubkeyAuthentication and RSAAuthentication, and they both should be enabled.

Public key authentication on Windows


Using key-based authentication on Windows machines is also a relatively easy task but one has
to equip himself with an application that provides such capabilities.

The programs described below can be used to generate the pair of keys. After that, the public key
needs to be manually provided to Linux, to the file .ssh/authorized_keys mentioned above.
Git Bash
Git Bash is a terminal client which (as one may guess) provides bash functionalities to Windows
machines. It is quite similar to CygWin but faster and less complex. For the purposes of
generating asymmetric keys, it is just enough.
The web address of the Git project is git-scm.com, and its Windows version can be downloaded
from there.
The pair of keys can be generated using the same commands, like on Linux (ssh-keygen, and
so on).
PuTTY
PuTTY is an SSH and telnet client, an open source application available for Windows platforms.
It consists of several components, PuTTY, PSCP, PSFTP, PuTTYtel, Plink, Pageant, PuTTYgen,
pterm. All of them are quite useful. The project home page is www.putty.org.
To generate the keys, one should use PuTTYgen tool, which can be found in PuTTY installation
directory. The user should just press the 'Generate' button. The application requires enough
random data to generate random byte streams used for key-generation, so the user will be asked
to make some random mouse movements.
Bitvise
Bitvise is a company that specializes in SSH server and client applications development. It
provides a few various services that are not as free as both tools described above, however they
can be used freely for personal purposes. The website of the project is www.bitvise.com.
To generate the pair of keys, one should use User Keypair Manager tool, which can be found on
the front page. After clicking 'Generate New' button, the user will be asked for the type of the key
and for the optional password, similarly like during generating the keys on Linux.
User Keypair Manager allows to export the created keys. One has to select the format of output
(which is, usually, OpenSSL).

Docker
Docker is an application that allows deploying programs inside sandbox packages called
containers, which provide far more efficiency that commonly used virtual machines. Docker
application was created in France in 2013. The official website of the project is www.docker.com.
Docker allows a user to create a sandbox container that contains the application with all the
required dependencies. The container may be later used for running the prepared software
multiple times, or for future software development.

Docker may be used for creating and managing distributed software systems, due to the fact that
the user is able to relatively quickly modify the application by changing the containers that form
its services and processes. By adding new containers to the network, the user can easily improve
performance and effectiveness of the produced system. Docker containers may work on many
physical or virtual machines, and their internal environment is not affected by their hosts
configurations.

Software containers are not purely cryptographic tools. However, due to portability, efficiency and
flexibility of sandboxes, they are a great way for deploying and testing security solutions, and for
performing all types of software operations.
At present, there exist a few more container solutions but Docker is definitely one of the most
popular ones. It is worth mentioning that a lot of popular cloud service providers, like Amazon or
Microsoft, added support for Docker images.

Sandboxes
The overhead caused by adding additional layers by Docker is much smaller than the cost of
running the whole virtual machine. Instead of creating another fully operational operating system,
Docker containers use the low level functionalities of the host, modifying only the necessary
functionalities located in the upper layers of the host system.

At first, Docker was available only for Linux operating systems, but over time the support for
Windows was added. Docker uses modern functionalities available in operating systems that
provide various types of resource isolation. For example, when running on Linux, Docker takes
advantage of kernel features, like namespaces, cgoups, aufs file system and virtualization interfaces
(libcontainer, libvirt, and LXC).
The application that runs inside Docker is isolated from the host operating system in terms of the
file system, other processes and users, CPU and memory, network interfaces, and input/output
devices.

Docker API
One of the reasons of Docker popularity is the simplicity of its usage. After creating a Docker
image, the user can carry out the work by using a few simple commands.

A Docker image is the package containing the application which is going to be used, together with
all its dependencies and configuration parameters. Each time an image is started, Docker creates
a new container and initiates it with fresh parameters. Of course, a number of separate containers
based on the same image can be created and run at one time. Every created container receives
its own unique id number.
o docker pull image_name: downloads the specified image from Docker
repository.
o docker images: lists all available images.
o docker run image_name: starts the specified image, and performs the
predefined operations. Each time the command run is used, a new container
is created.
o docker ps: lists all currently running images.
o docker stop container_id: stops the specified container.
o docker rm container_id: removes the specified container.

Creating Docker images


At present, there exist hundreds of available Docker images in public repositories. It is possible
to download them by using the docker pull command, and work with the containers created
from them.
It is also possible to create custom Docker images, by modifying the existing ones. It may be
easily achieved by creating a customized Dockerfile.
A Dockerfile usually consists of just a few lines, and each line contains a keyword and
a corresponding value. The first line should specify the parent image which is going to be modified
(keyword: FROM). The next lines contain additional dependencies that should be loaded into the
image (keyword: ADD), the scripts and applications that should be installed or run
(keyword: RUN, CMD), the ports that should be exposed (keyword: EXPOSE), and so on.
Having the file, a new Docker image is created by the command docker build, which takes as
parameters a target image name (combined with an optional user and a tag) and
the Dockerfile.
The created Docker image can be uploaded to the specified registry by the command:
docker push image_name

Docker networks
It is possible to create multiple Docker images that would be connected via network and that
would be able to exchange data. Generally, it is recommended to create many Docker images,
each one of them running just one service, and allow them all to work together within one network.

A Docker network can be created by using a command:


docker network create network_name
After that, the user can specify which network should be used by a Docker container by adding
a --net parameter to the run command. To make a few Docker containers cooperate, they
should be run in the same Docker network.
An additional tool, called Docker Compose, was created to make communication between
containers even easier. It allows packing multiple images into groups, defined by a docker-
compose.yml file, and managing them by using two commands docker-compose
up and docker-compose stop.

You might also like