Professional Documents
Culture Documents
Theory of Cryptography............................................................................................................................. 13
A 10-paragraph Introduction to Ciphers ....................................................................................................... 13
A Brief History of Cryptography ............................................................................................................... 13
Cipher Construction ............................................................................................................................... 14
Kerckhoffs's principle ................................................................................................................................. 15
Kerckhoffs's research publications ........................................................................................................... 15
Kerckhoffs's principle today .................................................................................................................... 15
Steganography .......................................................................................................................................... 16
A Brief History of Steganography ............................................................................................................. 16
Modern steganography .......................................................................................................................... 16
Protocols................................................................................................................................................... 17
Arbitration Protocols .............................................................................................................................. 17
Dispute Protocols................................................................................................................................... 17
Self-Enforcing Protocols.......................................................................................................................... 18
Attacking Protocols ................................................................................................................................ 18
TCP/IP Protocols ....................................................................................................................................... 18
Application Layer................................................................................................................................... 21
Transport Layer..................................................................................................................................... 22
Internet Layer........................................................................................................................................ 25
Network Interface Layer ......................................................................................................................... 26
Internet Relay Chat................................................................................................................................. 29
Notation of Numbers.................................................................................................................................. 31
Base conversion algorithm...................................................................................................................... 32
Number of digits .................................................................................................................................... 32
Binary numbers......................................................................................................................................... 32
Notation of numbers in NBS .................................................................................................................... 33
Notation of numbers in C2....................................................................................................................... 35
Time Estimation of Mathematical Operations................................................................................................ 38
Binary addition ...................................................................................................................................... 38
Binary multiplication.............................................................................................................................. 38
Factorial................................................................................................................................................ 39
Polynomial time..................................................................................................................................... 39
Modular Arithmetic (Clock Arithmetic)......................................................................................................... 40
Modular inversion.................................................................................................................................. 40
Calculating inverse numbers in ZN ............................................................................................................ 40
Information-theoretic security of ciphers ...................................................................................................... 41
Perfect Security...................................................................................................................................... 41
Semantic Security................................................................................................................................... 42
Padding Mechanisms ................................................................................................................................. 42
Bit Padding............................................................................................................................................ 42
TBC (Trailing Bit Complement) Padding ................................................................................................... 42
PKCS#5 and PKCS#7 Padding ................................................................................................................. 43
ISO 7816-4 Padding ............................................................................................................................... 43
ISO 10126-2 Padding.............................................................................................................................. 43
ANSI X9.23 Padding................................................................................................................................ 43
Zero Byte Padding.................................................................................................................................. 43
Block Ciphers Modes of Operation ............................................................................................................... 44
ECB (Electronic Codebook) Mode ............................................................................................................ 44
CBC (Cipher-Block Chaining) Mode.......................................................................................................... 46
Security of the CBC mode ........................................................................................................................ 47
PCBC (Propagating or Plaintext Cipher-Block Chaining) Mode .................................................................... 48
CFB (Cipher Feedback) Mode .................................................................................................................. 49
OFB (Output Feedback) Mode ................................................................................................................. 50
CTR (Counter) Mode .............................................................................................................................. 50
Security of the CTR mode ........................................................................................................................ 51
Pseudorandom Generator (PRG)................................................................................................................. 52
PRG Implementation.............................................................................................................................. 52
PRG Output Quality ................................................................................................................................ 52
Pseudorandom Functions and Permutations ................................................................................................ 53
Creating PRF from PRG........................................................................................................................... 53
One-way Function...................................................................................................................................... 55
Trapdoor one-way function .................................................................................................................... 55
One-way hash function ........................................................................................................................... 55
Hash function expanding ........................................................................................................................ 55
Hash functions based on block ciphers ..................................................................................................... 56
Message Authentication Code (MAC) ........................................................................................................... 57
MAC algorithms based on PRF................................................................................................................. 57
CBC MAC............................................................................................................................................... 58
NMAC................................................................................................................................................... 59
CMAC ................................................................................................................................................... 60
PMAC ................................................................................................................................................... 60
One-time MAC ....................................................................................................................................... 61
Carter-Wegman MAC ............................................................................................................................. 62
HMAC................................................................................................................................................... 62
Password-Based Encryption (PBE).............................................................................................................. 63
Salt ....................................................................................................................................................... 65
Iteration Count....................................................................................................................................... 65
Software Signing and Authorisation ............................................................................................................. 65
Secure device architecture ...................................................................................................................... 65
Software signing .................................................................................................................................... 66
Secure software flashing ......................................................................................................................... 68
Software signing usage today .................................................................................................................. 69
Index of Coincidence................................................................................................................................... 70
Using IC in cryptography......................................................................................................................... 70
Expected values for some languages......................................................................................................... 71
What Is Encryption? ......................................................................................................................... 71
Encryption Types / Methods ........................................................................................................... 72
Encryption Algorithms ...................................................................................................................... 72
Encryption Standards ...................................................................................................................... 73
File Encryption Overview ................................................................................................................ 74
Disk Encryption Overview ............................................................................................................... 74
Email Encryption Overview ............................................................................................................. 74
Encryption Best Practices ............................................................................................................... 74
All Simple Ciphers..................................................................................................................................... 75
Simple Substitution Ciphers ........................................................................................................................ 75
Usage.................................................................................................................................................... 75
Description............................................................................................................................................ 75
Security of simple substitution ciphers...................................................................................................... 76
Simple Substitution Ciphers: ............................................................................................................ 76
Caesar Cipher ............................................................................................................................................ 76
Usage.................................................................................................................................................... 76
Algorithm.............................................................................................................................................. 77
Security of the Caesar cipher.................................................................................................................... 77
Implementation..................................................................................................................................... 77
ROT13 ...................................................................................................................................................... 78
Usage.................................................................................................................................................... 78
Algorithm.............................................................................................................................................. 78
Security of the ROT13 cipher ................................................................................................................... 78
Implementation..................................................................................................................................... 78
Homophonic Substitution Ciphers ............................................................................................................... 79
Usage.................................................................................................................................................... 79
Description............................................................................................................................................ 79
Homophonic Substitution Cipher:.................................................................................................... 79
Book Cipher .............................................................................................................................................. 79
Usage.................................................................................................................................................... 79
Algorithm.............................................................................................................................................. 79
Security of the book cipher ...................................................................................................................... 80
Polygraphic Substitution Ciphers ................................................................................................................. 80
Usage.................................................................................................................................................... 80
Description............................................................................................................................................ 80
Polygraphic Substitution Ciphers: ................................................................................................... 81
Playfair Cipher ........................................................................................................................................... 81
Usage.................................................................................................................................................... 81
Algorithm.............................................................................................................................................. 81
Security of the Playfair cipher................................................................................................................... 82
Implementation..................................................................................................................................... 83
Two-Square Cipher (Double Playfair)........................................................................................................... 84
Usage.................................................................................................................................................... 84
Algorithm.............................................................................................................................................. 84
Security of the two-square cipher............................................................................................................. 86
Implementation..................................................................................................................................... 86
Four-Square Cipher.................................................................................................................................... 87
Usage.................................................................................................................................................... 87
Algorithm.............................................................................................................................................. 87
Security of the four-square cipher ............................................................................................................ 89
Implementation..................................................................................................................................... 90
Hill Cipher ................................................................................................................................................. 90
Usage.................................................................................................................................................... 91
Algorithm.............................................................................................................................................. 91
Security of the Hill cipher......................................................................................................................... 94
Number of possible keys ......................................................................................................................... 94
Polyalphabetic Substitution Ciphers ............................................................................................................. 95
Usage.................................................................................................................................................... 95
Description............................................................................................................................................ 95
Security of polyalphabetic substitution ciphers .......................................................................................... 95
Polyalphabetic Substitution Ciphers:............................................................................................... 95
Trithemius Cipher...................................................................................................................................... 96
Usage.................................................................................................................................................... 96
Algorithm.............................................................................................................................................. 96
Security of the Trithemius cipher ............................................................................................................. 97
Vigenère Cipher ......................................................................................................................................... 97
Usage.................................................................................................................................................... 97
Algorithm.............................................................................................................................................. 97
Security of the Vigenère cipher................................................................................................................. 98
Beaufort Cipher ......................................................................................................................................... 99
Usage.................................................................................................................................................... 99
Algorithm............................................................................................................................................ 100
Security of the Beaufort Cipher .............................................................................................................. 100
Implementation................................................................................................................................... 101
Running Key Cipher ................................................................................................................................. 101
Usage.................................................................................................................................................. 101
Algorithm............................................................................................................................................ 101
Security of the running key cipher .......................................................................................................... 102
Autokey Cipher........................................................................................................................................ 103
Usage.................................................................................................................................................. 103
Algorithm............................................................................................................................................ 103
Security of the autokey cipher................................................................................................................ 104
Nihilist Cipher.......................................................................................................................................... 105
Usage.................................................................................................................................................. 105
Algorithm............................................................................................................................................ 105
Security of the Nihilist cipher ................................................................................................................. 106
VIC Cipher ............................................................................................................................................... 107
Usage.................................................................................................................................................. 107
Algorithm............................................................................................................................................ 107
Security of VIC...................................................................................................................................... 109
Transposition Ciphers .............................................................................................................................. 109
Usage.................................................................................................................................................. 109
Description.......................................................................................................................................... 109
Transposition Ciphers:.................................................................................................................... 109
Rail Fence Cipher ..................................................................................................................................... 110
Usage.................................................................................................................................................. 110
Algorithm............................................................................................................................................ 110
Security of the Rail Fence Cipher ............................................................................................................ 111
Implementation................................................................................................................................... 111
Route Cipher ........................................................................................................................................... 112
Usage.................................................................................................................................................. 112
Algorithm............................................................................................................................................ 112
Security of the Route Cipher .................................................................................................................. 113
Implementation................................................................................................................................... 113
Columnar Transposition........................................................................................................................... 114
Usage.................................................................................................................................................. 114
Algorithm............................................................................................................................................ 114
Security of the Columnar Transposition.................................................................................................. 115
Implementation................................................................................................................................... 116
Double Columnar Transposition................................................................................................................ 118
Usage.................................................................................................................................................. 118
Algorithm............................................................................................................................................ 118
Security of the Double Columnar Transposition....................................................................................... 119
Myszkowski Transposition ....................................................................................................................... 119
Usage.................................................................................................................................................. 120
Algorithm............................................................................................................................................ 120
Security of the Myszkowski Transposition .............................................................................................. 121
Cryptographic Rotor Machines .................................................................................................................. 122
Usage.................................................................................................................................................. 122
Description.......................................................................................................................................... 122
Cryptographic Rotor Machines: ..................................................................................................... 123
Hebern Cryptographic Rotor Machine........................................................................................................ 123
Usage.................................................................................................................................................. 123
Algorithm............................................................................................................................................ 123
Security of the Hebern rotor machine ..................................................................................................... 124
Images: ............................................................................................................................................... 124
Lorenz Cryptographic Rotor Machine......................................................................................................... 126
Usage.................................................................................................................................................. 126
Algorithm............................................................................................................................................ 126
Security of the Lorenz rotor machine...................................................................................................... 128
Image:................................................................................................................................................. 128
Enigma Cryptographic Rotor Machine........................................................................................................ 129
Usage.................................................................................................................................................. 129
Algorithm............................................................................................................................................ 130
Security of Enigma ............................................................................................................................... 131
Images: ............................................................................................................................................... 132
Simple XOR Cipher ................................................................................................................................... 134
Usage.................................................................................................................................................. 134
Algorithm............................................................................................................................................ 134
Security of the simple XOR cipher........................................................................................................... 135
Implementation................................................................................................................................... 135
Symmetric Ciphers................................................................................................................................... 136
Stream Symmetric Ciphers........................................................................................................................ 136
All Stream Ciphers: .......................................................................................................................... 136
One-Time Pad (OTP) ................................................................................................................................ 136
Usage.................................................................................................................................................. 136
Algorithm............................................................................................................................................ 137
Block Diagram of OTP Algorithm ........................................................................................................... 137
Maths:................................................................................................................................................. 138
Implementation................................................................................................................................... 138
RC4 ........................................................................................................................................................ 139
Usage.................................................................................................................................................. 139
Algorithm............................................................................................................................................ 139
Creating the Table ................................................................................................................................ 139
Encryption and Decryption ................................................................................................................... 140
Speed of RC4 ....................................................................................................................................... 140
Security of RC4..................................................................................................................................... 140
Block Diagram of RC4 ........................................................................................................................... 140
Maths:................................................................................................................................................. 141
Implementation: .................................................................................................................................. 141
Keystream Initialisation ........................................................................................................................ 142
Keystream Generation.......................................................................................................................... 142
Salsa20 ................................................................................................................................................... 142
Usage.................................................................................................................................................. 142
Algorithm............................................................................................................................................ 142
Block Diagram of Salsa20 Algorithm ...................................................................................................... 143
Maths:................................................................................................................................................. 144
CSS (Content Scramble System)................................................................................................................. 149
Usage.................................................................................................................................................. 149
Algorithm............................................................................................................................................ 149
CSS Modes........................................................................................................................................... 149
CSS Keys ............................................................................................................................................. 150
CSS System.......................................................................................................................................... 150
CSS Protocol ........................................................................................................................................ 151
Block Diagram of CSS Algorithm for Audiovisual Data .............................................................................. 152
Block Diagram of CSS Algorithm for Key Bytes......................................................................................... 152
Block Diagram of CSS Additional Encryption of Keys ................................................................................ 153
Block Diagram of CSS LFSR Registers ..................................................................................................... 153
Maths:................................................................................................................................................. 154
Block Symmetric Ciphers .......................................................................................................................... 157
DES (Data Encryption Standard)................................................................................................................ 157
Usage.................................................................................................................................................. 157
Algorithm............................................................................................................................................ 157
Security of DES..................................................................................................................................... 160
Block Diagram of DES Algorithm............................................................................................................ 160
Block Diagram of DES Feistel Function ................................................................................................... 161
Block Diagram of DES Key Schedule ....................................................................................................... 162
Maths:................................................................................................................................................. 163
RC2 ........................................................................................................................................................ 173
Usage.................................................................................................................................................. 174
Algorithm............................................................................................................................................ 174
Encryption .......................................................................................................................................... 175
Decryption .......................................................................................................................................... 176
Block Diagram of RC2 Encryption .......................................................................................................... 176
Block Diagram of RC2 Decryption .......................................................................................................... 178
Maths:................................................................................................................................................. 180
Triple DES (3DES).................................................................................................................................... 182
Usage.................................................................................................................................................. 182
Algorithm............................................................................................................................................ 182
Block Diagram of 3DES Encryption ........................................................................................................ 183
Block Diagram of 3DES Decryption ........................................................................................................ 183
Maths:................................................................................................................................................. 184
AES (Advanced Encryption Standard) ........................................................................................................ 185
Usage.................................................................................................................................................. 185
Algorithm............................................................................................................................................ 185
Block Diagram of AES Encryption:.......................................................................................................... 187
Block Diagram of AES Key Expansion: .................................................................................................... 188
Maths:................................................................................................................................................. 190
Blowfish.................................................................................................................................................. 193
Usage.................................................................................................................................................. 193
Camellia .................................................................................................................................................. 193
Usage.................................................................................................................................................. 194
Algorithm............................................................................................................................................ 194
Block Diagram of Camellia Encryption for 128-bit Key ............................................................................. 195
Block Diagram of Camellia Encryption for 192 or 256-bit Key ................................................................... 197
Block Diagram of Camellia Decryption for 128-bit Key ............................................................................. 199
Block Diagram of Camellia Decryption for 192 or 256-bit Key ................................................................... 201
Block Diagram of Camellia 6-round Block ............................................................................................... 203
Block Diagram of Camellia - Creating Helper Variables of Key.................................................................... 205
Maths:................................................................................................................................................. 206
Implementation................................................................................................................................... 214
Serpent ................................................................................................................................................... 214
Usage.................................................................................................................................................. 214
Twofish................................................................................................................................................... 215
Asymmetric Ciphers................................................................................................................................. 215
Asymmetric Ciphers: ....................................................................................................................... 215
Merkle's Puzzles ...................................................................................................................................... 215
Usage.................................................................................................................................................. 216
Algorithm............................................................................................................................................ 216
Security of Merkle's Puzzles................................................................................................................... 216
Block Diagram of Merkle's Puzzles Protocol ............................................................................................ 217
Maths:................................................................................................................................................. 217
Diffie–Hellman Protocol............................................................................................................................ 217
Usage.................................................................................................................................................. 217
Algorithm............................................................................................................................................ 218
Public key encryption ........................................................................................................................... 218
Security of the Diffie-Hellman protocol.................................................................................................... 218
Block Diagram of Diffie-Hellman protocol ............................................................................................... 219
Maths:................................................................................................................................................. 219
RSA ........................................................................................................................................................ 219
Usage.................................................................................................................................................. 220
Algorithm............................................................................................................................................ 220
Key Generation .................................................................................................................................... 220
Encryption .......................................................................................................................................... 220
Decryption .......................................................................................................................................... 221
Message Authentication........................................................................................................................ 221
Security of RSA..................................................................................................................................... 221
Block Diagram of RSA encryption and decryption.................................................................................... 221
Maths:................................................................................................................................................. 222
Attack Models for Cryptanalysis................................................................................................................. 222
Theoretical Attack Models: ............................................................................................................. 223
Known-Plaintext Attack............................................................................................................................ 223
Known-Plaintext Attack Efficiency ......................................................................................................... 223
Chosen-Plaintext Attack............................................................................................................................ 224
Adaptive-Chosen-Plaintext Attack.......................................................................................................... 224
Ciphertext-Only (Known Ciphertext) Attack................................................................................................ 224
Chosen-Ciphertext Attack ......................................................................................................................... 225
Adaptive-Chosen-Ciphertext Attack ....................................................................................................... 225
Chosen-Key Attack ................................................................................................................................... 225
Cryptographic Attacks: .................................................................................................................. 225
Brute-Force Attack ................................................................................................................................... 226
Dictionary Attack ................................................................................................................................. 226
Reverse Brute-Force Attack................................................................................................................... 226
Denial-of-Service Attack............................................................................................................................ 226
DoS Techniques ................................................................................................................................... 227
Targeting Layers .................................................................................................................................. 227
Attacker's Goal..................................................................................................................................... 228
DDoS (Distributed Denial-of-Service) Attack ........................................................................................... 228
Degradation-of-Service ......................................................................................................................... 228
Reflected (Spoofed) Attack .................................................................................................................... 228
Slowloris Attacks.................................................................................................................................. 229
Zombie Computers .............................................................................................................................. 229
Tools .................................................................................................................................................. 230
Man-in-the-Middle Attack ......................................................................................................................... 230
Attack on Two-Time Pad .......................................................................................................................... 230
Venona Project .................................................................................................................................... 231
MS-PPTP............................................................................................................................................. 231
802.11 WEP ........................................................................................................................................ 232
Key Reinstallation Attack ...................................................................................................................... 233
KRACK.................................................................................................................................................... 233
WPA2 Secret Key ................................................................................................................................. 233
Performing the Attack........................................................................................................................... 234
Protection against KRACK..................................................................................................................... 234
Conclusion .......................................................................................................................................... 234
Frequency Analysis .................................................................................................................................. 235
Frequency Analysis of Substitution Ciphers............................................................................................. 235
Meet-in-the-middle Attack ........................................................................................................................ 235
Meet-in-the-middle Complexity ............................................................................................................. 236
Meet-in-the-middle 2D ......................................................................................................................... 237
Meet-in-the-middle nD ......................................................................................................................... 238
Replay Attack .......................................................................................................................................... 239
Cut-and-Paste Attack ............................................................................................................................ 240
Homograph Attack................................................................................................................................... 240
Simple homograph attacks.................................................................................................................... 240
Non-ASCII ULRs................................................................................................................................... 240
Homograph attacks using non-ASCII characters ...................................................................................... 241
Security............................................................................................................................................... 241
Cryptographic Tools................................................................................................................................ 242
Cryptography in Java ................................................................................................................................ 242
Providers ............................................................................................................................................ 242
Policy Files........................................................................................................................................... 243
Security Tokens ....................................................................................................................................... 243
Static password tokens ......................................................................................................................... 244
Time-synchronized tokens.................................................................................................................... 244
Asynchronous tokens........................................................................................................................... 244
Tokens with public and private keys ...................................................................................................... 244
Key-Based Authentication (Public Key Authentication) ................................................................................ 244
Public key authentication on Linux......................................................................................................... 245
Public key authentication on Windows................................................................................................... 246
Docker.................................................................................................................................................... 247
Sandboxes........................................................................................................................................... 248
Docker API .......................................................................................................................................... 248
Creating Docker images ........................................................................................................................ 248
Docker networks ................................................................................................................................. 249
Theory of Cryptography
A 10-paragraph Introduction to
Ciphers
It is difficult to say with certainty, but it seems probable that soon after mastering the art of writing,
people started to feel the need to hide and mask what was written. Probably over time and with
increasing importance of written messages, the need became stronger. First states were created
and more and more important information had to be sent in writing over long distances. The
information that should have remained undisclosed.
With the appearance of methods of breaking simple substitution ciphers, the ordinary exchange
of letters became no longer strong enough. The new ciphers were invented. They allowed better
mixing of letters, obscuring messages and corrupting typical language characteristics (letters
frequency, popular pairs of characters). Besides substitution of letters, ciphers started to use
transposition of letters.
Cipher Construction
At this stage a full model of building ciphers was developed. Having some long messages that
should be encrypted, one knows a recipe (an algorithm) which is a list of steps to perform for
changing plaintext letters into ciphertext characters. It is also necessary to choose a secret key.
It will be used together with the selected algorithm. For example, the algorithm may be
a sentence "move each letter right" and the secret key may be a phrase "by three positions".
This distinction comes from the possibility to reduce the amount of information that have to be
exchanged between interested parties. The number of positions of how much all letters should be
moved is the most important information in this situation, while the information about the fact that
the shift should be performed to the right is not crucial for message's security and can be
transmitted at the beginning of the message without encryption. Moreover, such distinction allows
to use the algorithm multiple times with different keys, for example during communication with
different people.
Current encryption algorithms operate on computers or electronic devices. Secret keys consist of
tens of characters and during encryption and decryption millions of operations are performed.
Encryption algorithms are part of larger algorithms, communication protocols and various
standards. People have to deal with them quite often in many areas of their lives.
Finally, let's notice two things. Firstly, the main benefit of using ciphers is the fact that secret keys
are much shorter in comparison to amount of transmitted information. This allows to replace
a difficult problem (providing in secret some long messages) by an easier challenge (providing in
secret a shorter key; the key can be then used many times).
Second, any algorithm may be publicly known or it may be secret. In theory, the latter option
provides additional security. The general rule says that every cipher should be reliable and secure
even after the full publication of its algorithm. Thus, one should always assume that an intruder
knows everything about attacked systems, except their secret keys.
Most modern ciphers base on publicly known algorithms. In practice, there are two arguments for
this solution. First, after some time every algorithm usually becomes known - due to coincidence,
bribery, betrayal, or analysis of equipment or software that use the cipher. Second, a publicly
known algorithm can be tested and analysed by thousands of honest and wise people. If they find
errors, they can publicly disclose the issues. Thus it is possible to improve the cipher and correct
the algorithm.
Kerckhoffs's principle
Auguste Kerckhoffs
Kerckhoffs's principle is one of the basic principles of modern cryptography. It was formulated in
the end of the nineteenth century by Dutch cryptographer Auguste Kerckhoffs. The principle goes
as follows: A cryptographic system should be secure even if everything about the system,
except the key, is public knowledge.
Keeping algorithms secret may act as a significant barrier to cryptanalysis, but only if such
algorithms are used in a strictly limited circle, which protects the algorithm from being revealed.
Most government ciphers are kept secret. Commercial encryption algorithms, released to
the market, have mostly been broken quite swiftly.
Steganography
Steganography is a way of sending the hidden data in such a way that nobody (apart from the
sender and intended recipients) knows that the secret message was sent. There aren't any
ciphers or other encryption like it is in cryptography.
During World War II, many agents used so called microdots. The whole document A4 was
reduced to the size of a dot and used as a part of other common text.
Steganography also includes all kinds of invisible inks. They have been used since ancient times
around the world. Over time new technologies have been invented and better recipes of invisible
inks have been developed. Mixtures have become more difficult to detect: more odorless, invisible
under ultraviolet light, easily soluble in water, etc. In 1999 CIA refused to disclose the recipes of
invisible inks that had been used during World War I, arguing that they were still important for
national security.
Modern steganography
With the development of technology, possibilities for data hiding have increased. For example
the microdot technology is used in almost all modern printers. It allows to mark all created
printouts in a way, that is invisible for users.
Currently, very popular kind of steganography is hiding information in digital pictures. There is
some redundancy in storing images. All pixels in digital pictures are coded using a specified
amount of bits and usually it is impossible to notice the changing of the least important bits. The
least important bits can be used for storing secret information. A similar situation happens when
storing digital sound.
Protocols
A protocol is a set of actions that two or more entities need to perform in order to accomplish
a task. All users take the actions step by step and successfully carry out the agreed procedure to
the end.
Computers and other electronic devices use communications protocols to establish a connection
and exchange data. Nowadays there are many protocols and communications standards which
are recognized globally. Thanks to that, various different devices located in different places in the
world may communicate with each other quite easily.
Cryptographic protocols are protocols that use cryptography. They have to guarantee that no
entity will be able to gain more knowledge and access more privileges than it was designed in
their algorithms. Cryptographic protocols include various types of encryption, message
authentication or key agreement algorithms.
Arbitration Protocols
An additional entity, apart from communicating sides, takes part in arbitration protocols. The new
entity is called an arbiter, and by definition, the arbiter is impartial, not interested in
the communication and trusted by all the other sides. He acts like a bank officer, mediating in
financial services.
Arbitration protocols simplify a lot of tasks which are performed by computers. The arbiter makes
it easier to resolve disputes and exchange secret data safely. On the other hand, using arbitration
protocols may sometimes be inconvenient:
o There is a need to find an arbiter, which may be located far from the other
sides, and which would be trusted by all the other entities.
o The servers serving as arbiters must be financed and maintained.
o An arbiter is an obvious bottleneck of the transaction. A damaged, attacked or
faulty arbiter is a serious problem for the communicating parties.
Most modern systems for transferring money, like credit cards and PayPal, require trusted
intermediaries, like banks and credit card companies, to facilitate the transfer.
Dispute Protocols
A dispute protocol is a kind of arbitration protocol, in which the arbiter is involved only when it is
really required. If there are not any problems, then the communicating parties perform the whole
task and exchange information without the participation of the arbiter. On the other hand, if a
problem occurs - an error, an unexpected circumstance or fraud - an arbiter is called for help.
The arbiter has information and power to fix the situation.
Dispute protocols are cheaper and easier in use than arbitration protocols. Usually the fact of
the arbiter's existence alone prevents fraud. Because the arbiter does not have to be involved in
most communications, the major disadvantage of arbitration protocols is overcome.
Self-Enforcing Protocols
In self-enforcing protocols the whole communication doesn't require trusted third parties.
The algorithms are designed in a way that assures that any fraud attempt made by one side is
immediately visible for others and they are able to prevent it, without suffering any loses.
Undoubtedly self-enforcing protocols have the largest number of advantages and they eliminate
the need of involving additional entities. Unfortunately, no all operations can be carried out by
using the protocols of this kind.
Attacking Protocols
In general, there are two types of attacks on protocols: active and passive
o Passive attacks: the intruder may eavesdrop the communication but he is not
able to interfere with the exchange of messages.
o Active attacks: the attacker tries to change the protocol - by sending new
messages, modifying or removing the existing ones, or even altering the whole
communication channel.
The main goal of a passive attack is only overhearing the communications. On the other hand,
the goals of an active attack may vary, and the effects may usually be much more dangerous for
the victims. In the most complex active attacks many intruders take part, attacking various points
of the targeted system.
TCP/IP Protocols
TCP/IP (Transmission Control Protocol/Internet Protocol) is a set of protocols, that are used for
data transmission over computer networks. The TCP/IP model recognizes the main functionalities
of the theoretical OSI model. The image below presents the corresponding layers of both TCP/IP
and OSI models.
Every message, which is sent by an application, has to pass through all the TCP/IP layers, from
the application layer to the lowest network interface layer. Then, it is transmitted over network to
another computer. Finally, it moves all the way up to the application layer and then to the target
application.
While data is passed down from the application to the network, each layer adds its own header
to every message. Each header is then handled by a corresponding layer on the receiving
computer (where, as we said earlier, messages are passed from the network up to the application
layer and beyond). Both the content and the size of each header depend on the protocol that has
been used in the layer.
There are a lot of application layer protocols that use TCP/IP data transmission. Some of the
popular protocols are:
An internet socket contain an IP address, a port number and a transport layer protocol name.
A unique combination of those three values determines a proper process that should deal with
the message.
The port number can be assigned automatically by the operating system, manually by the user or
is set as a default for some popular applications. The port number is a 16-bit integer (0 - 65535).
Some popular application layer protocols use by default predefined and well-known port numbers.
For example, HTTP uses port 80, HTTPS uses port 443, SMTP port 25, Telnet port 23, and FTP
uses two ports: 20 for data transmission and 21 for transmission control. The list of such default
port numbers is managed by the Internet Assigned Numbers Authority organization.
The process of associating an application to a socket is called binding. After successful binding
the application doesn't need to care about network management because all further operations
are handled by protocols of lower layers of TCP/IP.
In some operating systems some special privileges are required for applications to bind to port
numbers less than 1024. Therefore, a lot of processes prefer using higher port numbers allocated
for short term use. Such ports are called ephemeral ports.
A user can specify a port number in a URL. For instance the following URL forces the browser to
try to reach the website using port 8080, instead of default HTTP port 80:
http://www.example.com:8080/path
Transport Layer
The transport layer receives messages from the application layer. It divides them into smaller
packets, adds a header, and sends the messages down to the internet layer. The header contains
several control information, especially source and target port numbers.
Port numbers are used by the transport layer while handling incoming packets from the internet
layer (thus, during receiving data). Thanks to the port number, it is possible to determine what
kind of contents there is inside the incoming message, thus which application layer protocol
should receive it. For example, a packet with the target port number equal to 25 will be delivered
to the protocol connected to this port, usually SMTP. In this case, SMTP will provide data to
the email application that requested it.
TCP
The most common protocol used in the transport layer is TCP (Transmission Control Protocol).
This is a connection oriented protocol. TCP offers reliable, peer-acknowledged, ordered, session-
based connectivity between two hosts.
All the features mentioned above are provided by the TCP layer itself. This means, that it may
operate with other, unreliable, protocols in the lower layers and that this shouldn't affect the
communication from the application layer perspective.
TCP Reliability
During sending data, TCP assures that data has been provided to the recipient. The receiver
checks if the received packet was intact during transmission (by checking the checksum of the
data) and, if so, the receiver confirms it by sending an acknowledgement to the sender. If
the sender doesn't receive the acknowledgement for a message within some time period, it will
resend the lost packet.
After several unsuccessful attempts, TCP assumes that the receiver is unreachable and informs
the application layer that the transmission has failed.
TCP Ordering
The TCP header contains a field with the message sequence number. The sequence number is
incremented by one for every message sent. During receiving data, TCP rearranges incoming
packets and put them in the right order. Thanks to that, the application layer doesn't need to care
about the ordering of network packets.
TCP Header
The TCP header consists of 20 or more bytes. The size depends on the fact whether or not
the optional options field is used. The maximum size of the options field is 40 bytes, thus the
maximum size of the whole header is 60 bytes.
TCP Session
Two applications need to establish a session to exchange data. TCP requires three messages to
create the session:
1. SYN - the first application (the client) sends a synchronize packet to the host.
The message contains a random sequence number, which has been set by the client.
2. SYN-ACK - the host responds to the client. It increases the sequence number from
the client by one and sends it back in the message as an acknowledgement number.
Also, the response message contains another sequence number chosen randomly by
the host.
3. ACK - the client sends an acknowledgement message to the host. The message
contains both received numbers increased by one.
When the transmission is completed, the session should be terminated. Each side can terminate
the session. The second side is supposed to acknowledge that.
TCP Usage
TCP is widely used by protocols and applications that require high reliability. It is not as fast
as UDP but, if configured properly, it still provides quite good speed together with high quality of
transmitted data.
There are a lot of application layer protocols that are most mostly used together with TCP.
Some of the most popular ones are:
o HTTP, HTTPS
o FTP
o SMTP
o Telnet
UDP
The second popular protocol that is used in the transport layer is UDP (User Datagram Protocol
or Universal Datagram Protocol), a simpler, connectionless protocol. One program just sends
some packages to another, without creating any kind of relation between them.
Due to its simplicity UDP is faster than TCP. On the other hand, it doesn't provide such reliability
as TCP. There is no guarantee that the messages would reach the receiver. UDP doesn't deliver
packets in the same order that they were sent. It is up to the application to check that the received
messages are intact and to deal with data in the correct order.
UPD Header
The UDP header is 8-byte long. It is much shorter and simpler than the corresponding TCP
header.
There are a lot of application layer protocols that use UDP, for example:
o DNS
o DHCP
o TFTP
o SNMP
o RIP
o VOIP
DCCP
Datagram Congestion Control Protocol is a protocol that allows application to use congestion
control mechanisms and to maintain reliable connections. It doesn't provide reliable in-order
delivery.
DCCP is used by applications which work with quickly changing data (streaming media, online
games, VoIP). In such situations it is often better to use new piece of available data than ask for
retransmitting the old damaged package.
RSVP
Resource Reservation Protocol allows for reservation of resources across a network. It is mainly
used by routers and hosts to assure delivering specific levels of quality of service (QoS) for clients.
RSVP can reserve bandwidth for one-to-one and one-to-many transmissions. The protocol is
initiated by the client (receiver), which asks the router to reserve some resources.
SCTP
Stream Control Transmission Protocol allows sending multiple streams of data through one
stream. It ensures reliable and in-order transmission with congestion control, similarly to TCP, but
allows sending related data streams together in the same messages.
In general, SCTP is quite a powerful protocol. However, due to the poor support of routers and
operating systems, at present it is not popular and widely used.
Internet Layer
The internet layer adds another header to the messages received from the transport layer.
The most important fields in the new header are IP addresses of both source and target
machines. The IP address is a unique virtual number that allows to find the device in the network.
Each network device has also another special number assigned to it, called a MAC address. This
is a unique number that cannot be changed (it is stored in ROM) and that allows to identify
the device throughout the world. However, locating a device based on MAC in a global network
is practically impossible because this number is strictly hardware related and it doesn't tell us
anything about position of the device. On the other hand, IP addresses allow us to find any
computer by using DNS servers. Every computer can query a DNS server and obtain information
about the location of the target device in the network.
In general, messages travel through several routers before reaching the target server (pointed
out by the target IP address). To find out a way between the computer and the server, one could
use the Windows command:
tracert www.google.co.uk
There are a few protocols that work in the internet layer. The most important, and the most
popular, of them is IP (Internet Protocol). It would be a good idea to name some other internet
layer protocols:
IP doesn't provide any acknowledge system that means it is unreliable. It is up to TCP operating
in the transport layer to make sure that all the requested data has been delivered. Therefore
the TCP/IP connection will be reliable.
IP Datagrams
The data packets are taken from the transport layer and divided into datagrams. Every datagram
consists of the IP header and the bytes received from the transport layer. The maximum size of
a datagram depends on the IP version: 216−1 bytes for IPv4, and 232−1 for IPv6. If the transport
layer packet is too large, it will be divided into several smaller datagrams.
Usually the data is divided into even smaller datagrams. It is caused by the limited capabilities of
physical networks. For example, the maximum size of an Ethernet datagram is 1 500 bytes, so
usually the datagrams created in the internet layer based on Ethernet will be slightly smaller than
1 500 bytes (to allow the lower layers to add additional headers). The maximum datagram size
for a network is called MTU (Maximum Transfer Unit).
IP allows to divide a datagram into smaller datagrams if this datagram has to go through
a different kind of network with smaller MTU. When the smaller datagrams arrive to the previous
type of network, they can be re-assembled into the original datagram. There is a special field in
the IP header to allow such operations (called Fragment Offset).
The layer simply adds the information about the protocol used in the internet layer, and about
the protocol that is intended to receive the message. This allows the LLC layer on the target
computer to deliver datagrams correctly.
The media access control layer adds also 4 CRC bytes which may be used for data correction.
The layer is defined by IEEE 802.3 standard, if a cabled network is being used. For wireless
networks IEEE 802.11 is used.
Physical Layer
The physical layer is responsible for converting messages into electricity or electromagnetic
waves (depending on the type of the network) and for transmitting them over the physical network
between communicating machines.
It is described by the same specifications as the MAC layer, IEEE 802.3 and IEEE 802.11.
Before establishing the connection, both sides negotiate the encryption parameters during so
called TLS handshake protocol. They must agree which encryption algorithm will be used and
create proper cryptographic keys. The encryption used later for securing all messages is
symmetric and usually the negotiated symmetric key is valid only for the time of one session.
The process of establishing the shared secret key is secure and the eavesdropper cannot obtain it
even if he intercepted all the messages exchanged between the client and the server. What is
more, the handshake protocol guarantees that the negotiated secret key was intact during
transmission by the intruder, that is, that the communication is reliable.
The whole process of establishing the secure connection is protected against man-in-the-
middle attacks.
Authentication
Both sides may authenticate themselves before creating the session. The authentication is
performed by using the digital certificates signed by trusted third parties and asymmetric
encryption with public and private keys.
The authentication step is optional and one or both sides may not require it. Usually, for
convenience reasons, only the server authenticate itself.
The client may authenticate the other side by using the other side's public key (available from the
certificate received from trusted Certificate Authorities) to decrypt some information encrypted
earlier by the other side by using the corresponding private key. If the information can by properly
decrypted, then the client should assume that the other side can be trusted.
Message Integrity
The whole communication protected by TLS/SSL is reliable and the protocol itself checks the
integrity of all received messages.
The integrity checks are based on message authentication codes attached to all messages. They
are supposed to secure the messages against damages and alteration.
Similarly to other TLS/SSL functionalities, message integrity may also be provided by various
different cryptographic algorithms, depending on the client and server capabilities.
Handshake Protocol
The handshake procedure begins just after the sides agreed to use TLS. The client and the server
choose all the parameters of the secure connection they are going to create.
1. First, the client sends a list of supported ciphers and hash functions.
2. Then, the server selects the ones that it supports as well, and notifies the client of
the decision.
3. Usually (and also optionally), the server identifies itself by presenting a valid digital
certificate, which contains several information like the name of the server and its public
key. The public key is used by the client to check the server validity.
4. The client may use the server public key to encrypt a random number and send it to the
server, thus establishing the secret key which only the server will be able to decrypt.
5. Alternatively, the even better approach is to use a more secure asymmetric algorithm to
establish a stronger symmetric key. There exist two asymmetric key exchange
algorithms, Diffie-Hellman and Elliptic curve Diffie-Hellman, which provide an additional
level of security by having the property of perfect forward secrecy. It means that secret
symmetric keys established for each session will remain secure even if the long-term
public and private keys used during the handshake protocol are compromised.
If any of the steps described above fails (on either side), the connection is cancelled. The second
phase of communication, the record protocol will not be started.
Due to the fact that session negotiating by using an asymmetric encryption algorithm is a rather
expensive procedure, then instead of creating a new symmetric key, either side may try to resume
the previously used session. If the other side accepts that, they will use the secret keys created
for the previous session.
Security of TLS/SSL
The secure TLS/SSL connection may be configured to use various underlying symmetric and
asymmetric encryption algorithms. The strength of the protection depends strongly on
the selected cipher and its implementation.
The two first SSL protocol versions are generally considered to be unsafe, whilst the third SSL
version is comparable to TLS 1.1. As opposite to that, the newer TLS versions are much more
refined and provide much better security. Although there exist several attacks targeting various
TLS algorithm implementations, it is considered to be a strong and efficient tool for providing
security during communicating over computer networks.
It is recommended to create secret keys by algorithms which provide perfect forward secrecy.
That guarantees that private keys compromising (that belong for example to trusted Certificate
Authorities) will not compromise the privacy of all communications protected by the derived
private keys. Certificate Authority organisations were recently targeted by many attacks which
led to disclosure of many long-term private keys and compromised many digital certificates.
The protocol was created in 1988 by a Finnish software engineer, Jarkko Oikarinen. It was
designed mainly for group communication via various discussion forums called channels but
the protocol allows also to send and receive private messages or data.
IRC Overview
IRC works in client/server model. At first, every user has to install a client application. Using
the client application, it is possible to send text messages to the IRC server, which transfers
messages to other clients. The servers are connected together and form larger groups, so they
can exchange messages between themselves.
There are several IRC services that provide some additional functionalities, like bots (sending
messages generated by computer programs to channels) or bouncers (daemon processes that
provide IRC communication to offline users or to computers without any IRC client installed).
IRC specification is covered by several documents, RFC 1459 and a couple of later ones:
RFC 2811, RFC 2812, and RFC 2813. However, most client and server applications don't follow
the design strictly.
IRC was used originally only for sending text messages. Each character was encoded
using 8 bits, without specifying the type of encoding. This could cause problems when conversing
users were using different encoding. At present, UTF-8 is the most popular encoding used in IRC
messages and it is supported by most IRC applications.
IRC users communicate with server and other users by sending simple text commands. Every
command specifies who is the recipient (a server, a channel or another user) and additional
parameters like the text of the message.
Security of IRC
The original design of IRC is insecure. Most servers don't require users to register an account
and usually people can choose nicknames just before connecting to the channels.
Every process of changing the network structures is usually problematic and it may cause various
issues (for example, because of several users having the same nicknames not necessarily with
the same privileges). Also, it is assumed that servers trust one another during exchanging
messages. A server that behaves incorrectly can cause problems to the whole network.
In the early 2000s some IRC networks were often attacked using DDoS and other more
sophisticated attacks. This caused many users migrated to different IRC networks or abandoned
that way of communication completely.
The limitations of the protocol are well known, and therefore improvements are often introduced
in modern implementations. A lot of IRC servers have already started to support
secure SSL/TLS connections.
IRC Today
IRC was the most popular in 2003. It is estimated that it was using by over one million people on
hundreds of thousands of channels. Nowadays, the number of users have decreased to less than
half a million in 2014. The reasons why people use IRC applications have also changed.
At the beginning, the IRC networks were used for social networking, however now websites like
Facebook or Twitter took over these functions. People used to use IRC networks to broadcast
unofficial or illegal news and information. At present, there are much better ways to do it
(like TOR). IRC channels were used to exchange information about piracy software and warez.
Nowadays, bad guys prefer to look for such information in other places, like P2P.
Due to commercialization of the Internet, a lot of companies have decided to invest money in their
own products and to create their own ways of communication instead of using publicly available
IRC. On the other hand, there are several IRC-based commercial or open source projects that
are widely used by development teams and various firms and organizations for internal and
external communication.
IRC is a very old protocol and it has been using for many years. The way of using the protocol
has changed over that time. One may predict that IRC technology will be still used in various
applications and services, at least over the next several years.
Notation of Numbers
Notation of numbers is a way in which all numbers are represented, by using a limited set of
different digits. Notation that is used currently for representing numbers is called positional
notation (or place-value notation), in contrast to some ancient notations, such as Roman
numerals.
Definition Positional notation is a method of representing numbers in that way, that each digit
in a sequence is multiplied by the appropriate multiple (equal to the digit's position minus one) of
a number which is considered to be a base of a positional numeral system.
Number of digits
An integer n, which satisfies the inequality bk-1 <= n < bk has k digits in a numeral system
of base b. This relationship can be represented by the logarithm:
number of digits = [logbn] + 1
where:
a symbol [] means the integer part of the number.
Binary numbers
Binary numbers are stored as sequences of zeros and ones. At present there are a couple of
binary numeral systems in use. The most popular are:
o the Natural Binary System - in short NBS, which allows to store only
nonnegative numbers,
o the Two's Complement System - in short 2C, which allows to store both
positive and negative numbers.
where:
Definition A decimal value of a (n+m)-digit binary number stored using the natural binary
system as:
an-1 ... a2a1a0 , a-1a-2 ... a-m
is equal to:
2n-1an-1 + ... + 22a2 + 21a1 + 20a0 + 2-1a-1 + 2-2a-2 + ... + 2-ma-m
where:
1 1 1 1
111101
+ 000110
1000011
10011
- 110
1101
x 101
10110
00000
+ 10110
1101110
One should write the two numbers under a horizontal line, the dividend over the divisor.
The divisor should be moved left, until its most significant non-zero bit is located right under
the most significant non-zero bit of the dividend.
Then the dividend should be compared to the divisor. If the dividend (the number written higher)
is bigger than the divisor, one ought to subtract the divisor from the dividend and write 1 above
the horizontal line in the last position. During the next comparison, one should use the result of
the subtraction instead of the dividend.
If the divisor is bigger than the dividend, then no subtraction is performed. One ought
to write 0 above the horizontal line in the last position, move the divisor (the number written lower)
to the right by one position and repeat comparing.
If the divisor can't be moved to the right, than one should write 0 above the horizontal line in
the last position. The current result of subtraction is the remainder of division.
For example, after dividing 11 by 3 one will receive the result 3 and the remainder 2:
011
1011
11
1011
- 11
101
- 11
10
After obtaining the remainder, it is possible to continue dividing and receive fractional digits (as
during dividing of decimal numbers).
Notation of numbers in C2
A binary n-digit number stored in C2, can have positive and negative values ranging from -2n-
1 to 2n-1-1.
Definition The decimal value of a n-digit binary number stored using the C2 system as:
an-1 ... a2a1a0
is equal to:
-(2n-1)an-1 + 2n-2an-2 +... + 22a2 + 21a1 + 20a0
where:
For example, a 4-digit binary number -7 stored in C2 (10012), can be extended to 5 bits by
copying the sign bit (110012):
10012 = -23 + 20 = -8 + 1 = -7
110012 = -24 + 23 + 20 = -16 + 8 + 1 = -7
The inverse of a number in C2
Computing the inverse of a number in C2 is performed by negation of all the bits in the number
(replacing zeros by ones, and ones by zeros) and then addition the result to 1.
For example, to calculate the value of number 11 in C2, knowing the binary representation of -
11 in C2 (it is equal to 1101012), one should in the first place negate all the bits:
~110101 = 001010,
and then add 1 to the result:
01010
+00001
01011
23 + 21 + 20 = 8 + 2 + 1 = 11
Binary addition in C2
Binary addition of two numbers in C2 may be presented as simple columnar addition, just like
in NBS. For example, after adding 6 to -11 in C2, one will receive:
1
110101
+ 000110
111011
Binary subtraction in C2
Binary subtraction of two numbers in C2 may be presented as simple columnar subtraction, just
like in NBS. The borrowed numbers have (like in NBS) values one greater than the base of
the system is, so in this case 10 (or 2 in the decimal notation; to simplify the notation,
the examples below use the latter number). For example, after subtracting 6 from -11 in C2, one
will receive:
1 2 2
110101
-000110
101111
1110101
-0010110
1011111
The result is equal to the decimal value:
-2 + 2 + 2 + 2 + 2 + 2 = -64 + 16 + 8 + 4 + 2 + 1 = -33
6 4 3 2 1 0
Binary multiplication in C2
Binary multiplication in C2 is slightly more complicated than in NBS. One of the efficient
algorithms is called Booth's multiplication algorithm. In order to multiply two
numbers: X of lenX bits and Y of lenY bits, both stored in C2, one should perform the following
Initialization steps:
1. If any of the given numbers (factors) is equal to the largest negative number which can
be stored using as many bits as this factor has, then it should be expanded and a new
bit should be added to its left side,
2. Calculate the inversion of X in C2 (-X),
3. Initialize the helper variable A of size of lenX+lenY+1 bits: fill the most
significant lenX bits of A with all the bits of X, and the rest lenY+1 bits of A with zeros
(A stands for addition, which is performed later during the algorithm),
4. Initialize the helper variable S of size of lenX+lenY+1 bits: fill the most
significant lenX bits of S with all the bits of -X, and the rest lenY+1 bits of S with zeros
(S stands for shifting, which is performed later during the algorithm),
5. Initialize the helper variable P of size of lenX+lenY+1 bits: fill the most
significant lenX bits of P with zeros, then the next lenY bits of P with all the bits of Y, and
the last (the least significant) bit of P set to 0 (P stands for product, the result of
multiplication).
After initialization of variables, one should repeat the two steps below lenY times:
1. If the two last bits of P are equal to 01, then one should compute P+A (ignoring any
overflow) and assign the result to P. Otherwise, if the two last bits of P are equal to 10,
then one should compute P+S (ignoring any overflow) and assign the result to P.
Otherwise, if they are equal to either 00 or 11, then P should remain unchanged.
2. All the bits of the current number P should be arithmetically shifted right by one position
(abandoning the least significant bit and leaving unchanged the value of the most
significant bit) and the result ought to be assigned to P.
Finally, after completing the steps above, one should remove the least significant bit from
the received number P. The result (the new value of P) is the product of multiplication of the two
numbers X and Y.
Example of binary multiplication in C2
Let's consider the multiplication of -8 and 3:
X = -8 = 10002
Y = 3 = 0112
In the first step, the number X should be extended:
X = 11000
Then, it is necessary to calculate the value of -X:
-X = ~11000 + 01 = 00111 + 01 = 01000
Calculating the values of helper variables A, S and P:
A = 1 1000 0000
S = 0 1000 0000
P = 0 0000 0110
Then the next two steps of the algorithm should be repeated three times (because
the number Y is 3-bit long):
Iteration 1:
P = P + S = 0 1000 0110
P >> 1 = 0 0100 0011
Iteration 2:
P >> 1 = 0 0010 0001
Iteration 3:
P = P + A = 1 1010 0001
P >> 1 = 1 1101 0000
In the end, it is necessary to remove the least significant bit of P:
P = 1110 1000
The result is equal to:
-2 + 2 + 2 + 2 = -128 + 64 + 32 + 8 = -24
7 6 5 3
Time Estimation of
Mathematical Operations
Time needed by a computer to perform a task can be evaluated by estimating a number of
required operations on bits. One must analyse how many changes of bits is necessary to perform
the calculations on the number, which is stored in computer memory by using binary notation.
Binary addition
Binary addition of two numbers (stored in Natural Binary System - NBS) may be presented as
a simple columnar addition:
1 1 1
11100
+ 01110
101010
First, the vacant spaces in the shorter number are filled with zeros. As a result, both numbers are
of the same length of k bits. After that, simple summation should be performed for each pair of
digits. If the result of the summation is bigger than 1, then the result is set to 0 and
the number 1 is carry to the next position.
The result of addition of two k-digit numbers contains k or k+1 digits. The whole operation of
summation requires k operations on bits.
Binary multiplication
Analogously to addition one can perform multiplication of two binary numbers:
11101
x 01111
11101
111010
+11101
101111001
During multiplication of two binary numbers n and m of lengths of respectively k and j bits one
can receive up to j rows (the number of rows is equal to the number of ones in the number m)
which contain copies of the number n shifted to the left by the adequate number of positions.
Multiplication can be presented as up to j-1 additions. Each addition requires k operations
on bits.
The number of operations on bits during multiplication of two binary numbers is smaller than
a product of their lengths. The result of multiplication has k+j or k+j-1 digits.
It should be added that there are known quite faster algorithms of multiplication of big numbers.
Some of them allow to multiply two k-digit numbers using only k(ln k)(ln (ln k)) operations
on bits.
Factorial
The factorial of a non-negative integer n (n!) is the product of all positive integers less than or
equal to n.
The product of n numbers of length k is up to nk-bit long. Therefore, the number n! is also
up to nk-bit long. To calculate n! one must perform n-2 multiplications of numbers of length of
up to k bits and a number of length of up to nk bits.
Therefore, the total number of operations is equal to:
(n-2)nk2
The value can be presented by using only the number n:
(n-2)n([log2n]+1) 2
Approximately:
n2(log2n)2
Polynomial time
According to Cobham's thesis, algorithms are considered to be fast and effective, if they run in
polynomial time.
Definition An algorithm that performs operations on numbers n1, n2, ..., nr, of lengths of
respectively k1, k2, ..., kr digits, runs in polynomial time, if there are integers d1, d2, ..., dr such as
that the number of operations on bits necessary to perform this algorithm can be presented
as O(k1d1 k2d2 ... krdr).
where:
• N is a positive integer,
• if N is a prime, it will be denoted p (and the whole set as Zp).
To determine the value of an integer for a modulus N, one should divide this number by N. Its
value in ZN is equal to the remainder of the division. In modular arithmetic, it is possible to define
all typical operations, as in normal arithmetic. They work as one may expect. It is possible to use
the same commutative, associative, and distributive laws.
Modular inversion
Integers in modular arithmetic may (but not must) have inverse numbers.
where:
y is denoted x-1
•
For example, if N is an odd number, then the inverse of 2 in ZN is (N+1)/2:
x · (N+1)/2 = N + 1 = 1 (in ZN)
Theorem A number x is invertible in ZN if and only if the numbers x and N are relatively prime.
•The theorem can be proved using the fact that it is possible to present the greatest common
divisor of two integers as a sum of two products each of the numbers and another properly
selected integer:
a·x + b·y = gcd(x,y)
Determining inverse numbers in ZN allows solving linear equations in modular arithmetic:
the equation: a · x + b = 0 (in ZN)
has the solution: x = -b · a-1 (in ZN)
Definition The symbol Z*N denotes a set of all elements of ZN that are invertible in ZN; that
means the set of numbers x that belong to ZN, and x and N are relatively prime.
Information-theoretic security
of ciphers
The quality of a cipher can be described by checking its resistance to strictly technical attacks
(that is bypassing the human factor). The highest defined level of security is referred to as
the perfect security. In fact, to describe the practical security of ciphers, the semantic security
property is usually used.
Perfect Security
Definition A cipher (E, D) defined over (K, M, C) has perfect secrecy if for every two
messages m1 and m2 (of the same size) belonging to M and for every c belonging to C, there is
an equality:
P[E(k, m1)=c] = P[E(k, m2)=c]
where:
Semantic Security
Definition A cipher is semantically secure if knowledge of the ciphertext and the length of
the original message, does not reveal any additional information on the original message that
can be feasibly extracted.
Any probabilistic, polynomial-time algorithm (PPTA) which receives the ciphertext created by
a semantically secure cipher of any certain message and its length cannot determine any partial
information on the message with probability non-negligibly higher than all other PPTAs that only
have access to the message length and don't have access to the ciphertext.
One can prove that the OTP cipher is semantically secure if it uses a random encryption key.
Similarly, each stream cipher can have the property, if a pseudorandom generator used in
the cipher is secure.
Padding Mechanisms
Padding standards are mechanisms for appending some predefined values to messages. They
are used with algorithms which deal with blocks of data. Typical examples of such operations
are block symmetric ciphers and MAC algorithms. These algorithms work on the whole data
blocks. Therefore, if a message length is not a multiple of the block size, a stardard for adding
some number of bytes to the end of the message is required.
The information which padding standard has been used, must be provided to the receiver. This
allows them to determine (after decrypting the ciphertext) where the original message ends, and
the unimportant pad bytes starts.
All the padding standards defined below work in a similar way. They describe which values should
be appended to the message, to fill up the last block.
Using padding is a convenient way of making sure that encrypted data is of the correct size. The
only drawback is the fact that even if the original message contains the correct number of bytes
(a multiple of the block size), some padding must be added to fulfil the process and make sure
that the receiver would be able to understand the message. Usually, a new dummy block must
be added which will contain only the padding bytes.
There are a few padding types described below. The first two paddings are based on bits,
whereas the others are based on bytes.
Bit Padding
A single 1 bit is appended to the data. Then, all other bits of the padding (if any are required) are
zeros.
1 0 1 0 0 0 0 1 1 0 100000
For example, if the message is 3 bytes shorter than an integer multiple of the block size, then
3 pad bytes should be added, each of them of value 3. If 5 bytes should be added, then each of
them should be 5.
0x10 0x11 0x36 0x67 0x38 0xBC 0x03 0x21 0xEF 0x03 0x03 0x03
0x10 0x11 0x36 0x67 0x38 0xBC 0x06 0x06 0x06 0x06 0x06 0x06
0x10 0x11 0x36 0x67 0x38 0xBC 0x03 0x21 0xEF 0x80 0x00 0x00
0x10 0x11 0x36 0x67 0x38 0xBC 0x03 0x21 0xEF 0x23 0x86 0x03
0x10 0x11 0x36 0x67 0x38 0xBC 0x03 0x21 0xEF 0x00 0x00 0x03
0x10 0x11 0x36 0x67 0x38 0xBC 0x03 0x21 0xEF 0x00 0x00 0x00
0x10 0x11 0x36 0x67 0x38 0xBC 0x03 0x21 0x00 0x00 0x00 0x00
Date: 2020-03-09
Block Ciphers Modes of
Operation
The modes of operation of block ciphers are configuration methods that allow those ciphers to
work with large data streams, without the risk of compromising the provided security.
It is not recommended, however it is possible while working with block ciphers, to use the same
secret key bits for encrypting the same plaintext parts. Using one deterministic algorithm for
a number of identical input data, results in some number of identical ciphertext blocks.
This is a very dangerous situation for the cipher's users. An intruder would be able to get much
information by knowing the distribution of identical message parts, even if he would not be able
to break the cipher and discover the original messages.
Luckily, there exist ways to blur the cipher output. The idea is to mix the plaintext blocks (which
are known) with the ciphertext blocks (which have been just created), and to use the result as the
cipher input for the next blocks. As a result, the user avoids creating identical output ciphertext
blocks from identical plaintext data. These modifications are called the block cipher modes
of operations.
A typical example of weakness of encryption using ECB mode is encoding a bitmap image
(for example a .bmp file). Even a strong encryption algorithm used in ECB mode cannot blur
efficiently the plaintext.
The bitmap image encrypted using DES and the same secret key. The ECB mode was used for
the middle image and the more complicated CBC mode was used for the bottom image.
A message that is encrypted using the ECB mode should be extended until a size that is equal to
an integer multiple of the single block length. A popular method of aligning the length of the last
block is about appending an additional bit equal to 1 and then filling the rest of the block with bits
equal to 0. It allows to determine precisely the end of the original message. There exist
more methods of aligning the message size.
Apart from revealing the hints regarding the content of plaintext, the ciphers that are used in ECB
mode are also more vulnerable to replay attacks.
During decrypting of a ciphertext block, one should add XOR the output data received from the
decryption algorithm to the previous ciphertext block. Because the receiver knows all the
ciphertext blocks just after obtaining the encrypted message, he can decrypt the message using
many threads simultaneously.
If one bit of a plaintext message is damaged (for example because of some earlier transmission
error), all subsequent ciphertext blocks will be damaged and it will be never possible to decrypt
the ciphertext received from this plaintext. As opposed to that, if one ciphertext bit is damaged,
only two received plaintext blocks will be damaged. It might be possible to recover the data.
A message that is to be encrypted using the CBC mode, should be extended till the size that is
equal to an integer multiple of a single block length (similarly, as in the case of using the
ECB mode).
In the example presented above, if the intruder is able to predict that the vector IV1 will be used
by the attacked system to produce the response c1, they can guess which one of the two
encrypted messages m0 or m1 is carried by the response c1. This situation breaks the rule that
the intruder shouldn't be able to distinguish between two ciphertexts even if they have chosen
both plaintexts. Therefore, the attacked system is vulnerable to chosen-plaintext attacks.
If the vector IV is generated based on non-random data, for example the user password, it should
be encrypted before use. One should use a separate secret key for this activity.
The initialization vector IV should be changed after using the secret key a number of times. It can
be shown that even properly created IV used too many times, makes the system vulnerable
to chosen-plaintext attacks. For AES cipher it is estimated to be 248 blocks, while for 3DES it is
about 216 plaintext blocks.
In the PCBC mode both encryption and decryption can be performed using only one thread at
a time.
If one bit of a plaintext message is damaged, the corresponding ciphertext block and all
subsequent ciphertext blocks will be damaged. Encryption in CFB mode can be performed only
by using one thread.
On the other hand, as in CBC mode, one can decrypt ciphertext blocks using many threads
simultaneously. Similarly, if one ciphertext bit is damaged, only two received plaintext blocks will
be damaged.
As opposed to the previous block cipher modes, the encrypted message doesn't need to be
extended till the size that is equal to an integer multiple of a single block length.
OFB (Output Feedback) Mode
Algorithms that work in the OFB mode create keystream bits that are used for encryption
subsequent data blocks. In this regard, the way of working of the block cipher becomes similar to
the way of working of a typical stream cipher.
Because of the continuous creation of keystream bits, both encryption and decryption can
be performed using only one thread at a time. Similarly, as in the CFB mode, both data encryption
and decryption uses the same cipher encryption algorithm.
If one bit of a plaintext or ciphertext message is damaged (for example because of a transmission
error), only one corresponding ciphertext or respectively plaintext bit is damaged as well. It is
possible to use various correction algorithms to restore the previous value of damaged parts of
the received message.
The biggest drawback of OFB is that the repetition of encrypting the initialization vector may
produce the same state that has occurred before. It is an unlikely situation but in such a case the
plaintext will start to be encrypted by the same data as previously.
It is one of the most popular block ciphers modes of operation. Both encryption and decryption
can be performed using many threads at the same time.
If one bit of a plaintext or ciphertext message is damaged, only one corresponding output bit is
damaged as well. Thus, it is possible to use various correction algorithms to restore the previous
value of damaged parts of received messages.
The CTR mode is also known as the SIC mode (Segment Integer Counter).
For example, for the AES cipher the secret key should be changed after about 264 plaintext
blocks.
Pseudorandom Generator
(PRG)
Pseudorandom generators (PRG) are used to create random sequences of numbers in
deterministic devices. All computer algorithms are strictly deterministic. PRGs allow encryption of
many data blocks using data generated from secret keys which have only few bits.
Definition Pseudorandom generator (PRG) is an efficient and deterministic function, which
returns a longer pseudorandom output sequence based on the received shorter input:
G:{0,1}s -> {0,1}n
where:
• n >> s
Pseudorandom generator has to be unpredictable. There must not be any efficient algorithm that
after receiving the previous output bits from PRG would be able to predict the next output bit with
probability non-negligibly higher than 0.5.
Pseudorandom generators are used for creating pseudorandom functions and permutations,
which are widely used in cryptography (for example, for implementation of block ciphers).
PRG Implementation
Nowadays, pseudorandom generators are implemented in most operating systems
(for example /dev/random in Linux) and in many libraries for various programming languages.
In general, their behaviour is similar.
First, the algorithm initializes the internal state of the generator based on some external
information (for example, the current time or temperature). Then, all bytes of the state are mixed
for the whole time when the generator works. The changes are based on various external and
random input data - the frequency and the way of using the keyboard and mouse by the user,
network traffic, hardware interrupts and other kinds of information from outside the deterministic
environment where the algorithm works.
The pseudorandom generator algorithm continuously changes its internal state. The internal state
is then used to generate output sequences of numbers, which should be as random as possible.
All the modifications of the state are performed in a way that is supposed to provide the best
possible protection against sequence analysis of the produced output data.
Pseudorandom permutations can be defined in a similar way. They create output data
indistinguishable from random sequences.
Having the generator G, one can expand it and define a new generator G1:
G1: K -> K4
as:
G1(k) = G(G(k)[0]) || G(G(k)[1])
Analogously, it is possible to define an expanded pseudorandom function F1, which takes 2 bits
as its input:
F1: K x {0, 1}2 -> K
as:
F1(k, c) = G1(k)[c]
where:
c is any two bits.
N-bit pseudorandom function
The presented procedure can be repeated any number of times and one can receive
a pseudorandom function which could be of any size. This method of creating pseudorandom
functions is known as Goldreich-Goldwasser-Micali Construction, based on the names
of people who invented it.
It can be proved that if the generator G is secure, then a pseudorandom function working on input
data of length of n bits and defined in a way described above is also secure.
One-way Function
One-way functions are easy to compute but it is very difficult to compute their inverse functions.
Thus, having data x it is easy to calculate f(x) but, on the other hand, knowing the value
of f(x) it is quite difficult to calculate the value of x.
There is not a mathematical prove that one-way functions exist. In practical applications functions
that behave similarly as real one-way functions are used.
One-way functions are key elements of various tools useful in modern cryptography. They are
used in pseudorandom generators, authentication of messages and digital signatures.
An example of such trapdoor one-way functions may be finding the prime factors of large
numbers. Nowadays, this task is practically infeasible. On the other hand, knowing one of
the factors, it is easy to compute the other ones.
A one-way hash function should be collision-free. This means that it should be very difficult to find
two different sequences that produce the same hash value.
Algorithms of one-way hash functions are often known to the public. They provide security thanks
to their properties as one-way functions. Usually, a change of one bit of input data, causes
changing about half of the output bits. Data generated by hash functions should be
pseudorandom (it cannot be possible to distinguish output data from ordinary random data).
One-way hash functions are used to protect data against intentional or unintentional
modifications. Having some data, one can calculate a checksum that may be attached to
the message and checked by other recipients (by computing the same checksum and compare it
with the received checksum value). They are used for example in an message authentication
algorithm HMAC.
Moreover, hash functions are used for storing data efficiently, in so-called hash tables. Data can
be accessed by finding hash values, which are stored in computer memory.
The first h function is initialized by a previously determined vector IV. The last data block should
be fulfilled to the full-length using previously agreed bits. A popular method of aligning the length
of the last block is about appending an additional bit equals to 1 and then filling the rest of
the block with bits equal to 0. It allows to determine precisely the end of the real message.
It may be proved that if the function h is collision-free, then the function H (which is created based
on functions h) is also collision-free. Every collision in the function H would automatically result in
a collision in the function h.
The Davies-Meyer hash function (denoted h) uses the encryption algorithm E that operates
on subsequent data blocks:
h(H, m) = E(m, H) XOR H
A scheme of Davies-Meyer function is presented below:
It can by proved that if E is a secure algorithm of a block cipher, then an intruder would have to
perform about O(2n/2) encryption operations to find a collision (thus, to find a message with
the same hash value as the hash value that he would like to find).
Another kind of hash functions based on block ciphers are Miyaguchi-Preneel functions. There
are 12 kinds of those functions, for example:
h(H, m) = E(m, H) XOR H XOR m
or:
h(H, m) = E(H XOR m, m) XOR m
The abbreviation MAC can also be used for describing algorithms that can create
an authentication code and verify its correctness.
The primary disadvantage of this method is the lack of protection against intentional modifications
in the message content. The intruder can change the message, then calculate a new checksum,
and eventually replace the original checksum by the new value. An ordinary CRC algorithm allows
only to detect randomly damaged parts of messages (but not intentional changes made by
the attacker).
The following paragraphs present several MAC algorithms that provide security against
intentional changes of authentication codes.
• S(k, m) := F(k, m)
• V(k, m, t): returns a value true if t = F(k, m) or false otherwise
• One should consider that the set Y should be large enough that a probability of guessing the
result of the F function would be negligible.
For example, to encrypt a 16-byte long message one can use the AES encryption algorithm or
any other similar symmetric cipher that operates on data blocks of size of 16 bytes.
The following paragraphs present some MAC algorithms that allow to protect longer messages.
CBC MAC
CBC MAC is based on a pseudorandom function (for convenience called F). It works similarly to
encryption performed in the CBC mode, with a difference that intermediate values are
not returned. Moreover, after encryption of the last data block, one additional encryption of
the current result is performed using the second secret key.
The additional encryption is performed to protect the calculated code. The whole process,
including the last additional step, is often referred to as ECBC MAC (Encrypted MAC), in contrast
to the previous algorithm steps called Raw CBC MAC.
Without the last algorithm step (that is, without encryption using the second key), an intruder could
attack CBC MAC security using a chosen-plaintext attack:
1. The intruder chooses a message m of size of one block.
2. The intruder obtains a value of authentication code of the message from the attacked
system: t = F(k, m).
3. At this moment, the attacker can determine a value of authentication code of
the message m1 of the size of two blocks m1 = (m, t XOR m):
rawCBC(k, m1) = rawCBC(k, (m, t XOR m)) = F(k, F(k, m) XOR (t XOR m)) = F(k, t XO
R (t XOR m)) = t
CBC MAC can protect a message of any length, from one to many blocks. To ensure security,
while using CBC MAC one should change the secret key every some time. It can be proved that
after sending the number of messages that is equal roughly to the square of the number of all
possible values of data blocks, the key is no longer safe.
The last data block should be filled up to the full length using previously agreed bits.
The additional bits should clearly determine the end of the original message to prevent attackers
from using a potential ambiguity. A popular method of aligning the length of the last block is to
append an additional bit equal to 1 and then filling the rest of the block up with bits equal to 0. If
there is not enough free space in the last block, one should add one more extra block and fill it
with the additional padding bits.
For comparison, adding only zeros would cause ambiguity where is the last bit of the broadcast
message (because the original message may have zeros as last bits of data). Furthermore, a lot
of messages with different contents that only differ in the number of zeros at the end, would have
the same authentication codes. This situation would break safety rules of message encoding.
ECBC MAC is used in various applications, for example in banking systems (ANSI X9.9, X9.19
and FIPS 186-3 standards). It is often based on the AES algorithm, that is used as F function.
NMAC
The NMAC algorithm (Nested MAC) is similar to the CBC MAC algorithm described earlier. It uses
a slightly different pseudorandom function F. The function F returns numbers that are correct
values of secret keys (thus, not the values of data blocks).
As in the case of CBC MAC, after encryption of the last data block, one additional encryption of
the result is performed, using the second secret encryption key. Because the previous result of
encryption of the last data block consists of the same amount of bits as the secret key, an
additional sequence of bits (a fix pad) should be append, to assure that the result has the same
size as data blocks. NMAC is usually used in systems, where the length of data blocks is much
bigger than the size of secret keys.
The last additional encryption is performed to protect the calculated code, as in the case of
CBC MAC. During encryption the subsequent blocks without the last step of NMAC, the algorithm
is commonly referred to as a Cascade.
Without the last step of the algorithm (that is, without encryption using the second key),
an intruder would be able to append any number of blocks to the intercepted message with
the correctly calculated authentication code. Then, he could calculate a new authentication code
and attach it to the modified message. As input to the first new added function F, the attacker
would use the original authentication code of the original message.
To ensure NMAC security, one should change the secret key from time to time. It can be proved
that after sending the number of messages equal roughly to the square of the number of all
possible values of secret keys, the key is no longer safe.
The NMAC algorithm uses the same methods for adding padding bits to the end of the last
incomplete message block, as the CBC MAC algorithm.
CMAC
The CMAC algorithm is similar to the previously described CBC MAC algorithm. Is uses the same
pseudorandom function F, which returns numbers that are elements of the set of all possible
values of data blocks.
Instead of the last additional encryption that uses a second key, CMAC uses two additional keys
that are added to input bits to the last block of F function. Depending on whether the last message
block is completely filled up with data bits, or it must be filled up with a previously determined
sequence of padding characters, the corresponding encryption key should be used.
Adding the additional key in the last encryption step protects against appending new blocks of
modified messages by a potential intruder. It is not necessary to use an additional encryption by
the F function, unlike in other MAC algorithms. Thanks to this solution, there is no need to add
an additional block to make room for padding (it is enough to choose the correct additional key).
CMAC is considered to be secure. It provides a safe way for message authentication. It is certified
for example by the American institute NIST.
PMAC
The PMAC algorithm (Parallel MAC) can be performed using many threads at time, unlike other
MAC algorithms described above (that require sequential processing of data blocks).
PMAC uses two secret encryption keys. The first secret key is used in P functions. All P functions
receive also the subsequent numbers of the additional counter. Output bits of P functions are
added XOR to data blocks. The result is encrypted by a pseudorandom function F, that uses
the second secret key. The P function should be uncomplicated and it should work much faster
than F functions.
Output bits from all F functions and output bits from the last data block (which is not encrypted by
the F function) are added XOR together, then the result is encrypted using the F function
algorithm, with the second secret encryption key.
As usual, it can be proved that PMAC is secure, if the secret key is changed from time to time. A
new key should be created after sending the number of messages that is equal roughly to
the square of the number of all possible values of data blocks.
PMAC allows to update authentication codes easily and quickly, in a case when one of the
message block was replaced by a new one. For example, if a block m[x] is replaced by m'[x],
then the following calculations should be performed:
tag' = F(k2, (F-1(k2, tag) XOR F(k2, m[x] XOR P(k1, x)) XOR F(k2, m'[x] XOR P(k1, x)))
One-time MAC
Similar to one-time encryption, one can define a one-time MAC algorithm, which provides security
against potential attacks and it is generally faster than other message authentication algorithms
based on PFR functions.
Definition One-time MAC is a pair of algorithms (S, V):
• S(m, k1, k2) := P(m, k1) + k2 (mod q): returns an authentication code t
• V(m, k1, k2, t): returns a value true or false depending on the correctness of
the examined authentication code t
where:
Definition Having a secure one-time MAC (S, V) defined over sets (M, KJ, T) and a secure
pseudorandom function F: KF x {0,1}n->{0,1}n, one can define a pair of algorithms Carter-
Wegman MAC:
• SC-W(m, kF, kJ) := (r, F(kF, r) XOR S(m, kJ)): returns an authentication code that is
a pair (r, tC-W)
• VC-W(m, kJ, F(kF, r) XOR tC-W): returns true or false depending on the correctness of
the examined authentication code (r, tC-W)
where:
HMAC
HMAC is a popular system of checking message integrity, which is quite similar to NMAC.
The HMAC algorithm uses one-way hash functions to produce unique mac values.
The input parameters ipad and opad are used to modify the secret key. They may have various
values assigned. It is recommended to choose the values that would make both inputs to the hash
functions look as dissimilar as possible (that is, that modify the secret key in two different ways).
Using a secure hash function (that means the function which doesn't produce the same outputs
for different input data) guarantees the security of the HMAC algorithm.
Nowadays, the HMAC algorithm is used in many systems, including some popular Internet
protocols (SSL, IPsec, SSH).
Password-Based Encryption
(PBE)
Password-based encryption is a popular method of creating strong cryptographic keys.
The strength of the cipher depends on the strength of the secret key. A strong secret key must
contain characters that are not easily predictable, thus the secret key cannot be simply derived
from the user's password (because passwords are usually memorable subsets of ASCII or UTF-
8 characters).
Password-based encryption allows to create strong secret keys based on passwords provided by
the users. The produced key bytes are supposed to be as random and unpredictable as possible.
PBE algorithms use a user's password together with some additional input parameters:
o salt
o iteration count
There are two popular PBE standards that describe how to convert password bytes into the secret
key: PKCS #5 (supports ASCII characters) and PKCS #12 (which supports 16-bit characters).
In essence, they use a mixing function based around a secure hash function which is applied a
number of times (specified by an iteration count). After the mixing, the output bytes are used to
create the key for the cipher (together with the initialization vector if needed).
A diagram of PBE algorithms
Salt
The salt is a random number. It is supposed to prevent dictionary attacks. Without the salt,
an intruder could use the same PBE algorithms and create a lot of keys for some popular
phrases, often used as passwords. Adding a random value makes the combined input to the
PBE algorithm completely random. It is no longer possible for the attacker to check all the likely
PBE algorithm inputs.
Due to the fact that the salt is random, it is highly unlikely that the same salt would be reused
twice, for multiple encryptions. The salt is not a secret value. It may be transmitted along with
the ciphertext to the receiver.
Salt values are created by pseudorandom number generators. Ideally, the length of the salt
should be the same as the output size of the hash function that was used to create it.
Iteration Count
The key derivation procedure may be made more complicated by running PBE algorithm many
times. This would make the process of creating the secret key much more time consuming. Such
a situation is certainly acceptable for the user, who has to perform the authentication procedure
rarely and doesn't mind short delays. On the other hand, the attacker using brute force
attacks and checking thousands of combinations would suffer significantly due to the increased
time complexity.
Similarly to the salt, the iteration count may be transmitted to the receiver in the clear, along with
the ciphertext.
It is recommended to use 1000 or more iterations to achieve a sufficiently good security level.
Nowadays, most embedded devices allow downloading updated software versions and data. The
new software can be loaded into the device by using an application called a boot loader. Boot
loaders are programs that are stored in the device memory and are able to download data from
outside and flash the device.
Signing the software prevents it from being further modified without detection. Also, a secure
digest allows to authenticate the producer (or, at least, the approver) of the application version,
thus making sure that no malicious third party application is executed.
Software signing
The software stored in the device should be executed only if it was successfully validated during
the boot process. Digital signatures are used most often to confirm the software authenticity.
In order to sign the software, the first step it to use a hash function (for example, one of the SHA
algorithms) to calculate a short and unique software digest. The digest has a fixed, relatively short
(comparing to the size of the full firmware) length.
The digest should be then encrypted, by using an asymmetric cipher and a private key, which is
known only to the manufacturer. RSA is one of the most popular ciphers used for digest
encryption.
A corresponding public key should be saved in the read-only memory area, next to the bootloader.
Public keys are often contained in signed digital certificates.
The encrypted digest is stored in the device normal memory. It changes every time when the new
software version is loaded, for example during software update. During device start-up,
a bootloader should check if the stored digest value is correct. First, it should decrypt the digest,
by using the public key. Then, it has to calculate a second digest value by itself, by using the
same hash function, as it was used originally. Finally, the two values should be compared. If they
are not the same, then one should assume that the code was modified after being signed. The
program execution should stop, and the device should turn off.
It is crucial to highlight that this checking should take place at the beginning of the software flow,
when the processor still operates in the read-only memory, and the control didn't reach the larger,
writeable memory area.
In general, every secure microcontroller should start executing software from an internal,
immutable memory. All the software verification procedures should be stored in that read-only,
initially executed memory. Such a location is often called trusted, and the device starting point of
execution is called the root of trust.
Some more complex schemes of signing could be used to protect the code. The digest may be
signed by multiple entities, for example by the producer and the external validation authority. It is
also possible to add intermediate keys to introduce more sophisticated security mechanisms.
Generally, during manufacturing, the code that was developed and prepared for a device is then
signed using a proper private key and stored (together with the digest) in a secure database.
Asymmetric cipher algorithms mentioned above, are also most often used in this procedure.
During programming the device, the flashing application downloads the software and the
corresponding digest. After boot loader authentication, the secured software (obtained earlier
from the database) is sent to the device.
The boot loader decrypts the received signature, and calculates the software digest. If both digest
values are the same, then the boot loader program should accept the new application version.
Software signing usage today
Secure software flashing is implemented in numerous applications, operating in many areas,
especially in telecommunication, aeronautical, and automotive industries. It is particularly useful
if the software systems are distributed and consist of many, semi-independent devices.
Due to the fact, that a lot of small embedded devices have relatively little computational power,
special care should be taken when using complex hash and encryption algorithms. It is
recommended to configure the asymmetric ciphers in a way, that will make the public-key
decryption procedure as cheap as possible. For example, in case of the RSA cipher, the public
key component should be as small as possible (say, 3) to make computation faster.
Index of Coincidence
The index of coincidence shows how likely is the situation that during comparing some two texts
(letter by letter), two currently compared letters are the same.
Using IC in cryptography
The index of coincidence is used in cryptography for breaking substitution ciphers and simple
XOR ciphers.
IC can be used to determine the length of the secret key if a secret message is encrypted using
one of those ciphers. It may be achieved by comparing (letter by letter or byte by byte)
the encrypted text with the same text shifted by a number of characters which is equal to
the currently tested key size. For each testing possibility (so for each key size, from 1 until finding
the solution) one must calculate the value of IC and remember its value.
When one tests the correct text offset, which is equal to the length of the secret key, the confusion
introduced by the secret key will disappear:
During comparing two texts with wrong text offset, letters (bytes) in the first text will be changed
differently than in the second text. Therefore, it is possible to consider the letters as belonging to
other languages, with different frequencies of letter occurrences in the first and the second text.
A significantly larger value of IC will be calculated for all shifts equal to the key length or its
multiplicity (because the same key is repeated periodically).
o English - 1.73
o Russian - 1.76
o Spanish - 1.94
o Portuguese - 1.94
o Italian - 1.94
o French - 2.02
o German - 2.05
Sometimes, the values of indexes of coincidence are presented without the normalization (the
normalized value depends on the number of letters in the alphabet). For example, for English
language, the expected IC value without normalization is equal to:
1,73 / 26 = 0,067
What Is Encryption?
In public-key encryption schemes, the encryption key is published for anyone to use and
for encrypting messages. Only the receiving party has access to the decryption key that
enables messages to be read. Public-key encryption was first described in a secret
document in 1973. Before that, all encryption schemes were symmetric-key (also called
private-key).
Symmetric Encryption
In symmetric-key schemes, the encryption and decryption keys are the same.
Communicating parties must have the same key in order to achieve secure communication.
Encryption Algorithms
Triple DES Encryption
Triple DES was designed to replace the original Data Encryption Standard (DES) algorithm,
which hackers learned to defeat with ease. At one time, Triple DES was the recommended
standard and the most widely used symmetric algorithm in the industry.
Triple DES uses three individual keys with 56 bits each. The total key length adds up to 168
bits, but experts say that 112-bits in key strength is more like it.
Though it is slowly being phased out, Triple DES is still a dependable hardware encryption
solution for financial services and other industries.
RSA Encryption
RSA is a public-key encryption algorithm and the standard for encrypting data sent over
the internet. It also happens to be one of the methods used in PGP and GPG programs.
Unlike Triple DES, RSA is considered an asymmetric encryption algorithm because it uses a
pair of keys. The public key is used to encrypt a message and a private key to decrypt it. It
takes attackers quite a bit of time and processing power to break this encryption code.
The Advanced Encryption Standard (AES) is the algorithm trusted as the standard by the
U.S. government and many other organizations.
Although it is extremely efficient in 128-bit form, AES encryption also uses keys of 192 and
256 bits for heavy-duty encryption.
AES is considered resistant to all attacks, with the exception of brute-force attacks, which
attempt to decipher messages using all possible combinations in the 128-, 192- or 256-bit
cipher. Still, security experts believe that AES will eventually become the standard for
encrypting data in the private sector.
Encryption Standards
There are a number of standards related to cryptography. Here are the following standards
for encryption:
Usage
Early simple substitution ciphers were used as early as in ancient times. They were one of the first
ways (after steganography) to secure messages.
Description
Simple substitution ciphers work by replacing each plaintext character by another one character.
To decode ciphertext letters, one should use a reverse substitution and change the letters back.
Before using a substitution cipher, one should choose substitutions that will be used for changing
all alphabet letters. This can be performed by writing all alphabet letters in the alphabetical order
in the first row, and then in the second row the same letters but in any other random order. Letters
from upper and lower rows form pairs that should be used during encryption.
For example, let us consider the following two sequences of letters which define the substitution
cipher:
ABCDEFGHIJKLMNOPQRSTUVWXYZ
TIMEODANSFRBCGHJKLPQUVWXYZ
One can notice that the letter F is encoded using the letter D, while the same ciphertext
letter F corresponds to the plaintext litter J. On the other hand, the letter Z corresponds to Z (it is
not changed during encryption).
Mixing of the alphabet can be proceed by determining a keyword (or a few keywords), writing it
down (skipping repeated letters) under the original alphabet letters and completing remaining
empty spaces at the end with the remaining alphabet letters. It makes it easier to remember the
substitutions and to exchange the secret key between all sides.
This method has been used to create the substitution in the example above. A Greek
sentence: timeo Danaos et dona ferentes was used as its keywords.
One of the characteristics of simple substitution ciphers is that different plaintext alphabet letters
would always produce different ciphertext letters. It is not possible that both of them would be
encrypted by the same alphabet letter.
The Caesar cipher is a simple substitution cipher, which replaces each plaintext letter by a
different letter of the alphabet. The cipher is named after Gaius Julius Caesar (100 BC – 44 BC),
who used it for communication with his friends and allies.
Usage
Julius Caesar encrypted his correspondence in many ways, for example by writing texts
in reverse order or writing Latin texts using Greek letters. Some ancient authors (for example,
the Roman historian Gaius Suetonius Tranquillus, who lived in the first century of our era) wrote
that he was using the cipher with various shifts, of one or three characters.
Algorithm
The Caesar cipher is one of the simplest substitution ciphers.
Each plaintext letter is replaced by another one, which is offset by a certain amount of alphabet
positions (always in the same direction). If the algorithm points to the position after the last letter
in the alphabet, one should move to the beginning of the alphabet.
The cipher can be presented using mathematical formulas for encrypting and decrypting
characters:
En(x) = (x+n) mod 26
Dn(x) = (x-n) mod 26
where:
n is the offset (the secret key) and 26 is the total number of letters in the Latin alphabet
(of course, for other languages one should use other a different number).
Implementation
Simple encryption and decryption functions implemented in Python:
KEY = 3
def encrypt(text):
encrypted = ""
for ch in text:
if ord(ch) >= ord('a') and ord(ch) <= ord('z'):
newCode = ord(ch) + KEY
if (newCode > ord('z')):
newCode -= 26
encrypted += chr(newCode)
if ord(ch) >= ord('A') and ord(ch) <= ord('Z'):
newCode = ord(ch) + KEY
if (newCode > ord('Z')):
newCode -= 26
encrypted += chr(newCode)
return encrypted
def decrypt(text):
decrypted = ""
for ch in text:
if ord(ch) >= ord('a') and ord(ch) <= ord('z'):
newCode = ord(ch) - KEY
if (newCode < ord('a')):
newCode += 26
decrypted += chr(newCode)
if ord(ch) >= ord('A') and ord(ch) <= ord('Z'):
newCode = ord(ch) - KEY
if (newCode < ord('A')):
newCode += 26
decrypted += chr(newCode)
return decrypted
ROT13
SIMPLE SUBSTITUTION CIPHER
ROT13 is one of simple substitution ciphers. It is a special case of the Caesar cipher. It appeared
in use in the early eighties of the twentieth century.
Usage
The cipher is often used for hiding content of transmitted information and avoidance automatic
algorithms checking words used in messages. It is used for hiding e-mail addresses from
spambots or prohibited contents in posts on internet forums.
Algorithm
The ROT13 algorithm is about shifting characters by 13 positions in the Latin alphabet (which
contains 26 letters in total).
What is interesting (and sad), there are known cases of using the cipher in respectable and
popular applications (Netscape Communicator, ebooks of New Paradigm Research Group)
in order to protect the stored data.
Implementation
ROT13 can be implemented using a Linux command tr:
alias rot13="tr a-zA-Z n-za-mN-ZA-M"
Homophonic Substitution Ciphers
Homophonic substitution ciphers convert each plaintext character to one of the previously
determined letters or graphic symbols.
Usage
Homophonic substitution ciphers were invented as an improvement of simple substitution ciphers.
They were very popular during the Renaissance and they were used by diplomats in Europe for
many centuries.
Description
Homophonic substitution ciphers work by replacing each plaintext character by another character,
number, word or even graphic symbol. To decode ciphertext letters, one should use the reversed
substitution and change characters in the other side.
The main motivation of introducing such types of ciphers was a possibility to obscure frequencies
of ciphertext characters. Usually popular letters are replaced by one of several characters,
numbers or phrases. Different replacements are used randomly thus frequency analysis is much
more difficult.
Because of the fact that all 26 letters of the Latin alphabet should be replaced by many
corresponding phrases, the most popular technique is to assign a few numbers to each letter.
One can also expand the alphabet and add a few new characters, for example by assigning
different meanings to small and large letters, writing letters upside down or inventing new graphic
symbols.
First mention about book ciphers appeared in 1526 in the works of Jacobus Silvestri.
Usage
Around seventy years after developed the first efficient methods of printing books in 15th century,
the first book ciphers were invented. Thanks to their simplicity, they were used for the next
hundreds of years.
Algorithm
There are several types of book cipher's algorithms. The most popular method consists of
replacing each letter of the plaintext by three numbers - the number of a page, the number of
a line and the number of a character in the line. The numbers are chosen in such a way,
to indicate the same letter, as in the plaintext. Therefore the ciphertext consists of long
sequences of numbers. During decryption one must find each letter of the message pointed by
the triple sequences of numbers.
The other type of the book cipher consists in replacing each letter of the plaintext by two
numbers - the number of a page and the number of a word on the page. Both sides must agree
in advance, which letter of the pointed words will be used for encryption (for example the first one
or - because usually the first letters of words are generally less diverse - the second one or
the third one).
In order to use the book cipher, both sides must agree in advance for using exactly the same
book (including exactly the same edition) during their communication. Because of its popularity
and because all its verses are numbered, the common practice is to use the Bible as the key.
Second, there is a real danger that the intruder will air or guess which book is used by both sides
for encrypting their communication. In the era of computers, it is not a big problem to quickly
check a lot of books potentially used as secret keys by use brute force attacks.
Usage
Polygraphic substitution ciphers were invented as an improvement of simple substitution ciphers.
They were very popular during the Renaissance and they were used in Europe for many
centuries.
Description
Polygraphic substitution ciphers work by dividing the plaintext into many parts, and replacing each
group by a word, a single character or number, or anything else. To decrypt ciphertext letters, one
should use the reversed substitution and change phrases in the opposite direction.
The main motivation of introducing ciphers of this type was a possibility to obscure frequencies
of ciphertext characters. To achieve that, popular plaintext phrases should be replaced by one of
a few previously assigned to that phrase characters, numbers, or other phrases. Different
replacements should be used randomly, thus making the frequency analysis much more difficult.
Polygraphic substitution ciphers provide larger randomness and flexibility that homophonic
substitution ciphers due to a possibility to encrypt whole groups of characters at once.
A popular technique used in polygraphic substitution ciphers is to assign several predefined words
or numbers to each popular plaintext word. European diplomats used codenames to encode
important institutions, places, and names of important people.
Polygraphic Substitution Ciphers:
Playfair Cipher
POLYGRAPHIC SUBSTITUTION CIPHER
The cipher was invented by the British inventor Charles Wheatstone, who lived in the 19th
century. Its first description was presented in 1854. The cipher is named after the Scottish
scientist and politician, Lyon Playfair, who heavily popularized its use.
Usage
With the support of baron Playfair, the cipher was adapted for usage by the British Army. It was
used during the Second Boer War, and then in World War I and World War II (also by other
countries). Like all other ciphers of that period, it was withdrawn from use when the first computers
appeared. Nowadays, it can be broken relatively quickly by using brute force attacks.
Algorithm
The Playfair cipher is a kind of polygraphic substitution cipher. A plaintext is divided into groups
of characters and then one of the predefined characters is assigned to each group. The Playfair's
algorithm operates on groups of size of two letters.
Before encryption, one should prepare a table based on a secret keyword. The table has
dimensions of 5 by 5 letters and contains 25 letters of the Latin alphabet (the Latin alphabet has
26 letters, so one should skip one of the rare letters - for example x or q; or should
count i and j as one letter).
During filling table cells, one should use the secret keyword (or a few secret words). First, all
duplicated letters in the secret word should be skipped (only the first ones should be used). Then,
all the remaining letters should be entered into the table, without changing their original order
found in the keyword. Before doing that, the parties should agree in which order the table ought
to be filled (for example, row by row from left to right and from top to bottom). The rest cells of
the table should be filled with the rest alphabet letters in the ordinary alphabetical order.
For example, if one uses a Latin sentence as a keyword: pecunia non olet (it is believed that its
author was Roman Emperor Vespasian), counting i and j as one letter and filling the table row
by row, from top to bottom, one will receive the following table:
p e c u n
i a o l t
b d f g h
k m r s t
u w x y z
The next step during encryption is about dividing the plaintext into parts of length of two letters. If
necessary, one can append a rare letter (for example X or Q) to the original text.
The algorithm finds both letters of each pair in the table and designates a rectangle that has two
corners pointed by the letters. Then, these two letters should be replaced by another two letters,
determined by two other rectangle's corners. This procedure should be performed for all plaintext
pairs, all of them should be replaced by letters received from the table.
Both parties should agree in which order the new letters ought to be appended to the ciphertext
(for example, the first letter would be a letter in a corner that is determined by a row that contains
the first of both encoding plaintext letters).
The case with both letters in the currently encrypting pair that are located in the same row, should
be handled differently. In such a case, one should usually change them into two letters lying
directly to the right of them. If the original letter is located in the last position of the row, one should
take the first letter of the row.
If the both letters in the currently encrypting pair are in the same column, one should perform
similar operations. Usually, one ought to change them into two letters lying directly below them.
If the original letter is in the last position of the column, one should take the first letter of
the column.
The last case for consideration is the situation when the current pair consists of two identical
letters. One should add an additional rare letter (for example X or Q) before the first letter of
the pair. Then, after encrypting the new pair, one should continue the whole procedure for the rest
of characters (starting from the second letter of the original pair).
Ciphertext decryption is performed in a similar way. First, the recipient must create (knowing
the secret keyword) the same table as the sender. Then, he decodes pairs of letters, using
analogous operations (determining rectangle corners in the reverse order and skipping
unnecessary added letters like X or Q).
Because of its simplicity, the cipher is characterized by features that make it easier to break. First,
one can notice that pairs of letters and their inverse pairs (that means pairs like AC and CA)
produce the same pairs in the ciphertext. It can be detected by creating databases of popular
words and phrases that contain such combinations. Also, the Playfair cipher's ciphertext is
characterized by a lack of the same repeated letters that are located next to each other.
The other method of attacking the cipher is about randomly filling the table and trying to decode
the ciphertext based on its current values. Then, the attacker can slightly modify the table and try
to decode the ciphertext again. He should continue modifying the table, accepting changes that
improve quality of the current proposed plaintext. It is a relatively simple method, quite easy
to implement.
The third very effective method of breaking the Playfair cipher is about guessing plaintext
fragments, for example salutations to a sender, or dates and places of sending the message.
Knowing the ciphertext and probable plaintext parts, one can very easily recreate the table that
was used for encryption. This was a very common method of attacking German ciphers similar to
the Playfair cipher during the Second World War.
Implementation
Implementing the Playfair Cipher is a relatively simple task. The main challenge is to deal properly
with row and column numbers, for each pair of characters.
Below, there is a JavaScript function which performs encryption of the input message, and return
the result. Note, that the keyword is passed as one dimensional string:
var pos = 0;
while (pos < messageInput.length) {
var m1 = messageInput[pos];
var m2 = '';
var c1 = m1;
var c2 = m2;
messageOutput = messageOutput.concat(c1);
messageOutput = messageOutput.concat(c2);
}
return messageOutput;
}
You may try out the Playfair Cipher online on Crypto-Online website.
A two-square cipher is a modification of the Playfair cipher and provides slightly better protection
of exchanged messages.
Usage
The cipher was widely used by diplomats and armies until World War II. Nowadays, it is
considered to be easily breakable by using brute force attacks.
Algorithm
The two-square cipher is a polygraphic substitution cipher. The original plaintext is divided into
groups of a few letters. Then, each group is replaced by another previously determined group
of characters. The two-square cipher operates on groups of the size of two letters.
Before encryption one should prepare two tables, using words (or longer phrases) that are used
as secret keys. Both tables have dimensions of 5 by 5 letters and contain 25 letters of the Latin
alphabet. The Latin alphabet has 26 letters, so to create the table, one of the rare letters should
be skipped (for example, x or q), or the letters i and j should be treated as one letter.
During inserting letters into the two tables, one should use the secret keywords. First, all
duplicated letters in the keywords should be skipped (only the first occurrences should remain).
Then, all the remaining letters should be entered (in the original order) into the tables (letters from
the first keyword to the first table, and letters from the second one to the second table). Of course,
before that, the parties should agree, in which order the table cells ought to be filled (for example,
row by row from left to right, and from top to bottom). The rest cells of the table should be filled
with the remaining alphabet letters, usually in the alphabetical order.
For example, if one used names of both parents of the Roman Emperor Vespasian as secret
keywords: Titus Flavius Sabinus and Vespasia Polla, treating the letters i and j as one letter, and
filling the tables row by row, from left to right and from top to bottom, one would receive
the following two tables:
t i u s f
l a v b n
c d e g h
k m o p q
r w x y z
v e s p a
i o l b c
d f g h k
m n q r t
u w x y z
The tables should be placed side by side in such a way that lines of rows (or lines of columns)
are aligned.
To perform encryption, as a next step, one should divide the plaintext into pairs. Each pair should
consist of two consecutive letters. If necessary, a rare letter may be appended to the original text
(for example X or Q).
Then, one should find the first letter of each pair in the first table, and the second letter in
the second table. Then, one should create a rectangle that covers over two tables and has
corners in cells determined by the two letters. To encrypt those two letters, they have to be
replaced by another two letters, that are determined by two other corners of the rectangle.
The same steps should be repeated for all plaintext pairs. All letters should be replaced by letters
chosen based on the tables.
For example, encrypting of a pair of letters AS by using the tables defined above, produces one
of the two pairs: IL or LI.
Note, that both parties should agree in which order the new letters ought to be appended to
the ciphertext (for example, the first letter would be a letter from the left table and the second letter
would be taken from the right table).
If both letters in the currently encrypting pair are located in the same row (or, respectively, in
the same column), then a new pair should the same as the original one. It must by highlighted
that one of the weaknesses of the two-square cipher is that such a situation takes place for twenty
percent of all possible pairs of letters.
Ciphertext decryption is performed in a similar way. Firstly, the recipient should create (knowing
the secret keywords) the same two tables as the sender. Then, he should decode pairs of letters,
by using analogous operations. He should find the two ciphertext letters in the two tables, then he
should determine the rectangle corners (which will locate the original plaintext letters), and finally
the plaintext letters should be appended in the correct order to the plaintext.
Ciphertext frequency analysis of is about finding frequent repetitions of the same pairs of letters.
Knowing approximate frequencies of digraphs in a given language, one can try to match popular
ciphertext pairs to popular digraphs occurring in the language. Because of using two secret keys,
this is a more difficult task than the same operation performed against the Playfair cipher.
Guessing plaintext fragments is about finding ciphertext letters that correspond to popular phrases
expected during communications, for example welcoming, or date and place of creating the
message. Knowing the ciphertext and probable plaintext fragments, one can recreate the tables
used for encryption. This was a very common method of attacking German ciphers similar to
the two-square cipher, during the Second World War.
Implementation
Implementing the Two-Square Cipher is a relatively simple task. After preparing the input text and
passwords, the main challenge is to deal properly with row and column numbers, for each
character.
Below, there is a JavaScript function which performs encryption of the input message, and return
the result. Note, that both keywords are passed as one dimensional strings:
var pos = 0;
while (pos < messageInput.length) {
var m1 = messageInput[pos];
var m2 = '';
var c1 = m1;
var c2 = m2;
messageOutput = messageOutput.concat(c1);
messageOutput = messageOutput.concat(c2);
}
return messageOutput;
}
You may try out the Two-Square Cipher online on Crypto-Online website.
Four-Square Cipher
POLYGRAPHIC SUBSTITUTION CIPHER
The four-square cipher is a modified version of the Playfair cipher. It provides better security
of protected data. It was invented by a French cryptanalyst Félix Delastelle in 19th century.
Usage
It was used by all armies during World War II. Nowadays, it is considered to be easily breakable
by using brute force attacks.
Algorithm
The four-square cipher is a polygraphic substitution cipher. The whole plaintext is divided into
groups of letters. Then, each group is replaced by another previously determined group of
characters. The four-square cipher operates on groups of the size of two letters.
Before encryption, it is necessary to prepare four tables. All the tables have dimensions of 5 by
5 letters and contain 25 letters of the Latin alphabet. Due to the fact that the Latin alphabet
contains 26 letters, one of the rare letters (for example x or q) should be skipped, or the
letters i and j should be treated as one letter. The tables should be placed side by side in such
a way that they create a bigger square with the side length of two tables. All lines of rows and
columns should be retained.
Two tables (the upper left and lower right ones) contain letters in the alphabetical order. During
filling cells of the two other tables, one should use two secret keywords (that would be used to
protect the data). Firstly, all duplicated letters in the keywords should be skipped (only the first
occurrences should used). Then, all the remaining keyword letters should be entered (in the
original order) into the tables (the letters from the first keyword to the first remaining table -for
example the upper right one- and the letters from the second keyword to the second table).
The communicating parties must agree earlier, in which order each table ought to be filled (for
example row by row from left to right and from top to bottom). The rest cells of the both tables
should be filled with the rest alphabet letters, usually in the alphabetical order.
For example, if one used names of both parents of the Roman Emperor Vespasian as secret
keywords: Titus Flavius Sabinus and Vespasia Polla, treating letters i and j as one letter, and filling
the tables row by row, from left to right and from top to bottom, the two following tables would be
received:
a b c d e
f g h i k
l m n o p
q r s t u
v w x y z
t i u s f
l a v b n
c d e g h
k m o p q
r w x y z
v e s p a
i o l b c
d f g h k
m n q r t
u w x y z
a b c d e
f g h i k
l m n o p
q r s t u
v w x y z
In the next step of encryption, the whole plaintext should be split into pairs. Each pair should
consist of two consecutive letters. If required, a rare letter should be appended to the original text
(for example X or Q).
During encryption, two subsequent letters are encoded at a time. One should find the first letter
of each pair in the upper left table and the second letter in the lower right table. Then, one should
create a rectangle that covers over four tables and has corners in the cells determined by the two
plaintext letters. The letters are encrypted by replacing them by another two letters, that are
pointed by two other corners of the rectangle. The same steps should be repeated for all plaintext
pairs. All letters should be replaced by letters determined by the four encryption tables.
For example, after encryption of a pair AS using the tables defined above, one would receive a
new pair IL or LI.
Both parties should agree in which order the new letters ought to be appended to the ciphertext
(for example, the first letter would be a letter from the upper right table, and the second letter
should be taken from the lower left table).
Ciphertext decryption is performed in a similar way. Firstly, the recipient should create (knowing
the secret keywords) the same four tables as the sender. Then, he should decode all letters pair
by pair, using analogous operations. He needs to find two ciphertext letters in the upper right and
lower left tables. After that, the rectangle corners should be found, and they should point to the
two new plaintext letters (taken from the upper left and lower right tables).
Implementation
Implementing the Four-Square Cipher is a relatively simple task. After preparing the input text
and passwords, the main task is to deal properly with row and column numbers, for each pair of
character.
Below, there is a JavaScript function which performs encryption of the input message, and return
the result. Note, that both keywords are passed as one dimensional strings:
var pos = 0;
while (pos < messageInput.length) {
var m1 = messageInput[pos];
var m2 = '';
messageOutput = messageOutput.concat(c1);
messageOutput = messageOutput.concat(c2);
}
return messageOutput;
}
You may try out the Four-Square Cipher online on Crypto-Online website.
Hill Cipher
POLYGRAPHIC SUBSTITUTION CIPHER
The Hill cipher is considered to be the first polygraphic cipher in which it is practical to work on
more than three symbols at once.
Usage
The Hill cipher was created in 1929 by Lester S. Hill, an American mathematician.
Algorithm
In the Hill cipher each letter corresponds to one unique number, from 0 to 25. The simplest
scheme (A = 0, B = 1, ..., Z = 25) is used the most often but one can choose other combination
as well.
Messages are divided into n-letter blocks. Encryption is performed by multiplication of all blocks
by one n x n secret matrix, which contains also numbers from 0 to 25. All the results should be
modulo 26. The matrix can be defined based on a secret keyword, which contains n2 letters (one
should just ignore other unnecessary letters).
The decryption algorithm is similar to the encryption process. One should divide the ciphertext
into blocks (each with n letters) and multiply them by the inverse of the matrix modulo 26 used for
encryption.
Encryption and Decryption
To encrypt a message vino using a 2 x 2 matrix, one should divide the message into two blocks
of two letters. Then one should change the letters into numbers:
-->
21
10
14
15
K=
3 3
2 5
It is easy to calculate the inverse of the matrix modulo 26 using for decryption:
K-1 =
15 17
20 9
Encryption of two plaintext blocks is about multiplication them with the key matrix:
3 3
2 5
21
10
93 mod 26
92 mod 26
15
14
3 3
2 5
14
15
87 mod 26
103 mod 26
25
20 9
15
14
463 mod 26
426 mod 26
=
21
10
15 17
20 9
21
560 mod 26
405 mod 26
14
15
The received four number can be changed into the original plaintext letters vino.
Polyalphabetic Substitution
Ciphers
Each plaintext character is replaced by another letter. A way of substitution is changed cyclically
and it depends on a current position of the modified letter.
Usage
Polyalphabetic substitution ciphers were invented by an artist, philosopher and scientist Leon
Battista Alberti. In 1467 he presented a device called the cipher disk. It provides polyalphabetic
substitutions with mixed alphabets.
Description
In polyalphabetic substitution ciphers one should define a few possible combinations of
substitutions of all alphabet letters by other letters. Then, one should use the substitutions
cyclically, one after the other, changing the replacement after each new letter.
To use this cipher, one should choose, remember and deliver to all parties some substitutions of
all alphabet letters. Then, the substitutions should be used in a specific order. To decrypt
the message, one should use corresponding substitutions in the same order but the letters should
be changed in the other side.
The strongest version of a polyalphabetic substitution cipher is to define all its transformations
randomly. Such a method was preferred by Alberti himself.
On the other hand, due to the large amount of data to remember, some easy to remember and
easy to hand over to another person substitutions were invented and widely used. The Vigenère
cipher is an example of such an approach.
Security of polyalphabetic substitution ciphers
A properly implemented polyalphabetic substitution cipher is quite difficult to break. Its strength is
based on many possible combinations of changing alphabet letters. Some effective methods of
attacking such ciphers were discovered in the nineteenth century. They are about to guess
a secret key's length in a first step. After that, one can examine the ciphertext using frequency
analysis methods.
Polyalphabetic Substitution Ciphers:
Trithemius Cipher
POLYALPHABETIC SUBSTITUTION CIPHER
The cipher was invented by a German monk Johannes Trithemius, who lived at the turn
of fifteenth and sixteenth centuries. He described it in his book Polygraphia published in 1508.
This is considered to be one of the first books dedicated entirely to cryptography.
Usage
The cipher is very simple and it doesn't provide good security of transmitted messages. However,
the Trithemius cipher was an important step during developing polyalphabetic ciphers
in sixteenth-century Europe.
Algorithm
The Trithemius cipher was one of many polyalphabetic ciphers designed to be easy in frequent
use. Instead of using random combinations of alphabet letters, Trithemius proposed using a
special table. After some time the table was named tabula recta.
Tabula Recta
The first row contains all alphabet letters in the original order. Next rows also contain all letters
but in each row they are shifted to the left by one position. The table has 26 rows and 26 columns
(there are 26 letters in the Latin alphabet).
During encryption, subsequent plaintext letters are replaced by relevant letters from subsequent
rows of the table. After using the last row, one should move back to the first row. It means that all
plaintext letters are increased by number of positions determined by the actual row. Therefore
the first letter is encrypted without shift, the second letter with the shift determined by the second
row (so by one position), the third letter with the shift determined by the third row (so by two
positions) and so on.
For example, a word MACHINE encoded using the cipher would create ciphertext MBEKMSK.
Vigenère Cipher
POLYALPHABETIC SUBSTITUTION CIPHER
The cipher was invented by Italian Giovan Battista Bellaso, who described it in 1553 in his
book "La cifra del. Sig. Giovan Battista Bellaso". However it is named, due to the wrong widespread
belief in the nineteenth century, after the French diplomat and alchemist Blaise de Vigenère, who
lived in the sixteenth century.
Usage
The Vigenère cipher is quite easy to use and provide relatively good security. It was widely used
for a long time until the twentieth century.
Algorithm
The Vigenère cipher is a kind of polyalphabetic substitution cipher. It is about replacing plaintext
letters by other letters. The parties have to agree the common shared keyword (which may be
also a sentence), which is used during encryption algorithm. They don't have to specify all
26 substitutions for all possible letters of the alphabet.
During encrypting and decrypting, one should use a table which contains all alphabet letters in
the correct order in the first row and then, in subsequent rows, letters shifted to the left by one
subsequent position. The table has a Latin name tabula recta and it was used the first time in
cryptography by a German monk Johannes Trithemius.
Tabula Recta
In order to encrypt a message, one should use a secret keyword (or a few words). The keyword
is used to choose rows with variously shifted alphabet letters. Subsequent plaintext letters are
replaced by subsequent corresponding letters in rows, which are pointed by keyword letters.
For example, if one choose a word rex as a secret keyword, the first message letter should be
encrypted using the row r, the second letter using the row e, the third letter using the row x.
Therefore, the first letter should be shifted by 17 positions, the second plaintext letter should be
shifted by 4 positions and the third letter should be shifted by 23 alphabet positions. Then, one
should use keyword letters from the beginning. The fourth plaintext letter will be encrypted using
the row r.
There is a simple variant of the Vigenère cipher, referred to as Variant Beaufort. Using this
variant, one should encrypt the message using the Vigenère decryption method and decrypt
the ciphertext using the Vigenère encryption algorithm. One just move letters in the opposite
direction than in the original algorithm. This method has nothing in common with the Beaufort
cipher so they shouldn't be confused.
After determining the length of the key, further cryptanalysis is based on frequency analysis
of ciphertext letters. Ciphertext letters encrypted with different secret key letters should
be analyzed separately. Plaintext letters encrypted used the first secret key letter should be tested
separately, plaintext letters encrypted used the second secret key letter should be also tested
separately and so on. Knowing the key size, the main task is to break a few texts, separately
encrypted using the Caesar cipher.
Kasiski examination
This method of determining the secret key length was created by the German soldier,
archaeologist and cryptographer Friedrich Kasiski in the nineteenth century. One should search
through ciphertext looking for sequences of the same characters. Finding such sequences may
mean that they are created by encoding the same parts of plaintext using the same parts of secret
key.
For example, the principle can be noticed during encryption following plaintext letters using
following secret key letters:
Key: NATURAENATURAENATURAENATURAE
Plaintext: ALIUDESTFACEREALIUDESTDICERE
Ciphertext: NLBOUEWGFTWVRINLBOUEWGDBWVRI
If during ciphertext analysis one found more sequences of the same characters, then one could
assume that the secret key has the length equal to one of the numbers suggested by different
repeated sequences.
Sequences of the same ciphertext characters may be also caused by random mixing of various
plaintext and secret key letters. The more repeated sequences in the ciphertext will be found,
the more likely they are caused by encrypting the same parts of plaintext using the same secret
key letters (and it is not just a random coincidence).
Friedman test
William Friedman was a cryptographer in the US army. He elaborated a method of guessing
the keyword length for the Vigenère cipher in the third decade of the twentieth century. It is based
on calculating an index of coincidence and one should compare ciphertext letters with the same
letters shifted by various numbers of letters.
Beaufort Cipher
POLYALPHABETIC SUBSTITUTION CIPHER
The cipher is named after British admiral Francis Beaufort, who lived at the turn of the 18th and
19th centuries.
Usage
The Beaufort cipher is a simple polyalphabetic cipher. It uses a table called tabula recta, which
was first introduced in the Trithemius cipher. It shouldn't be confused with a special variant of
the Vigenère cipher, named Variant Beaufort.
The Beaufort cipher was used in rotor-based cipher machines Hagelin M-209 in the middle of the
20th century.
Algorithm
The Beaufort cipher's algorithm is based on the table called tabula recta:
Tabula Recta
During encryption, all plaintext letters are replaced by other letters, based on the tabula
recta table. Both sides share one secret key, which consists of one or more words. Each plaintext
letter is encrypted by using one key letter. After the last key character has been used, the
algorithm goes back to the first key letter and starts taking key characters again from the
beginning.
The encryption process is presented below. For every plaintext letter, one should perform the
following operations, by using one key character.
Implementation
A simple encryption/decryption function implemented in Python:
The running key cipher is a variation of the Vigenère cipher. Each letter of the plaintext is shifted
along some number of alphabet positions in one specified direction.
Usage
Like other polyalphabetic ciphers, the running key cipher had been quite widely used until the first
part of the twentieth century when effective attacks to this kind of ciphers were discovered.
Algorithm
Encrypting using running key is about shifting plaintext letters along some numbers of alphabet
positions. The numbers are determined by letters of a secret keyword (like in other substitution
ciphers). To search for proper letters during encrypting and decrypting, one can use tabula recta,
as during using for example the Trithemius cipher or the Vigenère cipher, both based on the same
idea.
Tabula Recta
Instead of determining a secret keyword and them using it repeatedly during encrypting all
messages, the running key cipher uses long, publicly available sets of letters - books and other
similar long texts. Parties should agree which book exactly (and exactly which edition) they will
use during the communication. The must determine the number of the first page used for
encryption, the first row and the number of letter in the row.
All letters of the message are encrypted using subsequent letters found in the book. After
encrypting some characters, one may jump to another, arbitrarily selected position in the book
and continue taking key letters from new positions. It is possible to encode a number of a new
page, a number of a new row and a number of the first letter in the row as subsequent letters.
The letters can be appended to the plaintext and both can be encrypted together. The second
party, after finding the letters and decoding them, jumps to the new position of the secret key
letters. One may also provide information about changing the book using during encryption.
In order to increase cipher's security, the parties can take ciphering letters not from one sequence
but from some different sequences (in different parts of the text) at the same time. The attacker
would have to guess rules used for changing the sequences. In this case, the analysis is much
more difficult because secret key letters don't create correct words.
Another idea to make cryptanalysis more difficult is about assigning a few words to each alphabet
letter and using those words instead of keyword letters. The method is intended to make difficult
distinction ciphertext letters from plaintext letters. Usually ciphertext doesn't consist of words,
unlike plaintext and secret key sequences.
Effective and popular methods for improving the cipher and creating better secret key characters
are about to using texts which contain unusual expressions (it was often used for example
by KGB) or avoiding the use of tabula recta and replacing it by random combinations.
Autokey Cipher
POLYALPHABETIC SUBSTITUTION CIPHER
The autokey cipher was presented in 1586 by a French diplomat and alchemist Blaise de
Vigenère.
Usage
The autokey cipher was used in Europe until the 20th century. Currently it is considered to be
easy to break. However, the idea to create key letters based on plaintext letters is used in many
modern ciphers.
Algorithm
Similarly to other polyalphabetic substitution ciphers, the autokey cipher algorithm is about
changing plaintext letters based on secret key letters. Each letter of the message is shifted along
some alphabet positions. The number of positions is equal to the place in the alphabet of
the current key letter.
To simplify calculations, one can use a table which contains in subsequent row alphabets with
letters shifted along increasingly larger number of positions. The table is called tabula recta and
looks like the one below:
Tabula Recta
Unlike in other similar ciphers, after using all of secret key letters, the algorithm doesn't go back
to its first letter but starts to take plaintext letters as new key letters.
For example, after encryption two words Opinio communis using the secret key Ab ovo one
receives:
Plaintext: OPINIOCOMMUNIS
Key: ABOVOOPINIOCOM
Ciphertext: OQWIWCRWZUIPWE
To break the cipher, the intruder should try to guess some parts of plaintext (for example trying
some common sequences of letters). Comparing them to plaintext allows to receive some
characters of the secret key. One should try to find such letters which result in disclosure of correct
words among the secret key characters.
Nihilist Cipher
POLYALPHABETIC SUBSTITUTION CIPHER
First used in the eighties of the nineteenth century in Russia by Nihilist organizations.
Usage
The cipher is named after the Nihilist movement, who fought against czarism in Russia
and attacked czarism's officials in the nineteenth century. They killed the tsar Alexander II in
the successful assassination in 1881.
The original algorithm was not very strong but there are some modifications which provide much
better security. One of ciphers which belongs to the Nihilist family of ciphers is the VIC cipher.
Algorithm
An algorithm of the Nihilist cipher uses a matrix called a Polybius square. It has 5 rows and
5 columns and it is filled with all Latin letters (there are 26 Latin letters, so usually
the letters i and j are treated as one character).
An order of letters in the table depends on a secret word which is shared by the two
communicating parties. To determine the order of letters, one should remove duplicate letters of
the secret word and then enter the rest letters into the table. Usually the letters are written starting
from the top leftmost cell and going to the right, row by row. However, the parties can agree to
the different order. The rest of empty cells are filled with the rest of letters, which aren't contained
by the secret word. Usually the parties use an alphabetical order.
For example, if a full name of the tsar Alexander II killed by Nihilists (Aleksandr II Nikolaevich) is
used as the secret word, then the table can be written as below:
1 2 3 4 5
1 a l e k s
2 n d r i o
3 v c h b f
4 g m p q t
5 u w x y z
Each letter of the secret key and each letter of a given message is changed into a two-digit
number, determined by digits of rows and columns. The secret key is usually different than
the secret word used for create the table in the previous step. The secret key is used during
encryption of the whole communication. Both the secret key (used for creating the table) and
the secret key (used for encrypting all messages) must be shared between the communicating
parties.
During encryption one should add one by one all the numbers created from plaintext letters to
the numbers created from the secret key's letters. The results can be two-digit or three-digit
numbers. The created ciphertext can contain a sequence of numbers or the received numbers
can be changed into letters using the same table and the inverse transform.
For example, the steps during encrypting of a sentence Acta est fabula using a secret
keyword Vivere and the Polybius square defined above, are presented below.
The plaintext and the secret encrypting key should be written in two rows, one under another:
a c t a e s t f a b u l a
v i v e r e v i v e r e v
After replacing letters by numbers, the rows have the following form:
11 32 45 11 13 15 45 35 11 34 51 12 11
31 24 31 13 23 13 31 24 31 13 23 13 31
The ciphertext is created by adding the plaintext numbers to the secret key numbers:
42 56 76 24 36 28 76 59 42 47 74 25 52
The recipient, which knows the secret key, subtracts the secret key numbers from the ciphertext
numbers. He receives the plaintext numbers, which can be changed into letters using the same
table as the sender used for encryption.
Then, for each received table, one should check numbers in all its columns. The numbers in each
columns should have both tens digits and ones digits which differ from each other not more than 5.
It is caused by the fact that in each column all the characters have been encoded using the same
letter of the secret key, so each plaintext number in the column has been added to the same
secret key number. The value of the secret key number does not affect a difference between
plaintext numbers, which were encoded using this number.
During adding any numbers from the Polybius square to another number which also belongs to
the Polybius square, one always receives numbers, which have tens digits and ones digits that
don't differ from each other more than 5 (so the difference between the biggest and the smallest
tens digits in the column can't be bigger than 5; the same situation applies to ones digits). In
the next steps, one should use only tables (only such potential lengths of the secret key) which
satisfy this condition.
The second step is to determine possible key numbers, which could be used for encryption. For
each column, one should find all possible numbers, which subtracted from all ciphertext numbers
in the column, result in numbers of values from 11 to 55. Every other number, which doesn't
satisfy the condition, should be discarded. In practice, it is possible to eliminate a lot of potential
secret key numbers in that way.
Further analysis of the ciphertext can rely on changing numbers into letters using a trial and error
method looking for solutions which disclose fragments of the original plaintext. For each correct
solution, one should create all possible encrypting keys and subtract them from the ciphertext,
receiving potential plaintext letters.
VIC Cipher
POLYALPHABETIC SUBSTITUTION CIPHER
Used by Soviet spies all over the world, in the middle of the twentieth century. Its name is based
on a nickname VICTOR of a Soviet agent spying in USA under the name Reino Häyhänen.
In 1957 he surrendered to the American intelligence and disclosed details of the cipher.
Usage
The VIC cipher is regarded as the most complex modification of the Nihilist cipher family. It is
considered to be one of the strongest ciphers, which can be used manually without computers.
By the time it was disclosed as a result of betrayal, American counterintelligence hadn't managed
to break the cipher.
Algorithm
The VIC cipher uses a table which allows changing letters of plaintext into numbers. It is called
a straddling checkerboard.
It differs from tables used in other substitution ciphers because it produces shorter sequences of
numbers (it is much more comfortable for sending to the second party).
0 1 2 3 4 5 6 7 8 9
E T A O N R I S
2 B C D F G H J K L M
6 P Q / U V W X Y Z .
The highest row is populated with the ten digits from 0 to 9. The second row is typically filled with
popular letters in any order. In English a mnemonic ESTONIA-R can be used to remember
the most frequent letters. Free cells should be left under two digits and in the leftmost column.
Each of both lower rows receives one of the two remaining digits, which isn't used in the second
row. Then, the two rows should be filled with letters in alphabetical order. Because of two empty
remaining cells, two additional special characters may be entered into the table. They can be
used for special purposes or shortcuts agreed previously between the two parties.
During encryption using VIC one should replace letters of the message by numbers created
based on numbers of rows and columns. The most popular letters should be replaced by only one
digit of the column (that results in producing shorter ciphertext).
For example, one can encrypt the name of the famous Scottish queen using the table presented
above:
M A R Y Q U E E N O F S C O T S
29 3 7 67 61 63 0 0 5 4 23 9 21 4 1 9
It should be noticed, that a lot of numbers in the received sequence have only one digit.
The next step is to add some specified numbers to the all digits of the created sequence. One
should add one by one all digits of the changing message to all digits of the secret sequence.
After the last letter of the secret sequence, algorithm goes back to the first digit of the sequence
and continues its work. The addition is done modulo 10, so if the result is bigger than 10 then
the tens digit should be discarded.
Continuing the example, one could add the received numbers to the secret sequence of four
digits, the year of Mary's birth (1542):
2 9 3 7 6 7 6 1 6 3 0 0 5 4 2 3 9 2 1 4 1 9
+ 1 5 4 2 1 5 4 2 1 5 4 2 1 5 4 2 1 5 4 2 1 5
= 3 4 7 9 7 2 0 3 7 8 4 2 6 9 6 5 0 7 5 6 2 4
The received digits can be used as a ciphertext and send to the second party. Sometimes, it is
a good idea to change digits back into letters, using the same table as during encryption.
Changing numbers into letters is straightforward and intuitive. After finding one of the two digits
which are assigned to the two lower rows, one should use a proper two-digit number.
The sequence of digits received previously can be changed into a sequence of letters as below:
3 4 7 9 7 20 3 7 8 4 26 9 65 0 7 5 62 4
A O R S R B A R I O J S W E R N / O
Decrypting can be performed using the same straddling checkerboard, the same secret number
and the steps performed in reverse order. The secret number's digits should be subtracted from
ciphertext's digits. If any of the results are smaller than 0, then one should add 10 to
the ciphertext's digits.
Security of VIC
The VIC cipher is well designed and provides quite good security. It makes ciphertext analyzing
very time-consuming by breaking the original frequency distribution.
There are many modifications of the VIC cipher. Changes can be introduced in the straddling
checkerboard by changing the order of letters. Some cells may be left empty, what makes
cryptanalysis more difficult.
The received ciphertext's characters can be modify at the end of encryption using one of
the transposition ciphers' algorithms.
Transposition Ciphers
To encrypt data, transposition ciphers rearrange the original message letters. The same letters
will appear in both plaintext and ciphertext, but the idea is that the permutation used to protect
data should be difficult to break without the knowledge of the secret key.
Usage
Transposition ciphers have been used since ancient times. They are perhaps as old, as the oldest
substitution ciphers and steganography methods. At present, in modern ciphers, various
transpositions are used together with substitutions, to make the cryptanalysis more difficult.
Description
There is not any common algorithm, that would be used in all transposition ciphers. The main
idea is to change the letter order in such a way, that would prevent attackers from reading it, while
at the same time, allow the receiver to decrypt messages easily and effectively.
Both sender and receiver should share a common secret, usually a keyword, that determines the
exact transpositions that should be applied to the text.
Transposition ciphers usually require more memory and more complex operations, than
substitution ciphers. That is why modern ciphers implemented pragmatically and electronically
are usually based on substitutions, and less often on transpositions.
Transposition Ciphers:
Rail Fence Cipher
TRANSPOSITION CIPHER
The Rail Fence Cipher is a transposition cipher, which rearranges the plaintext letters by drawing
them in a way that they form a shape of the rails of an imaginary fence.
Usage
The Rail Fence Cipher was invented in ancient times. It was used by the Greeks, who created a
special tool, called scytale, to make message encryption and decryption easier. Currently, it is
usually used with a piece of paper. The letters are arranged in a way which is similar to the shape
of the top edge of the rail fence.
Algorithm
To encrypt the message, the letters should be written in a zigzag pattern, going downwards and
upwards between the levels of the top and bottom imaginary rails. The shape that is formed by
the letters is similar to the shape of the top edge of the rail fence.
Next, all the letters should be read off and concatenated, to produce one line of ciphertext. The
letters should be read in rows, usually from the top row down to the bottom one.
The secret key is the number of levels in the rail. It is also a number of rows of letters that are
created during encryption. This number cannot be very big, so the number of possible keys is
quite limited.
For example, let us encrypt a name of one of the countries in Europe: The United Kingdom.
Let's assume that the secret key is 3, so three levels of rails will be produced.
First, we will remove the empty spaces, and encrypt only the capitalized letters:
THEUNITEDKINGDOM
Next, the plaintext letters will form the shape of the fence:
T . . . N . . . D . . . G . . .
. H . U . I . E . K . N . D . M
. . E . . . T . . . I . . . O .
Then, the letters should be read row by row, starting from the top one. Finally, they ought to be
concatenated to form one ciphertext message. In our example, the calculated ciphertext
sequence would be:
TNDGHUIEKNDMETIO
To decrypt the message, the receiver should know the secret key, that is the number of levels of
the rail. Based on the number of rows and the ciphertext length, it is possible to reconstruct the
grid and fill it with letters in the right order (that is, in the same way as used by the sender during
encryption).
Implementation
The encryption function in the Rail Fence Cipher performs two major steps. First, the letters are
entered into a table, that represents the imaginary fence. Then, the letters should be read off in
rows.
Below, there is a JavaScript function which performs encryption of the input message and returns
the result. Note, that rowNumber input parameter is determined by the cipher's secret key:
function encrypt(messageInput, rowNumber) {
var messageOutput = '';
var r = 0;
var direction = 1;
r = r + direction;
}
var row = 0;
while (row < rowNumber) {
for (var pos = 0; pos < fanceTable[row].length; ++pos) {
messageOutput = messageOutput.concat(fanceTable[row][pos]);
}
++row;
}
return messageOutput;
}
You may try out the Rail Fence Cipher online on Crypto-Online website.
Route Cipher
TRANSPOSITION CIPHER
The Route Cipher is a transposition cipher. It rearranges the plaintext letters based on a shape of
an imaginary path drawn on a grid.
Usage
The Route Cipher is a simple transposition cipher that can be performed manually, without the
need of using additional equipment. It was quite popular throughout centuries, and used to protect
information when more sophisticated ways were not available.
Currently, the Route Cipher is usually used with a piece of paper. The letters fill the grid which
has dimensions defined by the secret key.
Algorithm
To encrypt the message, the first step is to create a grid of one dimension determined by the
secret key, and the second dimension depended on the data size. The parties must also agree
which dimension (width or height) is described by the secret key, and in what way the grid will be
filled with plaintext letters (row by row, or column by column). If some cells in the grid remain
empty, one of two possible approaches should be taken:
1. The cells may be left empty, and just ignored during all further operations.
2. The sender may enter there some rare letters, and treat them as a part of the plaintext.
After decryption, the receiver should be able to determine, that the letters have no
sense, and that they should be ignored.
One more thing must be agreed by the sender and receiver in order to encrypt the message: an
order in which the letters in the grid should be appended to ciphertext. The order should not be
too simple, to prevent parts of plaintext appearing in the produced ciphertext. On the other hand,
the order should not be too difficult, to prevent the need of remembering to difficult configurations
by communicating parties.
The users often choose the rules that form some kind of paths on the grid, which should be follow
during encryption, for example 'spiral clockwise inwards, starting from the top left corner'. The
name of the cipher is derived from these paths, used for every data encryption and decryption.
The way in which the path is defined is also a part of the secret key of this cipher.
As an example, let's encrypt a name of a city in Great Britain, Brighton and Hove. The secret
key will be 3, and it will determine the width of the grid. We will fill the grid row by row, from left to
right. Finally, we will read the grid clockwise, going inwards, and starting from the top right corner.
As usual, the encryption starts by removing the non-letter characters, and capitalizing all the
letters:
BRIGHTONANDHOVE
The letters are then entered into the grid, which is 3-column wide:
BRI
GHT
ONA
NDH
OVE
Luckily, in our case, there is no need to add any additional characters at the bottom of the grid.
The letters are then read, and appended to the ciphertext. The reading starts from the top right,
and spiral clockwise inwards. The produced encrypted text will be:
ITAHEVONOGBRHND
As we can see, the original text was hidden, and the ciphertext doesn't reveal any plaintext parts.
Knowing the length of the ciphertext and the secret key, the receiver is able to recreate a grid of
the same size, as the one used for encryption. Then, knowing the path directions, the receiver
can simple enter the letters into correct cells. Finally, the plaintext is revealed by reading the grid
in the same way, as was used by the sender to enter the letters into the table.
Implementation
The main functionality of the Route Cipher is reading and following the shape which forms the
path. The implementation versions will differ, depending on the types of used paths.
You may try out the Route Cipher online on Crypto-Online website.
Columnar Transposition
TRANSPOSITION CIPHER
The Columnar Transposition rearranges the plaintext letters, based on a matrix filled with letters
in the order determined by the secret keyword.
Usage
The Columnar Transposition is a simple transposition cipher that can be performed manually,
without the need of using additional equipment. It was very popular throughout centuries, and it
was used in various situations by diplomats, soldiers, and spies.
The encryption and decryption can be performed by hand, using a piece of paper and a simple
matrix, into which the user enters the letters of the message.
Algorithm
The name of the cipher comes after the operations on a matrix, that are performed during both,
encryption and decryption. The number of columns of the matrix is determined by the secret key.
The secret key is usually a word (or just a sequence of letters). It has to be converted into a
sequence of numbers. The numbers are defined by an alphabetical order of the letters in the
keyword. The letter which is first in the alphabet will be the number 1, the second letter in the
alphabetical order will be 2, and so on.
If there are multiple identical letters in the keyword, each next occurrence of the same letter should
be converted into a number that is equal to the number for the previous occurrence increased by
one.
If, after entering the whole message, there are some empty cells in the bottom row of the matrix,
one of two approaches can be taken:
1. The cells may be left empty, and just ignored during all further operations (this is so
called an irregular columnar transposition cipher).
2. The sender may enter there some rare letters, and treat them as a part of the plaintext.
After decryption, the receiver should be able to determine, that the letters have no
sense, and that they should be ignored (in this case, the cipher is called a regular
columnar transposition cipher).
Next, the letters should be read off in a specific way, and write down to form the ciphertext. The
order of reading the letters is determined by the sequence of numbers, produced from the
keyword. They should be read column by column, from top to bottom, starting from the column,
which position is the same as the position of the number 1 in the key sequence. The next column
to read off is determined by the number 2 in the key sequence, and so on, until all the columns
are read off. To make this step easier, it is recommended to write the sequence numbers above
the corresponding columns.
As an example, let's encrypt a message A Midsummer Night's Dream, which is a comedy written
by Shakespeare. We will use the secret key mentioned above. The number sequence derived
from this keyword is 6723154, so the matrix created for the encryption will have seven columns.
After removing all non-letter characters, and changing the letters to upper case, the message
should be entered into the table:
6 7 2 3 1 5 4
A M I D S U M
M E R N I G H
T S D R E A M
Above the message, there are numbers derived from the keyword. These numbers determine the
order, in which the columns should be read (top to bottom), and appended to the produced
ciphertext. In our example, the first column will be SIE, the second will be IRD, and so on. The
produced ciphertext is:
SIE IRD DNR MHM UGA AMT MES
Finally, after removing the spaces, which were added to indicate separate columns, we receive
the encrypted message:
SIEIRDDNRMHMUGAAMTMES
To decrypt a received ciphertext, the receiver has to perform the following steps:
1. Knowing the secret keyword, and the length of the received message, the table of the
same size, as the one used for encryption, should be created.
2. The ciphertext should be entered into columns, from the leftmost columns to the
rightmost column, from top to bottom.
3. The columns should be rearranged, and put into the order defined by the keyword.
4. The decrypted message should be read out, row by row, starting from the top row, and
from left to right.
To break the ciphertext, an attacker should try to create the tables of different sizes, enter the
encrypted message down into the columns, and for each table look for anagrams appearing in
rows.
Implementation
Encryption
Below, there are encryption functions written in Python. The input parameters are the message
and the secret keyword. The main function, encrypt, uses two helper functions to create the
matrix and the keyword sequence of numbers.
def encrypt(message, keyword):
matrix = createEncMatrix(len(keyword), message)
keywordSequence = getKeywordSequence(keyword)
ciphertext = "";
for num in range(len(keywordSequence)):
pos = keywordSequence.index(num+1)
for row in range(len(matrix)):
if len(matrix[row]) > pos:
ciphertext += matrix[row][pos]
return ciphertext
return matrix
def getKeywordSequence(keyword):
sequence = []
for pos, ch in enumerate(keyword):
previousLetters = keyword[:pos]
newNumber = 1
for previousPos, previousCh in enumerate(previousLetters):
if previousCh > ch:
sequence[previousPos] += 1
else:
newNumber += 1
sequence.append(newNumber)
return sequence
Decryption
The Python functions written below allow to decrypt Columnar Transposition ciphertext. The input
parameters are the message and the secret keyword. The main function, decrypt, uses helper
functions to create the matrix and the keyword sequence of numbers.
def decrypt(message, keyword):
matrix = createDecrMatrix(getKeywordSequence(keyword), message)
plaintext = "";
for r in range(len(matrix)):
for c in range (len(matrix[r])):
plaintext += matrix[r][c]
return plaintext
pos = 0
for num in range(len(keywordSequence)):
column = keywordSequence.index(num+1)
r = 0
while (r < len(matrix)) and (len(matrix[r]) > column):
matrix[r][column] = message[pos]
r += 1
pos += 1
return matrix
def getKeywordSequence(keyword):
sequence = []
for pos, ch in enumerate(keyword):
previousLetters = keyword[:pos]
newNumber = 1
for previousPos, previousCh in enumerate(previousLetters):
if previousCh > ch:
sequence[previousPos] += 1
else:
newNumber += 1
sequence.append(newNumber)
return sequence
Double Columnar
Transposition
TRANSPOSITION CIPHER
The Double Columnar Transposition rearranges the plaintext letters, based on matrices filled with
letters in the order determined by the secret keyword.
Usage
The Double Columnar Transposition was introduced is a modification of the Columnar
Transposition. It is quite similar to its predecessor, and it has been used in similar situations.
The encryption and decryption can be performed by hand, using a piece of paper and a simple
matrix, in a similar way as it is done for the Columnar Transposition.
Algorithm
The Double Columnar Transposition was introduced to make cryptanalysis of messages
encrypted by the Columnar Transposition more difficult. It was supposed to prevent anagrams of
the plaintext words appearing in the analysed ciphertext.
The main idea behind the Double Columnar Transposition is to encrypt the message twice, by
using the original Columnar Transposition, with identical or different secret keys. The output from
the first encryption would be the input to the second encryption.
The matrices used in both steps may have different sizes, if the two keywords of different lengths
have been used.
All the operation performed during encryption and decryption, and all the parameters that have to
be defined, remain the same, as in the Columnar Transposition.
An attacker has to try many different combinations of keywords in order to find patterns in the
ciphertext. The cipher is more likely to be broken if multiple messages of the same length and
encrypted with the same keys were intercepted. They can be anagrammed simultaneously, which
makes the cryptanalysis much more effective.
It may be estimated that having a few messages of the same length, encrypted with identical
keys, would allow the attacker to determine both the plaintexts and the secret keys. This technique
was widely using by the French for breaking German messages at the beginning of World War I,
until the Germans improved their system.
The Double Columnar Transposition remains one of the strongest ciphers that can by used
manually, without the need of having electronic equipment. Another cipher that is considered to
be as strong as it is the VIC cipher.
Myszkowski Transposition
TRANSPOSITION CIPHER
The Myszkowski Transposition rearranges the plaintext letters, based on a matrix filled with letters
in the order determined by the secret keyword.
Usage
The Myszkowski Transposition is a very similar cipher to the Columnar Transposition. It was
proposed by Émile Victor Théodore Myszkowski in 1902.
The encryption and decryption can be performed by hand, using a piece of paper and a simple
matrix, into which the user enters the letters of the message.
Algorithm
Similarly to the Columnar Transposition, both encryption and decryption are performed by using
a matrix. The number of columns of the matrix is determined by the secret key.
The Myszkowski Transposition requires the secret key that is usually a word (or just a sequence
of letters). It should be converted into a sequence of numbers. The numbers are defined by an
alphabetical order of the letters in the keyword. The letter which is first in the alphabet will be the
number 1, the second letter in the alphabetical order will be 2, and so on.
Contrary to the Columnar Transposition, the keyword has to contain some recurrent letters.
Identical letters should have the same numbers assigned (which is again, different from the
Columnar Transposition).
If, after entering the whole message, there are some empty cells in the bottom row of the matrix,
one of two approaches can be taken:
1. The cells may be left empty, and just ignored during all further operations.
2. The sender may enter there some rare letters, and treat them as a part of the plaintext.
After decryption, the receiver should be able to determine, that the letters have no
sense, and that they should be ignored.
Next, the letters should be read off in a specific way, and write down to form the ciphertext. The
order of reading the letters is determined by the sequence of numbers, produced from the
keyword. They should be read column by column, from top to bottom, starting from the column,
which position is the same as the position of the number 1 in the key sequence. The next column
to read off is determined by the number 2 in the key sequence, and so on, until all the columns
are read off.
If several columns have the same numbers assigned to them (and as we know, this is a
requirement for this cipher), the letters from those columns should be read off together, row by
row. This procedure is different from the one which is performed in the Columnar Transposition,
which results in a different ciphertext being produced from both ciphers, even if all other
parameters are the identical.
To make this step easier, and to allows the user quickly located each column, it is recommended
to write the sequence numbers above the corresponding columns.
For example, let's encrypt a message A Midsummer Night's Dream, which is a comedy written by
Shakespeare (this is the same message we encrypted with the Columnar Transposition). We will
use the secret key mentioned above. The number sequence derived from this keyword
is 5623143, so the matrix created for the encryption will have seven columns.
After removing all non-letter characters, and changing the letters to upper case, the message
should be entered into the table:
5 6 2 3 1 4 3
A M I D S U M
M E R N I G H
T S D R E A M
There are numbers derived from the keyword written above the message. These numbers
determine the order, in which the columns should be read (top to bottom), and appended to the
produced ciphertext. In our example, the first column will be SIE, and the second will be IRD.
Then, there are two columns which have threes assigned: DNR and MHM. The letters from them
should be read off row by row, starting from the top. All the remaining columns should be dealt
with in a usual way. The produced ciphertext is:
SIE IRD DMNHRM UGA AMT MES
Finally, after removing the spaces, which were added to indicate separate columns, we receive
the encrypted message:
SIEIRDDMNHRMUGAAMTMES
To decrypt a received ciphertext, the receiver has to perform the following steps:
1. Knowing the secret keyword, and the length of the received message, the table of the
same size, as the one used for encryption, should be created.
2. The ciphertext should be entered into columns, from the leftmost columns to the
rightmost column, from top to bottom. The columns with the same numbers should be
filled together, row by row.
3. The columns should be rearranged, and put into the order defined by the keyword.
4. The decrypted message should be read out, row by row, starting from the top row, and
from left to right.
To break the ciphertext, an attacker should act in a similar way. That is, he should create the
tables of different sizes, enter the encrypted message down into the columns, and for each table
look for anagrams appearing in rows.
Cryptographic Rotor Machines
Electric rotor machines were mechanical devices that allowed to use encryption algorithms that
were much more complex than ciphers, which were used manually. They were developed in the
middle of the second decade of the 20th century. They became one of the most important
cryptographic solutions in the world for the next tens of years.
Usage
The concept of using rotor machines in cryptography occurred to a number of inventors
independently. At present, two Dutch naval officers, Theo A. van Hengel (1875 – 1939) and R. P.
C. Spengler (1875 – 1955) are considered to invent the first rotor cipher machine in 1915. There
were four more people who created (more or less independently) their own cryptographic rotor
machines not much time later: Edward Hebern, Arvid Damm, Hugo Koch and Arthur Scherbius.
Electro-mechanical machines fitted with movable rotors were able to produce long random
keystreams, thus allowing to encrypt messages by using complicated polyalphabetic substitution
ciphers.
Description
The main idea that lies behind rotor machines is relatively simple. One can imagine a simple
device, similar to a typewriter, with a number of keys used to input text. The number of keys may
differ, however usually there are 26 to 32 characters.
For example, if someone pressed K in the keyboard, the machine would always produce C.
As a result, the machine would encrypt the messages by using a simple substitution cipher.
Adding a rotor
Having a simple substitution machine, one can imagine adding an additional internal rotor with
an internal wiring. The rotor will rotate with a gear, each time after a keystroke. As a result, after
pressing the same letter twice, it will be encoded differently due to different internal wiring.
For example, if someone pressed KK in the keyboard, the machine would produce CB (because
the wiring changed after the first keystroke, due to the rotor movement).
The internal wiring of the rotor should be kept secret, however we may expect that over time the
enemy will discover its design. It will make it easier for them to break the cipher but it won't
compromise the security altogether.
To decode a ciphertext the receiver would need a machine with the same rotor. Adding the rotor
caused the encryption to become a stronger polyalphabetic substitution cipher.
Make it difficult
To improve the security, one could add more rotors. The output of one rotor would be connected
to the input of the second rotor. Similarly, the second rotor output would be connected to the third
one, and so on. The strength of the encryption depends on several factors:
o the number of rotors inside the machine.
o the size of each rotor.
o the number of rotor types (with different internal wirings).
Each rotor would contain a different internal wiring. The substitution performed by each rotor
should be unknown for the enemy. To make cryptanalysis more difficult and to ensure that the
wiring inside each rotor changes with different frequency, the discs should rotate with different
speeds.
Additionally, depending on the design of the machine, some additional features may be added to
the machine, to ensure that the produced substitution is as random as possible (for example, an
additional fixed substitution that does not depends on the rotors).
Some rotor machines (most notably Enigma) were designed to be symmetrical. That means that
encrypting the same message twice (with the same settings), would produce the original
message.
Cryptographic Rotor Machines:
Hebern Cryptographic Rotor
Machine
Hebern rotor machine was one of the first cryptographic rotor machines that allowed to encrypt
messages automatically and effectively, and was supposed to provide more complex
Usage
Edward Hugh Hebern created his first rotor machine in 1917 and patented it one year later. By
that time, he had already invented several electric machines that were supposed to be used for
message encryption and decryption. His cryptographic rotor machines had never become
popular, due to some lacks in design, which made the US Army not to purchase more than a few
copies.
Algorithm
Similar to other cryptographic rotor machines, the Hebern machine used a disk with electrical
wires to encode and decode characters. Each rotor contained 26 electrical contacts on either
side. While rotating, the contacts changed the connections to the wires on both sides of the disc.
The wires on both sides of the disc were connected to input and output characters, so the rotating
The rotor installed in the Hebern machine rotated a gear each time a key was pressed. The secret
key in this case might be presented as an internal wiring of the rotor. Because the rotor had
26 connections, the key settings would be reused after 26 characters. Attacking such a cipher
would be a relatively easy task, and the amount of work would be comparable to attacking
Over time, to make the key size longer, Hebern added additional rotors to the machine. All input
letters were passing through all rotors, which means that every letter was changed several times,
The first rotor moved after a keystroke, while each next rotor turned once after the previous one
Friedman. He proved that due to the fact that the rotors moved only when the previous disc had
rotated a full turn, the whole algorithm might be divided into a number of single substitution
ciphers, each one with 26-letter long texts. This mean that the encryption was easy to break by
Date: 2020-03-09
Lorenz Cryptographic Rotor
Machine
The Lorenz rotor machines were used by the Germans during World War II for strategic
communication between major cities in German–occupied Europe.
Three versions of the Lorenz machine were created during the 1940s: SZ40 (started being used
in 1941), SZ42A (1943) and SZ42B (1944). The letters SZ, which form the model names,
originated from the German word Schlüsselzusatz, which means cipher attachment. And indeed,
these machines were constructed as an attachment to a standard Lorenz teleprinter. Thus, the
cryptographic extension could be attached to a teleprinter and extend its functionality.
Usage
The Lorenz rotor machine was developed in 1940 by a German company Lorenz, which was
a major telecommunication producer in Germany at that time.
Algorithm
The Lorenz cryptographic machine was supposed to implement the OTP encryption. The idea
was to use an electro-mechanical machine to overcome the problem of distribution of keystream
characters (the task would be even more difficult during the war). The rotors would turn with
different speeds, thus generating a random sequence. The sequence would be possible to
regenerate by the receiver, if they used the same rotors and the same initial parameters.
All characters in the Lorenz teleprinter were encoded by using 5-bit Baudot codes. Both input and
output letters were encoded on a paper tape. Each plaintext letter was XOR-ed with a secret-key
character, which was also encoded by using 5 bits. A pseudorandom keystream was generated
character-by-character by the internal rotor mechanism.
A diagram of the Lorenz machine
Pseudorandom key bits were generated by 10 rotors. Fife rotors were turned after every
keystroke, whereas the other five ones rotated not after every character, depending from the
output from additional two discs, called the motor wheels.
The main rotors were connected in pairs. Each bit out of fife plaintext bits (which encoded one
letter) first was moving through an always-rotating wheel and then through a corresponding
sometimes-rotating wheel. The signal value could have been changed by any of them, depending
on the rotor positions.
The two motor rotors were connected one after another. The movement of the second motor rotor
was triggered by the first one. The fife sometimes-rotating wheels would move together, if the
position of the second motor rotor triggered that.
Each wheel was fitted with a different number of cams, thus they all rotated with different speeds.
Also, the numbers were all co-prime with each other, to provide the longest possible time before
the pattern repeated.
The key sequence generated by the Lorenz rotor machine depended on its initial configuration:
o The patterns of cams on the wheels.
o The starting rotor positions.
The cam settings had been changed daily since the second part of 1944 (and much less
frequently before). They were distributed in the secret codebooks.
The initial wheel positions (12-letter indicator) were chosen by the operator before each
transmission and sent without encryption at the beginning of the message. Later, the procedure
changed and the operators sent 2-digit codes, which could be found in a codebook called the QEP
book. The codes corresponded to the initial wheel positions.
One of the typical danger related to the usage of one-time encryption by radio operators was
sending the same message twice, encrypted by using the same secret key (the same initial
settings). This situation might have place, when the receiver had some problems with recording
the message.
If the sender used exactly the same secret key to encrypt exactly the same message, intercepting
the communication would not provide any information to the eavesdropper. Unfortunately, during
sending the second message, the sender could make some small changes in the text, like adding
abbreviations or changing single words.
The Allies were lucky to intercept the message that allowed them to break the cipher in the middle
of 1941. It was broadcast twice, by using the same secret key. The second message had some
abbreviations made at the beginning of it. Also, it was long enough to allow the British to break
the code and to recover both plaintexts and the keystream characters.
After discovering the secret keystream, the Allies managed to determine the internal structure of
the machine, in spite of the fact that almost until the end of the war they hadn't seen any Lorenz
machine. The rotor mechanism design was not optimal and the true security provided by the
machine turned out to be much weaker than predicted by the Germans.
According to the German inventors, the number of possible combinations of internal rotor
positions was impressive, too large to make it possible to break the security by using brute force
attacks. However, due to the fact that each bit of the letter was encoded separately (each bit was
passed through only two rotors), the actual number of possible combinations was much smaller.
Also, because all the sometimes-rotating wheels turned at the some moment, the machine
produced relatively long parts of ciphertext that were not affected by those wheels (between their
turns). This turned out to be the crucial drawback of the cipher.
As a result, the team of British code breakers managed to build their own machine for decrypting
intercepted messages. The Allies were able to read all German communication encrypted by
Lorenz machine.
Image:
The Lorenz SZ42 machine
Usage
The Enigma machines started to be used commercially in the early 1920s but gained most
popularity for their military applications. During World War II they were used by the German
armies and became their main cryptographic tool for encryption and decryption of tactical and
strategic communication.
Even after the war, the Enigma machines were still used by various agencies in some countries,
until in the second part of the 20th century it was revealed, that their security had been
compromised by the Allies as early as at the beginning of World War II.
Algorithm
The Enigma machines were produced in many versions and they were continuously improved
over time. As for other cryptographic rotor machines, the strength of encryption depended mainly
on the several rotors, each with 26 electrical connections corresponding to 26 alphabet letters.
To make the cipher even stronger, the Germans added some more elements, like the plugboard.
Also, to allow encryption and decryption by using the same machines with the same parameters,
an additional important device was added, called the reflector.
In the most popular Enigma version, a signal from the key pressed by the user was passing first
through the plugboard, then the entry wheel, then through the three moving rotors until it reached
the reflector. After being processed by the reflector, the signal went all the way back through
the three rotors, the entry wheel and the plugboard up to the output panel of lamps.
Rotors
The number of rotors in different Enigma versions varied. At the beginning, the German army
used a version with three different available rotors. All the three rotors were installed in the
machines each day and used for encryption. The rotor ordering (Walzenlage) and each ring
settings (Grundstellung) were a part of everyday secret initial configuration, distributed in
codebooks. The ring settings were the relative positions of the alphabet ring to the rotor wiring.
The initial position of each rotor was supposed to be set randomly by the operator before sending
a message.
Over time (in December 1938), a two new discs were added, but only three rotors were still
installed in the machines each day. The number of possible initial combinations increased
significantly.
The Naval version of Enigma was distributed with more rotors that the ordinary Army version.
Starting from six, the number of available rotors gradually increased up to eight. Also, the Naval
Enigma discs rotated with the higher frequency due to a different number of notches. The later
versions of Naval Enigma used four rotors at once, instead of three.
The plugboard
The military versions of Enigma contained an additional element called
the plugboard (Steckerbrett in German). If installed, it would connect the keyboard output to the
rest of the machine.
The plugboard allowed the operator to create pairs of letters, by using cables plugged into their
corresponding connectors. The signals from the letters connected by the cable were swapped
twice. First time, before they entered the main rotor mechanism, and then at the end of
the encryption process just before producing the output.
It was possible to create up to 13 connections but usually only 10 were used. The connections of
the plugs in the plugboard (Steckerverbindungen) were a part of the initial everyday configuration,
available from the secret codebook.
The plugboard turned out to be a useful feature, which significantly increased the strength of the
cipher.
There were several versions of this device. It made using the machine much easier and was one
of the reasons of popularity of Enigma. On the other hand, due to some mathematical properties,
it significantly reduced the cipher strength.
Security of Enigma
Due to its universal applications by the Germans during World War II, breaking the Enigma was
an extremely important success for the Allies. It allowed them to intercept all kinds of
communication, almost throughout the whole war.
The history of Enigma cryptanalysis is undoubtedly fascinating but due to many versions of the
machine and many stories describing the attempts from different perspectives, this website is
simply not large enough to accommodate the topic.
One should refer to the books, lectures and simulations that deal with the Enigma cryptanalysis
history.
The first three-rotor military Enigma machines were broken by the Polish Biuro Szyfrów agency,
long before the outbreak of World War II, in 1932. After that year, the Polish intelligence were able
to read all the messages encoded by Enigma almost in the real time. Three cryptologists had a
particularly great impact on breaking the cipher: Marian Rejewski (1905–1980), Jerzy Różycki
(1909-1942) and Henryk Zygalski (1908-1978).
Just before the outbreak of World War II, when the Germans added more rotor designs to Enigma,
thus making the decoding impossible for Poles, the Polish intelligence handed on the complete
documentation to the French and the British.
Over the next years, the burden of breaking new Enigma versions was taken on chiefly by the
British intelligence. The teams located at Bletchley Park were able to successfully break many
Enigma versions. What is more important, the British commanders were able to use the received
information to their advantage. Alan Turing (1912-1954) was a remarkable British cryptologist and
mathematician of that time.
Throughout the end of the war, the Americans built huge and powerful machines that were able
to break the latest and most complex versions of Naval Enigmas (fitted with four rotors).
Images:
The Enigma machine with three rotors
The Enigma plugboard (with two cables used)
Despite its simplicity and susceptibility to attacks, the simple XOR cipher was used in many
commercial applications, thanks to its speed and uncomplicated implementation.
Usage
The simple XOR cipher was quite popular in early times of computers, in operating systems MS-
DOS and Macintosh.
Algorithm
The simple XOR cipher is a variation of the Vigenère cipher. It differs from the original version
because it operates on bytes, which are stored in computer memory, instead of letters.
Instead of adding two alphabet letters, as in the original version of the Vigenère cipher, the XOR
algorithm adds subsequent plaintext bytes to secret key bytes using XOR operation. After using
the last secret key byte, one should return to the first byte (as in the Vigenère encryption).
In order to decrypt ciphertext bytes, one should take the same steps as during encryption.
Subsequent ciphertext bytes should be added to subsequent secret key bytes using XOR
operation.
Both encryption and decryption can be presented using the following equations:
M XOR K = C
C XOR K = M
Security of the simple XOR cipher
The simple XOR cipher is quite easy to break. It doesn't offer better protection that some other
classical polyalphabetic substitution ciphers. Using a computer, it is possible to break the cipher
in a relatively short time.
Almost always, the first step to break the cipher should be guessing a length of the secret key. It
can be easily achieved by calculating an index of coincidence of the ciphertext.
After determining the length of the key, one should write down the same ciphertext in two lines,
one under another. Bytes in the lower line should be offset by the secret key size with respect to
the same bytes in the upper line. Then, after adding XOR both texts (after adding each two bytes
in the same columns), one will receive a sequence of bytes without secret key modifications.
Implementation
The application written in C, that encrypt a given text file using a simple XOR cipher:
#include <stdio.h>
Symmetric Ciphers
Symmetric ciphers use the same cryptographic keys for both encryption of plaintext and
decryption of ciphertext. They are faster than asymmetric ciphers and allow encrypting large sets
of data. However, they require sophisticated mechanisms to securely distribute the secret keys
to both parties.
Definition A symmetric cipher defined over (K, M, C), where:
• E: K × M -> C
• D: K × C -> M
such that for every m belonging to M, k belonging to K there is an equality:
Usage
It has been proven that OTP is impossible to crack if it is used correctly. It has the perfect secrecy
property and allows very fast encryption and decryption. However, the secret key must be at least
as long as the message, what makes it quite inconvenient to use while sending large electronic
information.
Algorithm
Both data encryption and decryption by using OTP takes place in the same way. All bytes of
the message (or of the ciphertext) are added XOR to bytes of the secret key.
The bytes are added one by one, and each addition produces one output byte:
mi XOR ki = ci
ci XOR ki = mi
Using the same key repeatedly
Each part of the secret key can be used only once for encrypting exactly one part of the message
(of course, of the same length). Using the same key bytes more than once, allows the attacker
to discover the two original messages summed by XOR:
M1 XOR K = C1
M2 XOR K = C2
C1 XOR C2 = M1 XOR K XOR M2 XOR K = M1 XOR M2
Having two original messages summed by XOR, the intruder can try to broke the cipher, by using
attacks based on language and encoding features.
Providing no integrity
It is possible to modify the ciphertext in such a way, that the receiver would not be able to detect
that. What is worse, the changes have a predictable impact on the message. If the attackers know
the structure of the message, they are able to change only the desired parts of the message.
K = K1 XOR K2 XOR K3
C = M XOR K1 XOR K2 XOR K3
M = C XOR K1 XOR K2 XOR K3
Block Diagram of OTP Algorithm
Maths:
XOR
The only operation during the OTP encryption and decryption is Exclusive Or (XOR). The key
bytes are added XOR to the data bytes, one after another.
Each time, all the 8 bits in the first byte are added XOR to the 8 bits in the second bytes.
b1 b2 b1 XOR b2
0 0 0
0 1 1
1 0 1
1 1 0
Implementation
OTP encryption implemented in C++:
return ciphertext;
}
RC4
STREAM CIPHER WITH SYMMETRIC SECRET KEY
Usage
Designed by Ron Rivest of RSA Security in 1987. Implementation of RC4 cipher wasn't known
until September 1994 when it was anonymously posted to the Cypherpunks mailing list. RC4 is
often referred to as ARCFOUR or ARC4 to avoid problems with RC4 trademarked name.
The cipher is officially named after "Rivest Cipher 4" but the acronym RC is alternatively
understood to stand for "Ron's Code".
RC4 is one of the most popular ciphers. It is widely used in popular protocols, for example
to protect Internet traffic - TLS (Transport Layer Security) or to protect wireless networks - WEP
(Wired Equivalent Privacy).
Algorithm
RC4 is a stream symmetric cipher. It operates by creating long keystream sequences and adding
them to data bytes.
RC4 encrypts data by adding it XOR byte by byte, one after the other, to keystream bytes. The
whole RC4 algorithm is based on creating keystream bytes. The keystream is received from a 1-
d table called the T table.
Speed of RC4
The RC4 algorithm is designed especially to be used in software solutions because it only
manipulates single bytes. Unlike many other stream ciphers, it doesn't use LFSR registers, which
can be implemented optimally in hardware solutions but they are not so fast in applications.
Security of RC4
The cipher was created quite long time ago and it has some weaknesses which have been
improved in modern stream ciphers. It is possible to find keystream byte values that are slightly
more likely to occur than other combinations. In fact, over the last 20 years, several bytes like that
have been found. Some attacks based on this weakness were discovered.
Probably the most important weakness of RC4 cipher is the insufficient key schedule. Because
of that issue, it is possible to obtain some information about the secret key based on the first bytes
of keystream. It is recommended to simply discard a number of first bytes of the keystream. This
improvement is known as RC4-dropN, where N is usually a multiple of 256.
RC4 does not take a separate nonce alongside the key for every encryption. Therefore, the
cryptosystem must take care of unique values of keystream and specify how to combine
the nonce with the original secret key. The best idea would be to hash the nonce and the key
together to generate the base for creating the RC4 keystream. Unfortunately, many applications
simply concatenate key and nonce, which make them vulnerable to so called related key attacks.
This weakness of RC4 was used in Fluhrer, Mantin and Shamir (FMS) attack against WEP,
published in 2001.
Implementation:
Keystream Initialisation
Initialisation a T table, used for generation of keystream bytes. K is the secret key, that is an array
of length k_len.
for i from 0 to 255
T[i] := i
endfor
x_temp := 0
for i from 0 to 255
x_temp := (x_temp + T[i] + K[i mod k_len]) mod 256
swap(T[i], T[x_temp])
endfor
Keystream Generation
For keystream bytes generation, the loop below is executed as long as new bytes are needed.
p1 := 0
p2 := 0
while GeneratingOutput
p1 := (p1 + 1) mod 256
p2 := (p2 + T[p1]) mod 256
swap(T[p1], T[p2])
send(T[(T[p1] + T[p2]) mod 256])
endwhile
Salsa20
STREAM CIPHER WITH SYMMETRIC SECRET KEY
Usage
Salsa20 is a cipher that was submitted to eSTREAM project, running from 2004 to 2008, which
was supposed to promote development of stream ciphers. It is considered to be a well-designed
and efficient algorithm. There aren't any known and effective attacks on the family of Salsa20
ciphers.
Algorithm
Salsa20 is a stream cipher that works on data blocks of size of 64 bytes.
Encryption
For each 64-byte data block, the algorithm uses the Salsa20 expansion function. The input to
the function is the secret key (which can have either 32 or 16 bytes) and an 8-byte
long nonce concatenated with an additional block number, which values change from 0 to 264-
1 (it is also stored on 8 bytes). Every call to the expansion function increases the block number
by one.
The core of Salsa20 encryption algorithm is a hash function which receives the 64-byte long input
data from the Salsa20 expansion function, mixes it, and eventually returns the 64-byte long
output. The Salsa20 hash function works on the received sequence of bytes, which consists of:
o the secret key.
o the nonce with the block number.
o four constant vectors received from the expansion function, which values
depend on the size of the secret key.
The hash function operates on data divided into words. Every word contains 4 bytes and can
have values from 0 to 232-1. Therefore, the input data is 16-word long, a key contains 8 or
4 words, and the nonce has 2 words.
The output from the Salsa20 expansion function is added XOR to the 64-byte block of data.
The result is a 64-byte block of ciphertext.
Decryption
The same algorithm should be used during decryption. The data should be divided into parts of
the same size.
The output from the Salsa20 expansion function should be added XOR to the 64-byte block of
ciphertext. The result is a 64-byte block of plaintext.
Maths:
The operations for Salsa20 algorithm are presented in the order from the low-level functions, to
the more complex functions, which use the functions described above them.
If x is a 4-word input:
x= (x0, x1, x2, x3)
then the function can be defined as follow:
quarterround(x) = (y0, y1, y2, y3)
where:
y1 = x1 XOR ((x0 + x3) <<< 7)
y2 = x2 XOR ((y1 + x0) <<< 9)
y3 = x3 XOR ((y2 + y1) <<< 13)
y0 = x0 XOR ((y3 + y2) <<< 18)
The Quarterround Function can be performed in place, without the need of allocating any
additional memory. First, x1 changes to y1, then x2 changes to y2, next x3 changes to y3, then
x0 changes to y0. The Quarterround Function is invertible because all the modifications above are
invertible.
Rowround Function
The Rowround Function takes 16 words as input, transforms them, and returns 16-word
sequence.
This function is very similar to the Columnround Function but it operates on the words in
a different order.
If x is a 16-word input:
x= (x0, x1, x2, ..., x15)
then the function can be defined as follow:
rowround(x) = (y0, y1, y2, ..., y15)
where:
(y0, y1, y2, y3) = quarterround(x0, x1, x2, x3)
(y5, y6, y7, y4) = quarterround(x5, x6, x7, x4)
(y10, y11, y8, y9) = quarterround(x10, x11, x8, x9)
(y15, y12, y13, y14) = quarterround(x15, x12, x13, x14)
The 16-word input can be presented as a square matrix:
x0 x1 x2 x3
x4 x5 x6 x7
x8 x9 x10 x11
The rows in the matrix can be changed in parallel. Each of them is modified by the Quarterround
Function.
In the first row, the words are changed in the following order:
1. x1
2. x2
3. x3
4. x0
In the second row, the words are changed in the following order:
1. x6
2. x7
3. x4
4. x5
In the third row, the words are modified in the order:
1. x11
2. x8
3. x9
4. x10
Finally, in the last, fourth row, the words are changed in the order:
1. x12
2. x13
3. x14
4. x15
Columnround Function
The Columnround Function takes 16 words as input and returns 16-word sequence.
This function is very similar to the Rowround Function but operates on the words in different order.
If x is a 16-word input:
x= (x0, x1, x2, ..., x15)
then the function can be defined as follow:
columnround(x) = (y0, y1, y2, ..., y15)
where:
(y0, y4, y8, y12) = quarterround(x0, x4, x8, x12)
(y5, y9, y13, y1) = quarterround(x5, x9, x13, x1)
(y10, y14, y2, y6) = quarterround(x10, x14, x2, x6)
(y15, y3, y7, y11) = quarterround(x15, x3, x7, x11)
x0 x1 x2 x3
x4 x5 x6 x7
x8 x9 x10 x11
The columns in the matrix can be changed in parallel. Each of them is modified by
the Quarterround Function.
In the first column, the words are changed in the following order:
1. x4
2. x8
3. x12
4. x0
In the second column, the words are modified in the following order:
1. x9
2. x13
3. x1
4. x5
In the third column, the words are changed in the order:
1. x14
2. x2
3. x6
4. x10
In the last fourth column, the words are modified in the order:
1. x3
2. x7
3. x11
4. x15
Doubleround Function
The Doubleround Function takes 16 words as input and returns 16-word sequence.
If x is a 16-word input, then the Doubleround Function can be defined as follow:
doubleround(x) = rowround(columnround(x))
Littleendian Function
The Littleendian Function changes the order of a 4-byte sequence.
The Littleendian Function is invertible. It just simply changes the order of bytes in a word.
First, the Hash Function creates 16 words from the received 64-byte input. If input is a sequence
of 64 bytes:
input =(b0, b1, b2, ..., b63)
then 16 words are created as below:
w0 = littleendian(b0, b1, b2, b3)
w1 = littleendian(b4, b5, b6, b7)
[...]
w15 = littleendian(b60, b61, b62, b63)
Then, all 16 words are modified by 10 iterations of the Doubleround Function:
(x0, x1, ..., x15) = doubleround10(w0, w1, ..., w15)
Finally, the 16 words received as input are added (as described above) to the modified 16 words
and changed to 64 new bytes using the Littleendian Function. The bytes are output from
the Salsa20 Hash Function:
output = littleendian-1(x0+w0) + littleendian-1(x1+w1) + ... + littleendian-1(x15+w15)
Salsa20 Expansion Function
The Salsa20 Expansion Function takes two sequences of bytes. The first sequence can have
either 16 or 32 bytes and the second sequence (n) is always 16-byte long. The function returns
another sequence of 64 bytes.
If the first sequence is 32-bytes long, then it is divided into two shorter sequences of 16 bytes
(k0 and k1). The Salsa20 Expansion Function is defined by using the Salsa20 Hash Function,
as shown below:
Salsa20Expansionk0, k1(n) = Salsa20Hash(a0, k0, a1, n, a2 , k1, a3)
where:
a0 = (101, 120, 112, 97)
a1 = (110, 100, 32, 51)
a2 = (50, 45, 98, 121)
a3 = (116, 101, 32, 107)
If the first sequence is 16-bytes long (k), then the Salsa20 Expansion Function is defined by using
the Salsa20 Hash Function as below:
Salsa20Expansionk(n) = Salsa20Hash(b0, k, b1, n, b2, k, b3)
where:
b0 = (101, 120, 112, 97)
b1 = (110, 100, 32, 49)
b2 = (54, 45, 98, 121)
b3 = (116, 101, 32, 107)
The constant values of the vector (a0, a1, a2, a3) mean 'expand 32-byte k' in ASCII code. Similarly,
the constant values of the second vector (b0, b1, b2, b3) mean 'expand 16-byte k' in ASCII.
Usage
Because of its poor design, the effective key size is about 16 bits long.
It was compromised in 1999 by brute force attack. In this year, the DeCSS application was
published, which was able to quite fast break the CSS protection.
Algorithm
CSS is a stream cipher. The internal state machine is initialized using a 5-byte long secret key.
The state machine has 42 bits and contains two linear feedback shift registers - LFSR. The stream
of bytes is generated by registers and added XOR to stream of input data. Before addition, data
bytes are changing in a lookup table and keystream bytes move through optional inverters. There
are some lookup tables defined and they contain different coefficients.
The CSS cipher is created to protect audiovisual data on DVDs. There are a few different keys
in the whole CSS system. They are used to mutual authentication, encryption of sectors
and whole files. Some keys are stored encrypted and they must be decrypted before usage (using
CSS cipher, where the encrypted data are bytes of wanted secret key).
Because of many different types of tasks of CSS - working with audiovisual data and different
kind of keys - a few modes of CSS algorithm exist. All of them are generally similar but there are
some differences in detail (for example coefficients in tables).
When CSS algorithm decrypts one of the secret key, 5 encoded bytes of this key are mixed with
bytes received from registers in more complicated way. Instead of simple lookup tables and
addition XOR with keystream, there is a key mangling operation. Each byte goes through two
lookup tables and is added XOR twice with one byte from keystream.
CSS Modes
o authentication and establishing connection - the host establishes
communication with DVD and both sides create a bus key,
o disc key decryption - the host receives and decrypts a disc key using one of its
player keys,
o title key decryption - the host receives and decrypts a title key using
the obtained disc key,
o audiovisual data decryption - using the obtained title key and a sector key read
from a DVD, the host decrypts audiovisual data stored in one sector of
the DVD.
CSS Keys
o player keys - stored in a DVD driver; they are used for decryption of a disc key
(which is stored on DVD),
o disk key - encrypted on DVD and decrypted by the DVD driver using its player
keys; it is used for decryption of a title key,
o sector key - stored unecrypted in each sector of the DVD and read by the DVD
driver; it is used together with a title key for decryption of audiovisual data
in one DVD sector,
o title keys - encrypted on DVD and decrypted by the DVD driver using the disc
key obtained ealier; they are used for decryption of audiovisual data,
o session key or bus key - random key created during authentication between
the host and DVD drive; it is used for encryption of future communication
between them.
CSS System
The whole CSS system contains of three elements: a DVD, a DVD driver and a host (a computer,
an application for playing DVDs).
Every DVD contains an encoded unique disc key. Similarly, each DVD driver has a few player
keys. Each DVD has a hidden sector, which contains a disc key encrypted in many copies using
each of the 409 existing player keys. On writeable DVDs, the hidden sector is cleared and can't
be changed. A DVD driver tries to read a DVD and uses its player keys to decrypt one of the copy
of the encrypted disc key on DVD.
After each try and obtaining a result, which may be a correct disc key, the DVD driver performs
the following test: using received 5 bytes, which may be a disc key it tries to decrypt a test
sequence (stored on DVD), which is the real disc key, encrypted using the real disc key. If
the DVD driver receives the same 5 bytes like the 5 bytes it used as a key, then it is certain that
those 5 bytes are the real and correct disc key.
A DVD contains usually a few encoded title keys. Each of them protect one part of the movie,
called VTS (Video Title Set). Each VTS contains a set of files named as VTS_AA_B.CCC, where
every A and B means one digit. CCC may be one of three possible file extensions:
(.VOB, .BUP or .IFO). All the files which have the same number AA belong to the same VTS.
Title keys are decrypted using the disc key.
Each data sector on DVD is 2048-byte long (so it has the same size as sectors on CD-ROMs).
A sector starts with a MPEG-2 PACK header, 128-byte long. After the header there are either
audiovisual data (called stream data; for example MPEG-2 data or AC-3 data) or other information
(PCI or DSI). A sector key is stored unecrypted in bytes 80-84 in the header.
If a sector contains audiovisual data, then after the MPEG-2 PACK header it is stored a header
of audiovisual data (stream header), which contains 2 bits determining encryption type.
o 00 - no encryption
o 01 - CSS encryption
o 10 - reserved/not used
o 11 - CPRM encryption
If a sector doesn't contain audiovisual data, those bits are not stored in this sector (because not-
stream data are not encrypted).
For decryption of data stored on DVD, they are used two keys - a sector key (different value for
every sector) and a title key (each DVD contains usually a few title keys, one for each VTS). First
two bytes of the title key are added XOR with two first bytes of the sector key (bytes 80-81 of
the sector header) and then passed into the LFSR-17 register. Last three bytes of the title key are
added XOR with three last bytes of the sector key (bytes 82-84 of the sector header) and passed
into the LFSR-25 register. The host obtains the title key using the disc key decrypted earlier.
On each DVD it is stored also a region code, which determine a part of world where the DVD can
be played.
CSS Protocol
1. Mutual authentication
For decryption of a DVD in a DVD driver, a host must authenticate itself to the DVD using
a challenge-response protocol and CSS encryption. Theoretically, the DVD must also
authenticate itself to the host, however a host's application usually skips this checking.
During the authentication both sides use a predefined authentication key - F4 10 45 A3
E2.
The authentication requires the following steps:
1. The host receives an AGID (Authentication Grant ID) number from the DVD
drive. AGID is used as a session ID for the current communication. Click here to
find out more.
2. The host generates 10 random bytes and sends them to the DVD driver. The
driver encrypts them and sends back to the host 5-byte long sequence. Click
here to find out more.
3. The host decrypts driver's answer and checks if the result is the same as
challenge previously sent to the driver. The DVD driver can answer using one of
32 variants, so the host must make 32 tests to check which of them has been
chosen by the driver. Click here to find out more.
4. The DVD driver generates a random 10-byte long sequence and sends it to the
host. The host encrypts it. Click here to find out more.
5. The host sends back to the driver the encrypted sequence - the new
key KEY2. Click here to find out more.
At this point, both the host and driver know all 10 bytes - two keys created by
the host and the driver. The bus key is created by encryption using the 10-byte
long sequence using CSS in mode 3. It is 5-byte long and prevents to eavesdrop
future communication (particularly sending title and disc keys).
6. The host reads a disk key from a hidden sector on DVD. Click here to find out
more.
If the last step is succeeded, the host will be able to read a DVD using ordinary
commands SCSI read. Otherwise, the authentication fails.
2. Decoding a disk key
A DVD driver decrypts a disc key from a DVD using all its player keys. Each manufacturer
of DVD drivers usually possesses one or a few player keys and uses them in his products.
3. Reading and decoding title keys
Encrypted title keys are sent from a DVD to a host. Transmitting of title keys, as well as
the entire transmission in general, is encrypted using a bus key.
Bits V and i are set to 1 (to prevent initialisation of the LFSR registers by zeros) and bit CC is set
to 0.
LFSR-17 Operations
Bits are shifted right by one position. A new bit which appears in the leftmost position of
the register and at the output is a sum of the first and fifteenth bits.
LFSR-17 register operations
LFSR-25 Operations
Bits are shifted right by one position. A new bit which appears in the leftmost position of
the register and at the output is a sum of four bits from the register.
Inverter Modes
Depending on the current CSS mode, the inverters reverse or do not reverse output bits
from LFSR registers.
The following table presents which registers and in which CSS modes reverse order of bits in
each byte.
Mode LFSR-17 LFSR-25
Authentication yes no
Disc key no no
Lookup Table
Each byte for encryption or decryption is replaced by another byte, based on one of five lookup
tables using in CSS. There are different tables for encryption and decryption and different tables
for different CSS modes.
33 73 3B 26 63 23 6B 76 3E 7E 36 2B 6E 2E 66 7B
D3 93 DB 06 43 03 4B 96 DE 9E D6 0B 4E 0E 46 9B
57 17 5F 82 C7 87 CF 12 5A 1A 52 8F CA 8A C2 1F
D9 99 D1 00 49 09 41 90 D8 98 D0 01 48 08 40 91
3D 7D 35 24 6D 2D 65 74 3C 7C 34 25 6C 2C 64 75
DD 9D D5 04 4D 0D 45 94 DC 9C D4 05 4C 0C 44 95
59 19 51 80 C9 89 C1 10 58 18 50 81 C8 88 C0 11
D7 97 DF 02 47 07 4F 92 DA 9A D2 0F 4A 0A 42 9F
53 13 5B 86 C3 83 CB 16 5E 1E 56 8B CE 8E C6 1B
B3 F3 BB A6 E3 A3 EB F6 BE FE B6 AB EE AE E6 FB
37 77 3F 22 67 27 6F 72 3A 7A 32 2F 6A 2A 62 7F
B9 F9 B1 A0 E9 A9 E1 F0 B8 F8 B0 A1 E8 A8 E0 F1
5D 1D 55 84 CD 8D C5 14 5C 1C 54 85 CC 8C C4 15
BD FD B5 A4 ED AD E5 F4 BC FC B4 A5 EC AC E4 F5
39 79 31 20 69 29 61 70 38 78 30 21 68 28 60 71
B7 F7 BF A2 E7 A7 EF F2 BA FA B2 AF EA AA E2 FF
Block cipher algorithms are often able to combine data from different blocks in order to provide
additional security (e.g. AES in CBC mode).
Block ciphers may be described as efficient and deterministic functions, which permute contents
of all data blocks. They simply mix all the bits in each block. Permutation functions must be
pseudorandom and the output should be indistinguishable from pure random data. To allow
decryption, the inverse permutations must be used. The inverse permutations need also to be
quite efficient.
Usage
DES is one of the most thoroughly examined encryption algorithms. In 1981 it was included
in ANSI standards as Data Encryption Algorithm for private sector.
At the beginning of the 21st century, DES started to be considered insecure, mainly due to
its relatively short secret key length, what makes it vulnerable to brute force attacks. In 2001 DES
cipher was replaced by AES. DES is still one of the most popular cipher.
Algorithm
DES uses the key which is 64-bit long, however only 56 bits are actually used by the algorithm.
Every 8th bit of the key is a control one and it can be used for parity control.
In the encryption process, the data is first divided into 64-bit long blocks. Then, each block
undergoes the following operations:
1. Initial permutation rearranges bits in a certain, predefined way. This step does not
enhance the security of algorithm. It was introduced to make passing data into
encryption machines easier, at the times when the cipher was invented.
2. The input data is divided into two 32-bit parts: the left one and the right one.
3. 56 bits are selected from the 64-bit key (Permutation PC-1). They are then divided into
two 28-bit parts.
4. Sixteen rounds of the following operations (so called Feistel functions) are then
performed:
1. Both halves of key are rotated left by one or two bits (specified for each round).
Then 48 subkey bits are selected by Permutation PC-2.
2. The right half of data is expanded to 48 bits using the Expansion Permutation.
3. The expanded half of data is combined using XOR operation with the 48-bit
subkey chosen earlier.
4. The combined data is divided into eight 6-bit pieces. Each part is then an input to
one of the S-Boxes (the first 6-bit part is the input to the first S-Box, the second
6-bit part enters the second S-Box, and so on). The first and the last bits stand
for the row, and the rest of bits define the column of an S-Box table. After
determining the location in the table, the value is read and converted to binary
format. The output from each S-Box is 4-bit long, so the output from all S-Boxes
is 32-bit long. Each S-box has a different structure.
5. The output bits from S-Boxes are combined, and they undergo P-Box
Permutation.
6. Then, the bits of the changed right side are added to the bits of the left side.
7. The modified left half of data becomes a new right half, and the previous right
half becomes a new left side.
5. After all sixteen rounds, the left and the right halves of data are combined using the XOR
operation.
6. The Final Permutation is performed.
During decryption, the same set of operations is performed but in reverse order. The subkeys are
also selected in reverse order (compared to encryption).
Let's assume that somebody is designing a hardware circuit which should do some encryption
with DES. The data will be received in blocks of 8 bits. This means that there are 8 lines, each
yielding one bit at each clock. A common device for accumulating data is a shift register: the input
line plugs into a one-bit register, which itself plugs into another, which plugs into a third register,
and so on. At each clock, each register receives the contents from the previous register, and
the first register accepts the new bit. Therefore, the contents are shifted.
With an 8-bit bus, 8 shift registers are needed, each receiving 8 bits for every input block. The first
register receives bits 1, 9, 17, 25, 33, 41, 49 and 57. The second register receives bits 2, 10, 18,
..., and so on. After eight clocks, eight registers received the complete 64-bit block and it is time
to proceed with the DES algorithm itself.
If initial permutation was not used, then the first step of the first round would extract the 'left half'
(32 bits) which, at that point, would consist of the leftmost 4 bits of each of the 8 shift registers.
The 'right half' would also get bits from all the 8 shift registers. If you think of it as wires from
the shift registers to the units which use the bits, then you end up with a bunch of wires which
heavily cross each other. Crossing is doable but requires some circuit area, which is
the expensive resource in hardware designs.
On the other hand, if you consider that the wires must extract the input bits and permute them as
per the DES specification, you will find out that there is no crossing anymore. In other words,
the accumulation of bits into the shift registers inherently performs a permutation of the bits, which
is exactly the initial permutation of DES. By defining that initial permutation, the DES standard
says: 'well, now that you have accumulated the bits in eight shift registers, just use them in that
order, that's fine'.
The same thing is done again at the end of the algorithm during the Final Permutation.
DES was designed at a time when 8-bit bus were the top of the technology and one thousand
transistors were an awfully expensive amount of logic.
Security of DES
DES is considered to be a well-designed and effective algorithm. However, just after
its publication, many cryptographers believed that the size of its key is too small. At present,
the 56-bit long key can be broken relatively cheaply, by using brute force attacks within a few
days.
It is quite easy to attack DES knowing some parts of plaintext. The intruder can try all 256 possible
keys. He looks for a key, which used for decryption of an encrypted block of the known plaintext,
produces exactly the same plaintext. In practice, it is enough to know two or three blocks
of plaintext to be able to determine if the currently testing key which works for them, will be
working for other blocks as well. Probability that the found key is incorrect and converts correctly
only the known plaintext blocks is negligibly small.
The fastest known attacks on DES use linear cryptanalysis. They require knowing 243 blocks of
plaintext and their time complexity is around 239 to 243.
58 50 42 34 26 18 10 2
60 52 44 36 28 20 12 4
62 54 46 38 30 22 14 6
64 56 48 40 32 24 16 8
57 49 41 33 25 17 9 1
59 51 43 35 27 19 11 3
61 53 45 37 29 21 13 5
63 55 47 39 31 23 15 7
57 49 41 33 25 17 9
1 58 50 42 34 26 18
10 2 59 51 43 35 27
19 11 3 60 52 44 36
Right half
63 55 47 39 31 23 15
7 62 54 46 38 30 22
14 6 61 53 45 37 29
21 13 5 28 20 12 4
Expansion Permutation
Expansion Permutation initiates each round of Feistel functions. It expands the right half of data
from 32 bits to 48 bits.
32 1 2 3 4 5 4 5
6 7 8 9 8 9 10 11
12 13 12 13 14 15 16 17
16 17 18 19 20 21 20 21
22 23 24 25 24 25 26 27
28 29 28 29 30 31 32 1
Binary Rotation
In each round of Feistel functions, the 28-bit halves of key are rotated left by one or two bits.
Amount of
No. of cycle bits
1 1
2 1
3 2
4 2
5 2
6 2
7 2
8 2
9 1
10 2
11 2
12 2
13 2
14 2
15 2
16 1
14 17 11 24 1 5 3 28
15 6 21 10 23 19 12 4
26 8 16 7 27 20 13 2
41 52 31 37 47 55 30 40
51 45 33 48 44 49 39 56
34 53 46 42 50 36 29 32
PC-2 Permutation Table
S-Blocks
In S-Boxes encryption each 6-bit input block is replaced by 4-bit output.
If input bits are marked as a1, a2, a3, a4, a5 and a6, then the a1 and a6 comprise a 2-bit figure
standing for a row, and the a2, a3, a4 and a5 comprise a 2-bit figure standing for a column (both
columns and rows are numbered from zero). Where the row and column cross, there is the output
figure of the block. For example, if the input string of bits is 101010, the output will be 0110.
x00 x00 x00 x00 x01 x01 x01 x01 x10 x10 x10 x10 x11 x11 x11 x11
00x 01x 10x 11x 00x 01x 10x 11x 00x 01x 10x 11x 00x 01x 10x 11x
0yy
yy0 14 4 13 1 2 15 11 8 3 10 6 12 5 9 0 7
0yy
yy1 0 15 7 4 14 2 13 1 10 6 12 11 9 5 3 8
1yy
yy0 4 1 14 8 13 6 2 11 15 12 9 7 3 10 5 0
1yy
yy1 15 12 8 2 4 9 1 7 5 11 3 14 10 0 6 13
S1
x00 x00 x00 x00 x01 x01 x01 x01 x10 x10 x10 x10 x11 x11 x11 x11
00x 01x 10x 11x 00x 01x 10x 11x 00x 01x 10x 11x 00x 01x 10x 11x
0yy
yy0 15 1 8 14 6 11 3 4 9 7 2 13 12 0 5 10
0yy
yy1 3 13 4 7 15 2 8 14 12 0 1 10 6 9 11 5
1yy
yy0 0 14 7 11 10 4 13 1 5 8 12 6 9 3 2 15
1yy
yy1 13 8 10 1 3 15 4 2 11 6 7 12 0 5 14 9
S2
x00 x00 x00 x00 x01 x01 x01 x01 x10 x10 x10 x10 x11 x11 x11 x11
00x 01x 10x 11x 00x 01x 10x 11x 00x 01x 10x 11x 00x 01x 10x 11x
0yy
yy0 10 0 9 14 6 3 15 5 1 13 12 7 11 4 2 8
0yy
yy1 13 7 0 9 3 4 6 10 2 8 5 14 12 11 15 1
1yy
yy0 13 6 4 9 8 15 3 0 1 1 2 12 5 10 14 7
1yy
yy1 1 10 13 0 6 9 8 7 4 15 14 3 11 5 2 12
S3
x00 x00 x00 x00 x01 x01 x01 x01 x10 x10 x10 x10 x11 x11 x11 x11
00x 01x 10x 11x 00x 01x 10x 11x 00x 01x 10x 11x 00x 01x 10x 11x
0yy
yy0 7 13 14 3 0 6 9 10 1 2 8 5 11 12 4 15
0yy
yy1 13 8 11 5 6 15 0 3 4 7 2 12 1 10 14 9
1yy
yy0 10 6 9 0 12 11 7 13 15 1 3 14 5 2 8 4
1yy
yy1 3 15 0 6 10 1 13 8 9 4 5 11 12 7 2 14
S4
x00 x00 x00 x00 x01 x01 x01 x01 x10 x10 x10 x10 x11 x11 x11 x11
00x 01x 10x 11x 00x 01x 10x 11x 00x 01x 10x 11x 00x 01x 10x 11x
0yy
yy0 2 12 4 1 7 10 11 6 8 5 3 15 13 0 14 9
0yy
yy1 14 11 2 12 4 7 13 1 5 0 15 10 3 9 8 6
1yy
yy0 4 2 1 11 10 13 7 8 15 9 12 5 6 3 0 14
1yy
yy1 11 8 12 7 1 14 2 13 6 15 0 9 10 4 5 3
S5
x00 x00 x00 x00 x01 x01 x01 x01 x10 x10 x10 x10 x11 x11 x11 x11
00x 01x 10x 11x 00x 01x 10x 11x 00x 01x 10x 11x 00x 01x 10x 11x
0yy
yy0 12 1 10 15 9 2 6 8 0 13 3 4 14 7 5 11
0yy
yy1 10 15 4 2 7 12 9 5 6 1 13 14 0 11 3 8
1yy
yy0 9 14 15 5 2 8 12 3 7 0 4 10 1 13 11 6
1yy
yy1 4 3 2 12 9 5 15 10 11 14 1 7 6 0 8 13
S6
x00 x00 x00 x00 x01 x01 x01 x01 x10 x10 x10 x10 x11 x11 x11 x11
00x 01x 10x 11x 00x 01x 10x 11x 00x 01x 10x 11x 00x 01x 10x 11x
0yy
yy0 4 11 2 14 15 0 8 13 3 12 9 7 5 10 6 1
0yy
yy1 13 0 11 7 4 9 1 10 14 3 5 12 2 15 8 6
1yy
yy0 1 4 11 13 12 3 7 14 10 15 6 8 0 5 9 2
1yy
yy1 6 11 13 8 1 4 10 7 9 5 0 15 14 2 3 12
S7
x00 x00 x00 x00 x01 x01 x01 x01 x10 x10 x10 x10 x11 x11 x11 x11
00x 01x 10x 11x 00x 01x 10x 11x 00x 01x 10x 11x 00x 01x 10x 11x
0yy
yy0 13 2 8 4 6 15 11 1 10 9 3 14 5 0 12 7
0yy
yy1 1 15 13 8 10 3 7 4 12 5 6 11 0 14 9 2
1yy
yy0 7 11 4 1 9 12 14 2 0 6 10 13 15 3 5 8
1yy
yy1 2 1 14 7 4 10 8 13 15 12 9 0 3 5 6 11
S8
Permutation P
Permutation P is processed on the 32-bit output block from eight S-Boxes.
16 7 20 21 29 12 28 17
1 15 23 26 5 18 31 10
2 8 24 14 32 27 3 9
19 13 30 6 22 11 4 25
Permutation P Table
Scheme of permutation P
Final Permutation
Final Permutation is processed for each block of data after the last round of Feistel functions. It
is the inverse of the Initial Permutation.
40 8 48 16 56 24 64 32
39 7 47 15 55 23 63 31
38 6 46 14 54 22 62 30
37 5 45 13 53 21 61 29
36 4 44 12 52 20 60 28
35 3 43 11 51 19 59 27
34 2 42 10 50 18 58 26
33 1 41 9 49 17 57 25
RC2
BLOCK CIPHER WITH SYMMETRIC SECRET KEY
Usage
RC2 was designed by Ron Rivest of RSA Security in 1987, who created also a few other ciphers.
RC2 algorithm had been kept secret until 1996, when it was anonymously posted on sci.crypt
group.
RC2 is also known as ARC2. The acronym RC is understood as "Rivest Cipher" or "Ron's Code".
Algorithm
RC2 is a block cipher, and the block size is 8 bytes (64 bits). This means that the input data is
first divided into blocks of 8 bytes and then each of them is processed separately.
Each data block is treated as four words, each word has 16 bits (2 bytes). The array of four words
is presented as R[0] R[1] R[2] R[3]. Both encryption and decryption take this array as input
and modify the four words. The output is returned in the same array.
Key Expansion
Apart from the data, the RC2 cipher takes as input a secret user key. The key provided by the
user may be of size from one byte up to 128 bytes. Let us denote the key size (in bytes) as Keysize.
The first operation which RC2 then performs is to expand the key, to receive new 128 key bytes
which will be used for encryption of decryption of all data bytes.
The user provides also a second value, denoted Keybit-limit, which determine the maximum
effective key size, provided in bits. This means, that no matter how many bytes of the secret key
has been provided by the user, the cipher will operate on the key which effective size is Keybyte-
limit, where:
Keybyte-limit = (Keybit-limit + 7) / 8
Of course, the real strenght of the key will be even smaller, if the user provided less key bytes
than Keybyte-limit.
In order to assure that only that many bites will be used to secure the data, the following bit mask
is defined:
Keymask = 255 mod 2^(8 + Keybit-limit - 8·Keybyte-limit)
The 8 - (8·Keybyte-limit - Keybit-limit) least significant bits of Keymask are set, the rest are zeros.
Key Words and Bytes
The mathematical operations that are performed on the key operate on both single bytes and
whole words. So, for convenience, the numbers in the key may be presented as either 64 words:
K[0], K[1], ..., K[63]
The operation & performed in the second step above limits the effective size of the key down to
just Keybit-limit bits.
Encryption
The encryption procedure takes as input four words: R[0] R[1] R[2] R[3], which form one
block of data. Every block of data will be encrypted by using the same 64 words of expanded
secret key: K[0] K[1] ... K[63].
The following encryption steps are performed on every data block:
Each Mixing Round consists of four Mixing operations, performed on four words of the data block:
1. Mixing R[0]
2. Mixing R[1]
3. Mixing R[2]
4. Mixing R[3]
Each Mashing Round consists of four Mashing operations, performed on four words of the data
block:
1. Mashing R[0]
2. Mashing R[1]
3. Mashing R[2]
4. Mashing R[3]
Decryption
The decryption procedure takes as input four ciphertext words: R[0] R[1] R[2] R[3], which
form one block of encrypted data. Every block of data will be decrypted by using the same 64
words of expanded secret key: K[0] K[1] ... K[63].
The following decryption steps are performed on every ciphertext block:
1. R-Mixing R[3]
2. R-Mixing R[2]
3. R-Mixing R[1]
4. R-Mixing R[0]
Each R-Mashing Round consists of four R-Mashing operations, performed on four words of the
encrypted data block:
1. R-Mashing R[3]
2. R-Mashing R[2]
3. R-Mashing R[1]
4. R-Mashing R[0]
Table PI (TPI)
Table PI is a fixed array that contains 256 elements, using during key expansion operations.
Table PI is filled with numbers, which are a random permutations of all possible byte values from
0 to 255. The order of numbers is based on the digits of PI: 3.1415...
d9 78 f9 c4 19 dd b5 ed 28 e9 fd 79 4a a0 d8 9d
c6 7e 37 83 2b 76 53 8e 62 4c 64 88 44 8b fb a2
17 9a 59 f5 87 b3 4f 13 61 45 6d 8d 09 81 7d 32
bd 8f 40 eb 86 b7 7b 0b f0 95 21 22 5c 6b 4e 82
54 d6 65 93 ce 60 b2 1c 73 56 c0 14 a7 8c f1 dc
12 75 ca 1f 3b be e4 d1 42 3d d4 30 a3 3c b6 26
6f bf 0e da 46 69 07 57 27 f2 1d 9b bc 94 43 03
f8 11 c7 f6 90 ef 3e e7 06 c3 d5 2f c8 66 1e d7
08 e8 ea de 80 52 ee f7 84 aa 72 ac 35 4d 6a 2a
96 1a d2 71 5a 15 49 74 4b 9f d0 5e 04 18 a4 ec
c2 e0 41 6e 0f 51 cb cc 24 91 af 50 a1 f4 70 39
99 7c 3a 85 23 b8 b4 7a fc 02 36 5b 25 55 97 31
2d 5d fa 98 e3 8a 92 ae 05 df 29 10 67 6c ba c9
d3 00 e6 cf e1 9e a8 2c 63 16 01 3f 58 e2 89 a9
0d 38 34 1b ab 33 ff b0 bb 48 0c 5f b9 b1 cd 2e
c5 f3 db 47 e5 a5 9c 77 0a a6 20 68 fe 7f c1 ad
Table PI in hexadecimal notation
where j is the counter variable and the vector s contains the following
values: s[0] = 1, s[1] = 2, s[2] = 3, s[3] = 5.
Mashing Operation
Mashing Operation modifies one data word during encryption process.
where j is the counter variable and the vector s contains the following
values: s[0] = 1, s[1] = 2, s[2] = 3, s[3] = 5.
R-Mashing Operation
R-Mashing Operation modifies one data word during decryption process.
Usage
3DES cipher was developed because DES encryption, invented in the early 1970s and protected
by a 56-bit key, turned out to be too week and easy to break using modern computers of that time.
The effective security which 3DES provides is 112 bits, when an attacker uses meet-in-the-middle
attacks.
For several years, Triple DES was often used for electronic payments (for example, in EMV
standard). New protocols based on the cipher are still being created and maintained (as for 2016).
It was also used in several Microsoft products (for example, in Microsoft Outlook 2007, Microsoft
OneNote, Microsoft System Center Configuration Manager 2012) for protecting user configuration
and user data.
Algorithm
Triple DES algorithm performs three iterations of a typical DES algorithm. In its strongest version,
it uses a secret key which consists of 168 bits. The key is then divided into three 56-bit keys.
3DES Encryption
1. Encryption using the first secret key
2. Decryption using the second secret key
3. Encryption using the third secret key
The encryption and decryption operations may be presented as mathematical equations.
Encryption:
c = E3(D2(E1(m)))
Decryption:
m = D1(E2(D3(c)))
3DES with shorter keys
Using DES decryption operation in the second step of 3DES encryption provides backward
compatibility with the original DES algorithm. In this case, the first and second secret keys, or the
second and third secret keys should be identical, and their value is not important.
c= E3(D1(E1(m))) = E3(m)
c = E3(D3(E1(m))) = E1(m)
It is also possible to use the 3DES cipher with a secret key of size of 112 bits. In this case, the
first and third secret keys should be identical. Such an approach is stronger than simple DES
encryption used twice (with two separate 56-bit keys) because it provides better protection
against meet-in-the-middle attacks.
c = E1(D2(E1(m)))
Block Diagram of 3DES Encryption
Each iteration of DES algorithm executes the following operations for all input data blocks:
the initial permutation, 16 iterations of Feistel functions, and the final permutation.
During key manipulation, the following operations are executed: binary rotation, PC-1
permutation, and PC-2 permutation.
Usage
AES is considered as a strong and secure cipher. Over last few years (mostly 2005-2010) several
attacks against different AES implementations were described but generally speaking they
concern just some special cases and are not considered to be a threat to the AES algorithm itself.
Algorithm
A secret key in AES, for both data encryption and decryption, may contain 128 or 192 or 256 bits.
Based on the length of the key, a different number of encrypting cycles is performed.
Encryption
During encryption, the input data (plaintext) is divided into 128-bit blocks. The blocks of data are
presented as column-major matrices of size 4 bytes × 4 bytes, called states. The following
operations are performed for all blocks:
1. Preparing Subkeys: one starting subkey is created first, and later one more subkey for
every subsequent cycle of encryption (see below).
2. Initial Round: all bytes of data block are added to corresponding bytes of the starting
subkey using XOR operation.
3. A number of encrypting cycles takes place. The number of repetition depends on the
length of a secret key:
- 9 cycles of repetition for a 128-bit key,
- 11 cycles of repetition for a 192-bit key,
- 13 cycles of repetition for a 256-bit key.
Decryption
During decryption, the encrypted text is used as input data to the algorithm. The corresponding,
inverse operations should be performed, as during encryption:
AES Performance
In order to accelerate the application, one can decide to pre-compute the functions in different
rounds and replace them by simple byte substitution based on the calculated tables.
The disadvantage of this approach is that the size of the application will be much larger. It may
increase from several to tens of kilobytes, depending on the size of the secret key that is used.
The byte substitutions are presented in a table below. In the rows, there are specified the more
significant halves of input bytes. In the columns, there are the less significant halves of input
bytes. The value of the output byte may be found inside the table, at the intersection of
the specified row and the column.
x0 x1 x2 x3 x4 x5 x6 x7 x8 x9 xA xB xC xD xE xF
0x 63 7c 77 7b f2 6b 6f c5 30 01 67 2b fe d7 ab 76
1x ca 82 c9 7d fa 59 47 f0 ad d4 a2 af 9c a4 72 c0
2x b7 fd 93 26 36 3f f7 cc 34 a5 e5 f1 71 d8 31 15
3x 04 c7 23 c3 18 96 05 9a 07 12 80 e2 eb 27 b2 75
4x 09 83 2c 1a 1b 6e 5a a0 52 3b d6 b3 29 e3 2f 84
5x 53 d1 00 ed 20 fc b1 5b 6a cb be 39 4a 4c 58 cf
6x d0 ef aa fb 43 4d 33 85 45 f9 02 7f 50 3c 9f a8
7x 51 a3 40 8f 92 9d 38 f5 bc b6 da 21 10 ff f3 d2
8x cd 0c 13 ec 5f 97 44 17 c4 a7 7e 3d 64 5d 19 73
9x 60 81 4f dc 22 2a 90 88 46 ee b8 14 de 5e 0b db
Ax e0 32 3a 0a 49 06 24 5c c2 d3 ac 62 91 95 e4 79
Bx e7 c8 37 6d 8d d5 4e a9 6c 56 f4 ea 65 7a ae 08
Cx ba 78 25 2e 1c a6 b4 c6 e8 dd 74 1f 4b bd 8b 8a
Dx 70 3e b5 66 48 03 f6 0e 61 35 57 b9 86 c1 1d 9e
Ex e1 f8 98 11 69 d9 8e 94 9b 1e 87 e9 ce 55 28 df
Fx 8c a1 89 0d bf e6 42 68 41 99 2d 0f b0 54 bb 16
Rijndael S-Box
For example, for an input byte 3F, the new output byte is 75.
For decryption, the Inverse Rijndael S-Boxes are used. They can be obtained from the original
Rijndael S-Boxes.
x0 x1 x2 x3 x4 x5 x6 x7 x8 x9 xA xB xC xD xE xF
0x 52 09 6a d5 30 36 a5 38 bf 40 a3 9e 81 f3 d7 fb
1x 7c e3 39 82 9b 2f ff 87 34 8e 43 44 c4 de e9 cb
2x 54 7b 94 32 a6 c2 23 3d ee 4c 95 0b 42 fa c3 4e
3x 08 2e a1 66 28 d9 24 b2 76 5b a2 49 6d 8b d1 25
4x 72 f8 f6 64 86 68 98 16 d4 a4 5c cc 5d 65 b6 92
5x 6c 70 48 50 fd ed b9 da 5e 15 46 57 a7 8d 9d 84
6x 90 d8 ab 00 8c bc d3 0a f7 e4 58 05 b8 b3 45 06
7x d0 2c 1e 8f ca 3f 0f 02 c1 af bd 03 01 13 8a 6b
8x 3a 91 11 41 4f 67 dc ea 97 f2 cf ce f0 b4 e6 73
9x 96 ac 74 22 e7 ad 35 85 e2 f9 37 e8 1c 75 df 6e
Ax 47 f1 1a 71 1d 29 c5 89 6f b7 62 0e aa 18 be 1b
Bx fc 56 3e 4b c6 d2 79 20 9a db c0 fe 78 cd 5a f4
Cx 1f dd a8 33 88 07 c7 31 b1 12 10 59 27 80 ec 5f
Dx 60 51 7f a9 19 b5 4a 0d 2d e5 7a 9f 93 c9 9c ef
Ex a0 e0 3b 4d ae 2a f5 b0 c8 eb bb 3c 83 53 99 61
Fx 17 2b 04 7e ba 77 d6 26 e1 69 14 63 55 21 0c 7d
Multiplication of columns
Each column of a state matrix is multiplied by a predefined matrix of size of 4bytes x 4bytes.
The result of each multiplication is a new column which contains different 4 bytes.
Multiplication of the square matrix with the column c results in creating a new column r, with new
values.
2 3 1 1
1 2 3 1
1 1 2 3
3 1 1 2
c0
c1
c2
c3
r0
r1
r2
r3
Decryption
During decryption, an inverted matrix is used for multiplication:
e b d 9
9 e b d
d 9 e b
b d 9 e
Rcon Operation
In each iteration of the key generation process, the first byte of the current 4-byte long temporary
vector is added XOR to 2 raised to the power of number one less than the current iteration
number. The Rcon operation is performed in Rijndael's finite field.
These values can be calculated in runtime or stored in a table in the application memory:
i 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
xi 01 02 04 08 10 20 40 80 1b 36 6c d8 ab 4d 9a
Powers of x = 0x02
Blowfish
BLOCK CIPHER WITH SYMMETRIC SECRET KEY
Usage
It is used in plenty of encryption products and provides a good encryption quality. There are not
any known effective cryptanalysis of this cipher.
Camellia
BLOCK CIPHER WITH SYMMETRIC SECRET KEY
Algorithm
Camellia is a symmetric block cipher with secret key of size of 128, 192 or 256 bits. The length
of plaintext and ciphertext blocks is always equal to 128 bits.
In the following description, the original names of variables and functions the Camellia
documentation are used to describe its algorithm.
The most importnt elements of the algorithm are F-functions. They are used during encryption,
decryption and creating helper variables of the key. The F-function takes 128 input bits, mixes
them with bits of subkeys ki and returns 128 new bits. Modification of bits in the F-function is
referred to as one round of the algorithm. F-function calls are gathered in blocks. Each block
contains six rounds.
Six-round blocks (that means block of six calls of the F-function) are separated by calls of FL-
functions and FL-1-functions. They manipulate 64-bit long parts of data and mix them with
subkeys kli.
Both encryption and decryption algorithms are about to perform some repetitions of the 6-round
blocks described above. The number of repetitions depends on the length of the currently used
secret key:
Key schedule
The secret key using in the Camellia cipher can consist of 128, 192 or 256 bits. In order to encrypt
data blocks, one have to create a few helper variables and then subkeys, based on secret key
bits. Each subkey is 64-bit long.
At first, one should calculate two variables of size of 128 bits (KL and KR) and four variables of size
of 64 bits (KLL, KLR, KRL and KRR). The following equations describe connections between those
variables:
o KLL = 64 left bits of KL
o KLR = 64 right bits of KL
o KRL = 64 left bits of KR
o KRR = 64 right bits of KR
The rest of connections should be determined based on the length of the secret key K.
o for the 128-bit long key:
o KL = K
o KR = 0
o for the 192-bit long key:
o KL = 128 left bits of K
o KRL = 64 right bits of K
o KRR = ~KRL (negation of bits)
o for the 256-bit long key:
o KL = 128 left bits of K
o KR = 128 right bits of K
Then, it is possible to calculate two new helper variables: KA and KB, based on the previous ones.
They are both 128-bit long. KB is nonzero if and only if the secret key consists of 192 or 256 bits.
While creating KA and KB one should use six help constant values, which are referred to as ∑i.
At the end, based on four 128-bit long just created variables KL, KR, KA and KB, one should compute
all secret subkeys of size of 64 bits: ki, kwi and kli. Subkeys are used in all the steps during
encryption and decryption in the Camellia algorithm.
S-function
S-function is used inside the F-function. The input data of size of 64 bits are substituted by other
8 bytes, which are returns for further processing.
The function uses four substitution tables. They are referred to as s-boxes.
The input data is divided into eight separate bytes x1,...,x8. x1 contains eight leftmost
bits, x2 contains eight next bits and x8 contains eight last rightmost bits.
Every of the s-blocks changes eight received bits into eight other bits indicated by the table.
The four s-blocks are referred to as s1,...,s4. If y1,...,y8 are the eight subsequent output
bytes (the byte y1 contains leftmost input bits, y8 contains rightmost output bits), modifications
performed by the S-function can be defined as:
y1 = s1(x1)
y2 = s2(x2)
y3 = s3(x3)
y4 = s4(x4)
y5 = s2(x5)
y6 = s3(x6)
y7 = s4(x7)
y8 = s1(x8)
s-box s1 contains 256 following numbers:
112 130 44 236 179 39 192 229 228 133 87 53 234 12 174 65
139 13 154 102 251 204 176 45 116 18 43 32 240 177 132 153
223 76 203 194 52 126 118 5 109 183 169 49 209 23 4 215
254 68 207 178 195 181 122 145 36 8 232 168 96 252 105 80
170 208 160 125 161 137 98 151 84 91 30 149 224 255 100 210
16 196 0 72 163 247 117 219 138 3 230 218 9 63 221 148
135 92 131 2 205 74 144 51 115 103 246 243 157 127 191 226
233 121 167 140 159 110 188 142 41 245 249 182 47 253 180 89
120 152 6 106 231 70 113 186 212 37 171 66 136 162 141 250
64 40 211 123 187 201 67 193 21 227 173 244 119 199 128 158
S-blocks return values determined by the one-byte input number. All cells in the table are
numbered from 0 to 255, from left to right and from top to bottom. For example s1[0] is equal
to 112, s1[1] is equal to 130, and s1[255] is equal to 158.
The remaining three s-boxes can be also defined as tables of 256 numbers. Alternatively, their
substitutions can be defined as operations on s1, with changed input data:
s2(x) := (s1(x) <<< 1)
s3(x) := (s1(x) >>> 1)
s4(x) := s1(x<<<1)
P-function
P-function is used inside the F-function. P-function takes input data of size of 8 bytes (which are
output bytes of the S-function), modifies them and returns the vector, which is also 8-byte long.
Output data of the P-function is also output data of the F-function.
The input data is divided into eight separate bytes x1,...,x8. The byte x1 contains leftmost eight
bits, x2 next eight bits, and x8 contains rightmost eight bits.
If y1,...,y8 are the eight subsequent output bytes (y1 contains eight leftmost bits, y2 next eight
bits, and y8 contains rightmost eight output bits), modifications performed by the P-function can
be defined as:
y1 = x1 XOR x3 XOR x4 XOR x6 XOR x7 XOR x8
y2 = x1 XOR x2 XOR x4 XOR x5 XOR x7 XOR x8
y3 = x1 XOR x2 XOR x3 XOR x5 XOR x6 XOR x8
y4 = x2 XOR x3 XOR x4 XOR x5 XOR x6 XOR x7
y5 = x1 XOR x2 XOR x6 XOR x7 XOR x8
y6 = x2 XOR x3 XOR x5 XOR x7 XOR x8
y7 = x3 XOR x4 XOR x5 XOR x6 XOR x8
y8 = x1 XOR x4 XOR x5 XOR x6 XOR x7
F-function
F-function is on of the main functions. It is used during encryption and decryption processes, and
during creating subkeys. The input data X of size of 64 bits are mixed with one of
the subkeys k (also 64-bit long). The function returns a 64-bit long output block Y.
Data bits are added XOR to key bits and the result is modified by two functions S and P.
(X, k) -> Y => P (S (X XOR k)) -> Y
FL-function
FL-function is used during both encryption and decryption processes. It takes 64-bit long input
data and one of the subkeys, then it performs some modifications and finally it returns a block of
data, which contains also 64 bits.
An input data block is referred to as X, while Y is a 64-bit long output block. kl is one of
the subkeys created before:
(X, kl) -> Y => (XL || XR, klL || klR) -> YL || YR
At the beginning X is divided into two 32-bit long parts: XL contains 32 left bits of X,
and XR contains 32 right bits of X. Then, two new blocks (each 32-bit long) are calculated:
YR = ((XL AND klL) <<< 1) XOR XR
YL = (YR OR klR) XOR XL
YL contains 32 left output bits of the FL-function, and YR contains 32 right output bits.
FL-1-function
FL-1-function is used during both encryption and decryption processes. It takes 64-bit long input
data and one of the subkeys, then it performs some modifications and finally it returns a block of
data, which contains also 64 bits.
An input data block is referred to as X, while Y is a 64-bit long output block. kl is one of
the subkeys created before:
(Y, kl) -> X => (YL || YR, klL || klR) -> XL || XR
At the beginning Y is divided into two 32-bit long parts: YL contains 32 left bits of Y,
and YR contains 32 right bits of Y. Then, two new blocks (each 32-bit long) are calculated:
XL = (YR OR klR) XOR YL
XR = ((XL AND klL) <<< 1) XOR YR
XL contains 32 left output bits of the FL-1-function, and XR contains 32 right output bits.
The key schedule constants
During creating subkeys in encryption and decryption processes, one should use six constant
predefined values, commonly referred to as ∑i.
The values below are 64-bit long and they are presented in hexadecimal.
∑1 =0xA09E667F3BCC908B
∑2 =0xB67AE8584CAA73B2
∑3 =0xC6EF372FE94F82BE
∑4 =0x54FF53A5F1D36F1C
∑5 =0x10E527FADE682D1D
∑6 =0xB05688C2B3E6C1FD
Creating Subkeys
Using four 128-bit long variables KL, KR, KA and KB one should calculate
subkeys ki, kwi and kli (all subkeys have 64 bits).
The table for creating subkeys for the secret key of size of 128 bits:
received
subkey value where used
64 left at the
kw1 bits KL beginning
64 right at the
kw2 bits KL beginning
64 left
k1 bits KA F (round 1)
64 right
k2 bits KA F (round 2)
64 left
bits
k3 (KL <<< 15) F (round 3)
64 right
bits
k4 (KL <<< 15) F (round 4)
64 left
bits
k5 (KA <<< 15) F (round 5)
64 right
bits
k6 (KA <<< 15) F (round 6)
64 left
bits
kl1 (KA <<< 30) FL
64 right
bits
kl2 (KA <<< 30) FL-1
64 left
bits
k7 (KL <<< 45) F (round 7)
64 right
bits
k8 (KL <<< 45) F (round 8)
64 left
bits
k9 (KA <<< 45) F (round 9)
64 right
bits
k10 (KL <<< 60) F (round 10)
64 left
bits
k12 (KA <<< 60) F (round 11)
64 right
bits
k12 (KL <<< 60) F (round 12)
64 left
bits
kl3 (KL <<< 77) FL
64 right
bits
kl4 (KL <<< 77) FL-1
64 left
bits
k13 (KL <<< 94) F (round 13)
64 right
bits
k14 (KL <<< 94) F (round 14)
64 left
bits
k15 (KA <<< 94) F (round 15)
64 right
bits
k16 (KA <<< 94) F (round 16)
64 left
bits
k17 (KL <<< 111) F (round 17)
64 right
bits
k18 (KL <<< 111) F (round 18)
64 left
bits
kw3 (KA <<< 111) at the end
64 right
bits
kw4 (KA <<< 111) at the end
The table for creating subkeys for the secret keys of size of 192 bits and 256 bits:
received
subkey value where used
64 left at the
kw1 bits KL beginning
64 right at the
kw2 bits KL beginning
64 left
k1 bits KB F (round 1)
64 right
k2 bits KB F (round 2)
64 left
bits
k3 (KR <<< 15) F (round 3)
64 right
bits
k4 (KR <<< 15) F (round 4)
64 left
bits
k5 (KA <<< 15) F (round 5)
64 right
bits
k6 (KA <<< 15) F (round 6)
64 left
bits
kl1 (KR <<< 30) FL
64 right
bits
kl2 (KR <<< 30) FL-1
64 left
bits
k7 (KB <<< 30) F (round 7)
64 right
bits
k8 (KB <<< 30) F (round 8)
64 left
bits
k9 (KL <<< 45) F (round 9)
64 right
bits
k10 (KL <<< 45) F (round 10)
64 left
bits
k12 (KA <<< 45) F (round 11)
64 right
bits
k12 (KA <<< 45) F (round 12)
64 left
bits
kl3 (KL <<< 60) FL
64 right
bits
kl4 (KL <<< 60) FL-1
64 left
bits
k13 (KR <<< 60) F (round 13)
64 right
bits
k14 (KR <<< 60) F (round 14)
64 left
bits
k15 (KB <<< 60) F (round 15)
64 right
bits
k16 (KB <<< 60) F (round 16)
64 left
bits
k17 (KL <<< 77) F (round 17)
64 right
bits
k18 (KL <<< 77) F (round 18)
64 left
bits
kl5 (KA <<< 77) FL
64 right
bits
kl6 (KA <<< 77) FL-1
64 left
bits
k19 (KR <<< 94) F (round 19)
64 right
bits
k20 (KR <<< 94) F (round 20)
64 left
bits
k21 (KA <<< 94) F (round 21)
64 right
bits
k22 (KA <<< 94) F (round 22)
64 left
bits
k23 (KL <<< 111) F (round 23)
64 right
bits
k24 (KL <<< 111) F (round 24)
64 left
bits
kw3 (KB <<< 111) at the end
64 right
bits
kw4 (KB <<< 111) at the end
Implementation
On the website of NTT Corporation you can find source codes and detailed descriptions of
the Camellia cipher:
info.isl.ntt.co.jp/crypt/eng/camellia
Serpent
BLOCK CIPHER WITH SYMMETRIC SECRET KEY
Usage
The Serpent cipher is considered to be stronger but also slower than AES. It has not been
patented and is in the public domain. Everybody can use it in his software without any limitations.
Twofish
BLOCK CIPHER WITH SYMMETRIC SECRET KEY
Asymmetric Ciphers
Asymmetric ciphers are also referred to as ciphers with public and private keys. They use two
keys, one for encryption of messages and the other one during decryption.
Definition The system of asymmetric encryption consists of three algorithms (G, E, D):
The intruder can encrypt any messages using the known public key. Asymmetric ciphers are
therefore vulnerable to the chosen plaintext attacks. The ciphers with public key encryption must
provide security against such attacks. After encrypted two messages using the same public key,
the intruder can't be able to distinguish which ciphertext is connected with which plaintext. Also,
an observer which analyses two messages encrypted using the same algorithm and the same
public key, can not be able to distinguish their ciphertexts.
Asymmetric ciphers are much slower than symmetric ciphers (usually thousand times slower). It
is common practice to use public key encryption only to establish the secure connection and
negotiate the new secret key, which is then used to protect further communication by using
symmetric encryption.
Asymmetric Ciphers:
Merkle's Puzzles
KEY-AGREEMENT PROTOCOL
o without authentication
Merkle's Puzzles protocol allows any two parties to negotiate a shared secret key. The secret key
can be later used to protect their further communication.
Usage
Merkle's Puzzles is one of the first algorithms for a public-key cryptosystem. It was invented
by Ralph Merkle in 1974 and published in 1978.
Algorithm
The Merkle's Puzzles algorithm describes a communication between two parties which allows
to create a shared secret key. It is impossible to deduce the key by a potential eavesdropper.
The key may be later use during the further communication, protected by the symmetric
encryption.
The protocol of the exchange of information between two parties (commonly referred to as Alice
and Bob) is presented below. To determine the shared secret key, the following steps should be
performed:
1. One of the parties (Alice) prepares a lot of messages, which consist of a unique random
id and a unique random key.
2. Then, Alice encrypts each message using a weak cipher (for example a symmetric
cipher with a secret 20-bit long key) and send all the encrypted messages to the second
party (to Bob).
3. Bob chooses randomly one of the received encrypted messages and breaks its security
using a brute force attack.
4. Bob informs Alice, what is the id value in the message selected by him.
5. At this point, both Alice and Bob possess the secret shared key, which was in
the message chosen and broken by Bob. They can use it during the further
communication.
The main idea of the algorithm of Merkle's Puzzles is that the big number of messages is
encrypted using a cipher weak enough to be broken by a brute force attack by the receiver.
The secret unique key is specified by the message id transmitted without encryption.
The attacker, who wants to capture the key, can overhear all messages. He would have to break
many messages using brute force attacks, hoping to find the message which was chosen
randomly. After its decryption, he would have had knowledge about the secret key. Therefore,
the key which was used for encryption of the messages must be long enough, that decryption of
a large number of messages would be practically impossible.
Maths:
Encryption used in Merkle's Puzzles
A lot of messages, which are sent at the beginning of the protocol, must be encrypted by the first
person using a symmetric cipher.
To achieve this, one of the common and popular algorithms of symmetric encryption is used.
The only difference lies in the intentional weakening of the cipher, by used shorter symmetric
keys. Usually, the original random key (let us say 128-bit long) is replaced by the key consisting
of 98 bits set to 0 connected with 30 bits of random bits of the real key. The receiver must break
those 30 random bits to read the message and get the symmetric key.
Diffie–Hellman Protocol
KEY-AGREEMENT PROTOCOL
MESSAGE ENCRYPTION
o without authentication
The Diffie-Hellman Algorithm is one of the most popular key-agreement algorithms. It is used by
many protocols, including SSL/TLS. It is considered to be slightly faster than RSA, which can be
used for the same purpose.
The Diffie-Hellman Protocol may be also used for message encryption using the public key.
Usage
The algorithm was first published by the American cryptographers Whitfield Diffie and Martin
Hellman in 1976. However, it was revealed that the cipher had been discovered even earlier, by
the British intelligence agency (James H. Ellis, Clifford Cocks, and Malcolm J. Williamson) but
remained undisclosed.
Algorithm
The Diffie-Hellman algorithm describes a way of communication between two parties which
allows to create a shared secret key. The key may be later use during the further communication,
protected by the symmetric encryption. It is impossible to deduce the key by a potential
eavesdropper.
The algorithm uses mathematical theory about discrete logarithms in given groups.
The protocol of the exchange of information between two parties (commonly referred to as Alice
and Bob) is presented below. To determine the shared secret key using the Diffie-Hellman
protocol, the following steps should be performed:
1. Alice and Bob agree on a common prime number p and a generating element g.
2. Alice picks a random and secret natural number a and then
sends A = ga mod p to Bob.
3. Bob picks a random and secret natural number b and then
sends B = gb mod p to Alice.
4. Alice computes a number k = Ba mod p.
5. Bob computes the same number k = Ab mod p.
6. At this point, both Alice and Bob possess the secret number k.
The number k computed by both parties is very big - the numbers a and b used in the algorithm
have at least 100 digits and the prime number p has at least 300 digits. The value k can be
changed into the symmetric key using a hash function. The symmetric key is used for
the encryption of further communication.
In such a case, Bob could send a message to Alice, encrypting it with Alice's public key, which
contains three numbers: g, p, ga mod p. To send a secret message to Alice, Bob chooses
a random number b, and then sends an unencrypted number gb mod p to Alice. Eventually,
the message itself can be sent, being encrypted by a symmetric key (ga)b mod p.
Only Alice will be able to determine the value of b and decrypt the message. Using the public key
protects the sides from man-in-the-middle attacks.
The algorithm allows asymmetric encryption of data. As present however, RSA is much more
popular algorithm that allows encryption using public and private keys.
Maths:
Powers modulo a prime number
Raising a number to the power modulo a prime number is one of the most important operations
in modern cryptography. It is described by the general laws of numbers manipulation in modulo
arithmetic. Both the base and the exponent are positive integers. The result is also a natural
number.
The operation of raising a number to the power modulo a prime number may be presented as
a sequence of operations which consist of raising to the square and then dividing modulo
the prime number all the subsequent results.
For example, raising 4 to the power of 9 modulo 7 can be calculated in the following way:
49 mod 7:
42 mod 7 = 16 mod 7 = 2
4 mod 7
4 = 2 mod 7
2 = 4 mod 7 =4
48 mod 7 = 42 mod 7 = 16 mod 7 =2
49 mod 7 = (41 mod 7 * 48 mod 7) mod 7 = 4 * 2 mod 7 = 1
RSA
MESSAGE ENCRYPTION OR AUTHENTICATION
Algorithm
The RSA algorithm allows to create a pair of keys: a public key and a private key. Everyone can
receive the public key and use it to encrypt a message. Only the owner of the private key would
be able to decrypt that message.
Similarly, the owner of the private key can encrypt some data by using it, thus allowing everyone
else to use the corresponding public key to decrypt the data.
RSA security is based on the practical difficulty of factoring the product of two large prime
numbers (this is so called the factoring problem).
Key Generation
Both RSA keys are generated using the following algorithm:
1. Choose two different prime numbers, usually they are denoted by p and q. The numbers
should be chosen at random and they should be of similar bit-length.
2. Calculate: n = p·q
The number n is used as the modulus for both private and public keys. Its length is
the length of the RSA key.
3. Calculate a value of Euler's totient function for n:
φ(n) = φ(p)·φ(q) = (p − 1)·(q − 1)
4. Choose an integer e that is larger than 1 and smaller than previously computed
value φ(n). The numbers e and φ(n) should be coprime. The number e is used as the
public key exponent.
5. Compute a number d such that: d·e = 1 (mod φ(n))
The number d is used as the private key exponent.
The public key consists of the modulus n and the public exponent e. The private key consists of
the modulus n and the private exponent d. All numbers related to the private key must be kept
secret: both n and d, and three other numbers: p, q and φ(n) which can be used to compute d.
A lot of users can use the same value of e. Its length should be relatively short, because time
complexity of encryption depends significantly on the number of bits of e. A prime
number 216+1 (thus 65537) is often used as the value of e. One can also use much smaller
numbers (for example 3) but they are considered to be less secure in some circumstances.
Each user should possess its own number n (which is computed from the two prime numbers).
Encryption
During encryption one should use a public key (n, e). All messages should be divided into
a number of parts. Then, each part should be converted to a number (that must be larger
than 0 and smaller than n). In practice, the message should be divided into fragments of the size
of a certain number of bits.
Then, every number of the message is raised modulo n to the power e:
ci = mie (mod n)
RSA can be used multiple times (with different keys) to encrypt a message. The received
ciphertext can be decrypted in any order. The result is always the same. It does not matter in
which order the operations have been performed. However, one shouldn't encrypt a message in
this way more than twice, because of attacks based on the Chinese remainder theorem.
Encryption can be performed by using a private key as well. The procedure is the same, as
described above, but the private key (n, d) should be used instead. The receiver will have to use
the public key to decrypt the message.
Decryption
During decryption one should use a private key (n, d).
The received ciphertext consists of numbers, which are smaller than n. Each ciphertext number
ought to be raised modulo n to the power d:
mi = cid (mod n)
The received plaintext numbers should be combined in the correct order into the original plaintext
message.
If the message was encrypted by a private key, decryption should be performed by using
the corresponding public key. The procedure is the same as the one presented above, but for
decryption the public key (n, e) should be used instead.
Message Authentication
RSA can be used to sign messages. A sender should produce a hash value of the message
content and then raise it to the power of d (modulo n). Therefore, he should perform the same
operations as during ordinary encryption procedure. The encoded hash value should be attached
to the message.
The recipient of the message can raise the received encrypted hash value to the power
of e (modulo n) and compare the result with a hash value calculated by him. If both values are
the same, then the recipient is assured that the message hasn't been changed.
Security of RSA
If one used a small exponent e (for example 4) to encrypt a small value m (smaller than n1/e), then
a ciphertext number would be smaller than the modulus n. Such a case allows to determine
the value of m using ordinary arithmetic operations, which are fast and effective.
To protect against the use of the algorithm for encrypting too small plaintext numbers, one should
add random paddings, that would increase the number values. Also, thanks to using random
paddings, the same plaintext numbers are encoded by various ciphertext numbers. There are
a lot of popular padding schemes, for example OAEP, PKCS#1 or RSA-PSS.
The RSA algorithm is deterministic, thus the cipher is vulnerable to chosen plaintext attacks. It is
possible to encrypt a lot of messages using a known public key. Therefore, an attacker can guess
a content of captured encrypted messages by comparing them with the messages created
by him.
Another feature of this cipher is that a ciphertext of the product of two plaintext numbers is
the same as the product of ciphertexts that correspond to those plaintexts.
For example, raising 4 to the power of 9 modulo 7 can be calculated in the following way:
49 mod 7:
42 mod 7 = 16 mod 7 = 2
44 mod 7 = 22 mod 7 = 4 mod 7 =4
48 mod 7 = 42 mod 7 = 16 mod 7 =2
49 mod 7 = (41 mod 7 * 48 mod 7) mod 7 = 4 * 2 mod 7 = 1
As opposed to the common exponentiation, exponentiation in modular arithmetic is difficult and
there are no efficient algorithms to perform the reverse operation.
This gives the attacker much bigger possibilities to break the cipher than just by
performing ciphertext only attacks. However, he is no able to actively provide customized data or
secret keys which would be processed by the cipher.
Known-Plaintext Attack Efficiency
Known-plaintext attacks are most effective when they are used against the simplest kinds of
ciphers. For example, applying them against simple substitution ciphers allows the attacker to
break them almost immediately.
Known-plaintext attacks were commonly used for attacking the ciphers used during the Second
World War. The most notably example would be perhaps the attempts made by the British while
attacking German Enigma ciphers. The English intelligence targeted some common phrases,
commonly appearing in encrypted German messages, like weather forecasts or geographical
names.
The simple XOR cipher, used in the early days of computers, can be also broken easily by
knowing only some parts of plaintext and corresponding encrypted messages.
Modern ciphers are generally resistant against purely known-plaintext attacks. One of the
unfortunate exceptions was the old encryption method using in PKZIP application. Having just
one copy of encrypted file, together with its original version, it was possible to completely recover
the secret key.
In most cases however, the attacker should use more sophisticated types of cryptographic attacks
in order to break a well-designed modern cipher.
Date: 2020-03-09
Chosen-Plaintext Attack
During the chosen-plaintext attack, a cryptanalyst can choose arbitrary plaintext data to be
encrypted and then he receives the corresponding ciphertext. He tries to acquire the secret
encryption key or alternatively to create an algorithm which would allow him to decrypt any
ciphertext messages encrypted using this key (but without actually knowing the secret key).
This is a rather comfortable situation for the attacker. He can obtain more information about
the secret key and about the whole attacked system, because he is able to choose any text to be
processed by the cipher. He can analyse the system behaviour and output ciphertext, based on
any kind of input data.
During breaking deterministic ciphers with the public key, the intruder can easily create
a database with popular ciphertexts, for example with popular queries to the server. After that he
will be able to find the meaning of many intercepted encrypted messages, by simply comparing
them with his own database entries.
The most known chosen-plaintext attacks were performed by the Allied cryptanalysts during
World War II against the German Enigma ciphers.
Adaptive-Chosen-Plaintext Attack
In this kind of chosen-plaintext attack, the intruder has the capability to choose plaintext for
encryption many times. Instead of using one big block of text, it can choose the smaller one,
receive its encrypted ciphertext and then based on the answer, choose another one, and so on.
This allows him to investigate the attacked system in much more details.
Ciphertext-Only (Known
Ciphertext) Attack
During ciphertext-only attacks, the attacker has access only to a number of encrypted messages.
He has no idea what the plaintext data or the secret key may be. The goal is to recover as much
plaintext messages as possible or (preferably) to guess the secret key. After discovering the
encryption key, it will be possible to break all the other messages which have been encrypted by
this key.
There are a few techniques which proved to be very effective even when targeting modern ciphers
and which are based only on the knowledge of the ciphertext messages. The most important
methods are:
o Attack on Two-Time Pad
o Frequency Analysis
Chosen-Ciphertext Attack
During the chosen-ciphertext attack, a cryptanalyst can analyse any chosen ciphertexts together
with their corresponding plaintexts. His goal is to acquire a secret key or to get as many
information about the attacked system as possible.
The attacker has capability to make the victim (who obviously knows the secret key) decrypt any
ciphertext and send him back the result. By analysing the chosen ciphertext and the
corresponding received plaintext, the intruder tries to guess the secret key which has been used
by the victim.
Chosen-ciphertext attacks are usually used for breaking systems with public key encryption. For
example, early versions of the RSA cipher were vulnerable to such attacks. They are used less
often for attacking systems protected by symmetric ciphers. Some self-synchronizing stream
ciphers have been also attacked successfully in that way.
Adaptive-Chosen-Ciphertext Attack
The adaptive-chosen-ciphertext attack is a kind of chosen-ciphertext attacks, during which
an attacker can make the attacked system decrypt many different ciphertexts. This means that
the new ciphertexts are created based on responses (plaintexts) received previously. The attacker
can request decrypting of many ciphertexts.
There exist rather few practical adaptive-chosen-ciphertext attacks. This model is rather used for
analysing the security of a given system. Proving that this attack doesn't break the security
confirms that any realistic chosen-ciphertext attack will not succeed.
Chosen-Key Attack
Chosen-key attacks are a bit different than other kinds of cryptographic attacks. Usually, they are
intended to not just break a cipher but to break the larger system which relies on that cipher.
The attacker should have some knowledge regarding the relationship between various keys that
can be used in the cipher. Usually, he knows exactly what keys have been used or he himself can
choose the secret key.
Cryptographic Attacks:
Brute-Force Attack
During the brute-force attack, the intruder tries all possible keys (or passwords), and checks which
one of them returns the correct plaintext. A brute-force attack is also called an exhaustive key
search.
An amount of time that is necessary to break a cipher is proportional to the size of the secret key.
The maximum number of attempts is equal to 2key size, where key size is the number of bits in
the key. Nowadays, it is possible to break a cipher with around 60-bit long key, by using the brute-
force attack in less than one day.
Using brute-force attacks may be beneficial against all ciphers in which the number of all possible
keys values is smaller than the number of all possible different messages. Therefore, all ciphers
may be targeted, with the exception of ciphers providing perfect security.
For breaking ciphers using brute-force attacks, very fast specially designed supercomputers are
often used. They are owned by big research laboratories or government agencies, and they
contain tens or hundreds of processors. Alternatively, large networks of thousands of regular
computers working together may be used to break the same cipher. Cryptographic brute-force
attacks are very scalable processes.
Dictionary Attack
Dictionary attacks are a kind of brute-force attacks, in which the intruder attempts to guess
a password by trying existing words or popular expressions.
Such an approach reduces significantly the number of possible passwords that have to be tested.
On the other hand, users often choose (or are required) to add some additional characters, like
numbers, to their passwords, thus making the passwords impossible to be found in dictionaries.
The applications that perform dictionary attacks often perform some common modifications of
tested words, for example they may append current years.
To prevent such attacks, administrators can ban using some popular and too predictable
passwords.
Denial-of-Service Attack
A Denial-of-Service attack (DoS attack) is an attack where an attacker attempts to disrupt the
services provided by a host, by not allowing its intended users to access the host from the Internet.
If the attack succeeds, the targeted computer will become unresponsive and nobody will be able
to connect with it.
DoS Techniques
There are a lot of methods that can be used to disable a server.
Reducing Performance
The most popular techniques are based on flooding the attacked system with thousands of fake
messages, thus forcing it to deal with them and making it unable to react to genuine requests
from the real users or clients.
By preparing the messages carefully and targeting the correct parts of the system, it is possible
to prepare such requests that would cause most difficulties to the victim's computer. Their
processing should be as time consuming as possible, and the maximum of server's computational
power should be used up.
Exhausting Resources
Instead of just sending random messages, the attacks may be designed to use up all available
host's resources of some particular type. For example, the attacker may prepare the messages
that would lead to allocating all host's network connections, thus making it unable to accept any
other network requests.
An example of this attack type is a SYN flood. During this attack a victim's computer receives
thousands of fake TCP/SYN packages, which force it to open separate TCP connections for each
of them.
Similarly, the messages may be designed in a way that will cause the server to fill up the whole
available memory or disc space (for example, with log messages of core dump files).
Crashing
Finally, other methods of DoS attacks are supposed to completely crash the attacked host, by
using some known vulnerabilities of its software.
This may be achieved for example by sending malformed messages which cause troubles for the
handlers on the server side. A lot of operating systems were vulnerable to the attacks of this type,
that were targeting the Internet Layer (therefore, they were dealing with IP addresses).
Targeting Layers
By choosing the way of constructing the messages, the attacker can target different network
layers of the attacked system. Usually attacks are performed against the application layer or and
the functionalities that handle popular lower protocols TCP or UDP. The complexity of high-level
algorithms allows the intruders to construct a lot of complicated messages, targeting various
vulnerabilities of the attacked systems.
More sophisticated attacks target lower layers of the TCP/IP network model. For example, there
exist a lot of tools working similarly to popular ping programs. They create large numbers of IP
packages, which are supposed to flood the network and reduce the network's bandwidth. The
packages may be either valid (in this case we could call the attack ICMP flooding) or invalid
(which may lead to the so-called Nuke attacks).
A popular lower layer attack is called a ping of death. This is basically a malformed ping package,
which may lead to a system crash on unprepared systems.
Attacker's Goal
Disabling the attacked computer may be a goal by itself. This is often the case in various political
attacks, when intruders want mainly to manifest their slogans. Also, it is often enough to disable
a targeted system or even just to pose the threat of doing that if the attackers want only to demand
a ransom for stopping the attack.
Other DoS attacks are more sophisticated. In such situations, disabling online services is just the
first step, and the attack will be continued in order to exploit the vulnerability of the system. Quite
often, after removing the original server, the attacker creates its own identical service which is
supposed to imitate the original one. Having a fake copy of the attacked system, the attackers
may take advantage of its users, and use the controlled fake server to steal their data.
The most dangerous DoS attacks are perhaps the attacks which result in damaging the actual
hardware. They are called permanent denial-of-service attacks or phlashing. A well-designed
attack may disable the components of the targeted system which are crucial for the actual
mechanical devices, thus breaking them and forcing the administrators to reinstall or even replace
damaged hardware.
Thanks to the usage of thousands of computers, the number of generated messages, that have
to be handled by the attacked system, is really huge. Nowadays, the largest DDoS attacks can
generate as many as terabits of data per second.
Degradation-of-Service
Degradation-of-service attacks are similar to denial-of-service attacks but they are intended to not
completely block the server but rather to disturb it and reduce its performance. The amount of
sending messages is much smaller, and the server should be able to cope with the increased
traffic.
Therefore, these attacks are not so dangerous as the normal DoS attacks. They are intended to
reduce the performance of the attacked host, discouraging its clients, and to force the
administrators to take additional actions to improve the server's performance. All that results in
increased costs and financial damages which will affect the attacked system.
Well performed degradation-of-service attacks are designed in a way, which makes it not clear
for the administrators, whether any attack takes place at all or if they just face an increased traffic.
Slowloris Attacks
During the Slowloris attacks, the attacker sends the request slowly but in a large number. The
targeted system has to keep the all the connection canals open, because they are perfectly valid
and therefore there is no reason to discard them.
This will result in using up the whole available pool of network connections on the server side.
These attacks require less sophisticated hardware to be used by the intruders, and make both
the detection and protection against them more difficult.
Of course, the attacker creates hundreds or thousands of such connections, depleting the
resources of the targeted system.
Shrew Attack
The shrew attack is another example of a DoS attack which is based on sending messages to the
attacked system at a slow rate. It targets the TCP protocol, so operates on a lower level than the
HTTP Post DoS attack, described above.
An attacker sends the messages at a carefully chosen rate, exploiting the TCP retransmission
mechanism. The TCP connections are not allowed to be closed, due to the ongoing
communication, and soon the whole TCP traffic may be disrupted.
Zombie Computers
DoS attacks are often performed indirectly. It means that the messages are not sent from the
intruder's computer but from other machines, which are controlled by the attacker. Those
machines are called zombies because their genuine users don't have the full control over them.
Quite often, zombie computers are infected by specialized malware. After receiving the order from
the attackers, the hidden applications will start sending packets to the targeted system. The users
working on those machines won't often be aware of their participation in the attack.
Using zombie computers have at least two advantages. Firstly, it allows to create large networks
of computers, which will attempt to break the target system. The computational power of many
(hundreds or thousands) connected zombies working together is much larger that the power of
any other possible network that could be built by the intruders.
Secondly, similarly as in the case of any other cryptographic attack, an additional separation
between the attacked system and the attackers always increases the difficulty of any
countermeasures that can be taken by the system administrators. For example, blocking the IP
addresses of zombie computers located all over the world and belonging to different operators is
much more difficult than just blocking the access from one organisation or location.
Tools
There exist a lot of tools and applications available in the Internet that can perform various types
of DoS attacks. In fact, the underground market offers a variety of products, with different features
and prices. One could name programs like GCHQ, HOIC, or MyDoom.
One could mention also two tools for DDoS attacks, which were created in the UK. They are
called Predators Face and Rolling Thunder.
There are also tools which can be used for Slowloris attacks, like PyLoris, QSlowloris, and Goloris.
Man-in-the-Middle Attack
During the man-in-the-middle attack, the hidden intruder joins the communication and intercepts
all messages.
First, the attacker creates two secret keys. Then, he uses the first key to start the communication
with the first side. The received answer is encrypted but the intruder can decrypt it easily, as he
knows the key. He encrypts the message again, this time with the second key. The encrypted
message is then send back to the second side. Then, after receiving the answer from the second
side, he decrypts the message, reads it, encrypts by the first key and sends back to the first site.
In this way, the whole communication moves through the attacker. He can receive a lot
of information about the whole system and even successfully impersonate authorized persons
and reach the access for hidden data.
To defend against this attack, a strong mutual authentication method must be used before starting
transmission of secret data. The other way of protection is to use known public keys, which can
be reach from for example known databases, instead of using any encryption key obtained from
one of the sides of the communication (so in this case - from the attacker).
This attack is often used for eavesdropping the communication with Wi-Fi access points or with
base stations in GSM networks. As an example, you can refer to the KRACK attack against
WPA2.
Venona Project
During and after the Second World War, hundreds of cryptanalysts of intelligence agencies of
the United States and the United Kingdom were collaborating against intelligence agencies of
the Soviet Union. All the messages sent by Soviet spies and diplomats were constantly stored
and analysed. The most important messages were encrypted with a One-Time Pad system.
Mane secret Soviet messages were revealed, due to a serious blunder on the part of the Soviets.
Because of shortages of code books, the operators reused some parts of the secret OTP keys
for encryption of multiple messages. Every page of the code book should have been used exactly
once, and then it should have been destroyed
This mistake broke the security of the One-Time Pad cipher. It allowed the Allies to decrypt many
secret messages and gained advantage over their communist opponents.
MS-PPTP
PPTP (Point to Point Tunnelling Protocol) is one of communication protocols, which allow
to create virtual private networks (VPN) using tunnelling. Implementation of this protocol created
by Microsoft was one of the most popular (used in Windows 98 and Windows NT), and also one
of the most faulty. MS-PPTP has been considered cryptographically broken by Microsoft since
2012, and it is no longer recommended.
One of the MS-PPTP weaknesses is the lack of proper synchronisation between the client and
the server. They use the same secret key (usually created from the user's password) in exactly
the same way, for sending their messages. Both parties fail to create unique keystreams by
adding some unique numbers.
Encryption in the MS-PPTP protocol
The client groups his messages together, and then encrypts them by using the shared secret key.
In the meantime, the same operations are performed by the server. It also groups the messages,
encrypts using by the same shared secret key, and sends them to the client.
Because the used secret key bytes are the same, the attacker may eavesdrop messages from
the client and from the server, which are encoded by using exactly the same keystream bytes.
Having such data, the attacker has great chances for breaking the cipher and recovering the
original data.
802.11 WEP
802.11 is a group of IEEE (Institute of Electrical and Electronics Engineers) standards of wireless
network protocols. In their older versions, it was recommended to use a WEP (Wired Equivalent
Privacy) standard (created in 1997) for encryption of wireless transmission.
WEP encryption
The messages exchanged between client and host are encrypted using RC4 stream symmetric
cipher. Both sides use the same 5-byte long secret key for generating keystream. Each side
generates the same keystream. To ensure that every message is encrypted using different bytes,
they add three additional bytes of the IV vector to every key sequence. IV is added unencrypted
to each encrypted message. This allows the receiver to decrypt all messages.
However, due to the fact that the IV vector has only 24 bits, so after relatively short time its values
begin to repeat. It happens after around 16 million frames, so (if network traffic is high), after
around 5 hours. Moreover, some devices reset the vector IV during the restart, which allows to
observe the same byte sequences even faster.
There exist a few other flaws that make WEP even less secure. Usually, regular counters are
used for creating the IV vector. It makes the key bytes used for encryption messages in both
directions relatively similar. Moreover, some values of the IV vector are considered weak because
they allow to attack specific bytes of the secret key. Because of all these weaknesses, the WEP
encryption can be broken in a very short time (within a few minutes).
In newer versions of IEEE standards, newer security protocols are recommended: WPA (Wi-Fi
Protected Access) and WPA2.
KRACK
Key Reinstallation Attack (KRACK) is a complex attack against the WPA2 protocol. It is a
combination of a known-ciphertext attack and a man-in-the-middle attack. The intruder performs
the attack during the WPA2 handshake, that is during the initialisation of WPA2 connection. The
attack is based on flaws in the standard and its implementations.
At the moment when the details regarding this attack were published (on 16th October 2017) most
existing WPA2 implementation were vulnerable to KRACK attack. The authors carefully prepared
their publication, by creating a dedicated website, creating videos, and even preparing a special
logo (available in many resolutions). They proposed how to fix the standard and prepared patches
for all major WPA2 implementations, which should protect them from being vulnerable to attacks
based on KRACK.
One may predict that the lifetime of KRACK attack will not be particularly long. Most producers
released the high-priority patches, which fix the issue, shortly after the publication. However, it
seems reasonable to briefly present the way of performing this attack. This is a practical example
of cryptographic attack on two-time-pad.
To achieve that, the first step of the intruder is to perform the man-in-the-middle attack. After
faking his MAC address, he will locate himself between both communicating sides. He will pretend
that he is one of them. The attacker must be able to intercept and block the messages sending
between the router and the client.
In the second step, the attacker should intercept and save the third handshake message. He must
also block the client response (the fourth handshake message) and prevent the router from
receiving it.
The faulty WPA2 specification recommends the client to generate the secret key every time after
receiving the third handshake message. Thanks to that, after some time the attacker can send
the third message again, which sets the key to the same value as previously (because it is created
based on the same data as before: the password, two nounces and the MAC address.
Depending on the particular algorithm used for encryption of further communication between the
router and the client, the attacker may compromise the security to a different extent: starting from
discovering the secret key protecting the communication in one direction, through the two-
direction communication key (CCMP and GCMP protocols), up to breaking the key completely
(because some Linux and Android versions reset the key to zeros as a result of this attack).
It is worth mentioning that the attacker cannot discover the network password stored on both
devices. Only the secret key is stolen, which is used only for message encryption during the
current session.
First of all, after receiving the third handshake message and before actually resetting the key, it
would be a good idea to check if the key hasn't been already generated. If the key exists, it
shouldn't be calculated again.
Also, the values of the nounces and the counters (incremented after sending every message)
shouldn't be reset if the key already exists and is used for encryption of ongoing communication.
Conclusion
KRACK attack presented on this webpage is an interesting example of man-in-the-middle attack,
performed in order to break two-time-pad encryption.
Above, I presented only the most popular version of this attack. The authors presented a number
of attacks on similar protocols (Fast BSS, TDSL, PeerKey), which base on similar handshake
algorithms. The presented also the way of stealing the group key and various modifications of this
attack for different protocol versions and operating systems. To find out more, you can visit the
website which is devoted to KRACK attack: www.krackattacks.com.
Frequency Analysis
Frequency analysis is one of the known ciphertext attacks. It is based on the study of
the frequency of letters or groups of letters in a ciphertext.
In all languages, different letters are used with different frequencies. For each language
proportions of appearance of all characters are slightly different, so texts written in a given
language have some certain common properties, which allow to distinguish them from texts
written in other languages.
For example, in English there are often used vowels like e, o, a or a consonant t. On the other
hand, there are some very rare letters, for example z or x. There are statements of frequencies
of letters in different languages. The frequencies can be determined only approximately because
in different kind of texts (scientific, historical, fiction) they are slightly different.
Each language has some typical and popular sequences of letters. In English, there are some
common bigrams, like tr, er, on, an, ss, tt and ee. Based on that, one can distinguish
an English text from texts written in other languages. It is possible to determine the correct order
of letters from mixed words.
The attacker usually checks some possibilities and makes some substitutions of letters
in ciphertext. He looks for possible appearing words and based on that makes more substitutions.
Using computers, it is possible to try a lot of combinations in relative short time.
For example, if in the analyzed ciphertext the most popular letter is x, one may predict
that x replaced e or o (one of the most popular letters in English) from the plaintext.
It is useful to look for popular pairs of letters or even try to predict some frequent longer
sequences of letters or whole words. The intruder always tries to find sequences of letters which
are often used in the selected language.
Meet-in-the-middle Attack
The meet-in-the-middle attack is one of the types of known plaintext attacks. The intruder has
to know some parts of plaintext and their ciphertexts. Using meet-in-the-middle attacks it is
possible to break ciphers, which have two or more secret keys for multiple encryption using the
same algorithm. For example, the 3DES cipher works in this way. Meet-in-the-middle attack was
first presented by Diffie and Hellman for cryptanalysis of DES algorithm.
A cipher, which is to be broken using meet-in-the-middle attack, can be defined as two
algorithms, one for encryption and one for decryption. Each of them contains two simpler
algorithms:
C = Eb(kb, Ea(ka, P))
P = Da(ka, Db(kb, C))
where:
• C is a ciphertext,
• P is a plaintext,
• E is an algorithm for encryption,
• D is an algorithm for decryption,
• ka and kb are two secret keys
A following equation can be written for the cipher defined above:
Db(kb, C) = Ea(ka, P)
Where C is the ciphertext, known to the intruder, which corresponds to the message P, also
known to the intruder.
The first step of the attack is to create a table with all possible values for one side of the equation.
One should calculate all possible ciphertexts of the known plaintext P created using the first secret
key, so Ea(ka,P). A number of rows in the table is equal to a number of possible secret keys. It
is good idea to sort the received table based on received ciphertexts Ea(ka,P), in order to simplify
its further searching.
The second step of the attack is to calculate values of Db(kb,C) for the second side of
the equation. One should compare them with the values of the first side of the equation, computed
earlier and stored in the table. The intruder searches a pair of secret keys ka and kb, for which
the value Ea(ka,P) found in the table and the just calculated value Db(kb,C) are the same.
The scheme of meet-in-the-middle attack
It is possible to attack encryption systems, where two encrypting algorithms E are different (and
used keys which have not necessarily the same lengths). In that case, in the first step the table is
created for weaker of two algorithms.
Meet-in-the-middle Complexity
The meet-in-the-middle attack allows much quicker breaking of the cipher than using the
ordinary brute force attack. Both time complexity and computational complexity depend on
lengths of two encrypting keys ka and kb. They may be presented as a sum of two products:
2len(ka) log(2len(ka)) + 2len(kb) log(2len(ka))
Where:
o 2len(ka) - creating the table with all possible values of Ea(ka,P),
o log(2len(ka)) - sorting the table with all possible values of Ea(ka,P),
o 2len(kb) - calculating all possible values of Db(kb,C),
o log(2len(ka)) - searching the sorted table with values of Ea(ka,P).
If lengths of both keys ka and kb are the same and equal to Lk, then time complexity of the meet-
in-the-middle attack can be presented as O(2Lk+1). Memory usage can be approximated
as O(2Lk). Time complexity of the brute force attack is much greater and equals to
approximately O(2Lk+Lk). However, the brute force attack uses only O(1) memory.
Meet-in-the-middle 2D
If an analyzed algorihtm can be divided into two simpler algorithms with one intermediate state
and if the state is smaller than a secret key, then is is possible to perform the two-dimentional
meet-in-the-middle attack. In modern block ciphers, algorithms often operate on small data blocks
using the quite long secret key.
If it is possible to find the intermediate state S, then the analyzed cipher may can be presented as:
C = E1(k1, E2(k2, P))
P = D2(k2, D1(k1, C))
Where values E2(k2,P) can be find in the set which contains all possible values of
the intermediate state S.
Both encryption algorithms E1 and E2 can be broken using a two-dimentional meet-in-the-middle
attack. A scheme of the two-dimentional attack is presented below:
In order to break a cipher using the two-dimensional meet-in-the-middle attack, one should take
the following steps:
1. Calculate all possible values of Ea1(ka1,P) (for known P and all possible values of
the key ka1), then insert them to a table together with values of corresponding keys ka1.
The table should be sorted by calculated values of Ea1(ka1,P).
2. Calculate all possible values of Db2(kb2,C) (for known C corresponding to P and all
possible values of the key kb2), then insert them to a table together with values
of corresponding keys kb2. The table should be sorted by calculated values of Db2(kb2,C).
3. For all possible values of the intermediate state S:
1. Calculate all possible values of Db1(kb1,S) (for all possible values of the key kb1),
then insert them to a table together with values of corresponding keys kb1. The
table should be sorted by calculated values of Db1(kb1,S).
2. Calculate all possible values of Ea2(ka2,S) (for all possible values of the key ka2),
then insert them to a table together with values of corresponding keys ka2.
The table should be sorted by calculated values of Ea2(ka2,S).
3. Compare values in four created tables, searching for
equality Ea1(ka1,P) = Db1(kb1,S) (one should receive a pair of keys (ka1,kb1))
and Ea2(ka2,S) = Db2(kb2,C) (a pair of keys (ka2,kb2)). All combinations of the two
pairs are the potential secret key for the whole cipher. One should check all
received combinations with other known plaintext and ciphertexts blocks.
Usually four extracted keys ka1, kb1, ka2 and kb2 share some bits. One should assign
an independent variable to each bit in the keys and treat them separately.
Meet-in-the-middle nD
It is possible that the attacked cipher can be divided into more than two simpler ciphers. In
the general case one could find n intermediate states and n+1 encryption algorithms which can
be break using the meet-in-the-middle method. A scheme of the multidimensional meet-in-the-
middle attack is presented below:
In order to break a cipher using the multidimensional meet-in-the-middle attack, one should take
the following steps:
1. Calculate all possible values of Ea1(ka1,P) (for known P and all possible values of
the key ka1), then insert them to a table together with values of corresponding keys ka1.
The table should be sorted by calculated values of Ea1(ka1,P).
2. Calculate all possible values of Dbn+1(kbn+1,C) (dla znanego C opowiadającego P and all
possible values of the key kbn+1), then insert them to a table together with values
of corresponding keys kbn+1. The table should be sorted by calculated values
of Dbn+1(kbn+1,C).
3. For all possible values of the intermediate state S1:
1. Calculate all possible values of Db1(kb1,S1) (for all possible values of the key kb1),
then insert them to a table together with values of corresponding keys kb1.
The table should be sorted by calculated values of Db1(kb1,S1).
2. Calculate all possible values of Ea2(ka2,S1) (for all possible values of the key ka2),
then insert them to a table together with values of corresponding keys ka2.
The table should be sorted by calculated values of Ea2(ka2,S1).
3. For all possible values of the intermediate state S2:
1. Calculate all possible values of Db2(kb2,S2) (for all possible values of
the key kb2), then insert them to a table together with values
of corresponding keys kb2. The table should be sorted by calculated
values of Db2(kb2,S2).
2. ...powtarzać analogiczne operacje aż do przejściowego stanu Sn...
3. For all possible values of the intermediate state Sn:
1. Calculate all possible values of Dbn(kbn,Sn) (for all possible values
of the key kbn), then insert them to a table together with values
of corresponding keys kbn. The table should be sorted by
calculated values of Dbn(kbn,Sn).
2. Calculate all possible values of Ean+1(kan+1,Sn) (for all possible
values of the key kan+1), then insert them to a table together with
values of corresponding keys kan+1. The table should be sorted by
calculated values of Ean+1(kan+1,Sn).
3. Analyze numbers in all created tables, comparing corresponding
values of Eai with Dbi. One should receive a few combinations of
corresponding pairs of keys (kai,kbi). The pairs should
be checked for other known parts of plaintext and ciphertext and
one combination of pairs should work encrypt and decrypt
properly all data.
One can choose an arbitrary order of analyzed intermediate states. The states are smaller than
encryption keys, so tables created in the multidimensional meet-in-the-middle attack have
approximately the same size as tables created during the regular meet-in-the-middle attack.
Replay Attack
During replay attacks the intruder sends to the victim the same message as was already used in
the victim's communication. The message is correctly encrypted, so its receiver may treat is as
a correct request and take actions desired by the intruder.
The attacker might either have eavesdropped a message between two sides before or he may
know the message format from his previous communication with one of the sides. This message
may contain some kind of the secret key and be used for authentication.
For example, when one makes an order to the bank to transfer money to some specified account,
the attacker may eavesdrop the frames. If that happens, the attacker can send the same (correct)
messages to the bank one more time, hoping that the bank will transfer money again to the same
account (probably connected with the intruder).
There are some methods to avoid replay attacks. First of all, before starting the communication
both sides may negotiate and create a random session key, valid only for a specified time and
during a specified process. Instead of session keys, it is also reasonable to use timestamps in all
messages and accept messages that have not been sent too long ago. The other popular
technique is to use one-time passwords for each request. This method of prevention is very often
used for banking operations.
Cut-and-Paste Attack
In this variation of replay attack, an attacker mixes parts of different ciphertexts and sends them
to the victim. Most likely, the newly created message will be incorrect but the receiver may react
in such a way which will allow the intruder to obtain more information about the attacked system.
Date: 2020-03-09
Homograph Attack
A homograph attack is based on standards of modern Internet that allow to create (and display in
web browsers) URLs with characters from various language sets (with non-ASCII letters).
Different languages may contain different but very similar characters. Attackers can register their
own domain names that are similar to the existing web addresses. Then they can create their own
websites that are, again, the same or very similar to the existing original sites (that usually belong
to banks, corporations, email or news services). The phony websites are used for stealing data
from users who happened to visit them.
http://www.g00gle.co.uk
http://bl00mberg.com
Non-ASCII ULRs
The ability to use non-English characters in ULR addresses was added in 2003, due to
the increasing number of non-English-speaking people that were using Internet. The change
allowed to register and use domain names that could have been understood by a much larger
number of interested people. Thus it became possible to create web addresses that were
combinations of ASCII and non-ASCII characters, or addresses that consisted only of national
symbols:
http://россия.net
http://газета.ру
http://budyń.pl
All non-Latin addresses need to be encoded in a special way to be handled by DNS servers. This
format is known as Punycode and all browsers translate non-ASCII URLs into Punycode in
the background before performing a DNS lookup. A Punycode domain name always starts
from xn-- and then contains ASCII characters of the original address followed by encoded
Unicode data. For instance, the latter address from the example above will be encoded in
the following form:
http://xn--budy-e2a.pl
Such domain names that contain letters from different alphabets are called Internationalized
Domain Names (IDNs). They are handled in various ways by different web browsers. Usually every
producer implements his own algorithms for determining the display format of requested URLs
and usually one of two solutions (with some minor modifications) is preferred:
o Display all URL characters using Unicode, or
o Display all URL characters using Unicode if and only if all the characters
belong to the same language that is chosen by user settings; display
Punycode URL otherwise.
For example, Latin and Cyrillic alphabets contain a couple of letters that look the same but have
completely different meaning and are encoded in a different way:
Finally, it may be shown that an attacker can use a character that happens to look like the actual
ASCII slash / (U+002F) - the mathematical division operator ∕ (U+2215). It allows him to set up
a subdomain that looks like another real domain, using his own name server and a top-level
domain. The fake URL address could look like the one below:
http://example.com∕a-top-level-domain.com/
The character located after .com is a mathematical division operator. In a web browser's address
bar the character would look like a common slash and the whole URL could be easily confused
with the directory a-top-level-domain.com located in the root directory under
the domain example.com:
http://example.com/a-top-level-domain.com/
Security
The best protection against homograph attacks seems to be provided by warning or proper
handlings such phony addresses by web browsers. Unfortunately this is not always the case and,
what is more, the behaviour may differ depending on browser vendor.
Cryptographic Tools
Cryptography in Java
The cryptographic functionality in Java is provided mainly by two libraries, Java Cryptography
Architecture (JCA) and Java Cryptography Extension (JCE). The first one, JCA, is tightly
integrated with the core Java API, and delivers the most basic cryptographic features. The latter
one, JCE, provides various advanced cryptographic operations.
In the past, both JCA and JCE libraries used to be treated differently by US export policies. Over
time however the regulations were relaxed, and at present they both are delivered as part of
Java SE and the division is no longer important (one should keep in mind that it does not mean
that the law won't change in the future).
The API functions and classes defined in JCA and JCE allow cryptographic operations to be
performed in Java applications. In addition to operations, the classes describe various objects
and security concepts. All classes belonging to JCA and JCE are called engines.
All JCA engines are located in the java.security package, whereas the JCE classes are
located in the javax.crypto package.
Among others, JCA delivers engines for random number generation (SecureRandom), key
generation and management (KeyPairGenerator, KeyStore), message authentication
(MessageDigest, Signature), and for certificate management
(CertificateFactory, CertPathBuilder, CertStore).
JCA contains engines that allow actual encryption and decryption (Cipher), secret key
generation and agreement (KeyGenerator, SecretKeyFactory, KeyAgreement), and
message authentication operations (Mac).
Providers
Whilst JCA and JCE define all cryptographic operations and objects, the actual implementations
of functionalities are located in separate classes, called providers. The providers implement the
API defined in JCA and JCE, and they are responsible for providing the actual cryptographic
algorithms.
Thanks to that, the whole cryptographic architecture is relatively flexible. It separates the
interfaces and generic classes from their implementations. For most of the time, after the
initialization, the programmers need to deal only with abstract terms, like 'cipher' or 'secret key'.
In order to be used in Java applications, all providers must be signed by using a certificate
from Oracle. A detailed instruction can be found in the JDK documentation.
The providers can be installed by configuring the Java Runtime: installing the JAR containing the
provider, and then enabling it by adding its name to the java.security file. Alternatively, the
providers may be installed during execution (by calling Security.addProvider(..) function)
by the application itself.
Each functionality, for example the AES cipher algorithm, may be defined by several providers.
The application, when calling the JCA and JCE API functions, can specify which provider should
be used. Alternatively, the Java engine will choose an available provider based on the preference
order specified in the java.security file.
Popular Providers
A default set of SUN providers (nowadays owned by Oracle) is installed together with the main
Java cryptographic functionality. There are many different kinds of SUN providers (SUN, SunJCE,
SunPKCS11, and so on), and they are used by both JCA and JCE libraries. They define most (if
not all) cryptographic functionalities and can be used strait away in Java applications.
An example of different providers is the collection of classes called Bouncy Castle. It was
developed by an Australian charitable organization, so the US law restrictions do not apply to it.
Bouncy Castle provides a large number of classes implementing various cryptographic
operations. The project full description may be found on the website: www.bouncycastle.org.
Another set of providers were created by Cryptix organization, however the project has not been
actively developed since 2005. Cryptix website is located at: www.cryptix.org.
Policy Files
By default, Java cryptographic functionalities have some limitations related to the size of various
types of secret keys. The restrictions are related to the US law and they are supposed to prevent
the application from using too strong ciphers.
One can overcome the limitations by downloading and applying the unlimited strength policy files.
They can usually be acquired from the original Java download web page. The download link is
usually located somewhere at the bottom of the page. After getting the packed archive, the user
should follow the instructions that can be found in the README file.
Security Tokens
Security tokens are tools that allow to prove one's identity electronically. They are usually used
as additional means of authentication, typically together with passwords.
The tokens may be either physical devices or pure software applications, operating on computers
or mobile devices. Depending on their implementation, security tokens may be referred to as
authentication tokens, cryptographic tokens, hardware or software tokens, USB tokens, or key
fobs.
Irrespective of the type, the main functionality of all security tokens is basically the same. Every
token provides some kind of authentication code for the users, which allows them to access
a particular service (for example, an online bank account).
Another typical application of tokens are hardware dongles. They are required by some
applications to prove ownership of the software. During the startup, the program queries the token
connected to the USB port and checks the authentication code.
Usually a security token requires a password to release the internal authentication code. The
password is usually in a form of a short pin number. Sometimes a more sophisticated ways of
authentication are implemented, for example fingerprint readers.
The way the authentication code is produced may also vary between tokens. The simplest (and
the most popular) method is to display the code on the device display, so that the user may use
it later when required. Other tokens use NFC or bluetooth technologies for transmitting the
password, or require to be connected in another way, for example to the computer USB port or
a smart card reader.
Tokens may use different means for generating authentication codes.
Time-synchronized tokens
The time-synchronized tokens generate a password based on the current time. They must contain
a timer which is synchronized with another timer, operating on the authentication server side. The
passwords generated by time-synchronized tokens change constantly at a set time interval, for
example every minute.
The time-synchronized tokens may, over time, become unsynchronized. In such a case, the
passwords generated by them cannot be used to access the protected service, until
a resynchronization is performed.
Asynchronous tokens
The passwords generated by asynchronous tokens change every time they are generated. The
algorithms may be based on hash functions that generate series of one-time codes based on a
shared secret symmetric key.
Each created password must be unpredictable to guess, even if all the previously generated
passwords are known. One of the popular algorithms used in asynchronous tokens is the OATH
algorithm.
Tokens with public and private keys
If the token contains a private key, the server may use the corresponding public key to
authenticate it, without the need of transmitting the private key outside the token environment.
Usually the server sends the data encrypted with the public key. After decrypting the message,
the token sends it back to the server, allowing it to confirm the token identity. In such a case,
a direct communication between the token and the server must be established.
Date: 2020-03-09
Key-Based Authentication
(Public Key Authentication)
Key-based authentication is a kind of authentication that may be used as an alternative to
password authentication. Instead of requiring a user's password, it is possible to confirm the
client's identity by using asymmetric cryptography algorithms, with public and private keys.
Nowadays, password authentication is more popular than public key authentication. It does not
require much preparation (at least, from the client's point of view), and perhaps is generally more
intuitive. In order to log into a server, users have to provide their secret passwords, which are
verified by the server. A disadvantage of this method is that, when a server is publicly available,
it may be targeted by various types of brute force and dictionary attacks, and the password may
eventually be broken and revealed. Moreover, this method requires the users to remember their
(ideally difficult and complex) passwords.
Public key authentication offers a solution to these problems. The idea is to assign a pair of
asymmetric keys to every user. Users would store their public keys in each system they want to
use, while at the some time their private keys would be kept secure on the computers, the users
want to use to connect with those secured systems. During establishing the connection, the server
would use the public key to authenticate the client, for example by encrypting some number and
asking the client to decrypt it, by using his corresponding private key.
Of course, the brute force attacks may still be performed by an attacker, but the complexity of
long and unreadable keys is much larger, and such attacks would have significantly smaller
chance of success. The asymmetric keys using at present consist of thousand of bits (as for year
2016, the recommended lengths are 2048 and 4096 bits).
The user will be then asked for a few configuration parameters, like the desired location to save
the keys and, optionally, a passphrase which will be used to protect the private key. Usually the
keys are created in ~/.ssh directory, in the home folder. The public key is stored
in id_rsa.pub file, whereas the private key can be found in id_rsa.
The private key can be stored on an external memory stick. The users will be able to authenticate
themselves on any computer they happen to use, as long as they are able to present the proper
memory card.
The encrypted private key can by attacked, only when the attacker got full access to the computer
(for example, by stealing it). By the time the cipher is broken, the user has time to remove his old
public key from servers, that he was using, and to create a new secure pair of asymmetric keys.
Such an approach is much more secure that normal, password-based authentication, where an
attacker can start attacking the user's password online.
ssh-copy-id <username>@<host>
The user may be warned that the host is unknown, and will be asked for the proper password,
but apart from that the command automatically transfers the public key to the specified server.
ssh <username>@<host>
If succeeded (that means if the key-based authentication is correctly configured), this command
will establish a secure connection to the server, without the need of providing the remote account's
password. If the private key is encrypted, then the user will be asked for the password which
protects it.
After enabling the key-based authentication on the server, the password authentication could be
disabled, to prevent brute-force attacks. It can be done by changing the
flag PasswordAuthentication in /etc/ssh/sshd_config, and restarting the SSH service.
It is also possible to specify the IP addresses which would be allowed to use the password
authentication, and block the functionality for the others.
The configuration file mentioned above is also a place where you can enable or disable the key-
based authentication on the server altogether. The flags are
called PubkeyAuthentication and RSAAuthentication, and they both should be enabled.
The programs described below can be used to generate the pair of keys. After that, the public key
needs to be manually provided to Linux, to the file .ssh/authorized_keys mentioned above.
Git Bash
Git Bash is a terminal client which (as one may guess) provides bash functionalities to Windows
machines. It is quite similar to CygWin but faster and less complex. For the purposes of
generating asymmetric keys, it is just enough.
The web address of the Git project is git-scm.com, and its Windows version can be downloaded
from there.
The pair of keys can be generated using the same commands, like on Linux (ssh-keygen, and
so on).
PuTTY
PuTTY is an SSH and telnet client, an open source application available for Windows platforms.
It consists of several components, PuTTY, PSCP, PSFTP, PuTTYtel, Plink, Pageant, PuTTYgen,
pterm. All of them are quite useful. The project home page is www.putty.org.
To generate the keys, one should use PuTTYgen tool, which can be found in PuTTY installation
directory. The user should just press the 'Generate' button. The application requires enough
random data to generate random byte streams used for key-generation, so the user will be asked
to make some random mouse movements.
Bitvise
Bitvise is a company that specializes in SSH server and client applications development. It
provides a few various services that are not as free as both tools described above, however they
can be used freely for personal purposes. The website of the project is www.bitvise.com.
To generate the pair of keys, one should use User Keypair Manager tool, which can be found on
the front page. After clicking 'Generate New' button, the user will be asked for the type of the key
and for the optional password, similarly like during generating the keys on Linux.
User Keypair Manager allows to export the created keys. One has to select the format of output
(which is, usually, OpenSSL).
Docker
Docker is an application that allows deploying programs inside sandbox packages called
containers, which provide far more efficiency that commonly used virtual machines. Docker
application was created in France in 2013. The official website of the project is www.docker.com.
Docker allows a user to create a sandbox container that contains the application with all the
required dependencies. The container may be later used for running the prepared software
multiple times, or for future software development.
Docker may be used for creating and managing distributed software systems, due to the fact that
the user is able to relatively quickly modify the application by changing the containers that form
its services and processes. By adding new containers to the network, the user can easily improve
performance and effectiveness of the produced system. Docker containers may work on many
physical or virtual machines, and their internal environment is not affected by their hosts
configurations.
Software containers are not purely cryptographic tools. However, due to portability, efficiency and
flexibility of sandboxes, they are a great way for deploying and testing security solutions, and for
performing all types of software operations.
At present, there exist a few more container solutions but Docker is definitely one of the most
popular ones. It is worth mentioning that a lot of popular cloud service providers, like Amazon or
Microsoft, added support for Docker images.
Sandboxes
The overhead caused by adding additional layers by Docker is much smaller than the cost of
running the whole virtual machine. Instead of creating another fully operational operating system,
Docker containers use the low level functionalities of the host, modifying only the necessary
functionalities located in the upper layers of the host system.
At first, Docker was available only for Linux operating systems, but over time the support for
Windows was added. Docker uses modern functionalities available in operating systems that
provide various types of resource isolation. For example, when running on Linux, Docker takes
advantage of kernel features, like namespaces, cgoups, aufs file system and virtualization interfaces
(libcontainer, libvirt, and LXC).
The application that runs inside Docker is isolated from the host operating system in terms of the
file system, other processes and users, CPU and memory, network interfaces, and input/output
devices.
Docker API
One of the reasons of Docker popularity is the simplicity of its usage. After creating a Docker
image, the user can carry out the work by using a few simple commands.
A Docker image is the package containing the application which is going to be used, together with
all its dependencies and configuration parameters. Each time an image is started, Docker creates
a new container and initiates it with fresh parameters. Of course, a number of separate containers
based on the same image can be created and run at one time. Every created container receives
its own unique id number.
o docker pull image_name: downloads the specified image from Docker
repository.
o docker images: lists all available images.
o docker run image_name: starts the specified image, and performs the
predefined operations. Each time the command run is used, a new container
is created.
o docker ps: lists all currently running images.
o docker stop container_id: stops the specified container.
o docker rm container_id: removes the specified container.
Docker networks
It is possible to create multiple Docker images that would be connected via network and that
would be able to exchange data. Generally, it is recommended to create many Docker images,
each one of them running just one service, and allow them all to work together within one network.