You are on page 1of 1056

Information security

principles and practice, 2nd


edition, by mark stamp

Chapter 1  Introduction 1
Chapter 1: Introduction

“Begin at the beginning,” the King said, very gravely,


“and go on till you come to the end: then stop.”
 Lewis Carroll, Alice in Wonderland

Chapter 1  Introduction 2
The Cast of Characters
 Alice and Bob are the good guys

 Trudy is the bad “guy”

 Trudy is our generic “intruder”

Chapter 1  Introduction 3
Alice’s Online Bank
 Alice opens Alice’s Online Bank (AOB)
 What are Alice’s security concerns?
 If Bob is a customer of AOB, what
are his security concerns?
 How are Alice’s and Bob’s concerns
similar? How are they different?
 How does Trudy view the situation?

Chapter 1  Introduction 4
CIA
 CIA == Confidentiality, Integrity, and
Availability
 AOB must prevent Trudy from
learning Bob’s account balance
 Confidentiality: prevent unauthorized
reading of information
o Cryptography used for confidentiality

Chapter 1  Introduction 5
CIA
 Trudy must not be able to change
Bob’s account balance
 Bob must not be able to improperly
change his own account balance
 Integrity: detect unauthorized
writing of information
o Cryptography used for integrity

Chapter 1  Introduction 6
CIA
 AOB’s information must be available
whenever it’s needed
 Alice must be able to make transaction
o If not, she’ll take her business elsewhere
 Availability: Data is available in a timely
manner when needed
 Availability a relatively new security issue
o Denial of service (DoS) attacks

Chapter 1  Introduction 7
Beyond CIA: Crypto
 How does Bob’s computer know that
“Bob” is really Bob and not Trudy?
 Bob’s password must be verified
o This requires some clever cryptography
 What are security concerns of pwds?
 Are there alternatives to passwords?

Chapter 1  Introduction 8
Beyond CIA: Protocols
 When Bob logs into AOB, how does AOB
know that “Bob” is really Bob?
 As before, Bob’s password is verified
 Unlike the previous case, network security
issues arise
 How do we secure network transactions?
o Protocols are critically important
o Crypto plays a major role in security protocols

Chapter 1  Introduction 9
Beyond CIA: Access Control
 Once Bob is authenticated by AOB, then
AOB must restrict actions of Bob
o Bob can’t view Charlie’s account info

o Bob can’t install new software, and so on…

 Enforcing such restrictions: authorization


 Access control includes both
authentication and authorization

Chapter 1  Introduction 10
Beyond CIA: Software
 Cryptography, protocols, and access control
are all implemented in software
o Software is foundation on which security rests
 What are security issues of software?
o Real-world software is complex and buggy
o Software flaws lead to security flaws
o How does Trudy attack software?
o How to reduce flaws in software development?
o And what about malware?

Chapter 1  Introduction 11
Your Textbook
 The text consists of four major parts
o Cryptography
o Access control
o Protocols
o Software
 We’ll focus on technical issues
 But, people cause lots of problems…
Chapter 1  Introduction 12
The People Problem
 People often break security
o Both intentionally and unintentionally
o Here, we consider an unintentional case
 For example, suppose you want to buy
something online
o Say, Information Security: Principles and
Practice, 3rd edition from amazon.com

Chapter 1  Introduction 13
The People Problem
 To buy from amazon.com…
o Your browser uses the SSL protocol
o SSL relies on cryptography
o Many access control issues arise
o All security mechanisms are in software
 Suppose all of this security stuff
works perfectly
o Then you would be safe, right?
Chapter 1  Introduction 14
The People Problem
 What could go wrong?
 Trudy tries man-in-the-middle attack
o SSL is secure, so attack does not “work”
o But, Web browser warns of problem
o What do you, the user, do?
 If user ignores warning, attack works!
o None of the security mechanisms failed
o But user unintentionally broke security
Chapter 1  Introduction 15
Cryptography
 “Secret codes”
 The book covers
o Classic cryptography
o Symmetric ciphers
o Public key cryptography
o Hash functions++
o Advanced cryptanalysis

Chapter 1  Introduction 16
Access Control
 Authentication
o Passwords
o Biometrics
o Other methods of authentication
 Authorization
o Access Control Lists and Capabilities
o Multilevel security (MLS), security modeling,
covert channel, inference control
o Firewalls, intrusion detection (IDS)

Chapter 1  Introduction 17
Protocols
 “Simple” authentication protocols
o Focus on basics of security protocols
o Lots of applied cryptography in protocols
 Real-world security protocols
o SSH, SSL, IPSec, Kerberos
o Wireless: WEP, GSM

Chapter 1  Introduction 18
Software
 Security-critical flaws in software
o Buffer overflow
o Race conditions, etc.
 Malware
o Examples of viruses and worms
o Prevention and detection
o Future of malware?

Chapter 1  Introduction 19
Software
 Software reverse engineering (SRE)
o How hackers “dissect” software
 Digital rights management (DRM)
o Shows difficulty of security in software
o Also raises OS security issues
 Software and testing
o Open source, closed source, other topics

Chapter 1  Introduction 20
Software
 Operating systems
o Basic OS security issues
o “Trusted OS” requirements
o NGSCB: Microsoft’s trusted OS for the PC
 Software is a BIG security topic
o Lots of material to cover
o Lots of security problems to consider
o But not nearly enough time…

Chapter 1  Introduction 21
Think Like Trudy
 In the past, no respectable sources
talked about “hacking” in detail
o After all, such info might help Trudy
 Recently, this has changed
o Lots of info on network hacking,
malware, how to hack software, and more
o Classes taught on virus writing, SRE, …

Chapter 1  Introduction 22
Think Like Trudy
 Good guys must think like bad guys!
A police detective…
o …must study and understand criminals
 In information security
o We want to understand Trudy’s methods
o We might think about Trudy’s motives
o We’ll often pretend to be Trudy

Chapter 1  Introduction 23
Think Like Trudy
 Is it a good idea to discuss security
problems and attacks?
 Bruce Schneier, referring to Security
Engineering, by Ross Anderson:
o “It’s about time somebody wrote a book
to teach the good guys what the bad
guys already know.”

Chapter 1  Introduction 24
Think Like Trudy
 We must try to think like Trudy
 We must study Trudy’s methods
 We can admire Trudy’s cleverness
 Often, we can’t help but laugh at Alice’s
and/or Bob’s stupidity
 But, we cannot act like Trudy
o Except in this class …
o … and even then, there are limits

Chapter 1  Introduction 25
In This Course…
 Thinklike the bad guy
 Always look for weaknesses
o Find the weak link before Trudy does
 It’s OK to break the rules
o What rules?
 Think like Trudy
 But don’t do anything illegal!

Chapter 1  Introduction 26
Part I: Crypto

Part 1  Cryptography 27
Chapter 2: Crypto Basics
MXDXBVTZWVMXNSPBQXLIMSCCSGXSCJXBOVQXCJZMOJZCVC
TVWJCZAAXZBCSSCJXBQCJZCOJZCNSPOXBXSBTVWJC
JZDXGXXMOZQMSCSCJXBOVQXCJZMOJZCNSPJZHGXXMOSPLH
JZDXZAAXZBXHCSCJXTCSGXSCJXBOVQX
 plaintext from Lewis Carroll, Alice in Wonderland

The solution is by no means so difficult as you might


be led to imagine from the first hasty inspection of the characters.
These characters, as any one might readily guess,
form a cipher  that is to say, they convey a meaning…
 Edgar Allan Poe, The Gold Bug
Part 1  Cryptography 28
Crypto
 Cryptology  The art and science of
making and breaking “secret codes”
 Cryptography  making “secret
codes”
 Cryptanalysis  breaking “secret
codes”
 Crypto  all of the above (and more)

Part 1  Cryptography 29
How to Speak Crypto
 A cipher or cryptosystem is used to encrypt
the plaintext
 The result of encryption is ciphertext
 We decrypt ciphertext to recover plaintext
 A key is used to configure a cryptosystem
 A symmetric key cryptosystem uses the same
key to encrypt as to decrypt
 A public key cryptosystem uses a public key
to encrypt and a private key to decrypt

Part 1  Cryptography 30
Crypto
 Basic assumptions
o The system is completely known to the attacker
o Only the key is secret
o That is, crypto algorithms are not secret
 This is known as Kerckhoffs’ Principle
 Why do we make such an assumption?
o Experience has shown that secret algorithms
tend to be weak when exposed
o Secret algorithms never remain secret
o Better to find weaknesses beforehand
Part 1  Cryptography 31
Crypto as Black Box

key key

plaintext encrypt decrypt plaintext


ciphertext

A generic view of symmetric key crypto

Part 1  Cryptography 32
Simple Substitution
 Plaintext: fourscoreandsevenyearsago
 Key:

Plaintext a b c d e f g h i j k l m n o p q r s t u v w x y z

Ciphertext D E F G H I J K L M N O P Q R S T U V W X Y Z A B C

 Ciphertext:
IRXUVFRUHDQGVHYHQBHDUVDJR
 Shift by 3 is “Caesar’s cipher”
Part 1  Cryptography 33
Ceasar’s Cipher Decryption
 Suppose we know a Caesar’s cipher is
being used:

Plaintext a b c d e f g h i j k l m n o p q r s t u v w x y z

Ciphertext D E F G H I J K L M N O P Q R S T U V W X Y Z A B C

 Given ciphertext:
VSRQJHEREVTXDUHSDQWV
 Plaintext: spongebobsquarepants

Part 1  Cryptography 34
Not-so-Simple Substitution
 Shift by n for some n  {0,1,2,…,25}
 Then key is n
 Example: key n = 7

Plaintext a b c d e f g h i j k l m n o p q r s t u v w x y z

Ciphertext H I J K L M N O P Q R S T U V W X Y Z A B C D E F G

Part 1  Cryptography 35
Cryptanalysis I: Try Them All
 A simple substitution (shift by n) is used
o But the key is unknown
 Given ciphertext: CSYEVIXIVQMREXIH
 How to find the key?
 Only 26 possible keys  try them all!
 Exhaustive key search
 Solution: key is n = 4

Part 1  Cryptography 36
Simple Substitution: General Case
 In general, simple substitution key can be
any permutation of letters
o Not necessarily a shift of the alphabet
 For example

Plaintext a b c d e f g h i j k l m n o p q r s t u v w x y z

Ciphertext J I C A X S E Y V D K W B Q T Z R H F M P N U L G O

 Then 26! > 288 possible keys

Part 1  Cryptography 37
Cryptanalysis II: Be Clever
 We know that a simple substitution is used
 But not necessarily a shift by n
 Find the key given the ciphertext:
PBFPVYFBQXZTYFPBFEQJHDXXQVAPTPQJKTOYQWIPBVWLXTOX
BTFXQWAXBVCXQWAXFQJVWLEQNTOZQGGQLFXQWAKVWLXQ
WAEBIPBFXFQVXGTVJVWLBTPQWAEBFPBFHCVLXBQUFEVWLXGD
PEQVPQGVPPBFTIXPFHXZHVFAGFOTHFEFBQUFTDHZBQPOTHXTY
FTODXQHFTDPTOGHFQPBQWAQJJTODXQHFOQPWTBDHHIXQV
APBFZQHCFWPFHPBFIPBQWKFABVYYDZBOTHPBQPQJTQOTOGHF
QAPBFEQJHDXXQVAVXEBQPEFZBVFOJIWFFACFCCFHQWAUVWF
LQHGFXVAFXQHFUFHILTTAVWAFFAWTEVOITDHFHFQAITIXPFH
XAFQHEFZQWGFLVWPTOFFA

Part 1  Cryptography 38
Cryptanalysis II
 Cannot try all 288 simple substitution keys
 Can we be more clever?
 English letter frequency counts…

0.14

0.12

0.10

0.08

0.06

0.04

0.02

0.00
A C E G I K M O Q S U W Y

Part 1  Cryptography 39
Cryptanalysis II
 Ciphertext:
PBFPVYFBQXZTYFPBFEQJHDXXQVAPTPQJKTOYQWIPBVWLXTOXBTFXQ
WAXBVCXQWAXFQJVWLEQNTOZQGGQLFXQWAKVWLXQWAEBIPBFXFQ
VXGTVJVWLBTPQWAEBFPBFHCVLXBQUFEVWLXGDPEQVPQGVPPBFTIXPFH
XZHVFAGFOTHFEFBQUFTDHZBQPOTHXTYFTODXQHFTDPTOGHFQPBQW
AQJJTODXQHFOQPWTBDHHIXQVAPBFZQHCFWPFHPBFIPBQWKFABVYY
DZBOTHPBQPQJTQOTOGHFQAPBFEQJHDXXQVAVXEBQPEFZBVFOJIWFF
ACFCCFHQWAUVWFLQHGFXVAFXQHFUFHILTTAVWAFFAWTEVOITDHFH
FQAITIXPFHXAFQHEFZQWGFLVWPTOFFA

 Analyze this message using statistics below

Ciphertext frequency counts:


A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
21 26 6 10 12 51 10 25 10 9 3 10 0 1 15 28 42 0 0 27 4 24 22 28 6 8

Part 1  Cryptography 40
Cryptanalysis: Terminology
 Cryptosystem is secure if best know
attack is to try all keys
o Exhaustive key search, that is
 Cryptosystem is insecure if any
shortcut attack is known
 But then insecure cipher might be
harder to break than a secure cipher!
o What the … ?
Part 1  Cryptography 41
Double Transposition
 Plaintext: attackxatxdawn

Permute rows
and columns


 Ciphertext: xtawxnattxadakc
 Key is matrix size and permutations:
(3,5,1,4,2) and (1,3,2)
Part 1  Cryptography 42
One-Time Pad: Encryption
e=000 h=001 i=010 k=011 l=100 r=101 s=110 t=111

Encryption: Plaintext  Key = Ciphertext

h e i l h i t l e r
Plaintext: 001 000 010 100 001 010 111 100 000 101
Key: 111 101 110 101 111 100 000 101 110 000
Ciphertext: 110 101 100 001 110 110 111 001 110 101

s r l h s s t h s r
Part 1  Cryptography 43
One-Time Pad: Decryption
e=000 h=001 i=010 k=011 l=100 r=101 s=110 t=111

Decryption: Ciphertext  Key = Plaintext

s r l h s s t h s r
Ciphertext: 110 101 100 001 110 110 111 001 110 101
Key: 111 101 110 101 111 100 000 101 110 000
Plaintext: 001 000 010 100 001 010 111 100 000 101

h e i l h i t l e r
Part 1  Cryptography 44
One-Time Pad
Double agent claims following “key” was used:
s r l h s s t h s r
Ciphertext: 110 101 100 001 110 110 111 001 110 101
“key”: 101 111 000 101 111 100 000 101 110 000
“Plaintext”: 011 010 100 100 001 010 111 100 000 101

k i l l h i t l e r
e=000 h=001 i=010 k=011 l=100 r=101 s=110 t=111

Part 1  Cryptography 45
One-Time Pad
Or claims the key is…
s r l h s s t h s r
Ciphertext: 110 101 100 001 110 110 111 001 110 101
“key”: 111 101 000 011 101 110 001 011 101 101
“Plaintext”: 001 000 100 010 011 000 110 010 011 000

h e l i k e s i k e
e=000 h=001 i=010 k=011 l=100 r=101 s=110 t=111

Part 1  Cryptography 46
One-Time Pad Summary
 Provably secure
o Ciphertext gives no useful info about plaintext
o All plaintexts are equally likely
 BUT, only when be used correctly
o Pad must be random, used only once
o Pad is known only to sender and receiver
 Note: pad (key) is same size as message
 So, why not distribute msg instead of pad?

Part 1  Cryptography 47
Real-World One-Time Pad
 Project VENONA
o Soviet spies encrypted messages from U.S. to
Moscow in 30’s, 40’s, and 50’s
o Nuclear espionage, etc.
o Thousands of messages
 Spy carried one-time pad into U.S.
 Spy used pad to encrypt secret messages
 Repeats within the “one-time” pads made
cryptanalysis possible
Part 1  Cryptography 48
VENONA Decrypt (1944)
[C% Ruth] learned that her husband [v] was called up by the army but
he was not sent to the front. He is a mechanical engineer and is now
working at the ENORMOUS [ENORMOZ] [vi] plant in SANTA FE, New
Mexico. [45 groups unrecoverable]
detain VOLOK [vii] who is working in a plant on ENORMOUS. He is a
FELLOWCOUNTRYMAN [ZEMLYaK] [viii]. Yesterday he learned that
they had dismissed him from his work. His active work in progressive
organizations in the past was cause of his dismissal. In the
FELLOWCOUNTRYMAN line LIBERAL is in touch with CHESTER [ix].
They meet once a month for the payment of dues. CHESTER is
interested in whether we are satisfied with the collaboration and
whether there are not any misunderstandings. He does not inquire
about specific items of work [KONKRETNAYa RABOTA]. In as much
as CHESTER knows about the role of LIBERAL's group we beg
consent to ask C. through LIBERAL about leads from among people
who are working on ENOURMOUS and in other technical fields.
 “Ruth” == Ruth Greenglass
 “Liberal” == Julius Rosenberg
 “Enormous” == the atomic bomb
Part 1  Cryptography 49
Codebook Cipher
 Literally, a book filled with “codewords”
 Zimmerman Telegram encrypted via codebook
Februar 13605
fest 13732
finanzielle 13850
folgender 13918
Frieden 17142
Friedenschluss 17149
: :

 Modern block ciphers are codebooks!


 More about this later…
Part 1  Cryptography 50
Codebook Cipher: Additive
 Codebooks also (usually) use additive
 Additive  book of “random” numbers
o Encrypt message with codebook
o Then choose position in additive book
o Add in additives to get ciphertext
o Send ciphertext and additive position (MI)
o Recipient subtracts additives before
decrypting
 Why use an additive sequence?
Part 1  Cryptography 51
Zimmerman
Telegram
 Perhaps most
famous codebook
ciphertext ever
 A major factor in
U.S. entry into
World War I

Part 1  Cryptography 52
Zimmerman
Telegram
Decrypted
 British had
recovered
partial
codebook
 Then able to
fill in missing
parts

Part 1  Cryptography 53
Random Historical Items
 Crypto timeline
 Spartan Scytale  transposition
cipher
 Caesar’s cipher
 Poe’s short story: The Gold Bug
 Election of 1876

Part 1  Cryptography 54
Election of 1876
 “Rutherfraud” Hayes vs “Swindling” Tilden
o Popular vote was virtual tie
 Electoral college delegations for 4 states
(including Florida) in dispute
 Commission gave all 4 states to Hayes
o Voted on straight party lines
 Tilden accused Hayes of bribery
o Was it true?
Part 1  Cryptography 55
Election of 1876
 Encrypted messages by Tilden supporters
later emerged
 Cipher: Partial codebook, plus transposition
 Codebook substitution for important words
ciphertext plaintext
Copenhagen Greenbacks
Greece Hayes
Rochester votes
Russia Tilden
Warsaw telegram
: :

Part 1  Cryptography 56
Election of 1876
 Apply codebook to original message
 Pad message to multiple of 5 words (total
length, 10,15,20,25 or 30 words)
 For each length, a fixed permutation
applied to resulting message
 Permutations found by comparing several
messages of same length
 Note that the same key is applied to all
messages of a given length

Part 1  Cryptography 57
Election of 1876
 Ciphertext: Warsaw they read all
unchanged last are idiots can’t situation
 Codebook: Warsaw  telegram
 Transposition: 9,3,6,1,10,5,2,7,4,8
 Plaintext: Can’t read last telegram.
Situation unchanged. They are all idiots.
 A weak cipher made worse by reuse of key
 Lesson? Don’t overuse keys!

Part 1  Cryptography 58
Early 20th Century
 WWI  Zimmerman Telegram
 “Gentlemen do not read each other’s mail”
o Henry L. Stimson, Secretary of State, 1929
 WWII  golden age of cryptanalysis
o Midway/Coral Sea
o Japanese Purple (codename MAGIC)
o German Enigma (codename ULTRA)

Part 1  Cryptography 59
Post-WWII History
 Claude Shannon  father of the science of
information theory
 Computer revolution  lots of data to protect
 Data Encryption Standard (DES), 70’s
 Public Key cryptography, 70’s
 CRYPTO conferences, 80’s
 Advanced Encryption Standard (AES), 90’s
 The crypto genie is out of the bottle…
Part 1  Cryptography 60
Claude Shannon
 The founder of Information Theory
 1949 paper: Comm. Thy. of Secrecy Systems
 Fundamental concepts
o Confusion  obscure relationship between
plaintext and ciphertext
o Diffusion  spread plaintext statistics through
the ciphertext
 Proved one-time pad is secure
 One-time pad is confusion-only, while double
transposition is diffusion-only
Part 1  Cryptography 61
Taxonomy of Cryptography
 Symmetric Key
o Same key for encryption and decryption
o Modern types: Stream ciphers, Block ciphers
 Public Key (or “asymmetric” crypto)
o Two keys, one for encryption (public), and one
for decryption (private)
o And digital signatures  nothing comparable in
symmetric key crypto
 Hash algorithms
o Can be viewed as “one way” crypto

Part 1  Cryptography 62
Taxonomy of Cryptanalysis
 From perspective of info available to Trudy…
o Ciphertext only  Trudy’s worst case scenario
o Known plaintext
o Chosen plaintext
 “Lunchtime attack”
 Some protocols will encrypt chosen data
o Adaptively chosen plaintext
o Related key
o Forward search (public key crypto)
o And others…
Part 1  Cryptography 63
Chapter 3:
Symmetric Key Crypto

The chief forms of beauty are order and symmetry…


 Aristotle

“You boil it in sawdust: you salt it in glue:


You condense it with locusts and tape:
Still keeping one principal object in view 
To preserve its symmetrical shape.”
 Lewis Carroll, The Hunting of the Snark

Part 1  Cryptography 64
Symmetric Key Crypto
 Stream cipher  generalize one-time pad
o Except that key is relatively short
o Key is stretched into a long keystream
o Keystream is used just like a one-time pad
 Block cipher  generalized codebook
o Block cipher key determines a codebook
o Each key yields a different codebook
o Employs both “confusion” and “diffusion”

Part 1  Cryptography 65
Stream Ciphers

Part 1  Cryptography 66
Stream Ciphers
 Once upon a time, not so very long ago…
stream ciphers were the king of crypto
 Today, not as popular as block ciphers
 We’ll discuss two stream ciphers:
 A5/1
o Based on shift registers
o Used in GSM mobile phone system
 RC4
o Based on a changing lookup table
o Used many places
Part 1  Cryptography 67
A5/1: Shift Registers
 A5/1 uses 3 shift registers
o X: 19 bits (x0,x1,x2, …,x18)
o Y: 22 bits (y0,y1,y2, …,y21)
o Z: 23 bits (z0,z1,z2, …,z22)

Part 1  Cryptography 68
A5/1: Keystream
 At each iteration: m = maj(x8, y10, z10)
o Examples: maj(0,1,0) = 0 and maj(1,1,0) = 1
 If x8 = m then X steps
o t = x13  x16  x17  x18
o xi = xi1 for i = 18,17,…,1 and x0 = t
 If y10 = m then Y steps
o t = y20  y21
o yi = yi1 for i = 21,20,…,1 and y0 = t
 If z10 = m then Z steps
o t = z7  z20  z21  z22
o zi = zi1 for i = 22,21,…,1 and z0 = t
 Keystream bit is x18  y21  z22
Part 1  Cryptography 69
A5/1
X x0 x1 x2 x3 x4 x5 x6 x7 x8 x9 x10 x11 x12 x13 x14 x15 x16 x17 x18

Y y0 y1 y2 y3 y4 y5 y6 y7 y8 y9 y10 y11 y12 y13 y14 y15 y16 y17 y18 y19 y20 y21 

Z z0 z1 z2 z3 z4 z5 z6 z7 z8 z9 z10 z11 z12 z13 z14 z15 z16 z17 z18 z19 z20 z21 z22


 Each variable here is a single bit
 Key is used as initial fill of registers
 Each register steps (or not) based on maj(x8, y10, z10)
 Keystream bit is XOR of rightmost bits of registers
Part 1  Cryptography 70
A5/1
X 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1

Y 1 1 0 0 1 1 0 0 1 1 0 0 1 1 0 0 1 1 0 0 0 1 

Z 1 1 1 0 0 0 0 1 1 1 1 0 0 0 0 1 1 1 1 0 0 0 1


 In this example, m = maj(x8, y10, z10) = maj(1,0,1) = 1
 Register X steps, Y does not step, and Z steps
 Keystream bit is XOR of right bits of registers
 Here, keystream bit will be 0  1  0 = 1
Part 1  Cryptography 71
Shift Register Crypto
 Shift register crypto efficient in hardware
 Often, slow if implemented in software
 In the past, very, very popular
 Today, more is done in software due to
fast processors
 Shift register crypto still used some
o Especially in resource-constrained devices

Part 1  Cryptography 72
RC4
 A self-modifying lookup table
 Table always contains a permutation of the
byte values 0,1,…,255
 Initialize the permutation using key
 At each step, RC4 does the following
o Swaps elements in current lookup table
o Selects a keystream byte from table
 Each step of RC4 produces a byte
o Efficient in software
 Each step of A5/1 produces only a bit
o Efficient in hardware
Part 1  Cryptography 73
RC4 Initialization
 S[] is permutation of 0,1,...,255
 key[] contains N bytes of key
for i = 0 to 255
S[i] = i
K[i] = key[i (mod N)]
next i
j = 0
for i = 0 to 255
j = (j + S[i] + K[i]) mod 256
swap(S[i], S[j])
next i
i = j = 0

Part 1  Cryptography 74
RC4 Keystream
 At each step, swap elements in table and
select keystream byte
i = (i + 1) mod 256
j = (j + S[i]) mod 256
swap(S[i], S[j])
t = (S[i] + S[j]) mod 256
keystreamByte = S[t]
 Use keystream bytes like a one-time pad
 Note: first 256 bytes should be discarded
o Otherwise, related key attack exists

Part 1  Cryptography 75
Stream Ciphers
 Stream ciphers were popular in the past
o Efficient in hardware
o Speed was needed to keep up with voice, etc.
o Today, processors are fast, so software-based
crypto is usually more than fast enough
 Future of stream ciphers?
o Shamir declared “the death of stream ciphers”
o May be greatly exaggerated…

Part 1  Cryptography 76
Block Ciphers

Part 1  Cryptography 77
(Iterated) Block Cipher
 Plaintext and ciphertext consist of
fixed-sized blocks
 Ciphertext obtained from plaintext
by iterating a round function
 Input to round function consists of
key and output of previous round
 Usually implemented in software

Part 1  Cryptography 78
Feistel Cipher: Encryption
 Feistel cipher is a type of block cipher
o Not a specific block cipher
 Split plaintext block into left and right
halves: P = (L0, R0)
 For each round i = 1, 2, ..., n, compute
Li = Ri1
Ri = Li1  F(Ri1, Ki)
where F is round function and Ki is subkey
 Ciphertext: C = (Ln, Rn)
Part 1  Cryptography 79
Feistel Cipher: Decryption
 Start with ciphertext C = (Ln, Rn)
 For each round i = n, n1, …, 1, compute
Ri1 = Li
Li1 = Ri  F(Ri1, Ki)
where F is round function and Ki is subkey
 Plaintext: P = (L0, R0)
 Decryption works for any function F
o But only secure for certain functions F

Part 1  Cryptography 80
Data Encryption Standard
 DES developed in 1970’s
 Based on IBM’s Lucifer cipher
 DES was U.S. government standard
 Development of DES was controversial
o NSA secretly involved
o Design process was secret
o Key length reduced from 128 to 56 bits
o Subtle changes to Lucifer algorithm

Part 1  Cryptography 81
DES Numerology
 DES is a Feistel cipher with…
o 64 bit block length
o 56 bit key length
o 16 rounds
o 48 bits of key used each round (subkey)
 Round function is simple (for block cipher)
 Security depends heavily on “S-boxes”
o Each S-box maps 6 bits to 4 bits

Part 1  Cryptography 82
L R key
32 28 28

One
expand shift shift
32 48 28 28
Round
Ki

48 48 compress

S-boxes
of
DES
28 28
32

P box
32
32

32

L R key
Part 1  Cryptography 83
DES Expansion Permutation
 Input 32 bits
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31

 Output 48 bits
31 0 1 2 3 4 3 4 5 6 7 8
7 8 9 10 11 12 11 12 13 14 15 16
15 16 17 18 19 20 19 20 21 22 23 24
23 24 25 26 27 28 27 28 29 30 31 0

Part 1  Cryptography 84
DES S-box
8 “substitution boxes” or S-boxes
 Each S-box maps 6 bits to 4 bits
 Here is S-box number 1
input bits (0,5)
 input bits (1,2,3,4)
| 0000 0001 0010 0011 0100 0101 0110 0111 1000 1001 1010 1011 1100 1101 1110 1111
------------------------------------------------------------------------------------
00 | 1110 0100 1101 0001 0010 1111 1011 1000 0011 1010 0110 1100 0101 1001 0000 0111
01 | 0000 1111 0111 0100 1110 0010 1101 0001 1010 0110 1100 1011 1001 0101 0011 1000
10 | 0100 0001 1110 1000 1101 0110 0010 1011 1111 1100 1001 0111 0011 1010 0101 0000
11 | 1111 1100 1000 0010 0100 1001 0001 0111 0101 1011 0011 1110 1010 0000 0110 1101

Part 1  Cryptography 85
DES P-box
 Input 32 bits
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31

 Output 32 bits
15 6 19 20 28 11 27 16 0 14 22 25 4 17 30 9
1 7 23 13 31 26 2 8 18 12 29 5 21 10 3 24

Part 1  Cryptography 86
DES Subkey
 56 bit DES key, numbered 0,1,2,…,55
 Left half key bits, LK
49 42 35 28 21 14 7
0 50 43 36 29 22 15
8 1 51 44 37 30 23
16 9 2 52 45 38 31
 Right half key bits, RK
55 48 41 34 27 20 13
6 54 47 40 33 26 19
12 5 53 46 39 32 25
18 11 4 24 17 10 3

Part 1  Cryptography 87
DES Subkey
 For rounds i=1,2,...,16
o Let LK = (LK circular shift left by ri)
o Let RK = (RK circular shift left by ri)
o Left half of subkey Ki is of LK bits
13 16 10 23 0 4 2 27 14 5 20 9
22 18 11 3 25 7 15 6 26 19 12 1
o Right half of subkey Ki is RK bits
12 23 2 8 18 26 1 11 22 16 4 19
15 20 10 27 5 24 17 13 21 7 0 3

Part 1  Cryptography 88
DES Subkey
 For rounds 1, 2, 9 and 16 the shift ri is 1,
and in all other rounds ri is 2
 Bits 8,17,21,24 of LK omitted each round
 Bits 6,9,14,25 of RK omitted each round
 Compression permutation yields 48 bit
subkey Ki from 56 bits of LK and RK
 Key schedule generates subkey

Part 1  Cryptography 89
DES Last Word (Almost)
 An initial permutation before round 1
 Halves are swapped after last round
A final permutation (inverse of initial
perm) applied to (R16, L16)
 None of this serves any security
purpose

Part 1  Cryptography 90
Security of DES
 Security depends heavily on S-boxes
o Everything else in DES is linear
 35+ years of intense analysis has revealed
no back door
 Attacks, essentially exhaustive key search
 Inescapable conclusions
o Designers of DES knew what they were doing
o Designers of DES were way ahead of their time
(at least wrt certain cryptanalytic techniques)
Part 1  Cryptography 91
Block Cipher Notation
 P = plaintext block
 C = ciphertext block
 Encrypt P with key K to get ciphertext C
o C = E(P, K)
 Decrypt C with key K to get plaintext P
o P = D(C, K)
 Note: P = D(E(P, K), K) and C = E(D(C, K), K)
o But P  D(E(P, K1), K2) and C  E(D(C, K1), K2) when
K1  K2
Part 1  Cryptography 92
Triple DES
 Today, 56 bit DES key is too small
o Exhaustive key search is feasible
 But DES is everywhere, so what to do?
 Triple DES or 3DES (112 bit key)
o C = E(D(E(P,K1),K2),K1)
o P = D(E(D(C,K1),K2),K1)
 Why Encrypt-Decrypt-Encrypt with 2 keys?
o Backward compatible: E(D(E(P,K),K),K) = E(P,K)
o And 112 is a lot of bits
Part 1  Cryptography 93
3DES
 Why not C = E(E(P,K),K) instead?
o Trick question  still just 56 bit key
 Why not C = E(E(P,K1),K2) instead?
 A (semi-practical) known plaintext attack
o Pre-compute table of E(P,K1) for every possible
key K1 (resulting table has 256 entries)
o Then for each possible K2 compute D(C,K2) until
a match in table is found
o When match is found, have E(P,K1) = D(C,K2)
o Result gives us keys: C = E(E(P,K1),K2)

Part 1  Cryptography 94
Advanced Encryption Standard
 Replacement for DES
 AES competition (late 90’s)
o NSA openly involved
o Transparent selection process
o Many strong algorithms proposed
o Rijndael Algorithm ultimately selected
(pronounced like “Rain Doll” or “Rhine Doll”)
 Iterated block cipher (like DES)
 Not a Feistel cipher (unlike DES)
Part 1  Cryptography 95
AES: Executive Summary
 Block size: 128 bits (others in Rijndael)
 Key length: 128, 192 or 256 bits
(independent of block size in Rijndael)
 10 to 14 rounds (depends on key length)
 Each round uses 4 functions (3 “layers”)
o ByteSub (nonlinear layer)
o ShiftRow (linear mixing layer)
o MixColumn (nonlinear layer)
o AddRoundKey (key addition layer)

Part 1  Cryptography 96
AES ByteSub
 Treat 128 bit block as 4x4 byte array

 ByteSub is AES’s “S-box”


 Can be viewed as nonlinear (but invertible)
composition of two math operations

Part 1  Cryptography 97
AES “S-box”
Last 4 bits of input

First 4
bits of
input

Part 1  Cryptography 98
AES ShiftRow
 Cyclic shift rows

Part 1  Cryptography 99
AES MixColumn
 Invertible, linear operation applied to
each column

 Implemented as a (big) lookup table


Part 1  Cryptography 100
AES AddRoundKey
 XOR subkey with block

Block Subkey

 RoundKey (subkey) determined by key


schedule algorithm

Part 1  Cryptography 101


AES Decryption
 To decrypt, process must be invertible
 Inverse of MixAddRoundKey is easy, since
“” is its own inverse
 MixColumn is invertible (inverse is also
implemented as a lookup table)
 Inverse of ShiftRow is easy (cyclic shift
the other direction)
 ByteSub is invertible (inverse is also
implemented as a lookup table)

Part 1  Cryptography 102


A Few Other Block Ciphers
 Briefly…
o IDEA
o Blowfish
o RC6
 More detailed…
o TEA

Part 1  Cryptography 103


IDEA
 Invented by James Massey
o One of the giants of modern crypto
 IDEA has 64-bit block, 128-bit key
 IDEA uses mixed-mode arithmetic
 Combine different math operations
o IDEA the first to use this approach
o Frequently used today

Part 1  Cryptography 104


Blowfish
 Blowfish encrypts 64-bit blocks
 Key is variable length, up to 448 bits
 Invented by Bruce Schneier
 Almost a Feistel cipher
Ri = Li1  Ki
Li = Ri1  F(Li1  Ki)
 The round function F uses 4 S-boxes
o Each S-box maps 8 bits to 32 bits
 Key-dependent S-boxes
o S-boxes determined by the key
Part 1  Cryptography 105
RC6
 Invented by Ron Rivest
 Variables
o Block size
o Key size
o Number of rounds
 An AES finalist
 Uses data dependent rotations
o Unusual for algorithm to depend on plaintext

Part 1  Cryptography 106


Time for TEA…
 Tiny Encryption Algorithm (TEA)
 64 bit block, 128 bit key
 Assumes 32-bit arithmetic
 Number of rounds is variable (32 is
considered secure)
 Uses “weak” round function, so large
number of rounds required

Part 1  Cryptography 107


TEA Encryption
Assuming 32 rounds:
(K[0], K[1], K[2], K[3]) = 128 bit key
(L,R) = plaintext (64-bit block)
delta = 0x9e3779b9
sum = 0
for i = 1 to 32
sum += delta
L += ((R<<4)+K[0])^(R+sum)^((R>>5)+K[1])
R += ((L<<4)+K[2])^(L+sum)^((L>>5)+K[3])
next i
ciphertext = (L,R)
Part 1  Cryptography 108
TEA Decryption
Assuming 32 rounds:
(K[0], K[1], K[2], K[3]) = 128 bit key
(L,R) = ciphertext (64-bit block)
delta = 0x9e3779b9
sum = delta << 5
for i = 1 to 32
R = ((L<<4)+K[2])^(L+sum)^((L>>5)+K[3])
L = ((R<<4)+K[0])^(R+sum)^((R>>5)+K[1])
sum = delta
next i
plaintext = (L,R)
Part 1  Cryptography 109
TEA Comments
 “Almost” a Feistel cipher
o Uses + and - instead of  (XOR)
 Simple, easy to implement, fast, low
memory requirement, etc.
 Possibly a “related key” attack
 eXtended TEA (XTEA) eliminates related
key attack (slightly more complex)
 Simplified TEA (STEA)  insecure version
used as an example for cryptanalysis
Part 1  Cryptography 110
Block Cipher Modes

Part 1  Cryptography 111


Multiple Blocks
 How to encrypt multiple blocks?
 Do we need a new key for each block?
o If so, as impractical as a one-time pad!
 Encrypt each block independently?
 Is there any analog of codebook “additive”?
 How to handle partial blocks?
o We won’t discuss this issue

Part 1  Cryptography 112


Modes of Operation
 Many modes  we discuss 3 most popular
 Electronic Codebook (ECB) mode
o Encrypt each block independently
o Most obvious approach, but a bad idea
 Cipher Block Chaining (CBC) mode
o Chain the blocks together
o More secure than ECB, virtually no extra work
 Counter Mode (CTR) mode
o Block ciphers acts like a stream cipher
o Popular for random access
Part 1  Cryptography 113
ECB Mode
 Notation: C = E(P, K)
 Given plaintext P0, P1, …, Pm, …
 Most obvious way to use a block cipher:
Encrypt Decrypt
C0 = E(P0, K) P0 = D(C0, K)
C1 = E(P1, K) P1 = D(C1, K)
C2 = E(P2, K) … P2 = D(C2, K) …
 For fixed key K, this is “electronic” version
of a codebook cipher (without additive)
o With a different codebook for each key
Part 1  Cryptography 114
ECB Cut and Paste
 Suppose plaintext is
Alice digs Bob. Trudy digs Tom.
 Assuming 64-bit blocks and 8-bit ASCII:
P0 = “Alice di”, P1 = “gs Bob. ”,
P2 = “Trudy di”, P3 = “gs Tom. ”
 Ciphertext: C0, C1, C2, C3
 Trudy cuts and pastes: C0, C3, C2, C1
 Decrypts as
Alice digs Tom. Trudy digs Bob.

Part 1  Cryptography 115


ECB Weakness
 Suppose Pi = Pj
 Then Ci = Cj and Trudy knows Pi = Pj
 This gives Trudy some information,
even if she does not know Pi or Pj
 Trudy might know Pi
 Is this a serious issue?

Part 1  Cryptography 116


Alice Hates ECB Mode
 Alice’s uncompressed image, and ECB encrypted (TEA)

 Why does this happen?


 Same plaintext yields same ciphertext!
Part 1  Cryptography 117
CBC Mode
 Blocks are “chained” together
 A random initialization vector, or IV, is
required to initialize CBC mode
 IV is random, but not secret
Encryption Decryption
C0 = E(IV  P0, K), P0 = IV  D(C0, K),
C1 = E(C0  P1, K), P1 = C0  D(C1,
K),
C2 = E(C1  P2, K),… P2 = C1  D(C2, K),…
 Analogous to classic codebook with additive
Part 1  Cryptography 118
CBC Mode
 Identical plaintext blocks yield different
ciphertext blocks  this is very good!
 But what about errors in transmission?
o If C1 is garbled to, say, G then
P1  C0  D(G, K), P2  G  D(C2, K)
o But P3 = C2  D(C3, K), P4 = C3  D(C4, K), …
o Automatically recovers from errors!
 Cut and paste is still possible, but more
complex (and will cause garbles)
Part 1  Cryptography 119
Alice Likes CBC Mode
 Alice’s uncompressed image, Alice CBC encrypted (TEA)

 Why does this happen?


 Same plaintext yields different ciphertext!
Part 1  Cryptography 120
Counter Mode (CTR)
 CTR is popular for random access
 Use block cipher like a stream cipher
Encryption Decryption
C0 = P0  E(IV, K), P0 = C0  E(IV,
K),
C1 = P1  E(IV+1, K), P1 = C1  E(IV+1, K),
C2 = P2  E(IV+2, K),… P2 = C2  E(IV+2, K),…
 Note: CBC also works for random access
o But there is a significant limitation…
Part 1  Cryptography 121
Integrity

Part 1  Cryptography 122


Data Integrity
 Integrity  detect unauthorized writing
(i.e., detect unauthorized mod of data)
 Example: Inter-bank fund transfers
o Confidentiality may be nice, integrity is critical
 Encryption provides confidentiality
(prevents unauthorized disclosure)
 Encryption alone does not provide integrity
o One-time pad, ECB cut-and-paste, etc., etc.

Part 1  Cryptography 123


MAC
 Message Authentication Code (MAC)
o Used for data integrity
o Integrity not the same as confidentiality
 MAC is computed as CBC residue
o That is, compute CBC encryption, saving
only final ciphertext block, the MAC
o The MAC serves as a cryptographic
checksum for data
Part 1  Cryptography 124
MAC Computation
 MAC computation (assuming N blocks)
C0 = E(IV  P0, K),
C1 = E(C0  P1, K),
C2 = E(C1  P2, K),…
CN1 = E(CN2  PN1, K) = MAC
 Send IV, P0, P1, …, PN1 and MAC
 Receiver does same computation and
verifies that result agrees with MAC
 Both sender and receiver must know K
Part 1  Cryptography 125
Does a MAC work?
 Suppose Alice has 4 plaintext blocks
 Alice computes
C0 = E(IVP0, K), C1 = E(C0P1, K),
C2 = E(C1P2, K), C3 = E(C2P3, K) = MAC
 Alice sends IV, P0, P1, P2, P3 and MAC to Bob
 Suppose Trudy changes P1 to X
 Bob computes
C0 = E(IVP0, K), C1 = E(C0X, K),
C2 = E(C1P2, K), C3 = E(C2P3, K) = MAC  MAC
 It works since error propagates into MAC
 Trudy can’t make MAC == MAC without K
Part 1  Cryptography 126
Confidentiality and Integrity
 Encrypt with one key, MAC with another key
 Why not use the same key?
o Send last encrypted block (MAC) twice?
o This cannot add any security!
 Using different keys to encrypt and
compute MAC works, even if keys are
related
o But, twice as much work as encryption alone
o Can do a little better  about 1.5 “encryptions”
 Confidentiality and integrity with same work
as one encryption is a research topic
Part 1  Cryptography 127
Uses for Symmetric Crypto
 Confidentiality
o Transmitting data over insecure channel
o Secure storage on insecure media
 Integrity (MAC)
 Authentication protocols (later…)
 Anything you can do with a hash
function (upcoming chapter…)

Part 1  Cryptography 128


Chapter 4:
Public Key Cryptography

You should not live one way in private, another in public.


 Publilius Syrus

Three may keep a secret, if two of them are dead.


 Ben Franklin

Part 1  Cryptography 129


Public Key Cryptography
 Two keys, one to encrypt, another to decrypt
o Alice uses Bob’s public key to encrypt
o Only Bob’s private key decrypts the message
 Based on “trap door, one way function”
o “One way” means easy to compute in one direction,
but hard to compute in other direction
o Example: Given p and q, product N = pq easy to
compute, but hard to find p and q from N
o “Trap door” is used when creating key pairs

Part 1  Cryptography 130


Public Key Cryptography
 Encryption
o Suppose we encrypt M with Bob’s public key
o Bob’s private key can decrypt C to recover M
 Digital Signature
o Bob signs by “encrypting” with his private key
o Anyone can verify signature by “decrypting”
with Bob’s public key
o But only Bob could have signed
o Like a handwritten signature, but much better…
Part 1  Cryptography 131
Knapsack

Part 1  Cryptography 132


Knapsack Problem
 Given a set of n weights W0,W1,...,Wn-1 and a
sum S, find ai  {0,1} so that
S = a0W0+a1W1 + ... + an-1Wn-1
(technically, this is the subset sum problem)
 Example
o Weights (62,93,26,52,166,48,91,141)
o Problem: Find a subset that sums to S = 302
o Answer: 62 + 26 + 166 + 48 = 302
 The (general) knapsack is NP-complete
Part 1  Cryptography 133
Knapsack Problem
 General knapsack (GK) is hard to solve
 But superincreasing knapsack (SIK) is easy
 SIK  each weight greater than the sum of
all previous weights
 Example
o Weights (2,3,7,14,30,57,120,251)
o Problem: Find subset that sums to S = 186
o Work from largest to smallest weight
o Answer: 120 + 57 + 7 + 2 = 186
Part 1  Cryptography 134
Knapsack Cryptosystem
1. Generate superincreasing knapsack (SIK)
2. Convert SIK to “general” knapsack (GK)
3. Public Key: GK
4. Private Key: SIK and conversion factor
 Goal…
o Easy to encrypt with GK
o With private key, easy to decrypt (solve SIK)
o Without private key, Trudy has no choice but
to try to solve GK

Part 1  Cryptography 135


Example
 Start with (2,3,7,14,30,57,120,251) as the SIK
 Choose m = 41 and n = 491 (m, n relatively
prime, n exceeds sum of elements in SIK)
 Compute “general” knapsack
2  41 mod 491 = 82
3  41 mod 491 = 123
7  41 mod 491 = 287
14  41 mod 491 = 83
30  41 mod 491 = 248
57  41 mod 491 = 373
120  41 mod 491 = 10
251  41 mod 491 = 471
 “General” knapsack:
(82,123,287,83,248,373,10,471)
Part 1  Cryptography 136
Knapsack Example
 Private key: (2,3,7,14,30,57,120,251)
m1 mod n = 411 mod 491 = 12
 Public key: (82,123,287,83,248,373,10,471),
n=491
 Example: Encrypt 10010110
82 + 83 + 373 + 10 = 548
 To decrypt, use private key…
o 548 · 12 = 193 mod 491
o Solve (easy) SIK with S = 193
o Obtain plaintext 10010110
Part 1  Cryptography 137
Knapsack Weakness
 Trapdoor: Convert SIK into “general”
knapsack using modular arithmetic
 One-way: General knapsack easy to
encrypt, hard to solve; SIK easy to solve
 This knapsack cryptosystem is insecure
o Broken in 1983 with Apple II computer
o The attack uses lattice reduction
 “General knapsack” is not general enough!
o This special case of knapsack is easy to break

Part 1  Cryptography 138


RSA

Part 1  Cryptography 139


RSA
 Invented by Clifford Cocks (GCHQ) and
Rivest, Shamir, and Adleman (MIT)
o RSA is the gold standard in public key crypto
 Let p and q be two large prime numbers
 Let N = pq be the modulus
 Choose e relatively prime to (p−1)(q−1)
 Find d such that ed = 1 mod (p−1)(q−1)
 Public key is (N,e)
 Private key is d
Part 1  Cryptography 140
RSA
 Message M is treated as a number
 To encrypt M we compute
C = Me mod N
 To decrypt ciphertext C, we compute
M = Cd mod N
 Recall that e and N are public
 If Trudy can factor N = pq, she can use e
to easily find d since ed = 1 mod
(p−1)(q−1)
 So, factoring the modulus breaks RSA
o Is factoring the only way to break RSA?
Part 1  Cryptography 141
Does RSA Really Work?
 Given C = Me mod N we want to show that
M = Cd mod N = Med mod N
 We’ll need Euler’s Theorem:
If x is relatively prime to n then x(n) = 1 mod n
 Facts:
1) ed = 1 mod (p − 1)(q − 1)
2) By definition of “mod”, ed = k(p − 1)(q − 1) + 1
3) (N) = (p − 1)(q − 1)
 Then ed − 1 = k(p − 1)(q − 1) = k(N)
 So, Cd = Med = M(ed  1) + 1 = MMed  1 = MMk(N)
= M(M(N))k mod N = M1k mod N = M mod N
Part 1  Cryptography 142
Simple RSA Example
 Example of textbook RSA
o Select “large” primes p = 11, q = 3
o Then N = pq = 33 and (p − 1)(q − 1) = 20
o Choose e = 3 (relatively prime to 20)
o Find d such that ed = 1 mod 20
 We find that d = 7 works
 Public key: (N, e) = (33, 3)
 Private key: d = 7

Part 1  Cryptography 143


Simple RSA Example
 Public key: (N, e) = (33, 3)
 Private key: d = 7
 Suppose message to encrypt is M = 8
 Ciphertext C is computed as
C = Me mod N = 83 = 512 = 17 mod 33
 Decrypt C to recover the message M by
M = Cd mod N = 177 = 410,338,673
= 12,434,505  33 + 8 = 8 mod 33

Part 1  Cryptography 144


More Efficient RSA (1)
 Modular exponentiation example
o 520 = 95367431640625 = 25 mod 35
 A better way: repeated squaring
o 20 = 10100 base 2
o (1, 10, 101, 1010, 10100) = (1, 2, 5, 10, 20)
o Note that 2 = 1 2, 5 = 2  2 + 1, 10 = 2  5, 20 = 2  10
o 51= 5 mod 35
o 52= (51)2 = 52 = 25 mod 35
o 55= (52)2  51 = 252  5 = 3125 = 10 mod 35
o 510 = (55)2 = 102 = 100 = 30 mod 35
o 520 = (510)2 = 302 = 900 = 25 mod 35
 No huge numbers and it’s efficient!
Part 1  Cryptography 145
More Efficient RSA (2)
 Use e = 3 for all users (but not same N or d)
+ Public key operations only require 2 multiplies
o Private key operations remain expensive
- If M < N1/3 then C = Me = M3 and cube root attack
- For any M, if C1, C2, C3 sent to 3 users, cube root
attack works (uses Chinese Remainder Theorem)
 Can prevent cube root attack by padding
message with random bits
 Note: e = 216 + 1 also used (“better” than e =
3)
Part 1  Cryptography 146
Diffie-Hellman

Part 1  Cryptography 147


Diffie-Hellman Key Exchange
 Invented by Williamson (GCHQ) and,
independently, by D and H (Stanford)
 A “key exchange” algorithm
o Used to establish a shared symmetric key
o Not for encrypting or signing
 Based on discrete log problem
o Given: g, p, and gk mod p
o Find: exponent k

Part 1  Cryptography 148


Diffie-Hellman
 Let p be prime, let g be a generator
o For any x  {1,2,…,p-1} there is n s.t. x = gn mod p
 Alice selects her private value a
 Bob selects his private value b
 Alice sends ga mod p to Bob
 Bob sends gb mod p to Alice
 Both compute shared secret, gab mod p
 Shared secret can be used as symmetric key
Part 1  Cryptography 149
Diffie-Hellman
 Public: g and p
 Private: Alice’s exponent a, Bob’s exponent b

ga mod p
gb mod p

Alice, a Bob, b
 Alice computes (gb)a = gba = gab mod p
 Bob computes (ga)b = gab mod p
 They can use K = gab mod p as symmetric key
Part 1  Cryptography 150
Diffie-Hellman
 Suppose Bob and Alice use Diffie-Hellman
to determine symmetric key K = gab mod p
 Trudy can see ga mod p and gb mod p
o But… ga gb mod p = ga+b mod p  gab mod p
 If Trudy can find a or b, she gets K
 If Trudy can solve discrete log problem,
she can find a or b

Part 1  Cryptography 151


Diffie-Hellman
 Subject to man-in-the-middle (MiM) attack

ga mod p gt mod p

gt mod p gb mod p

Alice, a Trudy, t Bob, b

 Trudy shares secret gat mod p with Alice


 Trudy shares secret gbt mod p with Bob
 Alice and Bob don’t know Trudy is MiM
Part 1  Cryptography 152
Diffie-Hellman
 How to prevent MiM attack?
o Encrypt DH exchange with symmetric key
o Encrypt DH exchange with public key
o Sign DH values with private key
o Other?
 At this point, DH may look pointless…
o …but it’s not (more on this later)
 You MUST be aware of MiM attack on
Diffie-Hellman
Part 1  Cryptography 153
Elliptic Curve Cryptography

Part 1  Cryptography 154


Elliptic Curve Crypto (ECC)
 “Elliptic curve” is not a cryptosystem
 Elliptic curves provide different way
to do the math in public key system
 Elliptic curve versions of DH, RSA, …
 Elliptic curves are more efficient
o Fewer bits needed for same security
o But the operations are more complex,
yet it is a big “win” overall
Part 1  Cryptography 155
What is an Elliptic Curve?
 An elliptic curve E is the graph of
an equation of the form
y2 = x3 + ax + b
 Also includes a “point at infinity”
 What do elliptic curves look like?
 See the next slide!

Part 1  Cryptography 156


Elliptic Curve Picture
y
 Consider elliptic curve
E: y2 = x3 - x + 1
P2  If P1 and P2 are on E, we
P1
can define addition,
x
P3 = P 1 + P 2
P3
as shown in picture
 Addition is all we need…

Part 1  Cryptography 157


Points on Elliptic Curve
 Consider y2 = x3 + 2x + 3 (mod 5)
x = 0  y2 = 3  no solution (mod 5)
x = 1  y2 = 6 = 1  y = 1,4 (mod 5)
x = 2  y2 = 15 = 0  y = 0 (mod 5)
x = 3  y2 = 36 = 1  y = 1,4 (mod 5)
x = 4  y2 = 75 = 0  y = 0 (mod 5)
 Then points on the elliptic curve are
(1,1) (1,4) (2,0) (3,1) (3,4) (4,0)
and the point at infinity: 

Part 1  Cryptography 158


Elliptic Curve Math
 Addition on: y2 = x3 + ax + b (mod p)
P1=(x1,y1), P2=(x2,y2)
P1 + P2 = P3 = (x3,y3) where
x3 = m2 - x1 - x2 (mod p)
y3 = m(x1 - x3) - y1 (mod p)
And m = (y2-y1)(x2-x1)-1 mod p, if P1P2
m = (3x12+a)(2y1)-1 mod p, if P1 = P2
Special cases: If m is infinite, P3 = , and
 + P = P for all P

Part 1  Cryptography 159


Elliptic Curve Addition
 Consider y2 = x3 + 2x + 3 (mod 5).
Points on the curve are (1,1) (1,4)
(2,0) (3,1) (3,4) (4,0) and 
 What is (1,4) + (3,1) = P3 = (x3,y3)?
m = (1-4)(3-1)-1 = -32-1
= 2(3) = 6 = 1 (mod 5)
x3 = 1 - 1 - 3 = 2 (mod 5)
y3 = 1(1-2) - 4 = 0 (mod 5)
 On this curve, (1,4) + (3,1) = (2,0)

Part 1  Cryptography 160


ECC Diffie-Hellman
 Public: Elliptic curve and point (x,y) on curve
 Private: Alice’s A and Bob’s B

A(x,y)
B(x,y)

Alice, A Bob, B

 Alice computes A(B(x,y))


 Bob computes B(A(x,y))
 These are the same since AB = BA

Part 1  Cryptography 161


ECC Diffie-Hellman
 Public: Curve y2 = x3 + 7x + b (mod 37)
and point (2,5)  b = 3
 Alice’s private: A = 4
 Bob’s private: B = 7
 Alice sends Bob: 4(2,5) = (7,32)
 Bob sends Alice: 7(2,5) = (18,35)
 Alice computes: 4(18,35) = (22,1)
 Bob computes: 7(7,32) = (22,1)

Part 1  Cryptography 162


Larger ECC Example
 Example from Certicom ECCp-109
o Challenge problem, solved in 2002
 Curve E: y2 = x3 + ax + b (mod
p)
 Where
p = 564538252084441556247016902735257
a = 321094768129147601892514872825668
b = 430782315140218274262276694323197
 Now what?
Part 1  Cryptography 163
ECC Example
 The following point P is on the curve E
(x,y) = (97339010987059066523156133908935,
149670372846169285760682371978898)
 Letk = 281183840311601949668207954530684
 The kP is given by
(x,y) = (44646769697405861057630861884284,
522968098895785888047540374779097)
 And this point is also on the curve E
Part 1  Cryptography 164
Really Big Numbers!
 Numbers are big, but not big enough
o ECCp-109 bit (32 digit) solved in 2002
 Today,ECC DH needs bigger numbers
 But RSA needs way bigger numbers
o Minimum RSA modulus today is 1024 bits
o That is, more than 300 decimal digits
o That’s about 10x the size in ECC example
o And 2048 bit RSA modulus is common…
Part 1  Cryptography 165
Uses for Public Key Crypto

Part 1  Cryptography 166


Uses for Public Key Crypto
 Confidentiality
o Transmitting data over insecure channel
o Secure storage on insecure media
 Authentication protocols (later)
 Digital signature
o Provides integrity and non-repudiation
o No non-repudiation with symmetric keys

Part 1  Cryptography 167


Non-non-repudiation
 Alice orders 100 shares of stock from Bob
 Alice computes MAC using symmetric key
 Stock drops, Alice claims she did not order
 Can Bob prove that Alice placed the order?
 No! Bob also knows the symmetric key, so
he could have forged the MAC
 Problem: Bob knows Alice placed the order,
but he can’t prove it

Part 1  Cryptography 168


Non-repudiation
 Alice orders 100 shares of stock from Bob
 Alice signs order with her private key
 Stock drops, Alice claims she did not order
 Can Bob prove that Alice placed the order?
 Yes! Alice’s private key used to sign the
order  only Alice knows her private key
 This assumes Alice’s private key has not
been lost/stolen

Part 1  Cryptography 169


Public Key Notation
 Sign message M with Alice’s
private key: [M]Alice
 Encrypt message M with Alice’s
public key: {M}Alice
 Then
{[M]Alice}Alice = M
[{M}Alice]Alice = M

Part 1  Cryptography 170


Sign and Encrypt
vs
Encrypt and Sign

Part 1  Cryptography 171


Confidentiality and
Non-repudiation?
 Suppose that we want confidentiality
and integrity/non-repudiation
 Can public key crypto achieve both?
 Alice sends message to Bob
o Sign and encrypt: {[M]Alice}Bob
o Encrypt and sign: [{M}Bob]Alice
 Can the order possibly matter?

Part 1  Cryptography 172


Sign and Encrypt
 M = “I love you”

{[M]Alice}Bob {[M]Alice}Charlie

Alice Bob Charlie

 Q: What’s the problem?


 A: No problem  public key is public

Part 1  Cryptography 173


Encrypt and Sign
 M = “My theory, which is mine….”

[{M}Bob]Alice [{M}Bob]Charlie

Alice Charlie Bob

 Note that Charlie cannot decrypt M


 Q: What is the problem?
 A: No problem  public key is public
Part 1  Cryptography 174
Public Key Infrastructure

Part 1  Cryptography 175


Public Key Certificate
 Digital certificate contains name of user and
user’s public key (possibly other info too)
 It is signed by the issuer, a Certificate
Authority (CA), such as VeriSign
M = (Alice, Alice’s public key), S = [M]CA
Alice’s Certificate = (M, S)
 Signature on certificate is verified using
CA’s public key
Must verify that M = {S}CA
Part 1  Cryptography 176
Certificate Authority
 Certificate authority (CA) is a trusted 3rd
party (TTP)  creates and signs certificates
 Verify signature to verify integrity & identity
of owner of corresponding private key
o Does not verify the identity of the sender of
certificate  certificates are public!
 Big problem if CA makes a mistake
o CA once issued Microsoft cert. to someone else
 A common format for certificates is X.509
Part 1  Cryptography 177
PKI
 Public Key Infrastructure (PKI): the stuff
needed to securely use public key crypto
o Key generation and management
o Certificate authority (CA) or authorities
o Certificate revocation lists (CRLs), etc.
 No general standard for PKI
 We mention 3 generic “trust models”
o We only discuss the CA (or CAs)
Part 1  Cryptography 178
PKI Trust Models
 Monopoly model
o One universally trusted organization is
the CA for the known universe
o Big problems if CA is ever compromised
o Who will act as CA ???
 System is useless if you don’t trust the CA!

Part 1  Cryptography 179


PKI Trust Models
 Oligarchy
o Multiple (as in, “a few”) trusted CAs
o This approach is used in browsers today
o Browser may have 80 or more CA
certificates, just to verify certificates!
o User can decide which CA or CAs to trust

Part 1  Cryptography 180


PKI Trust Models
 Anarchy model
o Everyone is a CA…
o Users must decide who to trust
o This approach used in PGP: “Web of trust”
 Why is it anarchy?
o Suppose certificate is signed by Frank and you
don’t know Frank, but you do trust Bob and Bob
says Alice is trustworthy and Alice vouches for
Frank. Should you accept the certificate?
 Many other trust models/PKI issues
Part 1  Cryptography 181
Confidentiality
in the Real World

Part 1  Cryptography 182


Symmetric Key vs Public Key
 Symmetric key +’s
o Speed
o No public key infrastructure (PKI) needed
(but have to generate/distribute keys)
 Public Key +’s
o Signatures (non-repudiation)
o No shared secret (but, do have to get
private keys to the right user…)
Part 1  Cryptography 183
Notation Reminder
 Public key notation
o Sign M with Alice’s private key
[M]Alice
o Encrypt M with Alice’s public key
{M}Alice
 Symmetric key notation
o Encrypt P with symmetric key K
C = E(P,K)
o Decrypt C with symmetric key K
P = D(C,K)
Part 1  Cryptography 184
Real World Confidentiality
 Hybrid cryptosystem
o Public key crypto to establish a key
o Symmetric key crypto to encrypt data…

I’m Alice, {K}Bob

E(Bob’s data, K)
E(Alice’s data, K)
Alice Bob

 Can Bob be sure he’s talking to Alice?


Part 1  Cryptography 185
Chapter 5: Hash Functions++
“I'm sure [my memory] only works one way.” Alice remarked.
“I can't remember things before they happen.”
“It's a poor sort of memory that only works backwards,”
the Queen remarked.
“What sort of things do you remember best?" Alice ventured to ask.
“Oh, things that happened the week after next,"
the Queen replied in a careless tone.
 Lewis Carroll, Through the Looking Glass

Part 1  Cryptography 186


Chapter 5: Hash Functions++
A boat, beneath a sunny sky
Lingering onward dreamily
In an evening of July 
Children three that nestle near,
Eager eye and willing ear,
...
 Lewis Carroll, Through the Looking Glass

Part 1  Cryptography 187


Hash Function Motivation
 Suppose Alice signs M
o Alice sends M and S = [M]Alice to Bob
o Bob verifies that M = {S}Alice
o Can Alice just send S?
 If M is big, [M]Alice costly to compute & send
 Suppose instead, Alice signs h(M), where h(M)
is a much smaller “fingerprint” of M
o Alice sends M and S = [h(M)]Alice to Bob
o Bob verifies that h(M) = {S}Alice

Part 1  Cryptography 188


Hash Function Motivation
 So, Alice signs h(M)
o That is, Alice computes S = [h(M)]Alice
o Alice then sends (M, S) to Bob
o Bob verifies that h(M) = {S}Alice
 What properties must h(M) satisfy?
o Suppose Trudy finds M’ so that h(M) = h(M’)
o Then Trudy can replace (M, S) with (M’, S)
 Does Bob detect this tampering?
o No, since h(M’) = h(M) = {S}Alice
Part 1  Cryptography 189
Crypto Hash Function
 Crypto hash function h(x) must provide
o Compression  output length is small
o Efficiency  h(x) easy to compute for any x
o One-way  given a value y it is infeasible to
find an x such that h(x) = y
o Weak collision resistance  given x and h(x),
infeasible to find y  x such that h(y) = h(x)
o Strong collision resistance  infeasible to find
any x and y, with x  y such that h(x) = h(y)
 Lots of collisions exist, but hard to find any
Part 1  Cryptography 190
Pre-Birthday Problem
 Suppose N people in a room
 How large must N be before the
probability someone has same
birthday as me is  1/2 ?
o Solve: 1/2 = 1  (364/365)N for N
o We find N = 253

Part 1  Cryptography 191


Birthday Problem
 How many people must be in a room before
probability is  1/2 that any two (or more)
have same birthday?
o 1  365/365  364/365   (365N+1)/365
o Set equal to 1/2 and solve: N = 23
 Surprising? A paradox?
 Maybe not: “Should be” about sqrt(365) since
we compare all pairs x and y
o And there are 365 possible birthdays

Part 1  Cryptography 192


Of Hashes and Birthdays
 If h(x) is N bits, then 2N different hash
values are possible
 So, if you hash about sqrt(2N) = 2N/2 values
then you expect to find a collision
 Implication? “Exhaustive search” attack…
o Secure N-bit hash requires 2N/2 work to “break”
o Recall that secure N-bit symmetric cipher has
work factor of 2N1
 Hash output length vs cipher key length?
Part 1  Cryptography 193
Non-crypto Hash (1)
 Data X = (X1,X2,X3,…,Xn), each Xi is a byte
 Define h(X) = (X1+X2+X3+…+Xn) mod 256
 Is this a secure cryptographic hash?
 Example: X = (10101010, 00001111)
 Hash is h(X) = 10111001
 If Y = (00001111, 10101010) then h(X) = h(Y)
 Easy to find collisions, so not secure…

Part 1  Cryptography 194


Non-crypto Hash (2)
 Data X = (X0,X1,X2,…,Xn-1)
 Suppose hash is defined as
h(X) = (nX1+(n1)X2+(n2)X3+…+2Xn-1+Xn) mod
256
 Is this a secure cryptographic hash?
 Note that
h(10101010, 00001111)  h(00001111,
10101010)
 But hash of (00000001, 00001111) is same
as hash of (00000000, 00010001)
 Not “secure”, but this hash is used in the
Part 1  Cryptography
(non-crypto) application rsync
195
Non-crypto Hash (3)
 Cyclic Redundancy Check (CRC)
 Essentially, CRC is the remainder in a long
division calculation
 Good for detecting burst errors
o Such random errors unlikely to yield a collision
 But easy to construct collisions
o In crypto, Trudy is the enemy, not “random”
 CRC has been mistakenly used where
crypto integrity check is required (e.g.,
WEP)
Part 1  Cryptography 196
Popular Crypto Hashes
 MD5  invented by Rivest (of course…)
o 128 bit output
o MD5 collisions easy to find, so it’s broken
 SHA-1  A U.S. government standard,
inner workings similar to MD5
o 160 bit output
 Many other hashes, but MD5 and SHA-1
are the most widely used
 Hashes work by hashing message in blocks

Part 1  Cryptography 197


Crypto Hash Design
 Desired property: avalanche effect
o Change to 1 bit of input should affect about
half of output bits
 Crypto hash functions consist of some
number of rounds
 Want security and speed
o “Avalanche effect” after few rounds
o But simple rounds
 Analogous to design of block ciphers
Part 1  Cryptography 198
Tiger Hash
 “Fast and strong”
 Designed by Ross Anderson and Eli
Biham  leading cryptographers
 Design criteria
o Secure
o Optimized for 64-bit processors
o Easy replacement for MD5 or SHA-1

Part 1  Cryptography 199


Tiger Hash
 Like MD5/SHA-1, input divided into 512 bit
blocks (padded)
 Unlike MD5/SHA-1, output is 192 bits
(three 64-bit words)
o Truncate output if replacing MD5 or SHA-1
 Intermediate rounds are all 192 bits
 4 S-boxes, each maps 8 bits to 64 bits
 A “key schedule” is used

Part 1  Cryptography 200


a b c
Xi Tiger Outer Round
F5 W  Input is X
key schedule o X = (X0,X1,…,Xn-1)
o X is padded
F7 W
o Each Xi is 512 bits
key schedule
 There are n iterations
F9 W of diagram at left
o One for each input block
  
a b c  Initial (a,b,c) constants
 Final (a,b,c) is hash
a b c
 Looks like block cipher!
Part 1  Cryptography 201
Tiger Inner Rounds
a b c
 Each Fm consists of
precisely 8 rounds fm,0 w0

 512 bit input W to Fm fm.1 w1


o W=(w0,w1,…,w7)
fm,2 w2
o W is one of the input
blocks Xi
 All lines are 64 bits
 The fm,i depend on the fm,7 w7
S-boxes (next slide)
a b c
Part 1  Cryptography 202
Tiger Hash: One Round
 Each fm,i is a function of a,b,c,wi and m
o Input values of a,b,c from previous round
o And wi is 64-bit block of 512 bit W
o Subscript m is multiplier
o And c = (c0,c1,…,c7)
 Output of fm,i is
o c = c  wi
o a = a  (S0[c0]  S1[c2]  S2[c4]  S3[c6])
o b = b + (S3[c1]  S2[c3]  S1[c5]  S0[c7])
o b=bm
 Each Si is S-box: 8 bits mapped to 64 bits

Part 1  Cryptography 203


Tiger Hash x0 = x0  (x7  0xA5A5A5A5A5A5A5A5)
Key Schedule x1 = x1  x0
x2 = x2  x1

 Input is X x3 = x3  (x2  ((~x1) << 19))


x4 = x4  x3
o X=(x0,x1,…,x7) x5 = x5 +x4
x6 = x6  (x5  ((~x4) >> 23))
 Small change x7 = x7  x6
in X will x0 = x0 +x7
x1 = x1  (x0  ((~x7) << 19))
produce large x2 = x2  x1
change in key x3 = x3 +x2
schedule x4 = x4  (x3  ((~x2) >> 23))

output x5 = x5  x4
x6 = x6 +x5
x7 = x7 (x6  0x0123456789ABCDEF)
Part 1  Cryptography 204
Tiger Hash Summary (1)
 Hash and intermediate values are 192 bits
 24 (inner) rounds
o S-boxes: Claimed that each input bit affects a,
b and c after 3 rounds
o Key schedule: Small change in message affects
many bits of intermediate hash values
o Multiply: Designed to ensure that input to S-box
in one round mixed into many S-boxes in next
 S-boxes, key schedule and multiply together
designed to ensure strong avalanche effect
Part 1  Cryptography 205
Tiger Hash Summary (2)
 Uses lots of ideas from block ciphers
o S-boxes
o Multiple rounds
o Mixed mode arithmetic
 At a higher level, Tiger employs
o Confusion
o Diffusion

Part 1  Cryptography 206


HMAC
 Can compute a MAC of the message M with
key K using a “hashed MAC” or HMAC
 HMAC is a keyed hash
o Why would we need a key?
 How to compute HMAC?
 Two obvious choices: h(K,M) and h(M,K)
 Which is better?

Part 1  Cryptography 207


HMAC
 Should we compute HMAC as h(K,M) ?
 Hashes computed in blocks
o h(B1,B2) = F(F(A,B1),B2) for some F and constant A
o Then h(B1,B2) = F(h(B1),B2)
 Let M’ = (M,X)
o Then h(K,M’) = F(h(K,M),X)
o Attacker can compute HMAC of M’ without K
 Is h(M,K) better?
o Yes, but… if h(M’) = h(M) then we might have
h(M,K)=F(h(M),K)=F(h(M’),K)=h(M’,K)
Part 1  Cryptography 208
Correct Way to HMAC
 Described in RFC 2104
 Let B be the block length of hash, in bytes
o B = 64 for MD5 and SHA-1 and Tiger
 ipad = 0x36 repeated B times
 opad = 0x5C repeated B times
 Then
HMAC(M,K) = h(K  opad, h(K  ipad, M))

Part 1  Cryptography 209


Hash Uses
 Authentication (HMAC)
 Message integrity (HMAC)
 Message fingerprint
 Data corruption detection
 Digital signature efficiency
 Anything you can do with symmetric crypto
 Also, many, many clever/surprising uses…

Part 1  Cryptography 210


Online Bids
 Suppose Alice, Bob and Charlie are bidders
 Alice plans to bid A, Bob B and Charlie C
 They don’t trust that bids will stay secret
 A possible solution?
o Alice, Bob, Charlie submit hashes h(A), h(B), h(C)
o All hashes received and posted online
o Then bids A, B, and C submitted and revealed
 Hashes don’t reveal bids (one way)
 Can’t change bid after hash sent (collision)
 But there is a serious flaw here…
Part 1  Cryptography 211
Hashing for Spam Reduction
 Spam reduction
 Before accept email, want proof that
sender had to “work” to create email
o Here, “work” == CPU cycles
 Goal is to limit the amount of email
that can be sent
o This approach will not eliminate spam
o Instead, make spam more costly to send
Part 1  Cryptography 212
Spam Reduction
 Let M = complete email message
R = value to be determined
T = current time
 Sender must determine R so that
h(M,R,T) = (00…0,X), that is,
initial N bits of hash value are all zero
 Sender then sends (M,R,T)
 Recipient accepts email, provided that…
h(M,R,T) begins with N zeros
Part 1  Cryptography 213
Spam Reduction
 Sender: h(M,R,T) begins with N zeros
 Recipient: verify that h(M,R,T) begins with
N zeros
 Work for sender: on average 2N hashes
 Work for recipient: always 1 hash
 Sender’s work increases exponentially in N
 Small work for recipient, regardless of N
 Choose N so that…
o Work acceptable for normal amounts of email
o Work is too high for spammers
Part 1  Cryptography 214
Secret Sharing

Part 1  Cryptography 215


Shamir’s Secret Sharing
Y
Two points determine a line
 Give (X0,Y0) to Alice

(X1,Y1) (X0,Y0)  Give (X1,Y1) to Bob


 Then Alice and Bob must
(0,S) cooperate to find secret S
 Also works in discrete case
X  Easy to make “m out of n”
2 out of 2 scheme for any m  n

Part 1  Cryptography 216


Shamir’s Secret Sharing
Y
 Give (X0,Y0) to Alice
(X0,Y0)  Give (X1,Y1) to Bob

(X1,Y1)  Give (X2,Y2) to Charlie


 Then any two can cooperate
(X2,Y2)

(0,S) to find secret S


 No one can determine S
X  A “2 out of 3” scheme
2 out of 3

Part 1  Cryptography 217


Shamir’s Secret Sharing
Y  Give (X0,Y0) to Alice
(X0,Y0)  Give (X1,Y1) to Bob
(X1,Y1)  Give (X2,Y2) to Charlie
(X2,Y2)  3 pts determine parabola
(0,S) Alice, Bob, and Charlie
must cooperate to find S
X  A “3 out of 3” scheme
3 out of 3
 What about “3 out of
4”?
Part 1  Cryptography 218
Secret Sharing Use?
 Key escrow  suppose it’s required that
your key be stored somewhere
 Key can be “recovered” with court order
 But you don’t trust FBI to store your keys
 We can use secret sharing
o Say, three different government agencies
o Two must cooperate to recover the key

Part 1  Cryptography 219


Secret Sharing Example
Y
 Your symmetric key is K
(X0,Y0)  Point (X0,Y0) to FBI

(X1,Y1)  Point (X1,Y1) to DoJ


 Point (X2,Y2) to DoC
(X2,Y2)

(0,K)  To recover your key K,


two of the three agencies
X must cooperate
 No one agency can get K

Part 1  Cryptography 220


Visual Cryptography
 Another form of secret sharing…
 Alice and Bob “share” an image
 Both must cooperate to reveal the image
 Nobody can learn anything about image
from Alice’s share or Bob’s share
o That is, both shares are required
 Is this possible?

Part 1  Cryptography 221


Visual Cryptography
 How to “share” a pixel?
 Suppose image is black and white

 Then each pixel


is either black
or white
 We split pixels
as shown

Part 1  Cryptography 222


Sharing Black & White Image
 If pixel is white, randomly choose a
or b for Alice’s/Bob’s shares
 If pixel is
black, randomly
choose c or d
 No information
in one “share”

Part 1  Cryptography 223


Visual Crypto Example
 Alice’s  Bob’s  Overlaid
share share shares

Part 1  Cryptography 224


Visual Crypto
 How does visual “crypto” compare to
regular crypto?
 In visual crypto, no key…
o Or, maybe both images are the key?
 With encryption, exhaustive search
o Except for the one-time pad
 Exhaustive search on visual crypto?
o No exhaustive search is possible!
Part 1  Cryptography 225
Visual Crypto
 Visual crypto  no exhaustive search…
 How does visual crypto compare to crypto?
o Visual crypto is “information theoretically”
secure  also true of secret sharing schemes
o With regular encryption, goal is to make
cryptanalysis computationally infeasible
 Visual crypto an example of secret sharing
o Not really a form of crypto, in the usual sense

Part 1  Cryptography 226


Random Numbers in
Cryptography

Part 1  Cryptography 227


Random Numbers
 Random numbers used to generate keys
o Symmetric keys
o RSA: Prime numbers
o Diffie Hellman: secret values
 Random numbers used for nonces
o Sometimes a sequence is OK
o But sometimes nonces must be random
 Random numbers also used in simulations,
statistics, etc.
o In such apps, need “statistically” random numbers
Part 1  Cryptography 228
Random Numbers
 Cryptographic random numbers must be
statistically random and unpredictable
 Suppose server generates symmetric keys
o Alice: KA
o Bob: KB
o Charlie: KC
o Dave: KD
 Alice, Bob, and Charlie don’t like Dave…
 Alice, Bob, and Charlie, working together,
must not be able to determine KD
Part 1  Cryptography 229
Non-random Random Numbers
 Online version of Texas Hold ‘em Poker
o ASF Software, Inc.

 Random numbers used to shuffle the deck


 Program did not produce a random shuffle
 A serious problem, or not?
Part 1  Cryptography 230
Card Shuffle
 There are 52! > 2225 possible shuffles
 The poker program used “random” 32-bit
integer to determine the shuffle
o So, only 232 distinct shuffles could occur
 Code used Pascal pseudo-random number
generator (PRNG): Randomize()
 Seed value for PRNG was function of
number of milliseconds since midnight
 Less than 227 milliseconds in a day
o So, less than 227 possible shuffles
Part 1  Cryptography 231
Card Shuffle
 Seed based on milliseconds since midnight
 PRNG re-seeded with each shuffle
 By synchronizing clock with server, number
of shuffles that need to be tested  218
 Could then test all 218 in real time
o Test each possible shuffle against “up” cards
 Attacker knows every card after the first
of five rounds of betting!

Part 1  Cryptography 232


Poker Example
 Poker program is an extreme example
o But common PRNGs are predictable
o Only a question of how many outputs must be
observed before determining the sequence
 Crypto random sequences not predictable
o For example, keystream from RC4 cipher
o But “seed” (or key) selection is still an issue!
 How to generate initial random values?
o Keys (and, in some cases, seed values)

Part 1  Cryptography 233


What is Random?
 True “random” is hard to define
 Entropy is a measure of randomness
 Good sources of “true” randomness
o Radioactive decay  but, radioactive
computers are not too popular
o Hardware devices  many good ones on
the market
o Lava lamp  relies on chaotic behavior

Part 1  Cryptography 234


Randomness
 Sources of randomness via software
o Software is supposed to be deterministic
o So, must rely on external “random” events
o Mouse movements, keyboard dynamics, network
activity, etc., etc.
 Can get quality random bits by such methods
 But quantity of bits is very limited
 Bottom line: “The use of pseudo-random
processes to generate secret quantities can
result in pseudo-security”
Part 1  Cryptography 235
Information Hiding

Part 1  Cryptography 236


Information Hiding
 Digital Watermarks
o Example: Add “invisible” info to data
o Defense against music/software piracy
 Steganography
o “Secret” communication channel
o Similar to a covert channel (more later)
o Example: Hide data in an image file
Part 1  Cryptography 237
Watermark
 Add a “mark” to data
 Visibility (or not) of watermarks
o Invisible  Watermark is not obvious
o Visible  Such as TOP SECRET
 Strength (or not) of watermarks
o Robust  Readable even if attacked
o Fragile  Damaged if attacked
Part 1  Cryptography 238
Watermark Examples
 Add robust invisible mark to digital music
o If pirated music appears on Internet, can trace
it back to original source of the leak
 Add fragile invisible mark to audio file
o If watermark is unreadable, recipient knows
that audio has been tampered with (integrity)
 Combinations of several types are
sometimes used
o E.g., visible plus robust invisible watermarks

Part 1  Cryptography 239


Watermark Example (1)
 Non-digital watermark: U.S. currency

 Image embedded in paper on rhs


o Hold bill to light to see embedded info
Part 1  Cryptography 240
Watermark Example (2)
 Add invisible watermark to photo
 Claim is that 1 inch2 contains enough
info to reconstruct entire photo
 If photo is damaged, watermark can
be used to reconstruct it!

Part 1  Cryptography 241


Steganography
 According to Herodotus (Greece 440 BC)
o Shaved slave’s head
o Wrote message on head
o Let hair grow back
o Send slave to deliver message
o Shave slave’s head to expose a message
warning of Persian invasion
 Historically, steganography used by
military more often than cryptography
Part 1  Cryptography 242
Images and Steganography
 Images use 24 bits for color: RGB
o 8 bits for red, 8 for green, 8 for blue
 For example
o 0x7E 0x52 0x90 is this color
o 0xFE 0x52 0x90 is this color
 While
o 0xAB 0x33 0xF0 is this color
o 0xAB 0x33 0xF1 is this color
 Low-order bits don’t matter…
Part 1  Cryptography 243
Images and Stego
 Given an uncompressed image file…
o For example, BMP format
 …we can insert information into low-order
RGB bits
 Since low-order RGB bits don’t matter,
changes will be “invisible” to human eye
o But, computer program can “see” the bits

Part 1  Cryptography 244


Stego Example 1

 Left side: plain Alice image


 Right side: Alice with entire Alice in
Wonderland (pdf) “hidden” in the image
Part 1  Cryptography 245
Non-Stego Example
 Walrus.html in web browser

 “View source” reveals:


<font color=#000000>"The time has come," the Walrus said,</font><br>
<font color=#000000>"To talk of many things: </font><br>
<font color=#000000>Of shoes and ships and sealing wax </font><br>
<font color=#000000>Of cabbages and kings </font><br>
<font color=#000000>And why the sea is boiling hot </font><br>
<font color=#000000>And whether pigs have wings." </font><br>

Part 1  Cryptography 246


Stego Example 2
 stegoWalrus.html in web browser

 “View source” reveals:


<font color=#000101>"The time has come," the Walrus said,</font><br>
<font color=#000100>"To talk of many things: </font><br>
<font color=#010000>Of shoes and ships and sealing wax </font><br>
<font color=#010000>Of cabbages and kings </font><br>
<font color=#000000>And why the sea is boiling hot </font><br>
<font color=#010001>And whether pigs have wings." </font><br>

 “Hidden” message: 011 010 100 100 000 101


Part 1  Cryptography 247
Steganography
 Some formats (e.g., image files) are more
difficult than html for humans to read
o But easy for computer programs to read…
 Easy to hide info in unimportant bits
 Easy to damage info in unimportant bits
 To be robust, must use important bits
o But stored info must not damage data
o Collusion attacks are also a concern
 Robust steganography is tricky!
Part 1  Cryptography 248
Information Hiding:
The Bottom Line
 Not-so-easy to hide digital information
o “Obvious” approach is not robust
o Stirmark: tool to make most watermarks in
images unreadable without damaging the image
o Stego/watermarking are active research topics
 If information hiding is suspected
o Attacker may be able to make
information/watermark unreadable
o Attacker may be able to read the information,
given the original document (image, audio, etc.)
Part 1  Cryptography 249
Chapter 6:
Advanced Cryptanalysis
For there is nothing covered, that shall not be revealed;
neither hid, that shall not be known.
 Luke 12:2

The magic words are squeamish ossifrage


 Solution to RSA challenge problem
posed in 1977 by Ron Rivest, who
estimated that breaking the message
would require 40 quadrillion years.
It was broken in 1994.

Part 1  Cryptography 250


Advanced Cryptanalysis
 Modern block cipher cryptanalysis
o Differential cryptanalysis
o Linear cryptanalysis
 Side channel attack on RSA
 Lattice reduction attack on knapsack
 Hellman’s TMTO attack on DES

Part 1  Cryptography 251


Linear and Differential
Cryptanalysis

Part 1  Cryptography 252


Introduction
 Both linear and differential cryptanalysis
developed to attack DES
 Applicable to other block ciphers

 Differential  Biham and Shamir, 1990


o Apparently known to NSA in 1970s
o For analyzing ciphers, not a practical attack
o A chosen plaintext attack
 Linear cryptanalysis  Matsui, 1993
o Perhaps not know to NSA in 1970s
o Slightly more feasible than differential
o A known plaintext attack
Part 1  Cryptography 253
L R
DES Overview
Linear stuff
 8 S-boxes
 Each S-box maps
XOR Ki subkey 6 bits to 4 bits
 Example: S-box 1

input bits (0,5)


S-boxes
 input bits (1,2,3,4)
| 0 1 2 3 4 5 6 7 8 9 A B C D E F
-----------------------------------
Linear stuff 0 | E 4 D 1 2 F B 8 3 A 6 C 5 9 0 7
1 | 0 F 7 4 E 2 D 1 A 6 C B 9 5 3 4
2 | 4 1 E 8 D 6 2 B F C 9 7 3 A 5 0
3 | F C 8 2 4 9 1 7 5 B 3 E A 0 6 D
L R
Part 1  Cryptography 254
Overview of Differential
Cryptanalysis

Part 1  Cryptography 255


Differential Cryptanalysis
 Recall that all of DES is linear except for
the S-boxes
 Differential attack focuses on overcoming
this nonlinearity
 Idea is to compare input and output
differences
 For simplicity, first consider only one
round and only one S-box

Part 1  Cryptography 256


Differential Cryptanalysis
 Suppose a cipher has 3-bit to 2-bit S-box
column
row 00 01 10 11
0 10 01 11 00
1 00 10 01 11

 Sbox(abc) is element in row a column bc


 Example: Sbox(010) = 11

Part 1  Cryptography 257


Differential Cryptanalysis
column
row 00 01 10 11
0 10 01 11 00
1 00 10 01 11

 Suppose X1 = 110, X2 = 010, K = 011


 Then X1  K = 101 and X2  K = 001
 Sbox(X1  K) = 10 and Sbox(X2  K) = 01

Part 1  Cryptography 258


column
row 00 01 10 11 Differential
0 10 01 11 00 Cryptanalysis
1 00 10 01 11

 Suppose
o Unknown key: K
o Known inputs: X = 110, X = 010
o Known outputs: Sbox(X  K) = 10, Sbox(X  K) =
01
 Know X  K  {000,101}, X  K  {001,110}
 Then K  {110,011}  {011,100}  K = 011
 Like a known plaintext attack on S-box
Part 1  Cryptography 259
Differential Cryptanalysis
 Attacking one S-box not very useful!
o And Trudy can’t always see input and output
 To make this work we must do 2 things
1. Extend the attack to one round
o Have to deal with all S-boxes
o Choose input so only one S-box “active”
2. Then extend attack to (almost) all rounds
o Output of one round is input to next round
o Choose input so output is “good” for next round
Part 1  Cryptography 260
Differential Cryptanalysis
 We deal with input and output differences
 Suppose we know inputs X and X
o For X the input to S-box is X  K
o For X the input to S-box is X  K
o Key K is unknown
o Input difference: (X  K)  (X  K) = X  X
 Input difference is independent of key K
 Output difference: Y  Y is (almost) input
difference to next round
 Goal is to “chain” differences thru rounds
Part 1  Cryptography 261
Differential Cryptanalysis
 If we obtain known output difference from
known input difference…
o May be able to chain differences thru rounds
o It’s OK if this only occurs with some probability
 If input difference is 0…
o …output difference is 0
o Allows us to make some S-boxes “inactive” with
respect to differences

Part 1  Cryptography 262


column
S-box row 00 01 10 11
Differential 0 10 01 11 00
Analysis 1 00 10 01 11
Sbox(X)  Sbox(X)
 Input diff 000
not interesting 00 01 10 11
 Input diff 010
000 8 0 0 0
always gives 001 0 0 4 4
output diff 01 X 010 0 8 0 0
 More biased,
 011 0 0 4 4
the better (for X 100 0 0 4 4
Trudy) 101 4 4 0 0
110 0 0 4 4
111 4 4 0 0
Part 1  Cryptography 263
Overview of Linear
Cryptanalysis

Part 1  Cryptography 264


Linear Cryptanalysis
 Like differential cryptanalysis, we target
the nonlinear part of the cipher
 But instead of differences, we
approximate the nonlinearity with linear
equations
 For DES-like cipher we need to
approximate S-boxes by linear functions
 How well can we do this?

Part 1  Cryptography 265


column
S-box row 00 01 10 11
Linear 0 10 01 11 00
Analysis 1 00 10 01 11
output
 Input x0x1x2
y0 y1 y0y1
where x0 is row
0 4 4 4
and x1x2 is column i x0 4 4 4
 Output y0y1 n x1 4 6 2
 Count of 4 is p x2 4 4 4
unbiased u x0x1 4 2 2
 Count of 0 or 8 t x0x2 0 4 4
is best for Trudy x1x2 4 6 6
x0x1x2 4 6 2
Part 1  Cryptography 266
column
Linear row 00 01 10 11
Analysis 0 10 01 11 00
 For example, 1 00 10 01 11
y1 = x1 output
with prob. 3/4 y0 y1 y0y1
 And 0 4 4 4
y0 = x0x21 i x0 4 4 4
with prob. 1 n x1 4 6 2
 And
p x2 4 4 4
u x0x1 4 2 2
y0y1=x1x2
t x0x2 0 4 4
with prob. 3/4 x1x2 4 6 6
x0x1x2 4 6 2
Part 1  Cryptography 267
Linear Cryptanalysis
 Consider a single DES S-box
 Let Y = Sbox(X)
 Suppose y3 = x2  x5 with high probability
o I.e., a good linear approximation to output y3
 Can we extend this so that we can solve
linear equations for the key?
 As in differential cryptanalysis, we need to
“chain” thru multiple rounds

Part 1  Cryptography 268


Linear Cryptanalysis of DES
 DES is linear except for S-boxes
 How well can we approximate S-boxes with
linear functions?
 DES S-boxes designed so there are no good
linear approximations to any one output bit
 But there are linear combinations of output
bits that can be approximated by linear
combinations of input bits

Part 1  Cryptography 269


Tiny DES

Part 1  Cryptography 270


Tiny DES (TDES)
 A much simplified version of DES
o 16 bit block
o 16 bit key
o 4 rounds
o 2 S-boxes, each maps 6 bits to 4 bits
o 12 bit subkey each round
 Plaintext = (L0, R0)
 Ciphertext = (L4, R4)
 No useless junk
Part 1  Cryptography 271
L R key
8 8 8

One
expand shift shift
8 12 8 8

XOR
Ki
compress
Round
12
6 6 of
TDES
8 8
SboxLeft SboxRight

4 4
8
XOR
8

L R key
Part 1  Cryptography 272
TDES Fun Facts
 TDES is a Feistel Cipher
 (L0,R0) = plaintext
 For i = 1 to 4
Li = Ri-1
Ri = Li-1  F(Ri-1, Ki)
 Ciphertext = (L4,R4)
 F(Ri-1, Ki) = Sboxes(expand(Ri-1)  Ki)
where Sboxes(x0x1x2…x11) =
(SboxLeft(x0x1…x5), SboxRight(x6x7…x11))

Part 1  Cryptography 273


TDES Key Schedule
 Key: K = k0k1k2k3k4k5k6k7k8k9k10k11k12k13k14k15
 Subkey
o Left: k0k1…k7 rotate left 2, select 0,2,3,4,5,7
o Right: k8k9…k15 rotate left 1, select 9,10,11,13,14,15
 Subkey K1 = k2k4k5k6k7k1k10k11k12k14k15k8
 Subkey K2 = k4k6k7k0k1k3k11k12k13k15k8k9
 Subkey K3 = k6k0k1k2k3k5k12k13k14k8k9k10
 Subkey K4 = k0k2k3k4k5k7k13k14k15k9k10k11

Part 1  Cryptography 274


TDES expansion perm
 Expansion permutation: 8 bits to 12 bits

r0r1r2r3r4r5r6r7

r4r7r2r1r5r7r0r2r6r5r0r3

 We can write this as


expand(r0r1r2r3r4r5r6r7) = r4r7r2r1r5r7r0r2r6r5r0r3

Part 1  Cryptography 275


TDES S-boxes
0 1 2 3 4 5 6 7 8 9 A B C D E F  Right S-box
0 C 5 0 A E 7 2 8 D 4 3 9 6 F 1 B  SboxRight
1 1 C 9 6 3 E B 2 F 8 4 5 D A 0 7
2 F A E 6 D 8 2 4 1 7 9 0 3 5 B C
3 0 A 3 C 8 2 1 E 9 7 F 6 B 5 D 4

0 1 2 3 4 5 6 7 8 9 A B C D E F
0 6 9 A 3 4 D 7 8 E 1 2 B 5 C F 0
 Left S-box 1 9 E B A 4 5 0 7 8 6 3 2 C D 1 F
 SboxLeft 2 8 1 C 2 D 3 E F 0 9 5 A 4 B 6 7
3 9 0 2 5 A D 6 E 1 8 B C 3 4 7 F

Part 1  Cryptography 276


Differential Cryptanalysis of
TDES

Part 1  Cryptography 277


TDES
 TDES SboxRight
0 1 2 3 4 5 6 7 8 9 A B C D E F
0 C 5 0 A E 7 2 8 D 4 3 9 6 F 1 B
1 1 C 9 6 3 E B 2 F 8 4 5 D A 0 7
2 F A E 6 D 8 2 4 1 7 9 0 3 5 B C
3 0 A 3 C 8 2 1 E 9 7 F 6 B 5 D 4

 For X and X suppose X  X = 001000


 Then SboxRight(X)  SboxRight(X) = 0010
with probability 3/4

Part 1  Cryptography 278


Differential Crypt. of TDES
 The game plan…
 Select P and P so that
P  P = 0000 0000 0000 0010 = 0x0002
 Note that P and P differ in exactly 1 bit
 Let’s carefully analyze what happens as
these plaintexts are encrypted with TDES

Part 1  Cryptography 279


TDES
 If Y  Y = 001000 then with probability 3/4
SboxRight(Y)  SboxRight(Y) = 0010
 Y  Y = 001000  (YK)  (YK) = 001000
 If Y  Y = 000000 then for any S-box, we
have Sbox(Y)  Sbox(Y) = 0000
 Difference of (0000 0010) is expanded by
TDES expand perm to diff. (000000 001000)
 The bottom line: If X  X = 00000010 then
F(X, K)  F(X, K) = 00000010 with prob. 3/4

Part 1  Cryptography 280


TDES
 From the previous slide
o Suppose R  R = 0000 0010
o Suppose K is unknown key
o Then with probability 3/4
F(R,K)  F(R,K) = 0000 0010
 The bottom line? With probability 3/4…
o Input to next round same as current round
 So we can chain thru multiple rounds
Part 1  Cryptography 281
TDES Differential Attack
 Select P and P with P  P = 0x0002
(L0,R0) = P (L0,R0) = P P  P = 0x0002

L1 = R 0 L1 = R 0 With probability 3/4


R1 = L0  F(R0,K1) R1 = L0  F(R0,K1) (L1,R1)  (L1,R1) = 0x0202

L2 = R 1 L2 = R 1 With probability (3/4)2


R2 = L1  F(R1,K2) R2 = L1  F(R1,K2) (L2,R2)  (L2,R2) = 0x0200

L3 = R 2 L3 = R 2 With probability (3/4)2


R3 = L2  F(R2,K3) R3 = L2  F(R2,K3) (L3,R3)  (L3,R3) = 0x0002

L4 = R 3 L4 = R 3 With probability (3/4)3


R4 = L3  F(R3,K4) R4 = L3  F(R3,K4) (L4,R4)  (L4,R4) = 0x0202

C = (L4,R4) C = (L4,R4) C  C = 0x0202


Part 1  Cryptography 282
TDES Differential Attack
 Choose P and P with P  P = 0x0002
 If C  C = 0x0202 then
R4 = L3  F(R3, K4) R4 = L3  F(R3, K4)
R4 = L3  F(L4, K4) R4 = L3  F(L4, K4)
and (L3, R3)  (L3, R3) = 0x0002
 Then L3 = L3 and C=(L4, R4) and C=(L4, R4)
are both known
 Since L3 = R4  F(L4, K4) and L3 = R4  F(L4,
K4), for correct choice of subkey K4 we have
R4  F(L4, K4) = R4  F(L4, K4)
Part 1  Cryptography 283
TDES Differential Attack
 Choose P and P with P  P = 0x0002
 If C  C = (L4, R4)  (L4, R4) = 0x0202
 Then for the correct subkey K4
R4  F(L4, K4) = R4  F(L4, K4)
which we rewrite as
R4  R4 = F(L4, K4)  F(L4, K4)
where the only unknown is K4
 Let L4 = l0l1l2l3l4l5l6l7. Then we have
0010 = SBoxRight( l0l2l6l5l0l3 
k13k14k15k9k10k11)
 SBoxRight( l0l2l6l5l0l3  k13k14k15k9k10k11
Part 1  Cryptography 284)
TDES Differential Attack
Algorithm to find right 6 bits of subkey K4
count[i] = 0, for i = 0,1,. . .,63
for i = 1 to iterations
Choose P and P with P  P = 0x0002
Obtain corresponding C and C
if C  C = 0x0202
for K = 0 to 63
if 0010 == (SBoxRight( l0l2l6l5l0l3 K)  SBoxRight( l0l2l6l5l0l3 K))
++count[K]
end if
next K
end if
next i
All K with max count[K] are possible (partial) K4

Part 1  Cryptography 285


TDES Differential Attack
 Experimental results
 Choose 100 pairs P and P with P  P=
0x0002
 Found 47 of these give C  C = 0x0202
 Tabulated counts for these 47
o Max count of 47 for each
K  {000001,001001,110000,111000}
o No other count exceeded 39
 Implies that K4 is one of 4 values, that is,
k13k14k15k9k10k11 {000001,001001,110000,111000}
PartActual
 key
1  Cryptography is K=1010 1001 1000 0111 286
Linear Cryptanalysis of
TDES

Part 1  Cryptography 287


Linear Approx. of Left S-Box
 TDES left S-box or SboxLeft
0 1 2 3 4 5 6 7 8 9 A B C D E F
0 6 9 A 3 4 D 7 8 E 1 2 B 5 C F 0
1 9 E B A 4 5 0 7 8 6 3 2 C D 1 F
2 8 1 C 2 D 3 E F 0 9 5 A 4 B 6 7
3 9 0 2 5 A D 6 E 1 8 B C 3 4 7 F

 Notation: y0y1y2y3 = SboxLeft(x0x1x2x3x4x5)


 For this S-box, y1=x2 and y2=x3 both with
probability 3/4
 Can we “chain” this thru multiple rounds?
Part 1  Cryptography 288
TDES Linear Relations
 Recall that the expansion perm is
expand(r0r1r2r3r4r5r6r7) = r4r7r2r1r5r7r0r2r6r5r0r3
 And y0y1y2y3 = SboxLeft(x0x1x2x3x4x5) with y1=x2 and
y2=x3 each with probability 3/4
 Also, expand(Ri1)  Ki is input to Sboxes at round i
 Then y1=r2km and y2=r1kn both with prob 3/4
 New right half is y0y1y2y3… plus old left half
 Bottom line: New right half bits: r1  r2  km  l1
and r2  r1  kn  l2 both with probability 3/4

Part 1  Cryptography 289


Recall TDES Subkeys
 Key: K = k0k1k2k3k4k5k6k7k8k9k10k11k12k13k14k15
 Subkey K1 = k2k4k5k6k7k1k10k11k12k14k15k8
 Subkey K2 = k4k6k7k0k1k3k11k12k13k15k8k9
 Subkey K3 = k6k0k1k2k3k5k12k13k14k8k9k10
 Subkey K4 = k0k2k3k4k5k7k13k14k15k9k10k11

Part 1  Cryptography 290


TDES Linear Cryptanalysis
 Known P=p0p1p2…p15 and C=c0c1c2…c15
(L0,R0) = (p0…p7,p8…p15) Bit 1, Bit 2 probability
(numbering from 0)
L1 = R 0 p9, p10 1
R1 = L0  F(R0,K1) p1p10k5, p2p9k6 3/4

L2 = R 1 p1p10k5, p2p9k6 3/4


R2 = L1  F(R1,K2) p2k6k7, p1k5k0 (3/4)2

L3 = R 2 p2k6k7, p1k5k0 (3/4)2


R3 = L2  F(R2,K3) p10k0k1, p9k7k2 (3/4)3

p10k0k1, p9k7k2 (3/4)3


L4 = R 3
R4 = L3  F(R3,K4)
k0  k1 = c1  p10 (3/4)3
C = (L4,R4) k7  k2 = c2  p9
(3/4)3
Part 1  Cryptography 291
TDES Linear Cryptanalysis
 Experimental results
 Use 100 known plaintexts, get ciphertexts.
o Let P=p0p1p2…p15 and let C=c0c1c2…c15
 Resulting counts
o c1  p10 = 0 occurs 38 times
o c1  p10 = 1 occurs 62 times
o c2  p9 = 0 occurs 62 times
o c2  p9 = 1 occurs 38 times
 Conclusions
o Since k0  k1 = c1  p10 we have k0  k1 = 1
o Since k7  k2 = c2  p9 we have k7  k2 = 0
 Actual key is K = 1010 0011 0101 0110
Part 1  Cryptography 292
To Build a Better Block Cipher…
 How can cryptographers make linear and
differential attacks more difficult?
1. More rounds  success probabilities diminish
with each round
2. Better confusion (S-boxes)  reduce success
probability on each round
3. Better diffusion (permutations)  more
difficult to chain thru multiple rounds
 Limited mixing and limited nonlinearity,
means that more rounds required: TEA
 Strong mixing and nonlinearity, then
fewer (but more complex) rounds: AES
Part 1  Cryptography 293
Side Channel Attack on RSA

Part 1  Cryptography 294


Side Channel Attacks
 Sometimes possible to recover key without
directly attacking the crypto algorithm
 A side channel consists of “incidental info”
 Side channels can arise due to
o The way that a computation is performed
o Media used, power consumed, emanations, etc.
 Induced faults can also reveal information
 Side channel may reveal a crypto key
 Paul Kocher one of the first in this field
Part 1  Cryptography 295
Types of Side Channels
 Emanations security (EMSEC)
o Electromagnetic field (EMF) from computer screen can
allow screen image to be reconstructed at a distance
o Smartcards have been attacked via EMF emanations
 Differential power analysis (DPA)
o Smartcard power usage depends on the computation
 Differential fault analysis (DFA)
o Key stored on smartcard in GSM system could be read
using a flashbulb to induce faults
 Timing analysis
o Different computations take different time
o RSA keys recovered over a network (openSSL)!
Part 1  Cryptography 296
The Scenario
 Alice’s public key: (N,e)
 Alice’s private key: d
 Trudy wants to find d
 Trudy can send any message M to Alice and
Alice will respond with Md mod N
o That is, Alice signs M and sends result to Trudy
 Trudy can precisely time Alice’s
computation of Md mod N
Part 1  Cryptography 297
Timing Attack on RSA
 Consider Md mod N
Repeated Squaring
 We want to find private
key d, where d = d0d1…dn x=M
 Spse repeated squaring for j = 1 to n
used for Md mod N x = mod(x2,N)
 Suppose, for efficiency if dj == 1 then
mod(x,N)
x = mod(xM,N)
if x >= N
x=x%N end if
end if next j
return x
return x

Part 1  Cryptography 298


Timing Attack Repeated Squaring
x=M
for j = 1 to n
 If dj = 0 then x = mod(x2,N)
o x = mod(x2,N) if dj == 1 then

 If dj = 1 then x = mod(xM,N)
end if
o x = mod(x2,N) next j
o x = mod(xM,N) return x
 Computation time
differs in each case mod(x,N)
if x >= N
 Can attacker take x=x%N
advantage of this? end if
return x
Part 1  Cryptography 299
Timing Attack Repeated Squaring
x=M
 Choose M with M3 < N
for j = 1 to n
 Choose M with M2 < N < M3 x = mod(x2,N)
 Let x = M and x = M if dj == 1 then
x = mod(xM,N)
 Consider j = 1
end if
o x = mod(x2,N) does no “%”
next j
o x = mod(xM,N) does no “%”
return x
o x = mod(x2,N) does no “%”
o x = mod(xM,N) does “%” only if d1=1
mod(x,N)
 If d1 = 1 then j = 1 step takes if x >= N
longer for M than for M
x=x%N
 But more than one round… end if
return x
Part 1  Cryptography 300
Timing Attack on RSA
 An example of a chosen plaintext attack
 Choose M0,M1,…,Mm-1 with
o Mi3 < N for i=0,1,…,m-1
 Let ti be time to compute Mid mod N
o t = (t0 + t1 + … + tm-1) / m
 Choose M0,M1,…,Mm-1 with
o Mi2 < N < Mi3 for i=0,1,…,m-1
 Let ti be time to compute Mid mod N
o t = (t0 + t1 + … + tm-1) / m
 If t > t then d1 = 1 otherwise d1 = 0
 Once d1 is known, find d2 then d3 then …

Part 1  Cryptography 301


Side Channel Attacks
 If crypto is secure Trudy looks for shortcut
 What is good crypto?
o More than mathematical analysis of algorithms
o Many other issues (such as side channels) must
be considered
o See Schneier’s article
 Lesson: Attacker’s don’t play by the rules!

Part 1  Cryptography 302


Knapsack Lattice Reduction
Attack

Part 1  Cryptography 303


Lattice?
 Many problems can be solved by
finding a “short” vector in a lattice
 Let b1,b2,…,bn be vectors in m
 All 1b1+2b2+…+nbn, each i is an
integer is a discrete set of points

Part 1  Cryptography 304


What is a Lattice?
 Suppose b1=[1,3]T and b2=[2,1]T
 Then any point in the plane can be written
as 1b1+2b2 for some 1,2  
o Since b1 and b2 are linearly independent
 We say the plane 2 is spanned by (b1,b2)
 If 1,2 are restricted to integers, the
resulting span is a lattice
 Then a lattice is a discrete set of points

Part 1  Cryptography 305


Lattice Example
 Suppose
b1=[1,3]T and
b2=[2,1]T
 The lattice
spanned by
(b1,b2) is
pictured to the
right

Part 1  Cryptography 306


Exact Cover
 Exact cover  given a set S and a
collection of subsets of S, find a
collection of these subsets with each
element of S is in exactly one subset
 Exact cover is can be solved by
finding a “short” vector in a lattice

Part 1  Cryptography 307


Exact Cover Example
 Set S = {0,1,2,3,4,5,6}
 Spse m = 7 elements and n = 13 subsets
Subset: 0 1 2 3 4 5 6 7 8 9 10 11 12
Elements: 013 015 024 025 036 124 126 135 146 1 256 345 346

 Find a collection of these subsets with each


element of S in exactly one subset
 Could try all 213 possibilities
 If problem is too big, try heuristic search
 Many different heuristic search techniques
Part 1  Cryptography 308
Exact Cover Solution
 Exact cover in matrix form
o Set S = {0,1,2,3,4,5,6}
o Spse m = 7 elements and n = 13 subsets
Subset: 0 1 2 3 4 5 6 7 8 9 10 11 12
Elements: 013 015 024 025 036 124 126 135 146 1 256 345 346

subsets
e
l
Solve: AU = B
e where ui  {0,1}
m
e
n Solution:
t
s U = [0001000001001]T
mxn mx1

Part 1  Cryptography
nx1 309
Example
 We can restate AU = B as MV = W where

Matrix M Vector V Vector W

 The desired solution is U


o Columns of M are linearly independent
 Let c0,c1,c2,…,cn be the columns of M
 Let v0,v1,v2,…,vn be the elements of V
 Then W = v0c0 + v1c1 + … + vncn

Part 1  Cryptography 310


Example
 Let L be the lattice spanned by
c0,c1,c2,…,cn (ci are the columns of M)
 Recall MV = W
o Where W = [U,0]T and we want to find U
o But if we find W, we’ve also solved it!
 Note W is in lattice L since all vi are
integers and W = v0c0 + v1c1 + … + vncn

Part 1  Cryptography 311


Facts
 W = [u0,u1,…,un-1,0,0,…,0]  L, each ui  {0,1}
 The length of a vector Y  N is
||Y|| = sqrt(y02+y12+…+yN-12)
 Then the length of W is
||W|| = sqrt(u02+u12+…+un-12)  sqrt(n)
 So W is a very short vector in L where
o First n entries of W all 0 or 1
o Last m elements of W are all 0
 Can we use these facts to find U?
Part 1  Cryptography 312
Lattice Reduction
 If we can find a short vector in L, with first
n entries all 0 or 1 and last m entries all 0…
o Then we might have found solution U
 LLL lattice reduction algorithm will
efficiently find short vectors in a lattice
 About 30 lines of pseudo-code specify LLL
 No guarantee LLL will find desired vector
 But probability of success is often good

Part 1  Cryptography 313


Knapsack Example
 What does lattice reduction have to do with
the knapsack cryptosystem?
 Suppose we have
o Superincreasing knapsack
S = [2,3,7,14,30,57,120,251]
o Suppose m = 41, n = 491  m1 = 12 mod n
o Public knapsack: ti = 41  si mod 491
T = [82,123,287,83,248,373,10,471]
 Public key: T Private key: (S,m1,n)
Part 1  Cryptography 314
Knapsack Example
 Public key: T Private key: (S,m1,n)
S = [2,3,7,14,30,57,120,251]
T = [82,123,287,83,248,373,10,471]
n = 491, m1 = 12
 Example: 10010110 is encrypted as
82+83+373+10 = 548
 Then receiver computes
548  12 = 193 mod 491
and uses S to solve for 10010110

Part 1  Cryptography 315


Knapsack LLL Attack
 Attacker knows public key
T = [82,123,287,83,248,373,10,471]
 Attacker knows ciphertext: 548
 Attacker wants to find ui  {0,1} s.t.
82u0+123u1+287u2+83u3+248u4+373u5+10u6+471u7=548

 This can be written as a matrix equation


(dot product): T  U = 548
Part 1  Cryptography 316
Knapsack LLL Attack
 Attacker knows: T = [82,123,287,83,248,373,10,471]
 Wants to solve: T  U = 548 where each ui  {0,1}
o Same form as AU = B on previous slides!
o We can rewrite problem as MV = W where

 LLL gives us short vectors in the lattice spanned by


the columns of M
Part 1  Cryptography 317
LLL Result
 LLL finds short vectors in lattice of M
 Matrix M’ is result of applying LLL to M

 Column marked with “” has the right form


 Possible solution: U = [1,0,0,1,0,1,1,0]T
 Easy to verify this is actually the plaintext
Part 1  Cryptography 318
Bottom Line
 Lattice reduction is a surprising
method of attack on knapsack
A cryptosystem is only secure as long
as nobody has found an attack
 Lesson: Advances in mathematics
can break cryptosystems!

Part 1  Cryptography 319


Hellman’s TMTO Attack

Part 1  Cryptography 320


Popcnt
 Before we consider Hellman’s attack,
consider a simple Time-Memory TradeOff
 “Population count” or popcnt
o Let x be a 32-bit integer
o Define popcnt(x) = number of 1’s in binary
expansion of x
o How to compute popcnt(x) efficiently?

Part 1  Cryptography 321


Simple Popcnt
 Most obvious thing to do is
popcnt(x) // assuming x is 32-bit value
t=0
for i = 0 to 31
t = t + ((x >> i) & 1)
next i
return t
end popcnt
 But is it the most efficient?
Part 1  Cryptography 322
More Efficient Popcnt
 Precompute popcnt for all 256 bytes
 Store precomputed values in a table
 Given x, lookup its bytes in this table
o Sum these values to find popcnt(x)
 Note that precomputation is done once
 Each popcnt now requires 4 steps, not 32

Part 1  Cryptography 323


More Efficient Popcnt
Initialize: table[i] = popcnt(i) for i = 0,1,…,255

popcnt(x) // assuming x is 32-bit value


p = table[ x & 0xff ]
+ table[ (x >> 8) & 0xff ]
+ table[ (x >> 16) & 0xff ]
+ table[ (x >> 24) & 0xff ]
return p
end popcnt

Part 1  Cryptography 324


TMTO Basics
 A precomputation
o One-time work
o Results stored in a table
 Precomputation results used to make each
subsequent computation faster
 Balancing “memory” and “time”
 In general, larger precomputation requires
more initial work and larger “memory” but
each subsequent computation is less “time”
Part 1  Cryptography 325
Block Cipher Notation
 Consider a block cipher
C = E(P, K)
where
P is plaintext block of size n
C is ciphertext block of size n
K is key of size k

Part 1  Cryptography 326


Block Cipher as Black Box

 For TMTO, treat block cipher as black box


 Details of crypto algorithm not important

Part 1  Cryptography 327


Hellman’s TMTO Attack
 Chosen plaintext attack: choose P and
obtain C, where C = E(P, K)
 Want to find the key K
 Two “obvious” approaches
1. Exhaustive key search
o “Memory” is 0, but “time” of 2k-1 for each attack
2. Pre-compute C = E(P, K) for all possible K
o Then given C, can simply look up key K in the table
o “Memory” of 2k but “time” of 0 for each attack

 TMTO lies between 1. and 2.

Part 1  Cryptography 328


Chain of Encryptions
 Assume block and key lengths equal: n = k
 Then a chain of encryptions is
SP = K0 = Starting Point
K1 = E(P, SP)
K2 = E(P, K1)
:
:
EP = Kt = E(P, Kt1) = End Point

Part 1  Cryptography 329


Encryption Chain

 Ciphertext used as key at next iteration


 Same (chosen) plaintext at each iteration

Part 1  Cryptography 330


Pre-computation
 Pre-compute m encryption chains, each
of length t +1
 Save only the start and end points
EP0
(SP0, EP0) SP0
EP1
(SP1, EP1) SP1

: EPm-1
SPm-1
(SPm-1, EPm-1)

Part 1  Cryptography 331


TMTO Attack
 Memory: Pre-compute encryption chains and
save (SPi, EPi) for i = 0,1,…,m1
o This is one-time work
 Then to attack a particular unknown key K
o For the same chosen P used to find chains, we
know C where C = E(P, K) and K is unknown key
o Time: Compute the chain (maximum of t steps)
X0 = C, X1 = E(P, X0), X2 = E(P, X1),…

Part 1  Cryptography 332


TMTO Attack
 Consider the computed chain
X0 = C, X1 = E(P, X0), X2 = E(P, X1),…
 Suppose for some i we find Xi = EPj
EPj
SPj C

 Since C = E(P, K) key K before C in


chain!
Part 1  Cryptography 333
TMTO Attack
 To summarize, we compute chain
X0 = C, X1 = E(P, X0), X2 = E(P, X1),…
 If for some i we find Xi = EPj
 Then reconstruct chain from SPj
Y0 = SPj, Y1 = E(P,Y0), Y2 = E(P,Y1),…
 Find C = Yti = E(P, Yti1) (always?)
 Then K = Yti1 (always?)

Part 1  Cryptography 334


Trudy’s Perfect World
 Suppose block cipher has k = 56
o That is, the key length is 56 bits
 Suppose we find m = 228 chains, each of
length t = 228 and no chains overlap
 Memory: 228 pairs (SPj, EPi)
 Time: about 228 (per attack)
o Start at C, find some EPj in about 227 steps
o Find K with about 227 more steps
 Attack never fails
Part 1  Cryptography 335
Trudy’s Perfect World
 No chains overlap
 Any ciphertext C is in some chain

SP0 EP0

C EP1
SP1
K
EP2
SP2

Part 1  Cryptography 336


The Real World
 Chains are not so well-behaved!
 Chains can cycle and merge

K C
EP

SP

 Chain from C goes to EP


 Chain from SP to EP does not contain K
 Is this Trudy’s nightmare?
Part 1  Cryptography 337
Real-World TMTO Issues
 Merging, cycles, false alarms, etc.
 Pre-computation is lots of work
o Must attack many times to make it worthwhile
 Success is not assured
o Probability depends on initial work
 What if block size not equal key length?
o This is easy to deal with
 What is the probability of success?
o This is not so easy to compute
Part 1  Cryptography 338
To Reduce Merging
 Compute chain as F(E(P, Ki1)) where F
permutes the bits
 Chains computed using different functions
can intersect, but they will not merge

SP0
F0 chain
EP1

SP1 F1 chain
EP0

Part 1  Cryptography 339


Hellman’s TMTO in Practice
 Let
o m = random starting points for each F
o t = encryptions in each chain
o r = number of “random” functions F
 Then mtr = total precomputed chain elements
 Pre-computation is O(mtr) work
 Each TMTO attack requires
o O(mr) “memory” and O(tr) “time”
 If we choose m = t = r = 2k/3 then
o Probability of success is at least 0.55
Part 1  Cryptography 340
TMTO: The Bottom Line
 Attack is feasible against DES
 Pre-computation is about 256 work
 Each attack requires about
o 237 “memory”
o 237 “time”
 Attack is not particular to DES
 No fancy math is required!
 Lesson: Clever algorithms can break crypto!
Part 1  Cryptography 341
Crypto Summary
 Terminology
 Symmetric key crypto
o Stream ciphers
 A5/1 and RC4
o Block ciphers
 DES, AES, TEA
 Modes of operation
 Integrity

Part 1  Cryptography 342


Crypto Summary
 Public key crypto
o Knapsack
o RSA
o Diffie-Hellman
o ECC
o Non-repudiation
o PKI, etc.

Part 1  Cryptography 343


Crypto Summary
 Hashing
o Birthday problem
o Tiger hash
o HMAC
 Secret sharing
 Random numbers

Part 1  Cryptography 344


Crypto Summary
 Information hiding
o Steganography
o Watermarking
 Cryptanalysis
o Linear and differential cryptanalysis
o RSA timing attack
o Knapsack attack
o Hellman’s TMTO
Part 1  Cryptography 345
Coming Attractions…
 Access Control
o Authentication -- who goes there?
o Authorization -- can you do that?
 We’ll see some crypto in next chapter
 We’llsee lots of crypto in protocol
chapters

Part 1  Cryptography 346


Part II: Access Control

Part 2  Access Control 347


Access Control
 Two parts to access control…
 Authentication: Are you who you say you are?
o Determine whether access is allowed or not
o Authenticate human to machine
o Or, possibly, machine to machine
 Authorization: Are you allowed to do that?
o Once you have access, what can you do?
o Enforces limits on actions
 Note: “access control” often used as synonym
for authorization
Part 2  Access Control 348
Chapter 7: Authentication
Guard: Halt! Who goes there?
Arthur: It is I, Arthur, son of Uther Pendragon,
from the castle of Camelot. King of the Britons,
defeater of the Saxons, sovereign of all England!
 Monty Python and the Holy Grail

Then said they unto him, Say now Shibboleth:


and he said Sibboleth: for he could not frame to pronounce it right.
Then they took him, and slew him at the passages of Jordan:
and there fell at that time of the Ephraimites forty and two thousand.
 Judges 12:6

Part 2  Access Control 349


Are You Who You Say You Are?
 Authenticate a human to a machine?
 Can be based on…
o Something you know
 For example, a password
o Something you have
 For example, a smartcard
o Something you are
 For example, your fingerprint

Part 2  Access Control 350


Something You Know
 Passwords

 Lots of things act as passwords!


o PIN
o Social security number
o Mother’s maiden name
o Date of birth
o Name of your pet, etc.

Part 2  Access Control 351


Trouble with Passwords
 “Passwords are one of the biggest practical
problems facing security engineers today.”
 “Humans are incapable of securely storing
high-quality cryptographic keys, and they
have unacceptable speed and accuracy when
performing cryptographic operations. (They
are also large, expensive to maintain, difficult
to manage, and they pollute the environment.
It is astonishing that these devices continue
to be manufactured and deployed.)”
Part 2  Access Control 352
Why Passwords?
 Why is “something you know” more
popular than “something you have” and
“something you are”?
 Cost: passwords are free
 Convenience: easier for sysadmin to
reset pwd than to issue a new thumb

Part 2  Access Control 353


Keys vs Passwords

 Crypto keys  Passwords


 Spse key is 64 bits  Spse passwords are 8
characters, and 256
 Then 264 keys
different characters
 Choose key at  Then 2568 = 264 pwds
random…  Users do not select
 …then attacker must passwords at random
try about 263 keys  Attacker has far less
than 263 pwds to try
(dictionary attack)
Part 2  Access Control 354
Good and Bad Passwords
 Bad passwords  Good Passwords?
o frank o jfIej,43j-EmmL+y
o Fido o 0986437653726
o Password 3
o incorrect o P0kem0N
o Pikachu o FSa7Yago
o 102560 o 0nceuP0nAt1m8
o AustinStamp
o PokeGCTall150

Part 2  Access Control 355


Password Experiment
 Three groups of users  each group
advised to select passwords as follows
o Group A: At least 6 chars, 1 non-letter
winner o Group B: Password based on passphrase
o Group C: 8 random characters
 Results
o Group A: About 30% of pwds easy to crack
o Group B: About 10% cracked
 Passwords easy to remember
o Group C: About 10% cracked
 Passwords hard to remember
Part 2  Access Control 356
Password Experiment
 User compliance hard to achieve
 In each case, 1/3rd did not comply
o And about 1/3rd of those easy to crack!
 Assigned passwords sometimes best
 If passwords not assigned, best advice is…
o Choose passwords based on passphrase
o Use pwd cracking tool to test for weak pwds
 Require periodic password changes?
Part 2  Access Control 357
Attacks on Passwords
 Attacker could…
o Target one particular account
o Target any account on system
o Target any account on any system
o Attempt denial of service (DoS) attack
 Common attack path
o Outsider  normal user  administrator
o May only require one weak password!

Part 2  Access Control 358


Password Retry
 Suppose system locks after 3 bad
passwords. How long should it lock?
o 5 seconds
o 5 minutes
o Until SA restores service
 What are +’s and -’s of each?

Part 2  Access Control 359


Password File?
 Bad idea to store passwords in a file
 But we need to verify passwords
 Solution? Hash passwords
o Store y = h(password)
o Can verify entered password by hashing
o If Trudy obtains the password file, she
does not (directly) obtain passwords
 But Trudy can try a forward search
o Guess x and check whether y = h(x)
Part 2  Access Control 360
Dictionary Attack
 Trudy pre-computes h(x) for all x in a
dictionary of common passwords
 Suppose Trudy gets access to password
file containing hashed passwords
o She only needs to compare hashes to her pre-
computed dictionary
o After one-time work of computing hashes in
dictionary, actual attack is trivial
 Can we prevent this forward search
attack? Or at least make it more difficult?
Part 2  Access Control 361
Salt
 Hash password with salt
 Choose random salt s and compute
y = h(password, s)
and store (s,y) in the password file
 Note that the salt s is not secret
o Analogous to IV
 Still easy to verify salted password
 But lots more work for Trudy
o Why?

Part 2  Access Control 362


Password Cracking:
Do the Math
 Assumptions:
 Pwds are 8 chars, 128 choices per character
o Then 1288 = 256 possible passwords
 There is a password file with 210 pwds
 Attacker has dictionary of 220 common pwds
 Probability 1/4 that password is in dictionary
 Work is measured by number of hashes

Part 2  Access Control 363


Password Cracking: Case I
 Attack 1 specific password without
using a dictionary
o E.g., administrator’s password
o Must try 256/2 = 255 on average
o Like exhaustive key search
 Does salt help in this case?

Part 2  Access Control 364


Password Cracking: Case II
 Attack 1 specific password with
dictionary
 With salt
o Expected work: 1/4 (219) + 3/4 (255) ≈ 254.6
o In practice, try all pwds in dictionary…
o …then work is at most 220 and probability of
success is 1/4
 What if no salt is used?
o One-time work to compute dictionary: 220
o Expected work is of same order as above
o But with precomputed dictionary hashes, the
“in practice” attack is essentially free…
Part 2  Access Control 365
Password Cracking: Case III
 Any of 1024 pwds in file, without dictionary
o Assume all 210 passwords are distinct
o Need 255 comparisons before expect to find pwd
 If no salt is used
o Each computed hash yields 210 comparisons
o So expected work (hashes) is 255/210 = 245
 If salt is used
o Expected work is 255
o Each comparison requires a hash computation
Part 2  Access Control 366
Password Cracking: Case IV
 Any of 1024 pwds in file, with dictionary
o Prob. one or more pwd in dict.: 1 – (3/4)1024 ≈ 1
o So, we ignore case where no pwd is in dictionary
 If salt is used, expected work less than 222
o See book, or slide notes for details
o Work ≈ size of dictionary / P(pwd in dictionary)
 What if no salt is used?
o If dictionary hashes not precomputed, work is
about 219/210 = 29

Part 2  Access Control 367


Other Password Issues
 Too many passwords to remember
o Results in password reuse
o Why is this a problem?
 Who suffers from bad password?
o Login password vs ATM PIN
 Failure to change default passwords
 Social engineering
 Error logs may contain “almost” passwords
 Bugs, keystroke logging, spyware, etc.

Part 2  Access Control 368


Passwords
 The bottom line…
 Password attacks are too easy
o Often, one weak password will break security
o Users choose bad passwords
o Social engineering attacks, etc.
 Trudy has (almost) all of the advantages
 All of the math favors bad guys
 Passwords are a BIG security problem
o And will continue to be a problem
Part 2  Access Control 369
Password Cracking Tools
 Popular password cracking tools
o Password Crackers
o Password Portal
o L0phtCrack and LC4 (Windows)
o John the Ripper (Unix)
 Admins should use these tools to test for
weak passwords since attackers will
 Good articles on password cracking
o Passwords - Conerstone of Computer Security
o Passwords revealed by sweet deal
Part 2  Access Control 370
Biometrics

Part 2  Access Control 371


Something You Are
 Biometric
o “You are your key”  Schneier
 Examples
o Fingerprint
o Handwritten signature Are
o Facial recognition Know Have
o Speech recognition
o Gait (walking) recognition
o “Digital doggie” (odor recognition)
o Many more!
Part 2  Access Control 372
Why Biometrics?
 May be better than passwords
 But, cheap and reliable biometrics needed
o Today, an active area of research
 Biometrics are used in security today
o Thumbprint mouse
o Palm print for secure entry
o Fingerprint to unlock car door, etc.
 But biometrics not really that popular
o Has not lived up to its promise/hype (yet?)

Part 2  Access Control 373


Ideal Biometric
 Universal  applies to (almost) everyone
o In reality, no biometric applies to everyone
 Distinguishing  distinguish with certainty
o In reality, cannot hope for 100% certainty
 Permanent  physical characteristic being
measured never changes
o In reality, OK if it to remains valid for long time
 Collectable  easy to collect required data
o Depends on whether subjects are cooperative
 Also, safe, user-friendly, and ???
Part 2  Access Control 374
Identification vs Authentication
 Identification  Who goes there?
o Compare one-to-many
o Example: FBI fingerprint database
 Authentication  Are you who you say you are?
o Compare one-to-one
o Example: Thumbprint mouse
 Identification problem is more difficult
o More “random” matches since more comparisons
 We are (mostly) interested in authentication
Part 2  Access Control 375
Enrollment vs Recognition
 Enrollment phase
o Subject’s biometric info put into database
o Must carefully measure the required info
o OK if slow and repeated measurement needed
o Must be very precise
o May be a weak point in real-world use
 Recognition phase
o Biometric detection, when used in practice
o Must be quick and simple
o But must be reasonably accurate
Part 2  Access Control 376
Cooperative Subjects?
 Authentication  cooperative subjects
 Identification  uncooperative subjects
 For example, facial recognition
o Used in Las Vegas casinos to detect known
cheaters (also, terrorists in airports, etc.)
o Often, less than ideal enrollment conditions
o Subject will try to confuse recognition phase
 Cooperative subject makes it much easier
o We are focused on authentication
o So, we can assume subjects are cooperative

Part 2  Access Control 377


Biometric Errors
 Fraud rate versus insult rate
o Fraud  Trudy mis-authenticated as Alice
o Insult  Alice not authenticated as Alice
 For any biometric, can decrease fraud or
insult, but other one will increase
 For example
o 99% voiceprint match  low fraud, high insult
o 30% voiceprint match  high fraud, low insult
 Equal error rate: rate where fraud == insult
o A way to compare different biometrics
Part 2  Access Control 378
Fingerprint History
 1823  Professor Johannes Evangelist
Purkinje discussed 9 fingerprint patterns
 1856  Sir William Hershel used
fingerprint (in India) on contracts
 1880  Dr. Henry Faulds article in Nature
about fingerprints for ID
 1883  Mark Twain’s Life on the
Mississippi (murderer ID’ed by fingerprint)

Part 2  Access Control 379


Fingerprint History
 1888  Sir Francis Galton developed
classification system
o His system of “minutia” can be used today
o Also verified that fingerprints do not change
 Some countries require fixed number of
“points” (minutia) to match in criminal cases
o In Britain, at least 15 points
o In US, no fixed number of points

Part 2  Access Control 380


Fingerprint Comparison
 Examples of loops, whorls, and arches
 Minutia extracted from these features

Loop (double) Whorl Arch

Part 2  Access Control 381


Fingerprint: Enrollment

 Capture image of fingerprint


 Enhance image
 Identify “points”
Part 2  Access Control 382
Fingerprint: Recognition

 Extracted points are compared with


information stored in a database
 Is it a statistical match?
 Aside: Do identical twins’ fingerprints differ?
Part 2  Access Control 383
Hand Geometry
 A popular biometric
 Measures shape of hand
o Width of hand, fingers
o Length of fingers, etc.
 Human hands not so unique
 Hand geometry sufficient
for many situations
 OK for authentication
 Not useful for ID problem

Part 2  Access Control 384


Hand Geometry
 Advantages
o Quick  1 minute for enrollment,
5 seconds for recognition
o Hands are symmetric  so what?
 Disadvantages
o Cannot use on very young or very old
o Relatively high equal error rate

Part 2  Access Control 385


Iris Patterns

 Iris pattern development is “chaotic”


 Little or no genetic influence
 Even for identical twins, uncorrelated
 Pattern is stable through lifetime
Part 2  Access Control 386
Iris Recognition: History
 1936  suggested by ophthalmologist
 1980s  James Bond film(s)
 1986  first patent appeared
 1994  John Daugman patents new-
and-improved technique
o Patents owned by Iridian Technologies

Part 2  Access Control 387


Iris Scan
 Scanner locates iris
 Take b/w photo
 Use polar coordinates…
 2-D wavelet transform
 Get 256 byte iris code

Part 2  Access Control 388


Measuring Iris Similarity
 Based on Hamming distance
 Define d(x,y) to be
o # of non-match bits / # of bits compared
o d(0010,0101) = 3/4 and d(101111,101001) = 1/3
 Compute d(x,y) on 2048-bit iris code
o Perfect match is d(x,y) = 0
o For same iris, expected distance is 0.08
o At random, expect distance of 0.50
o Accept iris scan as match if distance < 0.32
Part 2  Access Control 389
Iris Scan Error Rate
distance Fraud rate

0.29 1 in 1.31010
0.30 1 in 1.5109
0.31 1 in 1.8108
0.32 1 in 2.6107
0.33 1 in 4.0106
0.34 1 in 6.9105
0.35 1 in 1.3105
== equal error rate
Part 2  Access Control
distance 390
Attack on Iris Scan
 Good photo of eye can be scanned
o Attacker could use photo of eye
 Afghan woman was authenticated by
iris scan of old photo
o Story can be found here
 To prevent attack, scanner could use
light to be sure it is a “live” iris

Part 2  Access Control 391


Equal Error Rate Comparison
 Equal error rate (EER): fraud == insult rate
 Fingerprint biometrics used in practice have
EER ranging from about 10-3 to as high as 5%
 Hand geometry has EER of about 10-3
 In theory, iris scan has EER of about 10-6
o Enrollment phase may be critical to accuracy
 Most biometrics much worse than fingerprint!
 Biometrics useful for authentication…
o …but for identification, not so impressive today

Part 2  Access Control 392


Biometrics: The Bottom Line
 Biometrics are hard to forge
 But attacker could
o Steal Alice’s thumb
o Photocopy Bob’s fingerprint, eye, etc.
o Subvert software, database, “trusted path” …
 And how to revoke a “broken” biometric?
 Biometrics are not foolproof
 Biometric use is relatively limited today
 That should change in the (near?) future

Part 2  Access Control 393


Something You Have
 Something in your possession
 Examples include following…
o Car key
o Laptop computer (or MAC address)
o Password generator (next)
o ATM card, smartcard, etc.

Part 2  Access Control 394


Password Generator
1. “I’m Alice”
3. PIN, R
2. R
4. h(K,R)
password
generator 5. h(K,R)
K Alice Bob, K
 Alice receives random “challenge” R from Bob
 Alice enters PIN and R in password generator
 Password generator hashes symmetric key K with R
 Alice sends “response” h(K,R) back to Bob
 Bob verifies response
 Note: Alice has pwd generator and knows PIN
Part 2  Access Control 395
2-factor Authentication
 Requires any 2 out of 3 of
o Something you know
o Something you have
o Something you are
 Examples
o ATM: Card and PIN
o Credit card: Card and signature
o Password generator: Device and PIN
o Smartcard with password/PIN

Part 2  Access Control 396


Single Sign-on
 A hassle to enter password(s) repeatedly
o Alice would like to authenticate only once
o “Credentials” stay with Alice wherever she goes
o Subsequent authentications transparent to Alice
 Kerberos  a single sign-on protocol
 Single sign-on for the Internet?
o Microsoft: Passport
o Everybody else: Liberty Alliance
o Security Assertion Markup Language (SAML)

Part 2  Access Control 397


Web Cookies
 Cookie is provided by a Website and stored
on user’s machine
 Cookie indexes a database at Website
 Cookies maintain state across sessions
o Web uses a stateless protocol: HTTP
o Cookies also maintain state within a session
 Sorta like a single sign-on for a website
o But, very, very weak form of authentication
 Cookies also create privacy concerns
Part 2  Access Control 398
Authorization

Part 2  Access Control 399


Chapter 8: Authorization
It is easier to exclude harmful passions than to rule them,
and to deny them admittance
than to control them after they have been admitted.
 Seneca

You can always trust the information given to you


by people who are crazy;
they have an access to truth not available through regular channels.
 Sheila Ballantyne

Part 2  Access Control 400


Authentication vs
Authorization
 Authentication  Are you who you say you are?
o Restrictions on who (or what) can access system

 Authorization  Are you allowed to do that?


o Restrictions on actions of authenticated users

 Authorization is a form of access control


 But first, we look at system certification…

Part 2  Access Control 401


System Certification
 Government attempt to certify
“security level” of products
 Of historical interest
o Sorta like a history of authorization
 Still important today if you want to
sell a product to the government
o Tempting to argue it’s a failure since
government is so insecure, but…
Part 2  Access Control 402
Orange Book
 Trusted Computing System Evaluation
Criteria (TCSEC), 1983
o Universally known as the “orange book”
o Name is due to color of it’s cover
o About 115 pages
o Developed by U.S. DoD (NSA)
o Part of the “rainbow series”
 Orange book generated a pseudo-religious
fervor among some people
o Less and less intensity as time goes by
Part 2  Access Control 403
Orange Book Outline
 Goals
o Provide way to assess security products
o Provide general guidance/philosophy on
how to build more secure products
 Four divisions labeled D thru A
o D is lowest, A is highest
 Divisions split into numbered classes

Part 2  Access Control 404


D and C Divisions
D  minimal protection
o Losers that can’t get into higher division
C  discretionary protection, i.e.,
don’t enforce security, just have
means to detect breaches (audit)
o C1  discretionary security protection
o C2  controlled access protection
o C2 slightly stronger than C1 (both vague)
Part 2  Access Control 405
B Division
B  mandatory protection
 B is a huge step up from C
o C: break security, you might get caught
o B: “mandatory”, so you can’t break it
 B1  labeled security protection
o All data labeled, which restricts what
can be done with it
o This access control cannot be violated
Part 2  Access Control 406
B and A Divisions
 B2  structured protection
o Adds covert channel protection onto B1
 B3  security domains
o On top of B2 protection, adds that code
must be tamperproof and “small”
A  verified protection
o Like B3, but proved using formal methods
o Such methods still (mostly) impractical
Part 2  Access Control 407
Orange Book: Last Word
 Also a 2nd part, discusses rationale
 Not very practical or sensible, IMHO
 But some people insist we’d be better
off if we’d followed it
 Others think it was a dead end
o And resulted in lots of wasted effort
o Aside… people who made the orange
book, now set security education
standards
Part 2  Access Control 408
Common Criteria
 Successor to the orange book (ca. 1998)
o Due to inflation, more than 1000 pages
 An international government standard
o And it reads like it…
o Won’t ever stir same passions as orange book
 CC is relevant in practice, but usually only
if you want to sell to the government
 Evaluation Assurance Levels (EALs)
o 1 thru 7, from lowest to highest security

Part 2  Access Control 409


EAL
 Note: product with high EAL may not be
more secure than one with lower EAL
o Why?
 Similarly,product with an EAL may not
be any more secure than one without
o Why?

Part 2  Access Control 410


EAL 1 thru 7
 EAL1 functionally tested
 EAL2  structurally tested
 EAL3  methodically tested, checked
 EAL4  designed, tested, reviewed
 EAL5  semiformally designed, tested
 EAL6  verified, designed, tested
 EAL7  formally … (blah blah blah)

Part 2  Access Control 411


Common Criteria
 EAL4 is most commonly sought
o Minimum needed to sell to government
 EAL7 requires formal proofs
o Author could only find 2 EAL7 products…
 Who performs evaluations?
o Government accredited labs, of course
(for a hefty fee, like 6 figures)

Part 2  Access Control 412


Authentication vs
Authorization
 Authentication  Are you who you say you are?
o Restrictions on who (or what) can access system
 Authorization  Are you allowed to do that?
o Restrictions on actions of authenticated users
 Authorization is a form of access control
 Classic view of authorization…
o Access Control Lists (ACLs)
o Capabilities (C-lists)

Part 2  Access Control 413


Lampson’s Access Control Matrix
 Subjects (users) index the rows
 Objects (resources) index the columns
Accounting Accounting Insurance Payroll
OS program data data data

Bob rx rx r  

Alice rx rx r rw rw

Sam rwx rwx r rw rw


Accounting
program rx rx rw rw rw
Part 2  Access Control 414
Are You Allowed to Do That?
 Access control matrix has all relevant info
 Could be 100’s of users, 10,000’s of resources
o Then matrix has 1,000,000’s of entries
 How to manage such a large matrix?
 Note: We need to check this matrix before
access to any resource by any user
 How to make this more efficient/practical?

Part 2  Access Control 415


Access Control Lists (ACLs)
 ACL: store access control matrix by column
 Example: ACL for insurance data is in blue
Accounting Accounting Insurance Payroll
OS program data data data

Bob rx rx r  

Alice rx rx r rw rw

Sam rwx rwx r rw rw


Accounting
program rx rx rw rw rw
Part 2  Access Control 416
Capabilities (or C-Lists)
 Store access control matrix by row
 Example: Capability for Alice is in red
Accounting Accounting Insurance Payroll
OS program data data data

Bob rx rx r  

Alice rx rx r rw rw

Sam rwx rwx r rw rw


Accounting
program rx rx rw rw rw
Part 2  Access Control 417
ACLs vs Capabilities
r r
Alice --- file1 Alice w file1
r rw

w ---
Bob r file2 Bob r file2
--- r

rw r
Fred r file3 Fred --- file3
r r

Access Control List Capability

 Note that arrows point in opposite directions…


 With ACLs, still need to associate users to files
Part 2  Access Control 418
Confused Deputy
 Two resources  Access control matrix
o Compiler and BILL
file (billing info) Compiler BILL
 Compiler can write Alice x 
file BILL
 Alice can invoke Compiler rx rw
compiler with a
debug filename
 Alice not allowed to
write to BILL

Part 2  Access Control 419


ACL’s and Confused Deputy

Compiler

Alice BILL

 Compiler is deputy acting on behalf of Alice


 Compiler is confused
o Alice is not allowed to write BILL
 Compiler has confused its rights with Alice’s
Part 2  Access Control 420
Confused Deputy
 Compiler acting for Alice is confused
 There has been a separation of authority
from the purpose for which it is used
 With ACLs, more difficult to prevent this
 With Capabilities, easier to prevent problem
o Must maintain association between authority and
intended purpose
 Capabilities  easy to delegate authority

Part 2  Access Control 421


ACLs vs Capabilities
 ACLs
o Good when users manage their own files
o Protection is data-oriented
o Easy to change rights to a resource
 Capabilities
o Easy to delegate  avoid the confused deputy
o Easy to add/delete users
o More difficult to implement
o The “Zen of information security”
 Capabilities loved by academics
o Capability Myths Demolished

Part 2  Access Control 422


Multilevel Security (MLS)
Models

Part 2  Access Control 423


Classifications and Clearances
 Classificationsapply to objects
 Clearances apply to subjects
 US Department of Defense (DoD)
uses 4 levels:
TOP SECRET
SECRET
CONFIDENTIAL
UNCLASSIFIED

Part 2  Access Control 424


Clearances and Classification
 To obtain a SECRET clearance
requires a routine background check
 A TOP SECRET clearance requires
extensive background check
 Practical classification problems
o Proper classification not always clear
o Level of granularity to apply
classifications
o Aggregation  flipside of granularity
Part 2  Access Control 425
Subjects and Objects
 Let O be an object, S a subject
o O has a classification
o S has a clearance
o Security level denoted L(O) and L(S)
 For DoD levels, we have
TOP SECRET > SECRET >
CONFIDENTIAL > UNCLASSIFIED

Part 2  Access Control 426


Multilevel Security (MLS)
 MLS needed when subjects/objects at
different levels access same system
 MLS is a form of Access Control
 Military and government interest in MLS
for many decades
o Lots of research into MLS
o Strengths and weaknesses of MLS well
understood (almost entirely theoretical)
o Many possible uses of MLS outside military

Part 2  Access Control 427


MLS Applications
 Classified government/military systems
 Business example: info restricted to
o Senior management only, all management,
everyone in company, or general public
 Network firewall
 Confidential medical info, databases, etc.
 Usually, MLS not really a technical system
o More like part of a legal structure

Part 2  Access Control 428


MLS Security Models
 MLS models explain what needs to be done
 Models do not tell you how to implement
 Models are descriptive, not prescriptive
o That is, high-level description, not an algorithm
 There are many MLS models
 We’ll discuss simplest MLS model
o Other models are more realistic
o Other models also more complex, more difficult
to enforce, harder to verify, etc.
Part 2  Access Control 429
Bell-LaPadula
 BLP security model designed to express
essential requirements for MLS
 BLP deals with confidentiality
o To prevent unauthorized reading
 Recall that O is an object, S a subject
o Object O has a classification
o Subject S has a clearance
o Security level denoted L(O) and L(S)

Part 2  Access Control 430


Bell-LaPadula
 BLP consists of
Simple Security Condition: S can read O
if and only if L(O)  L(S)
*-Property (Star Property): S can write O
if and only if L(S)  L(O)
 No read up, no write down

Part 2  Access Control 431


McLean’s Criticisms of BLP
 McLean: BLP is “so trivial that it is hard to
imagine a realistic security model for which it
does not hold”
 McLean’s “system Z” allowed administrator to
reclassify object, then “write down”
 Is this fair?
 Violates spirit of BLP, but not expressly
forbidden in statement of BLP
 Raises fundamental questions about the
nature of (and limits of) modeling
Part 2  Access Control 432
B and LP’s Response
 BLP enhanced with tranquility property
o Strong tranquility: security labels never change
o Weak tranquility: security label can only change
if it does not violate “established security policy”
 Strong tranquility impractical in real world
o Often want to enforce “least privilege”
o Give users lowest privilege for current work
o Then upgrade as needed (and allowed by policy)
o This is known as the high water mark principle
 Weak tranquility allows for least privilege
(high water mark), but the property is vague
Part 2  Access Control 433
BLP: The Bottom Line
 BLP is simple, probably too simple
 BLP is one of the few security models that
can be used to prove things about systems
 BLP has inspired other security models
o Most other models try to be more realistic
o Other security models are more complex
o Models difficult to analyze, apply in practice

Part 2  Access Control 434


Biba’s Model
 BLP for confidentiality, Biba for integrity
o Biba is to prevent unauthorized writing
 Biba is (in a sense) the dual of BLP
 Integrity model
o Spse you trust the integrity of O but not O
o If object O includes O and O then you cannot
trust the integrity of O
 Integrity level of O is minimum of the
integrity of any object in O
 Low water mark principle for integrity
Part 2  Access Control 435
Biba
 Let I(O) denote the integrity of object O
and I(S) denote the integrity of subject S
 Biba can be stated as
Write Access Rule: S can write O if and only if
I(O)  I(S)
(if S writes O, the integrity of O  that of S)
Biba’s Model: S can read O if and only if
I(S)  I(O)
(if S reads O, the integrity of S  that of O)
 Often, replace Biba’s Model with
Low Water Mark Policy: If S reads O, then
I(S) = min(I(S), I(O))
Part 2  Access Control 436
BLP vs Biba
high BLP Biba high

l L(O) L(O) I(O) l


e e
v v
e e
l L(O) I(O) I(O) l

low Confidentiality Integrity low

Part 2  Access Control 437


Compartments

Part 2  Access Control 438


Compartments
 Multilevel Security (MLS) enforces access
control up and down
 Simple hierarchy of security labels is
generally not flexible enough
 Compartments enforces restrictions across
 Suppose TOP SECRET divided into TOP
SECRET {CAT} and TOP SECRET {DOG}
 Both are TOP SECRET but information flow
restricted across the TOP SECRET level
Part 2  Access Control 439
Compartments
 Why compartments?
o Why not create a new classification level?
 May not want either of
o TOP SECRET {CAT}  TOP SECRET {DOG}
o TOP SECRET {DOG}  TOP SECRET {CAT}
 Compartments designed to enforce the need
to know principle
o Regardless of clearance, you only have access to
info that you need to know to do your job

Part 2  Access Control 440


Compartments
 Arrows indicate “” relationship
TOP SECRET {CAT, DOG}

TOP SECRET {CAT} TOP SECRET {DOG}

TOP SECRET

SECRET {CAT, DOG}

SECRET {CAT} SECRET {DOG}

SECRET
 Not all classifications are comparable, e.g.,
TOP SECRET {CAT} vs SECRET {CAT, DOG}
Part 2  Access Control 441
MLS vs Compartments
 MLS can be used without compartments
o And vice-versa
 But, MLS almost always uses compartments
 Example
o MLS mandated for protecting medical records of
British Medical Association (BMA)
o AIDS was TOP SECRET, prescriptions SECRET
o What is the classification of an AIDS drug?
o Everything tends toward TOP SECRET
o Defeats the purpose of the system!
o Compartments-only approach used instead
Part 2  Access Control 442
Covert Channel

Part 2  Access Control 443


Covert Channel
 MLS designed to restrict legitimate
channels of communication
 May be other ways for information to flow
 For example, resources shared at
different levels could be used to “signal”
information
 Covert channel: a communication path not
intended as such by system’s designers

Part 2  Access Control 444


Covert Channel Example
 Alice has TOP SECRET clearance, Bob has
CONFIDENTIAL clearance
 Suppose the file space shared by all users
 Alice creates file FileXYzW to signal “1” to
Bob, and removes file to signal “0”
 Once per minute Bob lists the files
o If file FileXYzW does not exist, Alice sent 0
o If file FileXYzW exists, Alice sent 1
 Alice can leak TOP SECRET info to Bob
Part 2  Access Control 445
Covert Channel Example

Alice: Create file Delete file Create file Delete file

Bob: Check file Check file Check file Check file Check file

Data: 1 0 1 1 0

Time:

Part 2  Access Control 446


Covert Channel
 Other possible covert channels?
o Print queue
o ACK messages
o Network traffic, etc.
 When does covert channel exist?
1. Sender and receiver have a shared resource
2. Sender able to vary some property of resource
that receiver can observe
3. “Communication” between sender and receiver
can be synchronized
Part 2  Access Control 447
Covert Channel
 Potential covert channels are everywhere
 But, it’s easy to eliminate covert channels:
o “Just” eliminate all shared resources and all
communication!
 Virtually impossible to eliminate covert
channels in any useful information system
o DoD guidelines: reduce covert channel capacity
to no more than 1 bit/second
o Implication? DoD has given up on eliminating
covert channels

Part 2  Access Control 448


Covert Channel
 Consider 100MB TOP SECRET file
o Plaintext stored in TOP SECRET location
o Ciphertext  encrypted with AES using 256-
bit key  stored in UNCLASSIFIED location
 Suppose we reduce covert channel capacity
to 1 bit per second
 It would take more than 25 years to leak
entire document thru a covert channel
 But it would take less than 5 minutes to
leak 256-bit AES key thru covert channel!
Part 2  Access Control 449
Real-World Covert Channel
bits
0 8 16 24 31

Source Port Destination Port


Sequence Number
Acknowledgement Number
Offset reserved U A P R S F Window
Checksum Urgent Pointer
Options Padding
Data (variable length)
 Hide data in TCP header “reserved” field
 Or use covert_TCP, tool to hide data in
o Sequence number
o ACK number
Part 2  Access Control 450
Real-World Covert Channel
 Hide data in TCP sequence numbers
 Tool: covert_TCP
 Sequence number X contains covert info

ACK (or RST)


SYN Source: B
Spoofed source: C Destination: C
Destination: B ACK: X
SEQ: X B. Innocent
server

A. Covert_TCP C. Covert_TCP
sender receiver
Part 2  Access Control 451
Inference Control

Part 2  Access Control 452


Inference Control Example
 Suppose we query a database
o Question: What is average salary of female CS
professors at SJSU?
o Answer: $95,000
o Question: How many female CS professors at
SJSU?
o Answer: 1
 Specific information has leaked from
responses to general questions!

Part 2  Access Control 453


Inference Control &
Research
 For example, medical records are
private but valuable for research
 How to make info available for
research and protect privacy?
 How to allow access to such data
without leaking specific information?

Part 2  Access Control 454


Naïve Inference Control
 Remove names from medical records?
 Stillmay be easy to get specific info
from such “anonymous” data
 Removing names is not enough
o As seen in previous example
 What more can be done?

Part 2  Access Control 455


Less-naïve Inference Control
 Query set size control
o Don’t return an answer if set size is too small
 N-respondent, k% dominance rule
o Do not release statistic if k% or more
contributed by N or fewer
o Example: Avg salary in Bill Gates’ neighborhood
o This approach used by US Census Bureau
 Randomization
o Add small amount of random noise to data
 Many other methods  none satisfactory
Part 2  Access Control 456
Netflix Example
 Netflix prize  $1M to first to improve
recommendation system by 10% or more
 Netflix created dataset for contest
o Movie preferences of real users
o Usernames removed, some “noise” added
 Insufficient inference control
o Researchers able to correlate IMDB
reviews with those in Netflix dataset
Part 2  Access Control 457
Something Better Than Nothing?
 Robust inference control may be impossible
 Is weak inference control better than nothing?
o Yes: Reduces amount of information that leaks
 Is weak covert channel protection better than
nothing?
o Yes: Reduces amount of information that leaks
 Is weak crypto better than no crypto?
o Probably not: Encryption indicates important data
o May be easier to filter encrypted data
Part 2  Access Control 458
CAPTCHA

Part 2  Access Control 459


Turing Test
 Proposed by Alan Turing in 1950
 Human asks questions to a human and a
computer, without seeing either
 If questioner cannot distinguish human
from computer, computer passes
 This is the gold standard in AI
 No computer can pass this today
o But some claim they are close to passing
Part 2  Access Control 460
CAPTCHA
 CAPTCHA
o Completely Automated Public Turing test to tell
Computers and Humans Apart
 Completely Automated  test is generated
and scored by a computer
 Public  program and data are public
 Turing test to tell…  humans can pass the
test, but machines cannot
o Also known as HIP == Human Interactive Proof
 Like an inverse Turing test (sort of…)
Part 2  Access Control 461
CAPTCHA Paradox?
 “…CAPTCHA is a program that can
generate and grade tests that it itself
cannot pass…”
 “…much like some professors…”
 Paradox  computer creates and scores
test that it itself cannot pass!
 CAPTCHA purpose?
o Only humans get access (not bots/computers)
 So, CAPTCHA is for access control
Part 2  Access Control 462
CAPTCHA Uses?
 Original motivation?
o Automated bots stuffed ballot box in vote for
best CS grad school
o SJSU vs Stanford? No, it was MIT vs CMU
 Free email services  spammers like to use
bots to sign up for 1000s of email accounts
o CAPTCHA employed so only humans get accounts
 Sites that do not want to be automatically
indexed by search engines
o CAPTCHA would force human intervention
Part 2  Access Control 463
CAPTCHA: Rules of the Game
 Easy for most humans to pass
 Difficult or impossible for machines to pass
o Even with access to CAPTCHA software
 From Trudy’s perspective, the only unknown
is a random number
o Similar to Kerckhoffs’ Principle
 Good to have different CAPTCHAs in case
someone cannot pass one type
o E.g., blind person could not pass visual CAPTCHA
Part 2  Access Control 464
Do CAPTCHAs Exist?
 Test: Find 2 words in the following

 Easy for most humans


 A (difficult?) OCR problem for computer
o OCR  Optical Character Recognition
Part 2  Access Control 465
CAPTCHAs
 Current types of CAPTCHAs
o Visual  like previous example
o Audio  distorted words or music
 No text-based CAPTCHAs
o Maybe this is impossible…

Part 2  Access Control 466


CAPTCHA’s and AI
 OCR is a challenging AI problem
o Hardest part is the segmentation problem
o Humans good at solving this problem
 Distorted sound makes good CAPTCHA
o Humans also good at solving this
 Hackers who break CAPTCHA have solved a
hard AI problem (such as OCR)
o So, putting hacker’s effort to good use!
 Other ways to defeat CAPTCHAs???
Part 2  Access Control 467
Firewalls

Part 2  Access Control 468


Firewalls

Internal
Internet Firewall network

 Firewall decides what to let in to internal


network and/or what to let out
 Access control for the network
Part 2  Access Control 469
Firewall as Secretary
 A firewall is like a secretary
 To meet with an executive
o First contact the secretary
o Secretary decides if meeting is important
o So, secretary filters out many requests
 You want to meet chair of CS department?
o Secretary does some filtering
 You want to meet POTUS?
o Secretary does lots of filtering
Part 2  Access Control 470
Firewall Terminology
 No standard firewall terminology
 Types of firewalls
o Packet filter  works at network layer
o Stateful packet filter  transport layer
o Application proxy  application layer
 Lots of other terms often used
o E.g., “deep packet inspection”

Part 2  Access Control 471


Packet Filter
 Operates at network layer
application
 Can filters based on…
o Source IP address transport
o Destination IP address
network
o Source Port
o Destination Port link
o Flag bits (SYN, ACK, etc.)
physical
o Egress or ingress

Part 2  Access Control 472


Packet Filter
 Advantages? application
o Speed
transport
 Disadvantages?
o No concept of state network

o Cannot see TCP connections link


o Blind to application data
physical

Part 2  Access Control 473


Packet Filter
 Configured via Access Control Lists (ACLs)
o Different meaning than at start of Chapter 8
Source Dest Source Dest Flag
Action IP IP Port Port Protocol Bits

Allow Inside Outside Any 80 HTTP Any

Allow Outside Inside 80 > 1023 HTTP ACK

Deny All All All All All All

 Q: Intention?
 A: Restrict traffic to Web browsing
Part 2  Access Control 474
TCP ACK Scan
 Attacker scans for open ports thru firewall
o Port scanning often first step in network attack
 Attacker sends packet with ACK bit set,
without prior 3-way handshake
o Violates TCP/IP protocol
o ACK packet pass thru packet filter firewall
o Appears to be part of an ongoing connection
o RST sent by recipient of such packet

Part 2  Access Control 475


TCP ACK Scan
ACK dest port 1207

ACK dest port 1208

ACK dest port 1209

Trudy RST Internal


Packet
Network
Filter

 Attacker knows port 1209 open thru firewall


 A stateful packet filter can prevent this
o Since scans not part of established connections
Part 2  Access Control 476
Stateful Packet Filter
 Adds state to packet filter application
 Operates at transport layer transport
 Remembers TCP connections, network
flag bits, etc.
 Can even remember UDP
link

packets (e.g., DNS requests) physical

Part 2  Access Control 477


Stateful Packet Filter
 Advantages? application
o Can do everything a packet filter
can do plus... transport
o Keep track of ongoing connections
network
(e.g., prevents TCP ACK scan)
 Disadvantages? link
o Cannot see application data
physical
o Slower than packet filtering

Part 2  Access Control 478


Application Proxy
 A proxy is something that
acts on your behalf application

 Application proxy looks at transport

incoming application data network


 Verifies that data is safe link
before letting it in
physical

Part 2  Access Control 479


Application Proxy
 Advantages?
application
o Complete view of connections
and applications data transport
o Filter bad data at application
layer (viruses, Word macros) network

 Disadvantages? link
o Speed
physical

Part 2  Access Control 480


Application Proxy
 Creates a new packet before sending it
thru to internal network
 Attacker must talk to proxy and convince
it to forward message
 Proxy has complete view of connection
 Can prevent some scans stateful packet
filter cannot  next slides

Part 2  Access Control 481


Firewalk
 Tool to scan for open ports thru firewall
 Attacker knows IP address of firewall and
IP address of one system inside firewall
o Set TTL to 1 more than number of hops to
firewall, and set destination port to N
 If firewall allows data on port N thru
firewall, get time exceeded error message
o Otherwise, no response

Part 2  Access Control 482


Firewalk and Proxy Firewall
Packet
filter
Trudy Router Router Router

Dest port 12343, TTL=4


Dest port 12344, TTL=4
Dest port 12345, TTL=4
Time exceeded

 This will not work thru an application proxy (why?)


 The proxy creates a new packet, destroys old TTL

Part 2  Access Control 483


Deep Packet Inspection
 Many buzzwords used for firewalls
o One example: deep packet inspection
 What could this mean?
 Look into packets, but don’t really
“process” the packets
o Like an application proxy, but faster

Part 2  Access Control 484


Firewalls and Defense in Depth
 Typical network security architecture
DMZ

FTP server
Web server

DNS server

Intranet with
Packet Application additional
Internet Filter Proxy defense

Part 2  Access Control 485


Intrusion Detection Systems

Part 2  Access Control 486


Intrusion Prevention
 Want to keep bad guys out
 Intrusion prevention is a traditional
focus of computer security
o Authentication is to prevent intrusions
o Firewalls a form of intrusion prevention
o Virus defenses aimed at intrusion
prevention
o Like locking the door on your car

Part 2  Access Control 487


Intrusion Detection
 In spite of intrusion prevention, bad guys
will sometime get in
 Intrusion detection systems (IDS)
o Detect attacks in progress (or soon after)
o Look for unusual or suspicious activity
 IDS evolved from log file analysis
 IDS is currently a hot research topic
 How to respond when intrusion detected?
o We don’t deal with this topic here…

Part 2  Access Control 488


Intrusion Detection Systems
 Who is likely intruder?
o May be outsider who got thru firewall
o May be evil insider
 What do intruders do?
o Launch well-known attacks
o Launch variations on well-known attacks
o Launch new/little-known attacks
o “Borrow” system resources
o Use compromised system to attack others. etc.

Part 2  Access Control 489


IDS
 Intrusion detection approaches
o Signature-based IDS
o Anomaly-based IDS
 Intrusion detection architectures
o Host-based IDS
o Network-based IDS
 Any IDS can be classified as above
o In spite of marketing claims to the contrary!

Part 2  Access Control 490


Host-Based IDS
 Monitor activities on hosts for
o Known attacks
o Suspicious behavior
 Designed to detect attacks such as
o Buffer overflow
o Escalation of privilege, …
 Little or no view of network activities

Part 2  Access Control 491


Network-Based IDS
 Monitor activity on the network for…
o Known attacks
o Suspicious network activity
 Designed to detect attacks such as
o Denial of service
o Network probes
o Malformed packets, etc.
 Some overlap with firewall
 Little or no view of host-base attacks
 Can have both host and network IDS

Part 2  Access Control 492


Signature Detection Example
 Failed login attempts may indicate
password cracking attack
 IDS could use the rule “N failed login
attempts in M seconds” as signature
 If N or more failed login attempts in M
seconds, IDS warns of attack
 Note that such a warning is specific
o Admin knows what attack is suspected
o Easy to verify attack (or false alarm)

Part 2  Access Control 493


Signature Detection
 Suppose IDS warns whenever N or more
failed logins in M seconds
o Set N and M so false alarms not common
o Can do this based on “normal” behavior
 But, if Trudy knows the signature, she can
try N  1 logins every M seconds…
 Then signature detection slows down Trudy,
but might not stop her

Part 2  Access Control 494


Signature Detection
 Many techniques used to make signature
detection more robust
 Goal is to detect “almost” signatures
 For example, if “about” N login attempts in
“about” M seconds
o Warn of possible password cracking attempt
o What are reasonable values for “about”?
o Can use statistical analysis, heuristics, etc.
o Must not increase false alarm rate too much

Part 2  Access Control 495


Signature Detection
 Advantages of signature detection
o Simple
o Detect known attacks
o Know which attack at time of detection
o Efficient (if reasonable number of signatures)
 Disadvantages of signature detection
o Signature files must be kept up to date
o Number of signatures may become large
o Can only detect known attacks
o Variation on known attack may not be detected
Part 2  Access Control 496
Anomaly Detection
 Anomaly detection systems look for unusual
or abnormal behavior
 There are (at least) two challenges
o What is normal for this system?
o How “far” from normal is abnormal?
 No avoiding statistics here!
o mean defines normal
o variance gives distance from normal to abnormal

Part 2  Access Control 497


How to Measure Normal?
 How to measure normal?
o Must measure during “representative”
behavior
o Must not measure during an attack…
o …or else attack will seem normal!
o Normal is statistical mean
o Must also compute variance to have any
reasonable idea of abnormal

Part 2  Access Control 498


How to Measure Abnormal?
 Abnormal is relative to some “normal”
o Abnormal indicates possible attack
 Statistical discrimination techniques include
o Bayesian statistics
o Linear discriminant analysis (LDA)
o Quadratic discriminant analysis (QDA)
o Neural nets, hidden Markov models (HMMs), etc.
 Fancy modeling techniques also used
o Artificial intelligence
o Artificial immune system principles
o Many, many, many others

Part 2  Access Control 499


Anomaly Detection (1)
 Spse we monitor use of three commands:
open, read, close
 Under normal use we observe Alice:
open, read, close, open, open, read, close, …
 Of the six possible ordered pairs, we see
four pairs are normal for Alice,
(open,read), (read,close), (close,open), (open,open)
 Can we use this to identify unusual activity?

Part 2  Access Control 500


Anomaly Detection (1)
 We monitor use of the three commands
open, read, close
 If the ratio of abnormal to normal pairs is
“too high”, warn of possible attack
 Could improve this approach by
o Also use expected frequency of each pair
o Use more than two consecutive commands
o Include more commands/behavior in the model
o More sophisticated statistical discrimination

Part 2  Access Control 501


Anomaly Detection (2)
 Over time, Alice has  Recently, “Alice”
accessed file Fn at has accessed Fn at
rate Hn rate An

H0 H1 H2 H3 A0 A1 A2 A3
.10 .40 .40 .10 .10 .40 .30 .20

 Is this normal use for Alice?


 We compute S = (H0A0)2+(H1A1)2+…+(H3A3)2 = .02
o We consider S < 0.1 to be normal, so this is normal
 How to account for use that varies over time?
Part 2  Access Control 502
Anomaly Detection (2)
 To allow “normal” to adapt to new use, we
update averages: Hn = 0.2An + 0.8Hn
 In this example, Hn are updated…
H2=.2.3+.8.4=.38 and H3=.2.2+.8.1=.12
 And we now have

H0 H1 H2 H3
.10 .40 .38 .12

Part 2  Access Control 503


Anomaly Detection (2)
 The updated long  Suppose new
term average is observed rates…
H0 H1 H2 H3 A0 A1 A2 A3
.10 .40 .38 .12 .10 .30 .30 .30

 Is this normal use?


 Compute S = (H0A0)2+…+(H3A3)2 = .0488
o Since S = .0488 < 0.1 we consider this normal
 And we again update the long term averages:
Hn = 0.2An + 0.8Hn
Part 2  Access Control 504
Anomaly Detection (2)
 The starting  After 2 iterations,
averages were: averages are:

H0 H1 H2 H3 H0 H1 H2 H3
.10 .40 .40 .10 .10 .38 .364 .156

 Statistics slowly evolve to match behavior


 This reduces false alarms for SA
 But also opens an avenue for attack…
o Suppose Trudy always wants to access F3
o Can she convince IDS this is normal for Alice?
Part 2  Access Control 505
Anomaly Detection (2)
 To make this approach more robust, must
incorporate the variance
 Can also combine N stats Si as, say,
T = (S1 + S2 + S3 + … + SN) / N
to obtain a more complete view of “normal”
 Similar (but more sophisticated) approach
is used in an IDS known as NIDES
 NIDES combines anomaly & signature IDS

Part 2  Access Control 506


Anomaly Detection Issues
 Systems constantly evolve and so must IDS
o Static system would place huge burden on admin
o But evolving IDS makes it possible for attacker to
(slowly) convince IDS that an attack is normal
o Attacker may win simply by “going slow”
 What does “abnormal” really mean?
o Indicates there may be an attack
o Might not be any specific info about “attack”
o How to respond to such vague information?
o In contrast, signature detection is very specific
Part 2  Access Control 507
Anomaly Detection
 Advantages?
o Chance of detecting unknown attacks
 Disadvantages?
o Cannot use anomaly detection alone…
o …must be used with signature detection
o Reliability is unclear
o May be subject to attack
o Anomaly detection indicates “something unusual”,
but lacks specific info on possible attack

Part 2  Access Control 508


Anomaly Detection: The
Bottom Line
 Anomaly-based IDS is active research topic
 Many security experts have high hopes for its
ultimate success
 Often cited as key future security technology
 Hackers are not convinced!
o Title of a talk at Defcon: “Why Anomaly-based
IDS is an Attacker’s Best Friend”
 Anomaly detection is difficult and tricky
 As hard as AI?

Part 2  Access Control 509


Access Control Summary
 Authentication and authorization
o Authentication  who goes there?
 Passwords  something you know
 Biometrics  something you are (you
are your key)
 Something you have

Part 2  Access Control 510


Access Control Summary
 Authorization  are you allowed to do that?
o Access control matrix/ACLs/Capabilities
o MLS/Multilateral security
o BLP/Biba
o Covert channel
o Inference control
o CAPTCHA
o Firewalls
o IDS
Part 2  Access Control 511
Coming Attractions…
 Security protocols
o Generic authentication protocols
o SSH
o SSL
o IPSec
o Kerberos
o WEP
o GSM
 We’ll see lots of crypto applications in the
protocol chapters
Part 2  Access Control 512
Part III: Protocols

Part 3  Protocols 513


Protocol
 Human protocols  the rules followed in
human interactions
o Example: Asking a question in class
 Networking protocols  rules followed in
networked communication systems
o Examples: HTTP, FTP, etc.
 Security protocol  the (communication)
rules followed in a security application
o Examples: SSL, IPSec, Kerberos, etc.

Part 3  Protocols 514


Protocols
 Protocol flaws can be very subtle
 Several well-known security protocols
have significant flaws
o Including WEP, GSM, and IPSec
 Implementation errors can also occur
o Recently, IE implementation of SSL
 Not easy to get protocols right…

Part 3  Protocols 515


Ideal Security Protocol
 Must satisfy security requirements
o Requirements need to be precise
 Efficient
o Minimize computational requirement
o Minimize bandwidth usage, delays…
 Robust
o Works when attacker tries to break it
o Works if environment changes (slightly)
 Easy to implement, easy to use, flexible…
 Difficult to satisfy all of these!
Part 3  Protocols 516
Chapter 9:
Simple Security Protocols
“I quite agree with you,” said the Duchess; “and the moral of that is
‘Be what you would seem to be’ or
if you'd like it put more simply‘Never imagine yourself not to be
otherwise than what it might appear to others that what you were
or might have been was not otherwise than what you
had been would have appeared to them to be otherwise.’ ”
 Lewis Carroll, Alice in Wonderland

Seek simplicity, and distrust it.


 Alfred North Whitehead

Part 2  Access Control 517


Secure Entry to NSA
1. Insert badge into reader
2. Enter PIN
3. Correct PIN?
Yes? Enter
No? Get shot by security guard

Part 3  Protocols 518


ATM Machine Protocol
1. Insert ATM card
2. Enter PIN
3. Correct PIN?
Yes? Conduct your transaction(s)
No? Machine (eventually) eats card

Part 3  Protocols 519


Identify Friend or Foe (IFF)

Russian
MIG
Angola

SAAF 2. E(N,K)
Impala
K 1. N
Namibia
K
Part 3  Protocols 520
MIG in the Middle
3. N
SAAF
Impala 4. E(N,K)
K Angola
2. N

5. E(N,K)

6. E(N,K)
Russian
MiG
1. N
Namibia
K
Part 3  Protocols 521
Authentication Protocols

Part 3  Protocols 522


Authentication
 Alice must prove her identity to Bob
o Alice and Bob can be humans or computers
 May also require Bob to prove he’s Bob
(mutual authentication)
 Probably need to establish a session key
 May have other requirements, such as
o Public keys, symmetric keys, hash functions, …
o Anonymity, plausible deniability, perfect
forward secrecy, etc.

Part 3  Protocols 523


Authentication
 Authentication on a stand-alone computer is
relatively simple
o For example, hash a password with a salt
o “Secure path,” attacks on authentication
software, keystroke logging, etc., can be issues
 Authentication over a network is challenging
o Attacker can passively observe messages
o Attacker can replay messages
o Active attacks possible (insert, delete, change)

Part 3  Protocols 524


Simple Authentication
“I’m Alice”

Prove it

My password is “frank”
Alice Bob

 Simple and may be OK for standalone system


 But highly insecure for networked system
o Subject to a replay attack (next 2 slides)
o Also, Bob must know Alice’s password
Part 3  Protocols 525
Authentication Attack
“I’m Alice”

Prove it

My password is “frank”
Alice Bob

Trudy
Part 3  Protocols 526
Authentication Attack

“I’m Alice”

Prove it

My password is “frank”
Trudy Bob

 This is an example of a replay attack


 How can we prevent a replay?
Part 3  Protocols 527
Simple Authentication

I’m Alice, my password is “frank”

Alice Bob

 More efficient, but…


 … same problem as previous version
Part 3  Protocols 528
Better Authentication
“I’m Alice”

Prove it

h(Alice’s password)
Alice Bob

 This approach hides Alice’s password


o From both Bob and Trudy
 But still subject to replay attack
Part 3  Protocols 529
Challenge-Response
 To prevent replay, use challenge-response
o Goal is to ensure “freshness”
 Suppose Bob wants to authenticate Alice
o Challenge sent from Bob to Alice
 Challenge is chosen so that…
o Replay is not possible
o Only Alice can provide the correct response
o Bob can verify the response

Part 3  Protocols 530


Nonce
 To ensure freshness, can employ a nonce
o Nonce == number used once
 What to use for nonces?
o That is, what is the challenge?
 What should Alice do with the nonce?
o That is, how to compute the response?
 How can Bob verify the response?
 Should we use passwords or keys?

Part 3  Protocols 531


Challenge-Response
“I’m Alice”

Nonce

h(Alice’s password, Nonce)


Alice Bob
 Nonce is the challenge
 The hash is the response
 Nonce prevents replay (ensures freshness)
 Password is something Alice knows
 Note: Bob must know Alice’s pwd to verify

Part 3  Protocols 532


Generic Challenge-Response
“I’m Alice”

Nonce

Something that could only be


Alice from Alice, and Bob can verify Bob

 In practice, how to achieve this?


 Hashed password works, but…
 …encryption is much better here (why?)
Part 3  Protocols 533
Symmetric Key Notation
 Encrypt plaintext P with key K
C = E(P,K)
 Decrypt ciphertext C with key K
P = D(C,K)
 Here, we are concerned with attacks on
protocols, not attacks on cryptography
o So, we assume crypto algorithms are secure

Part 3  Protocols 534


Authentication: Symmetric Key
 Alice and Bob share symmetric key K
 Key K known only to Alice and Bob
 Authenticate by proving knowledge of
shared symmetric key
 How to accomplish this?
o Cannot reveal key, must not allow replay
(or other) attack, must be verifiable, …

Part 3  Protocols 535


Authenticate Alice Using
Symmetric Key
“I’m Alice”
R
E(R,K)
Alice, K Bob, K

 Secure method for Bob to authenticate Alice


 But, Alice does not authenticate Bob
 So, can we achieve mutual authentication?
Part 3  Protocols 536
Mutual Authentication?

“I’m Alice”, R

E(R,K)

E(R,K)
Alice, K Bob, K

 What’s wrong with this picture?


 “Alice” could be Trudy (or anybody else)!
Part 3  Protocols 537
Mutual Authentication
 Since we have a secure one-way
authentication protocol…
 The obvious thing to do is to use the
protocol twice
o Once for Bob to authenticate Alice
o Once for Alice to authenticate Bob
 This has got to work…

Part 3  Protocols 538


Mutual Authentication
“I’m Alice”, RA

RB, E(RA, K)

E(RB, K)
Alice, K Bob, K

 This provides mutual authentication…


 …or does it? Subject to reflection attack
o Next slide

Part 3  Protocols 539


Mutual Authentication Attack
1. “I’m Alice”, RA
2. RB, E(RA, K)

Trudy Bob, K

3. “I’m Alice”, RB

4. RC, E(RB, K)

Trudy Bob, K
Part 3  Protocols 540
Mutual Authentication
 Our one-way authentication protocol is
not secure for mutual authentication
o Protocols are subtle!
o In this case, “obvious” solution is not secure
 Also, if assumptions or environment
change, protocol may not be secure
o This is a common source of security failure
o For example, Internet protocols

Part 3  Protocols 541


Symmetric Key Mutual
Authentication
“I’m Alice”, RA

RB, E(“Bob”,RA,K)

E(“Alice”,RB,K)
Alice, K Bob, K

 Do these “insignificant” changes help?


 Yes!
Part 3  Protocols 542
Public Key Notation
 Encrypt M with Alice’s public key: {M}Alice
 Sign M with Alice’s private key: [M]Alice
 Then
o [{M}Alice ]Alice = M
o {[M]Alice }Alice = M
 Anybody can use Alice’s public key
 Only Alice can use her private key

Part 3  Protocols 543


Public Key Authentication

“I’m Alice”
{R}Alice

R
Alice Bob
 Is this secure?
 Trudy can get Alice to decrypt anything!
Prevent this by having two key pairs
Part 3  Protocols 544
Public Key Authentication

“I’m Alice”

[R]Alice
Alice Bob
 Is this secure?
 Trudy can get Alice to sign anything!
o Same a previous  should have two key pairs
Part 3  Protocols 545
Public Keys
 Generally, a bad idea to use the same
key pair for encryption and signing
 Instead, should have…
o …one key pair for encryption/decryption
and signing/verifying signatures…
o …and a different key pair for
authentication

Part 3  Protocols 546


Session Key
 Usually, a session key is required
o A symmetric key for current session
o Used for confidentiality and/or integrity
 How to authenticate and establish a
session key (i.e., shared symmetric key)?
o When authentication completed, Alice and Bob
share a session key
o Trudy cannot break the authentication…
o …and Trudy cannot determine the session key

Part 3  Protocols 547


Authentication & Session Key
“I’m Alice”, R
{R, K}Alice

{R +1, K}Bob
Alice Bob

 Is this secure?
o Alice is authenticated and session key is secure
o Alice’s “nonce”, R, useless to authenticate Bob
o The key K is acting as Bob’s nonce to Alice
 No mutual authentication
Part 3  Protocols 548
Public Key Authentication
and Session Key
“I’m Alice”, R
[R, K]Bob

[R +1, K]Alice
Alice Bob

 Is this secure?
o Mutual authentication (good), but…
o … session key is not protected (very bad)

Part 3  Protocols 549


Public Key Authentication
and Session Key
“I’m Alice”, R
{[R, K]Bob}Alice

{[R +1, K]Alice}Bob


Alice Bob

 Is this secure?
 No! It’s subject to subtle MiM attack
o See the next slide…
Part 3  Protocols 550
Public Key Authentication
and Session Key
1. “I’m Alice”, R 2. “I’m Trudy”, R
4. { }Alice 3. {[R, K]Bob}Trudy

5. {[R +1, K]Alice}Bob 6. time out


Alice Trudy Bob

 Trudy can get [R, K]Bob and K from 3.


 Alice uses this same key K
 And Alice thinks she’s talking to Bob
Part 3  Protocols 551
Public Key Authentication
and Session Key
“I’m Alice”, R
[{R, K}Alice]Bob

[{R +1, K}Bob]Alice


Alice Bob

 Is this secure?
 Seems to be OK
o Anyone can see {R, K}Alice and {R +1, K}Bob
Part 3  Protocols 552
Timestamps
 A timestamp T is derived from current time
 Timestamps can be used to prevent replay
o Used in Kerberos, for example
 Timestamps reduce number of msgs (good)
o A challenge that both sides know in advance
 “Time” is a security-critical parameter (bad)
o Clocks not same and/or network delays, so must
allow for clock skew  creates risk of replay
o How much clock skew is enough?

Part 3  Protocols 553


Public Key Authentication
with Timestamp T
“I’m Alice”, {[T, K]Alice}Bob
{[T +1, K]Bob}Alice

Alice Bob

 Secure mutual authentication?


 Session key secure?
 Seems to be OK
Part 3  Protocols 554
Public Key Authentication
with Timestamp T
“I’m Alice”, [{T, K}Bob]Alice
[{T +1, K}Alice]Bob

Alice Bob

 Secure authentication and session key?


 Trudy can use Alice’s public key to find
{T, K}Bob and then…
Part 3  Protocols 555
Public Key Authentication
with Timestamp T

“I’m Trudy”, [{T, K}Bob]Trudy


[{T +1, K}Trudy]Bob

Trudy Bob

 Trudy obtains Alice-Bob session key K


 Note: Trudy must act within clock skew

Part 3  Protocols 556


Public Key Authentication
 Sign and encrypt with nonce…
o Insecure
 Encrypt and sign with nonce…
o Secure
 Sign and encrypt with timestamp…
o Secure
 Encrypt and sign with timestamp…
o Insecure
 Protocols can be subtle!

Part 3  Protocols 557


Public Key Authentication
with Timestamp T

“I’m Alice”, [{T, K}Bob]Alice


[{T +1}Alice]Bob

Alice Bob

 Is this “encrypt and sign” secure?


o Yes, seems to be OK
 Does “sign and encrypt” also work here?
Part 3  Protocols 558
Perfect Forward Secrecy
 Consider this “issue”…
o Alice encrypts message with shared key K and
sends ciphertext to Bob
o Trudy records ciphertext and later attacks
Alice’s (or Bob’s) computer to recover K
o Then Trudy decrypts recorded messages
 Perfect forward secrecy (PFS): Trudy
cannot later decrypt recorded ciphertext
o Even if Trudy gets key K or other secret(s)
 Is PFS possible?
Part 3  Protocols 559
Perfect Forward Secrecy
 Suppose Alice and Bob share key K
 For perfect forward secrecy, Alice and Bob
cannot use K to encrypt
 Instead they must use a session key KS and
forget it after it’s used
 Can Alice and Bob agree on session key KS
in a way that provides PFS?

Part 3  Protocols 560


Naïve Session Key Protocol

E(KS, K)

E(messages, KS)

Alice, K Bob, K

 Trudy could record E(KS, K)


 If Trudy later gets K then she can get KS
o Then Trudy can decrypt recorded messages
 No perfect forward secrecy in this case
Part 3  Protocols 561
Perfect Forward Secrecy
 We can use Diffie-Hellman for PFS
 Recall: public g and p

ga mod p
gb mod p

Alice, a Bob, b
 But Diffie-Hellman is subject to MiM
 How to get PFS and prevent MiM?

Part 3  Protocols 562


Perfect Forward Secrecy
E(ga mod p, K)
E(gb mod p, K)

Alice: K, a Bob: K, b
 Session key KS = gab mod p
 Alice forgets a, Bob forgets b
 This is known as Ephemeral Diffie-Hellman
 Neither Alice nor Bob can later recover KS
 Are there other ways to achieve PFS?
Part 3  Protocols 563
Mutual Authentication,
Session Key and PFS
“I’m Alice”, RA
RB, [RA, gb mod p]Bob

[RB, ga mod p]Alice


Alice Bob

 Session key is K = gab mod p


 Alice forgets a and Bob forgets b
 If Trudy later gets Bob’s and Alice’s secrets,
she cannot recover session key K
Part 3  Protocols 564
Authentication and TCP

Part 3  Protocols 565


TCP-based Authentication
 TCP not intended for use as an
authentication protocol
 But IP address in TCP connection may
be (mis)used for authentication
 Also, one mode of IPSec relies on IP
address for authentication

Part 3  Protocols 566


TCP 3-way Handshake

SYN, SEQ a

SYN, ACK a+1, SEQ b

ACK b+1, data


Alice Bob

 Initial sequence numbers: SEQ a and SEQ b


o Supposed to be selected at random
 If not, might have problems…
Part 3  Protocols 567
TCP Authentication Attack

Trudy Bob

5.
5.
5.

5. Alice
Part 3  Protocols 568
TCP Authentication Attack

Initial SEQ numbers


Random SEQ numbers Mac OS X
 If initial SEQ numbers not very random…
 …possible to guess initial SEQ number…
 …and previous attack will succeed
Part 3  Protocols 569
TCP Authentication Attack
 Trudy cannot see what Bob sends, but she can
send packets to Bob, while posing as Alice
 Trudy must prevent Alice from receiving Bob’s
response (or else connection will terminate)
 If password (or other authentication) required,
this attack fails
 If TCP connection is relied on for authentication,
then attack might succeed
 Bad idea to rely on TCP for authentication

Part 3  Protocols 570


Zero Knowledge Proofs

Part 3  Protocols 571


Zero Knowledge Proof (ZKP)
 Alice wants to prove that she knows a
secret without revealing any info about it
 Bob must verify that Alice knows secret
o But, Bob gains no information about the secret
 Process is probabilistic
o Bob can verify that Alice knows the secret to
an arbitrarily high probability
 An “interactive proof system”

Part 3  Protocols 572


Bob’s Cave
 Alice knows secret
phrase to open path P
between R and S
(“open sarsaparilla”)
 Can she convince Q
Bob that she knows R S
the secret without
revealing phrase?

Part 3  Protocols 573


Bob’s Cave
 Bob: “Alice, come out on S side” P

 Alice (quietly):
“Open sarsaparilla”
Q
 If Alice does not
R S
know the secret…
 …then Alice could come out from the correct side
with probability 1/2
 If Bob repeats this n times and Alice does not know
secret, she can only fool Bob with probability 1/2n

Part 3  Protocols 574


Fiat-Shamir Protocol
 Cave-based protocols are inconvenient
o Can we achieve same effect without the cave?
 Finding square roots modulo N is difficult
o Equivalent to factoring
 Suppose N = pq, where p and q prime
 Alice has a secret S
 N and v = S2 mod N are public, S is secret
 Alice must convince Bob that she knows S
without revealing any information about S
Part 3  Protocols 575
Fiat-Shamir
x = r2 mod N
e  {0,1}
y = r  Se mod N
Alice Bob
secret S random e
random r
 Public: Modulus N and v = S2 mod N
 Alice selects random r, Bob chooses e  {0,1}
 Bob verifies: y2 = x  ve mod N
o Note that y2 = r2  S2e = r2  (S2)e = x  ve mod N
Part 3  Protocols 576
Fiat-Shamir: e = 1
x = r2 mod N
e=1
y = r  S mod N
Alice Bob
secret S random e
random r
 Public: Modulus N and v = S2 mod N
 Alice selects random r, Bob chooses e =1
 If y2 = x  v mod N then Bob accepts it
o And Alice passes this iteration of the protocol
 Note that Alice must know S in this case
Part 3  Protocols 577
Fiat-Shamir: e = 0
x = r2 mod N
e=0
y = r mod N
Alice Bob
secret S random e
random r

 Public: Modulus N and v = S2 mod N


 Alice selects random r, Bob chooses e = 0
 Bob must checks whether y2 = x mod N
 “Alice” does not need to know S in this case!

Part 3  Protocols 578


Fiat-Shamir
 Public: modulus N and v = S2 mod N
 Secret: Alice knows S
 Alice selects random r and commits to r by
sending x = r2 mod N to Bob
 Bob sends challenge e  {0,1} to Alice
 Alice responds with y = r  Se mod N
 Bob checks whether y2 = x  ve mod N
o Does this prove response is from Alice?

Part 3  Protocols 579


Does Fiat-Shamir Work?
 If everyone follows protocol, math works:
o Public: v = S2 mod N
o Alice to Bob: x = r2 mod N and y = r  Se mod N
o Bob verifies: y2 = x  ve mod N
 Can Trudy convince Bob she is Alice?
o If Trudy expects e = 0, she follows the
protocol: send x = r2 in msg 1 and y = r in msg 3
o If Trudy expects e = 1, she sends x = r2  v1 in
msg 1 and y = r in msg 3
 If Bob chooses e  {0,1} at random, Trudy
can only trick Bob with probability 1/2
Part 3  Protocols 580
Fiat-Shamir Facts
 Trudy can trick Bob with probability 1/2, but…
o …after n iterations, the probability that Trudy can
convince Bob that she is Alice is only 1/2n
o Just like Bob’s cave!
 Bob’s e  {0,1} must be unpredictable
 Alice must use new r each iteration, or else…
o If e = 0, Alice sends r mod N in message 3
o If e = 1, Alice sends r  S mod N in message 3
o Anyone can find S given r mod N and r  S mod N

Part 3  Protocols 581


Fiat-Shamir Zero Knowledge?
 Zero knowledge means that nobody learns
anything about the secret S
o Public: v = S2 mod N
o Trudy sees r2 mod N in message 1
o Trudy sees r  S mod N in message 3 (if e = 1)
 If Trudy can find r from r2 mod N, she gets
S
o But that requires modular square root calculation
o If Trudy could find modular square roots, she
could get S from public v
 Protocol does not seem to “help” to find S582
Part 3  Protocols
ZKP in the Real World
 Public key certificates identify users
o No anonymity if certificates sent in plaintext
 ZKP offers a way to authenticate without
revealing identities
 ZKP supported in MS’s Next Generation
Secure Computing Base (NGSCB), where…
o …ZKP used to authenticate software “without
revealing machine identifying data”
 ZKP is not just pointless mathematics!
Part 3  Protocols 583
Best Authentication Protocol?
 It depends on…
o The sensitivity of the application/data
o The delay that is tolerable
o The cost (computation) that is tolerable
o What crypto is supported (public key,
symmetric key, …)
o Whether mutual authentication is required
o Whether PFS, anonymity, etc., are concern
 …and possibly other factors
Part 3  Protocols 584
Chapter 10:
Real-World Protocols
The wire protocol guys don't worry about security because that's really
a network protocol problem. The network protocol guys don't
worry about it because, really, it's an application problem.
The application guys don't worry about it because, after all,
they can just use the IP address and trust the network.
 Marcus J. Ranum

In the real world, nothing happens at the right place at the right time.
It is the job of journalists and historians to correct that.
 Mark Twain

Part 2  Access Control 585


Real-World Protocols
 Next, we look at real protocols
o SSH  relatively simple & useful protocol
o SSL  practical security on the Web
o IPSec  security at the IP layer
o Kerberos  symmetric key, single sign-on
o WEP  “Swiss cheese” of security protocols
o GSM  mobile phone (in)security

Part 3  Protocols 586


Secure Shell (SSH)

Part 3  Protocols 587


SSH
 Creates a “secure tunnel”
 Insecure command sent thru SSH
“tunnel” are then secure
 SSH used with things like rlogin
o Why is rlogin insecure without SSH?
o Why is rlogin secure with SSH?
 SSH is a relatively simple protocol

Part 3  Protocols 588


SSH
 SSH authentication can be based on:
o Public keys, or
o Digital certificates, or
o Passwords
 Here, we consider certificate mode
o Other modes in homework problems
 We consider slightly simplified SSH…

Part 3  Protocols 589


Simplified SSH
Alice, CP, RA
CS, RB
ga mod p
gb mod p, certificateB, SB
Alice E(Alice, certificateA, SA, K) Bob

 CP = “crypto proposed”, and CS = “crypto selected”


 H = h(Alice,Bob,CP,CS,RA,RB,ga mod p,gb mod p,gab mod p)
 SB = [H]Bob
 SA = [H, Alice, certificateA]Alice
 K = gab mod p
Part 3  Protocols 590
MiM Attack on SSH?
Alice, RA Alice, RA
RB RB
ga mod p gt mod p
gt mod p, certB, SB gb mod p, certB, SB
Alice E(Alice,certA,SA,K) Trudy E(Alice,certA,SA,K) Bob

 Where does this attack fail?


 Alice computes
Ha = h(Alice,Bob,CP,CS,RA,RB,ga mod p,gt mod p,gat mod p)
 But Bob signs
Hb = h(Alice,Bob,CP,CS,RA,RB,gt mod p,gb mod p,gbt mod p)

Part 3  Protocols 591


Secure Socket Layer

Part 3  Protocols 592


Socket layer
 “Socket layer”
lives between Socket application User
application “layer”
and transport transport
OS
layers
network
 SSL usually
between HTTP link
NIC
and TCP
physical

Part 3  Protocols 593


What is SSL?
 SSL is the protocol used for majority of
secure Internet transactions today
 For example, if you want to buy a book at
amazon.com…
o You want to be sure you are dealing with Amazon
(authentication)
o Your credit card information must be protected
in transit (confidentiality and/or integrity)
o As long as you have money, Amazon does not
really care who you are…
o …so, no need for mutual authentication

Part 3  Protocols 594


Simple SSL-like Protocol
I’d like to talk to you securely

Here’s my certificate
{K}Bob

Alice protected HTTP Bob

 Is Alice sure she’s talking to Bob?


 Is Bob sure he’s talking to Alice?

Part 3  Protocols 595


Simplified SSL Protocol
Can we talk?, cipher list, RA
certificate, cipher, RB
{S}Bob, E(h(msgs,CLNT,K),K)
h(msgs,SRVR,K)
Alice Data protected with key K Bob

 S is the so-called pre-master secret


 K = h(S,RA,RB)
 “msgs” means all previous messages
 CLNT and SRVR are constants

Part 3  Protocols 596


SSL Keys
6 “keys” derived from K = h(S,RA,RB)
o 2 encryption keys: client and server
o 2 integrity keys: client and server
o 2 IVs: client and server
o Why different keys in each direction?
 Q: Why is h(msgs,CLNT,K) encrypted?
 A: Apparently, it adds no security…

Part 3  Protocols 597


SSL Authentication
 Alice authenticates Bob, not vice-versa
o How does client authenticate server?
o Why would server not authenticate client?
 Mutual authentication is possible: Bob
sends certificate request in message 2
o Then client must have a valid certificate
o But, if server wants to authenticate client,
server could instead require password

Part 3  Protocols 598


SSL MiM Attack?
RA RA
certificateT, RB certificateB, RB
{S1}Trudy,E(X1,K1) {S2}Bob,E(X2,K2)
h(Y1,K1) h(Y2,K2)
Alice Trudy
E(data,K1) E(data,K2) Bob

 Q: What prevents this MiM “attack”?


 A: Bob’s certificate must be signed by a
certificate authority (CA)
 What does browser do if signature not valid?
 What does user do when browser complains?
Part 3  Protocols 599
SSL Sessions vs Connections
 SSL session is established as shown on
previous slides
 SSL designed for use with HTTP 1.0
 HTTP 1.0 often opens multiple simultaneous
(parallel) connections
o Multiple connections per session
 SSL session is costly, public key operations
 SSL has an efficient protocol for opening
new connections given an existing session
Part 3  Protocols 600
SSL Connection
session-ID, cipher list, RA
session-ID, cipher, RB,
h(msgs,SRVR,K)
h(msgs,CLNT,K)

Alice Protected data Bob

 Assuming SSL session exists


 So, S is already known to Alice and Bob
 Both sides must remember session-ID
 Again, K = h(S,RA,RB)
 No public key operations! (relies on known S)
Part 3  Protocols 601
SSL vs IPSec
 IPSec  discussed in next section
o Lives at the network layer (part of the OS)
o Encryption, integrity, authentication, etc.
o Is overly complex, has some security “issues”
 SSL (and IEEE standard known as TLS)
o Lives at socket layer (part of user space)
o Encryption, integrity, authentication, etc.
o Relatively simple and elegant specification

Part 3  Protocols 602


SSL vs IPSec
 IPSec: OS must be aware, but not apps
 SSL: Apps must be aware, but not OS
 SSL built into Web early-on (Netscape)
 IPSec often used in VPNs (secure tunnel)
 Reluctance to retrofit applications for SSL
 IPSec not widely deployed (complexity, etc.)
 The bottom line?
 Internet less secure than it could be!
Part 3  Protocols 603
IPSec

Part 3  Protocols 604


IPSec and SSL
 IPSec lives at
the network application User
layer SSL

 IPSec is
transport
OS
transparent to IPSec network
applications
link
NIC

physical

Part 3  Protocols 605


IPSec and Complexity
 IPSec is a complex protocol
 Over-engineered
o Lots of (generally useless) features
 Flawed  Some significant security issues
 Interoperability is serious challenge
o Defeats the purpose of having a standard!
 Complex
 And, did I mention, it’s complex?

Part 3  Protocols 606


IKE and ESP/AH
 Two parts to IPSec…
 IKE: Internet Key Exchange
o Mutual authentication
o Establish session key
o Two “phases”  like SSL session/connection
 ESP/AH
o ESP: Encapsulating Security Payload  for
confidentiality and/or integrity
o AH: Authentication Header  integrity only
Part 3  Protocols 607
IKE

Part 3  Protocols 608


IKE
 IKE has 2 phases
o Phase 1  IKE security association (SA)
o Phase 2  AH/ESP security association
 Phase 1 is comparable to SSL session
 Phase 2 is comparable to SSL connection
 Not an obvious need for two phases in IKE
o In the context of IPSec, that is
 If multiple Phase 2’s do not occur, then it is
more costly to have two phases!
Part 3  Protocols 609
IKE Phase 1
 4 different “key options”
o Public key encryption (original version)
o Public key encryption (improved version)
o Public key signature
o Symmetric key
 For each of these, 2 different “modes”
o Main mode and aggressive mode
 There are 8 versions of IKE Phase 1!
 Need more evidence it’s over-engineered?

Part 3  Protocols 610


IKE Phase 1
 We discuss 6 of the 8 Phase 1 variants
o Public key signatures (main & aggressive modes)
o Symmetric key (main and aggressive modes)
o Public key encryption (main and aggressive)
 Why public key encryption and public key
signatures?
o Always know your own private key
o May not (initially) know other side’s public key

Part 3  Protocols 611


IKE Phase 1
 Uses ephemeral Diffie-Hellman to
establish session key
o Provides perfect forward secrecy (PFS)
 Let a be Alice’s Diffie-Hellman exponent
 Let b be Bob’s Diffie-Hellman exponent
 Let g be generator and p prime
 Recall that p and g are public

Part 3  Protocols 612


IKE Phase 1: Digital Signature
(Main Mode)
IC, CP
IC,RC, CS
IC,RC, ga mod p, RA
IC,RC, gb mod p, RB
IC,RC, E(“Alice”, proofA, K)
Alice IC,RC, E(“Bob”, proofB, K) Bob

 CP = crypto proposed, CS = crypto selected


 IC = initiator “cookie”, RC = responder “cookie”
 K = h(IC,RC,gab mod p,RA,RB)
 SKEYID = h(RA, RB, gab mod p)
 proofA = [h(SKEYID,ga mod p,gb mod
p,IC,RC,CP,“Alice”)]
Part 3  Protocols Alice 613
IKE Phase 1: Public Key
Signature (Aggressive Mode)
IC, “Alice”, ga mod p, RA, CP

IC,RC, “Bob”, RB,


gb mod p, CS, proofB

IC,RC, proofA
Alice Bob

 Main differences from main mode


o Not trying to hide identities
o Cannot negotiate g or p

Part 3  Protocols 614


Main vs Aggressive Modes
 Main mode MUST be implemented
 Aggressive mode SHOULD be implemented
o So, if aggressive mode is not implemented, “you
should feel guilty about it”
 Might create interoperability issues
 For public key signature authentication
o Passive attacker knows identities of Alice and
Bob in aggressive mode, but not in main mode
o Active attacker can determine Alice’s and Bob’s
identity in main mode
Part 3  Protocols 615
IKE Phase 1: Symmetric Key
(Main Mode)
IC, CP
IC,RC, CS
IC,RC, ga mod p, RA
IC,RC, gb mod p, RB
IC,RC, E(“Alice”, proofA, K)
Alice Bob
KAB IC,RC, E(“Bob”, proofB, K) KAB

 Same as signature mode except


o KAB = symmetric key shared in advance
o K = h(IC,RC,gab mod p,RA,RB,KAB)
o SKEYID = h(K, gab mod p)
o proofA = h(SKEYID,ga mod p,gb mod
Part 3 p,IC,RC,CP,“Alice”)
Protocols 616
Problems with Symmetric
Key (Main Mode)
 Catch-22
o Alice sends her ID in message 5
o Alice’s ID encrypted with K
o To find K Bob must know KAB
o To get KAB Bob must know he’s talking to Alice!
 Result: Alice’s IP address used as ID!
 Useless mode for the “road warrior”
 Why go to all of the trouble of trying to
hide identities in 6 message protocol?
Part 3  Protocols 617
IKE Phase 1: Symmetric Key
(Aggressive Mode)
IC, “Alice”, ga mod p, RA, CP

IC,RC, “Bob”, RB,


gb mod p, CS, proofB

IC,RC, proofA
Alice Bob

 Same format as digital signature aggressive mode


 Not trying to hide identities…
 As a result, does not have problems of main mode
 But does not (pretend to) hide identities
Part 3  Protocols 618
IKE Phase 1: Public Key
Encryption (Main Mode)
IC, CP
IC,RC, CS
IC,RC, ga mod p, {RA}Bob, {“Alice”}Bob

IC,RC, gb mod p, {RB}Alice, {“Bob”}Alice


IC,RC, E(proofA, K)
Alice Bob
IC,RC, E(proofB, K)
 CP = crypto proposed, CS = crypto selected
 IC = initiator “cookie”, RC = responder “cookie”
 K = h(IC,RC,gab mod p,RA,RB)
 SKEYID = h(RA, RB, gab mod p)
 proofA = h(SKEYID,ga mod p,gb mod
Partp,IC,RC,CP,“Alice”)
3  Protocols 619
IKE Phase 1: Public Key
Encryption (Aggressive Mode)
IC, CP, ga mod p,
{“Alice”}Bob, {RA}Bob
IC,RC, CS, gb mod p,
{“Bob”}Alice, {RB}Alice, proofB

IC,RC, proofA
Alice Bob
 K, proofA, proofB computed as in main mode
 Note that identities are hidden
o The only aggressive mode to hide identities
o So, why have a main mode?
Part 3  Protocols 620
Public Key Encryption Issue?
 In public key encryption, aggressive mode…
 Suppose Trudy generates
o Exponents a and b
o Nonces RA and RB
 Trudy can compute “valid” keys and proofs:
gab mod p, K, SKEYID, proofA and proofB
 All of this also works in main mode

Part 3  Protocols 621


Public Key Encryption Issue?
IC, CP, ga mod p,
{“Alice”}Bob, {RA}Bob
IC,RC, CS, gb mod p,
{“Bob”}Alice, {RB}Alice, proofB
Trudy IC,RC, proofA Trudy
(as Alice) (as Bob)

 Trudy can create messages that appears to


be between Alice and Bob
 Appears valid to any observer, including
Alice and Bob!
Part 3  Protocols 622
Plausible Deniability
 Trudy can create fake “conversation” that
appears to be between Alice and Bob
o Appears valid, even to Alice and Bob!
 A security failure?
 In IPSec public key option, it is a feature…
o Plausible deniability: Alice and Bob can deny
that any conversation took place!
 In some cases it might create a problem
o E.g., if Alice makes a purchase from Bob, she
could later repudiate it (unless she had signed)
Part 3  Protocols 623
IKE Phase 1 “Cookies”
 IC and RC  cookies (or “anti-clogging
tokens”) supposed to prevent DoS attacks
o No relation to Web cookies
 To reduce DoS threats, Bob wants to remain
stateless as long as possible
 But Bob must remember CP from message 1
(required for proof of identity in message 6)
 Bob must keep state from 1st message on
o So, these “cookies” offer little DoS protection

Part 3  Protocols 624


IKE Phase 1 Summary
 Result of IKE phase 1 is
o Mutual authentication
o Shared symmetric key
o IKE Security Association (SA)
 But phase 1 is expensive
o Especially in public key and/or main mode
 Developers of IKE thought it would be used
for lots of things  not just IPSec
o Partly explains the over-engineering…
Part 3  Protocols 625
IKE Phase 2
 Phase 1 establishes IKE SA
 Phase 2 establishes IPSec SA
 Comparison to SSL…
o SSL session is comparable to IKE Phase 1
o SSL connections are like IKE Phase 2
 IKE could be used for lots of things, but in
practice, it’s not!

Part 3  Protocols 626


IKE Phase 2
IC, RC, CP, E(hash1,SA,RA,K)

IC, RC, CS, E(hash2,SA,RB,K)

IC, RC, E(hash3,K)


Alice Bob
 Key K, IC, RC and SA known from Phase 1
 Proposal CP includes ESP and/or AH
 Hashes 1,2,3 depend on SKEYID, SA, RA and RB
 Keys derived from KEYMAT = h(SKEYID,RA,RB,junk)
 Recall SKEYID depends on phase 1 key method
 Optional PFS (ephemeral Diffie-Hellman exchange)
Part 3  Protocols 627
IPSec
 After IKE Phase 1, we have an IKE SA
 After IKE Phase 2, we have an IPSec SA
 Authentication completed and have a
shared symmetric key (session key)
 Now what?
o We want to protect IP datagrams
o But what is an IP datagram?
o From the perspective of IPSec…

Part 3  Protocols 628


IP Review
 IP datagram is of the form

IP header data

 Where IP header is

Part 3  Protocols 629


IP and TCP
 Consider Web traffic, for example
o IP encapsulates TCP and…
o …TCP encapsulates HTTP

IP header data

IP header TCP hdr HTTP hdr app data

 IP data includes TCP header, etc.


Part 3  Protocols 630
IPSec Transport Mode
 IPSec Transport Mode
IP header data

IP header IPSec header data

 Transport mode designed for host-to-host


 Transport mode is efficient
o Adds minimal amount of extra header
 The original header remains
o Passive attacker can see who is talking
Part 3  Protocols 631
IPSec: Host-to-Host
 IPSec transport mode used here

 There may be firewalls in between


o If so, is that a problem?
Part 3  Protocols 632
IPSec Tunnel Mode
 IPSec Tunnel Mode
IP header data

new IP hdr IPSec hdr IP header data

 Tunnel mode for firewall-to-firewall traffic


 Original IP packet encapsulated in IPSec
 Original IP header not visible to attacker
o New IP header from firewall to firewall
o Attacker does not know which hosts are talking
Part 3  Protocols 633
IPSec: Firewall-to-Firewall
 IPSec tunnel mode used here

 Note: Local networks not protected


 Is there any advantage here?

Part 3  Protocols 634


Comparison of IPSec Modes
 Transport Mode  Transport Mode
IP header data
o Host-to-host
 Tunnel Mode
IP header IPSec header data o Firewall-to-
firewall
 Tunnel Mode  Transport Mode
IP header data not necessary…
 …but it’s more
new IP hdr IPSec hdr IP header data efficient
Part 3  Protocols 635
IPSec Security
 What kind of protection?
o Confidentiality?
o Integrity?
o Both?
 What to protect?
o Data?
o Header?
o Both?
 ESP/AH allow some combinations of these
Part 3  Protocols 636
AH vs ESP
 AH  Authentication Header
o Integrity only (no confidentiality)
o Integrity-protect everything beyond IP header
and some fields of header (why not all fields?)

 ESP  Encapsulating Security Payload


o Integrity and confidentiality both required
o Protects everything beyond IP header
o Integrity-only by using NULL encryption

Part 3  Protocols 637


ESP NULL Encryption
 According to RFC 2410
o NULL encryption “is a block cipher the origins of
which appear to be lost in antiquity”
o “Despite rumors”, there is no evidence that NSA
“suppressed publication of this algorithm”
o Evidence suggests it was developed in Roman
times as exportable version of Caesar’s cipher
o Can make use of keys of varying length
o No IV is required
o Null(P,K) = P for any P and any key K
 Is ESP with NULL encryption same as AH ?
Part 3  Protocols 638
Why Does AH Exist? (1)
 Cannot encrypt IP header
o Routers must look at the IP header
o IP addresses, TTL, etc.
o IP header exists to route packets!
 AH protects immutable fields in IP header
o Cannot integrity protect all header fields
o TTL, for example, will change
 ESP does not protect IP header at all
Part 3  Protocols 639
Why Does AH Exist? (2)
 ESP encrypts everything beyond the IP
header (if non-null encryption)
 If ESP-encrypted, firewall cannot look at
TCP header in host-to-host case
 Why not use ESP with NULL encryption?
o Firewall sees ESP header, but does not know
whether null encryption is used
o End systems know, but not the firewalls

Part 3  Protocols 640


Why Does AH Exist? (3)
 The real reason why AH exists:
o At one IETF meeting “someone from
Microsoft gave an impassioned speech
about how AH was useless…”
o “…everyone in the room looked around and
said `Hmm. He’s right, and we hate AH
also, but if it annoys Microsoft let’s leave
it in since we hate Microsoft more than we
hate AH.’ ”

Part 3  Protocols 641


Kerberos

Part 3  Protocols 642


Kerberos
 In Greek mythology, Kerberos is 3-headed
dog that guards entrance to Hades
o “Wouldn’t it make more sense to guard the exit?”
 In security, Kerberos is an authentication
protocol based on symmetric key crypto
o Originated at MIT
o Based on Needham and Schroeder protocol
o Relies on a Trusted Third Party (TTP)

Part 3  Protocols 643


Motivation for Kerberos
 Authentication using public keys
o N users  N key pairs
 Authentication using symmetric keys
o N users requires (on the order of) N2 keys
 Symmetric key case does not scale
 Kerberos based on symmetric keys but only
requires N keys for N users
- Security depends on TTP
+ No PKI is needed

Part 3  Protocols 644


Kerberos KDC
 Kerberos Key Distribution Center or KDC
o KDC acts as the TTP
o TTP is trusted, so it must not be compromised
 KDC shares symmetric key KA with Alice,
key KB with Bob, key KC with Carol, etc.
 And a master key KKDC known only to KDC
 KDC enables authentication, session keys
o Session key for confidentiality and integrity
 In practice, crypto algorithm is DES
Part 3  Protocols 645
Kerberos Tickets
 KDC issue tickets containing info needed to
access network resources
 KDC also issues Ticket-Granting Tickets
or TGTs that are used to obtain tickets
 Each TGT contains
o Session key
o User’s ID
o Expiration time
 Every TGT is encrypted with KKDC
o So, TGT can only be read by the KDC
Part 3  Protocols 646
Kerberized Login
 Alice enters her password
 Then Alice’s computer does following:
o Derives KA from Alice’s password
o Uses KA to get TGT for Alice from KDC
 Alice then uses her TGT (credentials) to
securely access network resources
 Plus: Security is transparent to Alice
 Minus: KDC must be secure  it’s trusted!

Part 3  Protocols 647


Kerberized Login
Alice wants
Alice’s a TGT
password
E(SA,TGT,KA)

Alice Computer KDC


 Key KA = h(Alice’s password)
 KDC creates session key SA
 Alice’s computer decrypts SA and TGT
o Then it forgets KA
 TGT = E(“Alice”, SA, KKDC)
Part 3  Protocols 648
Alice Requests “Ticket to Bob”
I want to
talk to Bob
Talk to Bob REQUEST

REPLY

Alice Computer KDC


 REQUEST = (TGT, authenticator)
o authenticator = E(timestamp, SA)
 REPLY = E(“Bob”, KAB, ticket to Bob, SA)
o ticket to Bob = E(“Alice”, KAB, KB)
 KDC gets SA from TGT to verify timestamp
Part 3  Protocols 649
Alice Uses Ticket to Bob

ticket to Bob, authenticator

E(timestamp + 1, KAB)

Alice’s Bob
Computer

 ticket to Bob = E(“Alice”, KAB, KB)


 authenticator = E(timestamp, KAB)
 Bob decrypts “ticket to Bob” to get KAB which he
then uses to verify timestamp
Part 3  Protocols 650
Kerberos
 Key SA used in authentication
o For confidentiality/integrity
 Timestamps for authentication and
replay protection
 Recall, that timestamps…
o Reduce the number of messages  like a
nonce that is known in advance
o But, “time” is a security-critical parameter
Part 3  Protocols 651
Questions about Kerberos
 When Alice logs in, KDC sends E(SA, TGT, KA)
where TGT = E(“Alice”, SA, KKDC)
Q: Why is TGT encrypted with KA?
A: Enables Alice to be anonymous when she later
uses her TGT to request a ticket
 In Alice’s “Kerberized” login to Bob, why
can Alice remain anonymous?
 Why is “ticket to Bob” sent to Alice?
o Why doesn’t KDC send it directly to Bob?

Part 3  Protocols 652


Kerberos Alternatives
 Could have Alice’s computer remember
password and use that for authentication
o Then no KDC required
o But hard to protect passwords
o Also, does not scale
 Could have KDC remember session key
instead of putting it in a TGT
o Then no need for TGT
o But stateless KDC is major feature of Kerberos

Part 3  Protocols 653


Kerberos Keys
 In Kerberos, KA = h(Alice’s password)
 Could instead generate random KA
o Compute Kh = h(Alice’s password)
o And Alice’s computer stores E(KA, Kh)
 Then KA need not change when Alice changes
her password
o But E(KA, Kh) must be stored on computer
 This alternative approach is often used
o But not in Kerberos
Part 3  Protocols 654
WEP

Part 3  Protocols 655


WEP
 WEP  Wired Equivalent Privacy
 The stated goal of WEP is to make
wireless LAN as secure as a wired LAN
 According to Tanenbaum:
o “The 802.11 standard prescribes a data link-
level security protocol called WEP (Wired
Equivalent Privacy), which is designed to make
the security of a wireless LAN as good as that
of a wired LAN. Since the default for a wired
LAN is no security at all, this goal is easy to
achieve, and WEP achieves it as we shall see.”
Part 3  Protocols 656
WEP Authentication
Authentication Request
R

E(R, K)
Alice, K Bob, K

 Bob is wireless access point


 Key K shared by access point and all users
o Key K seldom (if ever) changes
 WEP has many, many, many security flaws
Part 3  Protocols 657
WEP Issues
 WEP uses RC4 cipher for confidentiality
o RC4 can be a strong cipher
o But WEP introduces a subtle flaw…
o …making cryptanalytic attacks feasible
 WEP uses CRC for “integrity”
o Should have used a MAC, HMAC, or similar
o CRC is for error detection, not crypto integrity
o Everyone should know NOT to use CRC here…

Part 3  Protocols 658


WEP Integrity Problems
 WEP “integrity” gives no crypto integrity
o CRC is linear, so is stream cipher (XOR)
o Trudy can change ciphertext and CRC so that
checksum on plaintext remains valid
o Then Trudy’s introduced changes go undetected
o Requires no knowledge of the plaintext!
 CRC does not provide a cryptographic
integrity check
o CRC designed to detect random errors
o Not to detect intelligent changes

Part 3  Protocols 659


More WEP Integrity Issues
 Suppose Trudy knows destination IP
 Then Trudy also knows keystream used to
encrypt IP address, since
C = destination IP address  keystream
 Then Trudy can replace C with
C = Trudy’s IP address  keystream
 And change the CRC so no error detected
o Then what happens??
 Moral: Big problems when integrity fails
Part 3  Protocols 660
WEP Key
 Recall WEP uses a long-term key K
 RC4 is a stream cipher, so each packet
must be encrypted using a different key
o Initialization Vector (IV) sent with packet
o Sent in the clear, that is, IV is not secret
o Note: IV similar to MI in WWII ciphers
 Actual RC4 key for packet is (IV,K)
o That is, IV is pre-pended to long-term key K

Part 3  Protocols 661


WEP Encryption

IV, E(packet,KIV)

Alice, K Bob, K

 KIV = (IV,K)
o That is, RC4 key is K with 3-byte IV pre-pended
 The IV is known to Trudy

Part 3  Protocols 662


WEP IV Issues
 WEP uses 24-bit (3 byte) IV
o Each packet gets its own IV
o Key: IV pre-pended to long-term key, K
 Long term key K seldom changes
 If long-term key and IV are same,
then same keystream is used
o This is bad, bad, really really bad!
o Why?
Part 3  Protocols 663
WEP IV Issues
 Assume 1500 byte packets, 11 Mbps link
 Suppose IVs generated in sequence
o Since 1500  8/(11  106)  224 = 18,000 seconds,
an IV repeat in about 5 hours of traffic
 Suppose IVs generated at random
o By birthday problem, some IV repeats in
seconds
 Again, repeated IV (with same K) is bad

Part 3  Protocols 664


Another Active Attack
 Suppose Trudy can insert traffic and
observe corresponding ciphertext
o Then she knows the keystream for some IV
o She can decrypt any packet that uses that IV
 If Trudy does this many times, she can
then decrypt data for lots of IVs
o Remember, IV is sent in the clear
 Is such an attack feasible?
Part 3  Protocols 665
Cryptanalytic Attack
 WEP data encrypted using RC4
o Packet key is IV with long-term key K
o 3-byte IV is pre-pended to K
o Packet key is (IV,K)
 Recall IV is sent in the clear (not secret)
o New IV sent with every packet
o Long-term key K seldom changes (maybe never)
 So Trudy always knows IV and ciphertext
o Trudy wants to find the key K
Part 3  Protocols 666
Cryptanalytic Attack
 3-byte IV pre-pended to key
 Denote the RC4 key bytes …
o … as K0,K1,K2,K3,K4,K5, …
o Where IV = (K0,K1,K2) , which Trudy knows
o Trudy wants to find K = (K3,K4,K5, …)
 Given enough IVs, Trudy can easily find key K
o Regardless of the length of the key
o Provided Trudy knows first keystream byte
o Known plaintext attack (1st byte of each packet)
o Prevent by discarding first 256 keystream bytes
Part 3  Protocols 667
WEP Conclusions
 Many attacks are practical
 Attacks have been used to recover keys
and break real WEP traffic
 How to prevent these attacks?
o Don’t use WEP
o Good alternatives: WPA, WPA2, etc.
 How to make WEP a little better?
o Restrict MAC addresses, don’t broadcast ID, …

Part 3  Protocols 668


GSM (In)Security

Part 3  Protocols 669


Cell Phones
 First generation cell phones
o Brick-sized, analog, few standards
o Little or no security
o Susceptible to cloning
 Second generation cell phones: GSM
o Began in 1982 as “Groupe Speciale Mobile”
o Now, Global System for Mobile Communications
 Third generation?
o 3rd Generation Partnership Project (3GPP)

Part 3  Protocols 670


GSM System Overview

air
interface

Mobile
Base AuC
VLR
Station
“land line”
HLR
PSTN
Base Internet
etc. Home
Visited Station Network
Network Controller

Part 3  Protocols 671


GSM System Components
 Mobile phone
o Contains SIM (Subscriber
Identity Module)
 SIM is the security module
o IMSI (International Mobile
Subscriber ID)
o User key: Ki (128 bits) SIM
o Tamper resistant (smart card)
o PIN activated (often not used)

Part 3  Protocols 672


GSM System Components
 Visited network  network where mobile is
currently located
o Base station  one “cell”
o Base station controller  manages many cells
o VLR (Visitor Location Register)  info on all
visiting mobiles currently in the network
 Home network  “home” of the mobile
o HLR (Home Location Register)  keeps track of
most recent location of mobile
o AuC (Authentication Center)  has IMSI and Ki
Part 3  Protocols 673
GSM Security Goals
 Primary design goals
o Make GSM as secure as ordinary telephone
o Prevent phone cloning
 Not designed to resist an active attacks
o At the time this seemed infeasible
o Today such an attacks are clearly feasible…
 Designers considered biggest threats to be
o Insecure billing
o Corruption
o Other low-tech attacks
Part 3  Protocols 674
GSM Security Features
 Anonymity
o Intercepted traffic does not identify user
o Not so important to phone company
 Authentication
o Necessary for proper billing
o Very, very important to phone company!
 Confidentiality
o Confidentiality of calls over the air interface
o Not important to phone company…
o …except for marketing
Part 3  Protocols 675
GSM: Anonymity
 IMSI used to initially identify caller
 Then TMSI (Temporary Mobile Subscriber
ID) used
o TMSI changed frequently
o TMSI’s encrypted when sent
 Not a strong form of anonymity
 But probably useful in many cases

Part 3  Protocols 676


GSM: Authentication
 Caller is authenticated to base station
 Authentication is not mutual
 Authentication via challenge-response
o Home network generates RAND and computes
XRES = A3(RAND, Ki) where A3 is a hash
o Then (RAND,XRES) sent to base station
o Base station sends challenge RAND to mobile
o Mobile’s response is SRES = A3(RAND, Ki)
o Base station verifies SRES = XRES
 Note: Ki never leaves home network
Part 3  Protocols 677
GSM: Confidentiality
 Data encrypted with stream cipher
 Error rate estimated at about 1/1000
o Error rate is high for a block cipher
 Encryption key Kc
o Home network computes Kc = A8(RAND, Ki)
where A8 is a hash
o Then Kc sent to base station with (RAND,XRES)
o Mobile computes Kc = A8(RAND, Ki)
o Keystream generated from A5(Kc)
 Note: Ki never leaves home network
Part 3  Protocols 678
GSM Security
1. IMSI
2. IMSI
4. RAND
3. (RAND,XRES,Kc)
5. SRES
Mobile Base Home
6. Encrypt with Kc Station Network

 SRES and Kc must be uncorrelated


o Even though both are derived from RAND and Ki
 Must not be possible to deduce Ki from known
RAND/SRES pairs (known plaintext attack)
 Must not be possible to deduce Ki from chosen
RAND/SRES pairs (chosen plaintext attack)
o With possession of SIM, attacker can choose RAND’s

Part 3  Protocols 679


GSM Insecurity (1)
 Hash used for A3/A8 is COMP128
o Broken by 160,000 chosen plaintexts
o With SIM, can get Ki in 2 to 10 hours Base
Station
 Encryption between mobile and base
station but no encryption from base
VLR
station to base station controller
o Often transmitted over microwave link
 Encryption algorithm A5/1 Base
Station
o Broken with 2 seconds of known plaintext Controller

Part 3  Protocols 680


GSM Insecurity (2)
 Attacks on SIM card
o Optical Fault Induction  could attack SIM
with a flashbulb to recover Ki
o Partitioning Attacks  using timing and power
consumption, could recover Ki with only 8
adaptively chosen “plaintexts”
 With possession of SIM, attacker could
recover Ki in seconds

Part 3  Protocols 681


GSM Insecurity (3)
 Fake base station exploits two flaws
1. Encryption not automatic
2. Base station not authenticated

RAND
SRES Call to
destination
No
Mobile Fake
encryption Base Station Base Station

 Note: GSM bill goes to fake base station!

Part 3  Protocols 682


GSM Insecurity (4)
 Denial of service is possible
o Jamming (always an issue in wireless)
 Can replay triple: (RAND,XRES,Kc)
o One compromised triple gives attacker a
key Kc that is valid forever
o No replay protection here

Part 3  Protocols 683


GSM Conclusion
 Did GSM achieve its goals?
o Eliminate cloning? Yes, as a practical matter
o Make air interface as secure as PSTN? Perhaps…
 But design goals were clearly too limited
 GSM insecurities  weak crypto, SIM
issues, fake base station, replay, etc.
 PSTN insecurities  tapping, active attack,
passive attack (e.g., cordless phones), etc.
 GSM a (modest) security success?
Part 3  Protocols 684
3rd Generation Partnership
Project (3GPP)
 3G security built on GSM (in)security
 3G fixed known GSM security problems
o Mutual authentication
o Integrity-protect signaling (such as “start
encryption” command)
o Keys (encryption/integrity) cannot be reused
o Triples cannot be replayed
o Strong encryption algorithm (KASUMI)
o Encryption extended to base station controller
Part 3  Protocols 685
Protocols Summary
 Generic authentication protocols
o Protocols are subtle!
 SSH
 SSL
 IPSec
 Kerberos
 Wireless: GSM and WEP
Part 3  Protocols 686
Coming Attractions…
 Software and security
o Software flaws  buffer overflow, etc.
o Malware  viruses, worms, etc.
o Software reverse engineering
o Digital rights management
o OS and security/NGSCB

Part 3  Protocols 687


Part IV: Software

Part 4  Software 688


Why Software?
 Why is software as important to security
as crypto, access control, protocols?
 Virtually all information security features
are implemented in software
 If your software is subject to attack, your
security can be broken
o Regardless of strength of crypto, access
control, or protocols
 Software is a poor foundation for security

Part 4  Software 689


Chapter 11:
Software Flaws and Malware
If automobiles had followed the same development cycle as the computer,
a Rolls-Royce would today cost $100, get a million miles per gallon,
and explode once a year, killing everyone inside.
 Robert X. Cringely

My software never has bugs. It just develops random features.


 Anonymous

Part 4  Software 690


Bad Software is Ubiquitous
 NASA Mars Lander (cost $165 million)
o Crashed into Mars due to…
o …error in converting English and metric units of measure
o Believe it or not
 Denver airport
o Baggage handling system  very buggy software
o Delayed airport opening by 11 months
o Cost of delay exceeded $1 million/day
o What happened to person responsible for this fiasco?
 MV-22 Osprey
o Advanced military aircraft
o Faulty software can be fatal

Part 4  Software 691


Software Issues
Alice and Bob Trudy
 Find bugs and flaws  Actively looks for
by accident bugs and flaws
 Hate bad software…  Likes bad software…
 …but they learn to  …and tries to make
live with it it misbehave
 Must make bad  Attacks systems via
software work bad software

Part 4  Software 692


Complexity
 “Complexity is the enemy of security”, Paul
Kocher, Cryptography Research, Inc.

System Lines of Code (LOC)


Netscape 17 million
Space Shuttle 10 million
Linux kernel 2.6.0 5 million
Windows XP 40 million
Mac OS X 10.4 86 million
Boeing 777 7 million

 A new car contains more LOC than was required


to land the Apollo astronauts on the moon
Part 4  Software 693
Lines of Code and Bugs
 Conservative estimate: 5 bugs/10,000 LOC
 Do the math
o Typical computer: 3k exe’s of 100k LOC each
o Conservative estimate: 50 bugs/exe
o Implies about 150k bugs per computer
o So, 30,000-node network has 4.5 billion bugs
o Maybe only 10% of bugs security-critical and
only 10% of those remotely exploitable
o Then “only” 45 million critical security flaws!

Part 4  Software 694


Software Security Topics
 Program flaws (unintentional)
o Buffer overflow
o Incomplete mediation
o Race conditions
 Malicious software (intentional)
o Viruses
o Worms
o Other breeds of malware

Part 4  Software 695


Program Flaws
 An error is a programming mistake
o To err is human
 An error may lead to incorrect state: fault
o A fault is internal to the program
 A fault may lead to a failure, where a
system departs from its expected behavior
o A failure is externally observable

error fault failure

Part 4  Software 696


Example
char array[10];
for(i = 0; i < 10; ++i)
array[i] = `A`;
array[10] = `B`;
 This program has an error
 This error might cause a fault
o Incorrect internal state
 If a fault occurs, it might lead to a failure
o Program behaves incorrectly (external)
 We use the term flaw for all of the above
Part 4  Software 697
Secure Software
 In software engineering, try to ensure that
a program does what is intended
 Secure software engineering requires that
software does what is intended…
 …and nothing more
 Absolutely secure software? Dream on…
o Absolute security anywhere is impossible
 How can we manage software risks?

Part 4  Software 698


Program Flaws
 Program flaws are unintentional
o But can still create security risks
 We’ll consider 3 types of flaws
o Buffer overflow (smashing the stack)
o Incomplete mediation
o Race conditions
 These are the most common flaws
Part 4  Software 699
Buffer Overflow

Part 4  Software 700


Attack Scenario
 Users enter data into a Web form
 Web form is sent to server
 Server writes data to array called buffer,
without checking length of input data
 Data “overflows” buffer
o Such overflow might enable an attack
o If so, attack could be carried out by anyone
with Internet access

Part 4  Software 701


Buffer Overflow
int main(){
int buffer[10];
buffer[20] = 37;}

 Q: What happens when code is executed?


 A: Depending on what resides in memory
at location “buffer[20]”
o Might overwrite user data or code
o Might overwrite system data or code
o Or program could work just fine
Part 4  Software 702
Simple Buffer Overflow
 Consider boolean flag for authentication
 Buffer overflow could overwrite flag
allowing anyone to authenticate
Boolean flag
buffer
F OU R S C … T
F

 In some cases, Trudy need not be so lucky


as in this example
Part 4  Software 703
Memory Organization
 low
 Text  code text address
 Data  static variables
data
 Heap  dynamic data
heap
 Stack  “scratch paper” 
 stack
o Dynamic local variables pointer (SP)

o Parameters to functions
stack  high
o Return address address

Part 4  Software 704


Simplified Stack Example
low 

void func(int a, int b){ :


char buffer[10]; :
}
void main(){
func(1,2);  SP
buffer
}
return
 SP
ret address
a  SP

high  b  SP

Part 4  Software 705


Smashing the Stack
low 

 What happens if :
??? :
buffer overflows?
 Program“returns”  SP
to wrong location buffer
ret… NOT!
 SP
A crash is likely overflow
ret
overflow
a  SP

high  b  SP

Part 4  Software 706


Smashing the Stack
low 
 Trudy has a
:
better idea… :
 Code injection
 Trudy can run  SP
code of her evil code

choosing… ret
ret  SP
a  SP
o …on your machine
high  b  SP

Part 4  Software 707


Smashing the Stack
:
:
 Trudy may not know…
NOP
1) Address of evil code :
2) Location of ret on stack NOP

 Solutions evil code

1) Precede evil code with ret


 ret
NOP “landing pad” ret
:
2) Insert ret many times
ret
:
Part 4  Software : 708
Stack Smashing Summary
 A buffer overflow must exist in the code
 Not all buffer overflows are exploitable
o Things must align properly
 If exploitable, attacker can inject code
 Trial and error is likely required
o Fear not, lots of help is available online
o Smashing the Stack for Fun and Profit, Aleph One
 Stack smashing is “attack of the decade”…
o …for many recent decades
o Also heap & integer overflows, format strings, etc.
Part 4  Software 709
Stack Smashing Example
 Suppose program asks for a serial number
that Trudy does not know
 Also, Trudy does not have source code
 Trudy only has the executable (exe)

 Program quits on incorrect serial number


Part 4  Software 710
Buffer Overflow Present?
 By trial and error, Trudy discovers
apparent buffer overflow

 Note that 0x41 is ASCII for “A”


 Looks like ret overwritten by 2 bytes!
Part 4  Software 711
Disassemble Code
 Next, disassemble bo.exe to find

 The goal is to exploit buffer overflow


to jump to address 0x401034
Part 4  Software 712
Buffer Overflow Attack
 Find that, in ASCII, 0x401034 is “@^P4”

 Byte order is reversed? What the …


 X86 processors are “little-endian”
Part 4  Software 713
Overflow Attack, Take 2
 Reverse the byte order to “4^P@” and…

 Success! We’ve bypassed serial number


check by exploiting a buffer overflow
 What just happened?
o Overwrote return address on the stack

Part 4  Software 714


Buffer Overflow
 Trudy did not require access to the
source code
 Only tool used was a disassembler to
determine address to jump to
 Find desired address by trial and error?
o Necessary if attacker does not have exe
o For example, a remote attack

Part 4  Software 715


Source Code
 Source code for buffer overflow example
 Flaw easily
exploited by
attacker…
 …without
access to
source code!

Part 4  Software 716


Stack Smashing Defenses
 Employ non-executable stack
o “No execute” NX bit (if available)
o Seems like the logical thing to do, but some real
code executes on the stack (Java, for example)
 Use a canary
 Address space layout randomization (ASLR)
 Use safe languages (Java, C#)
 Use safer C functions
o For unsafe functions, safer versions exist
o For example, strncpy instead of strcpy

Part 4  Software 717


Stack Smashing Defenses
low 
:
 Canary :

o Run-time stack check


o Push canary onto stack
o Canary value: buffer
 Constant 0x000aff0d overflow
canary 

 Or, may depends on ret overflow


ret
a
high  b
Part 4  Software 718
Microsoft’s Canary
 Microsoft added buffer security check
feature to C++ with /GS compiler flag
o Based on canary (or “security cookie”)
Q: What to do when canary dies?
A: Check for user-supplied “handler”
 Handler shown to be subject to attack
o Claimed that attacker can specify handler code
o If so, formerly “safe” buffer overflows become
exploitable when /GS is used!

Part 4  Software 719


ASLR
 Address Space Layout Randomization
o Randomize place where code loaded in memory
 Makes most buffer overflow attacks
probabilistic
 Windows Vista uses 256 random layouts
o So about 1/256 chance buffer overflow works
 Similar thing in Mac OS X and other OSs
 Attacks against Microsoft’s ASLR do exist
o Possible to “de-randomize”

Part 4  Software 720


Buffer Overflow
 A major security threat yesterday, today,
and tomorrow
 The good news?
o It is possible to reduce overflow attacks (safe
languages, NX bit, ASLR, education, etc.)
 The bad news?
o Buffer overflows will exist for a long time
o Why? Legacy code, bad development practices,
clever attacks, etc.
Part 4  Software 721
Incomplete Mediation

Part 4  Software 722


Input Validation
 Consider: strcpy(buffer, argv[1])
 A buffer overflow occurs if
len(buffer) < len(argv[1])
 Software must validate the input by
checking the length of argv[1]
 Failure to do so is an example of a more
general problem: incomplete mediation

Part 4  Software 723


Input Validation
 Consider web form data
 Suppose input is validated on client
 For example, the following is valid
http://www.things.com/orders/final&custID=112&
num=55A&qty=20&price=10&shipping=5&total=205
 Suppose input is not checked on server
o Why bother since input checked on client?
o Then attacker could send http message
http://www.things.com/orders/final&custID=112&
num=55A&qty=20&price=10&shipping=5&total=25

Part 4  Software 724


Incomplete Mediation
 Linux kernel
o Research revealed many buffer overflows
o Lots of these due to incomplete mediation
 Linux kernel is “good” software since
o Open-source
o Kernel  written by coding gurus
 Tools exist to help find such problems
o But incomplete mediation errors can be subtle
o And tools useful for attackers too!
Part 4  Software 725
Race Conditions

Part 4  Software 726


Race Condition
 Security processes should be atomic
o Occur “all at once”
 Race conditions can arise when security-
critical process occurs in stages
 Attacker makes change between stages
o Often, between stage that gives authorization,
but before stage that transfers ownership
 Example: Unix mkdir

Part 4  Software 727


mkdir Race Condition
 mkdircreates new directory
 How mkdir is supposed to work

mkdir
1. Allocate
space
2. Transfer
ownership

Part 4  Software 728


mkdir Attack
 The mkdir race condition
mkdir
1. Allocate
space
3. Transfer
ownership

2. Create link to
password file

 Not really a “race”


o But attacker’s timing is critical
Part 4  Software 729
Race Conditions
 Race conditions are common
 Race conditions may be more prevalent
than buffer overflows
 But race conditions harder to exploit
o Buffer overflow is “low hanging fruit” today
 To prevent race conditions, make security-
critical processes atomic
o Occur all at once, not in stages
o Not always easy to accomplish in practice
Part 4  Software 730
Malware

Part 4  Software 731


Malicious Software
 Malware is not new…
o Fred Cohen’s initial virus work in 1980’s
o Cohen used viruses to break MLS systems
 Types of malware (no standard definition)
o Virus  passive propagation
o Worm  active propagation
o Trojan horse  unexpected functionality
o Trapdoor/backdoor  unauthorized access
o Rabbit  exhaust system resources
o Spyware  steals info, such as passwords
Part 4  Software 732
Where do Viruses Live?
 They live just about anywhere, such as…
 Boot sector
o Take control before anything else
 Memory resident
o Stays in memory
 Applications, macros, data, etc.
 Library routines
 Compilers, debuggers, virus checker, etc.
o These would be particularly nasty!
Part 4  Software 733
Malware Examples
 Brain virus (1986)
 Morris worm (1988)
 Code Red (2001)
 SQL Slammer (2004)
 Stuxnet (2010)
 Botnets (currently fashionable malware)
 Future of malware?
Part 4  Software 734
Brain
 First appeared in 1986
 More annoying than harmful
 A prototype for later viruses
 Not much reaction by users
 What it did
1. Placed itself in boot sector (and other places)
2. Screened disk calls to avoid detection
3. Each disk read, checked boot sector to see if
boot sector infected; if not, goto 1
 Brain did nothing really malicious
Part 4  Software 735
Morris Worm
 First
appeared in 1988
 What it tried to do
o Determine where it could spread, then…
o …spread its infection and…
o …remain undiscovered
 Morris claimed his worm had a bug!
o It tried to re-infect infected systems
o Led to resource exhaustion
o Effect was like a so-called rabbit
Part 4  Software 736
How Morris Worm Spread
 Obtained access to machines by…
o User account password guessing
o Exploit buffer overflow in fingerd
o Exploit trapdoor in sendmail
 Flaws in fingerd and sendmail were
well-known, but not widely patched

Part 4  Software 737


Bootstrap Loader
 Once Morris worm got access…
 “Bootstrap loader” sent to victim
o 99 lines of C code
 Victim compiled and executed code
 Bootstrap loader fetched the worm
 Victim authenticated sender
o Don’t want user to get a bad worm…
Part 4  Software 738
How to Remain Undetected?
 If transmission interrupted, all code
deleted
 Code encrypted when downloaded
 Code deleted after decrypt/compile
 When running, worm regularly changed
name and process identifier (PID)

Part 4  Software 739


Morris Worm: Bottom Line
 Shock to the Internet community of 1988
o Internet of 1988 much different than today
 Internet designed to survive nuclear war
o Yet, brought down by one graduate student!
o At the time, Morris’ father worked at NSA…
 Could have been much worse
 Result? CERT, more security awareness
 But should have been a wakeup call
Part 4  Software 740
Code Red Worm
 Appeared in July 2001
 Infected more than 250,000 systems
in about 15 hours
 Eventually infected 750,000 out of
about 6,000,000 vulnerable systems
 Exploited buffer overflow in
Microsoft IIS server software
o Then monitor traffic on port 80, looking
for other susceptible servers
Part 4  Software 741
Code Red: What it Did
 Day 1 to 19 of month: spread its infection
 Day 20 to 27: distributed denial of service
attack (DDoS) on www.whitehouse.gov
 Later version (several variants)
o Included trapdoor for remote access
o Rebooted to flush worm, leaving only trapdoor
 Some said it was “beta test for info warfare”
o But, no evidence to support this

Part 4  Software 742


SQL Slammer
 Infected 75,000 systems
in 10 minutes!
 At its peak, infections
doubled every 8.5 seconds
 Spread “too fast”…
 …so it “burned out”
available bandwidth

Part 4  Software 743


Why was Slammer Successful?
 Worm size: one 376-byte UDP packet
 Firewalls often let one packet thru
o Then monitor ongoing “connections”
 Expectation was that much more data
required for an attack
o So no need to worry about 1 small packet
 Slammer defied “experts”

Part 4  Software 744


Stuxnet
 Malware for information warfare…
 Discovered in 2010
o Origins go back to 2008, or earlier
 Apparently, targeted Iranian nuclear
processing facility
o Reprogrammed specific type of PLC
o Changed speed of centrifuges, causing
damage to about 1000 of them

Part 4  Software 745


Stuxnet
 Many advanced features including…
o Infect system via removable drives 
able to get behind “airgap” firewalls
o Used 4 unpatched MS vulnerabilities
o Updates via P2P over a LAN
o Contact C&C server for code/updates
o Includes a Windows rootkit for stealth
o Significant exfiltration/recon capability
o Used a compromised private key
Part 4  Software 746
Malware Related to Stuxnet
 Duqu (2011)
o Likely that developers had access to
Stuxnet source code
o Apparently, used mostly for info stealing
 Flame (2012)
o May be “most complex” malware ever
o Very sophisticated spyware mechanisms

Part 4  Software 747


Trojan Horse Example
 Trojan: unexpected functionality
 Prototype trojan for the Mac
 File icon for freeMusic.mp3:
 For a real mp3, double click on icon
o iTunes opens
o Music in mp3 file plays
 But for freeMusic.mp3, unexpected results…

Part 4  Software 748


Mac Trojan
 Double click on freeMusic.mp3
o iTunes opens (expected)
o “Wild Laugh” (not expected)
o Message box (not expected)

Part 4  Software 749


Trojan Example
 How does freeMusic.mp3 trojan work?
 This “mp3” is an application, not data

 This trojan is harmless, but…


 …could have done anything user could do
o Delete files, download files, launch apps, etc.
Part 4  Software 750
Malware Detection
 Three common detection methods
o Signature detection
o Change detection
o Anomaly detection
 We briefly discuss each of these
o And consider advantages…
o …and disadvantages

Part 4  Software 751


Signature Detection
 A signature may be a string of bits in exe
o Might also use wildcards, hash values, etc.
 For example, W32/Beast virus has signature
83EB 0274 EB0E 740A 81EB 0301 0000
o That is, this string of bits appears in virus
 We can search for this signature in all files
 If string found, have we found W32/Beast?
o Not necessarily  string could be in normal code
o At random, chance is only 1/2112
o But software is not random…
Part 4  Software 752
Signature Detection
 Advantages
o Effective on “ordinary” malware
o Minimal burden for users/administrators
 Disadvantages
o Signature file can be large (10s of thousands)…
o …making scanning slow
o Signature files must be kept up to date
o Cannot detect unknown viruses
o Cannot detect some advanced types of malware
 The most popular detection method
Part 4  Software 753
Change Detection
 Viruses must live somewhere
 Ifyou detect a file has changed, it
might have been infected
 How to detect changes?
o Hash files and (securely) store hash values
o Periodically re-compute hashes and
compare
o If hash changes, file might be infected
Part 4  Software 754
Change Detection
 Advantages
o Virtually no false negatives
o Can even detect previously unknown malware
 Disadvantages
o Many files change  and often
o Many false alarms (false positives)
o Heavy burden on users/administrators
o If suspicious change detected, then what?
Might fall back on signature detection

Part 4  Software 755


Anomaly Detection
 Monitor system for anything “unusual” or
“virus-like” or “potentially malicious” or …
 Examples of anomalous things
o Files change in some unexpected way
o System misbehaves in some way
o Unexpected network activity
o Unexpected file access, etc., etc., etc., etc.
 But, we must first define “normal”
o And normal can (and must) change over time
Part 4  Software 756
Anomaly Detection
 Advantages
o Chance of detecting unknown malware
 Disadvantages
o No proven track record
o Trudy can make abnormal look normal (go slow)
o Must be combined with another method (e.g.,
signature detection)
 Also popular in intrusion detection (IDS)
 Difficult unsolved (unsolvable?) problem
o Reminds me of AI…
Part 4  Software 757
Future of Malware
 Recent trends
o Encrypted, polymorphic, metamorphic malware
o Fast replication/Warhol worms
o Flash worms, slow worms
o Botnets
 The future is bright for malware
o Good news for the bad guys…
o …bad news for the good guys
 Future of malware detection?
Part 4  Software 758
Encrypted Viruses
 Virus writers know signature detection used
 So, how to evade signature detection?
 Encrypting the virus is a good approach
o Ciphertext looks like random bits
o Different key, then different “random” bits
o So, different copies have no common signature
 Encryption often used in viruses today

Part 4  Software 759


Encrypted Viruses
 How to detect encrypted viruses?
 Scan for the decryptor code
o More-or-less standard signature detection
o But may be more false alarms
 Why not encrypt the decryptor code?
o Then encrypt the decryptor of the decryptor
(and so on…)
 Encryption of limited value to virus writers

Part 4  Software 760


Polymorphic Malware
 Polymorphic worm
o Body of worm is encrypted
o Decryptor code is “mutated” (or “morphed”)
o Trying to hide decryptor signature
o Like an encrypted worm on steroids…
Q: How to detect?
A: Emulation  let the code decrypt itself
o Slow, and anti-emulation is possible

Part 4  Software 761


Metamorphic Malware
 A metamorphic worm mutates before
infecting a new system
o Sometimes called “body polymorphic”
 Such a worm can, in principle, evade
signature-based detection
 Mutated worm must function the same
o And be “different enough” to avoid detection
 Detection is a difficult research problem

Part 4  Software 762


Metamorphic Worm
 One approach to metamorphic replication…
o The worm is disassembled
o Worm then stripped to a base form
o Random variations inserted into code (permute
the code, insert dead code, etc., etc.)
o Assemble the resulting code

 Result is a worm with same functionality as


original, but different signature

Part 4  Software 763


Warhol Worm
 “In the future everybody will be world-
famous for 15 minutes”  Andy Warhol
 Warhol Worm is designed to infect the
entire Internet in 15 minutes
 Slammer infected 250,000 in 10 minutes
o “Burned out” bandwidth
o Could not have infected entire Internet in 15
minutes  too bandwidth intensive
 Can rapid worm do “better” than Slammer?

Part 4  Software 764


A Possible Warhol Worm
 Seed worm with an initial hit list containing
a set of vulnerable IP addresses
o Depends on the particular exploit
o Tools exist for identifying vulnerable systems
 Each successful initial infection would
attack selected part of IP address space
 Could infect entire Internet in 15 minutes!
 No worm this sophisticated has yet been
seen in the wild (as of 2011)
o Slammer generated random IP addresses
Part 4  Software 765
Flash Worm
 Can we do “better” than Warhol worm?
 Infect entire Internet in less than 15 minutes?
 Searching for vulnerable IP addresses is the
slow part of any worm attack
 Searching might be bandwidth limited
o Like Slammer
 Flash worm designed to infect entire Internet
almost instantly

Part 4  Software 766


Flash Worm
 Predetermine all vulnerable IP addresses
o Depends on details of the attack
 Embed these addresses in worm(s)
o Results in huge worm(s)
o But, the worm replicates, it splits
 No wasted time or bandwidth!
Original worm(s)

1st generation

2nd
generation
Part 4  Software 767
Flash Worm
 Estimated that ideal flash worm could
infect the entire Internet in 15 seconds!
o Some debate as to actual time it would take
o Estimates range from 2 seconds to 2 minutes
 In any case…
 …much faster than humans could respond
 So, any defense must be fully automated
 How to defend against such attacks?
Part 4  Software 768
Rapid Malware Defenses
 Master IDS watches over network
o “Infection” proceeds on part of network
o Determines whether an attack or not
o If so, IDS saves most of the network
o If not, only a slight delay
 Beneficial worm
o Disinfect faster than the worm infects
 Other approaches?
Part 4  Software 769
Push vs Pull Malware
 Viruses/worms examples of “push”
 Recently, a lot of “pull” malware
 Scenario
o A compromised web server
o Visit a website at compromised server
o Malware loaded on you machine
 Good paper: Ghost in the Browser
Part 4  Software 770
Botnet
 Botnet: a “network” of infected machines
 Infected machines are “bots”
o Victim is unaware of infection (stealthy)
 Botmaster controls botnet
o Generally, using IRC
o P2P botnet architectures exist
 Botnets used for…
o Spam, DoS attacks, keylogging, ID theft, etc.

Part 4  Software 771


Botnet Examples
 XtremBot
o Similar bots: Agobot, Forbot, Phatbot
o Highly modular, easily modified
o Source code readily available (GPL license)
 UrXbot
o Similar bots: SDBot, UrBot, Rbot
o Less sophisticated than XtremBot type
 GT-Bots and mIRC-based bots
o mIRC is common IRC client for Windows

Part 4  Software 772


More Botnet Examples
 Mariposa
o Used to steal credit card info
o Creator arrested in July 2010
 Conficker
o Estimated 10M infected hosts (2009)
 Kraken
o Largest as of 2008 (400,000 infections)
 Srizbi
o For spam, one of largest as of 2008
Part 4  Software 773
Computer Infections
 Analogies are made between computer
viruses/worms and biological diseases
 There are differences
o Computer infections are much quicker
o Ability to intervene in computer outbreak is more
limited (vaccination?)
o Bio disease models often not applicable
o “Distance” almost meaningless on Internet
 But there are some similarities…
Part 4  Software 774
Computer Infections
 Cyber “diseases” vs biological diseases
 One similarity
o In nature, too few susceptible individuals and
disease will die out
o In the Internet, too few susceptible systems and
worm might fail to take hold
 One difference
o In nature, diseases attack more-or-less at random
o Cyber attackers select most “desirable” targets
o Cyber attacks are more focused and damaging
 Mobile devices an interesting hybrid case
Part 4  Software 775
Future Malware Detection?
 Malware today far outnumbers “goodware”
o Metamorphic copies of existing malware
o Many virus toolkits available
o Trudy can recycle old viruses, new signatures
 So, may be better to “detect” good code
o If code not on approved list, assume it’s bad
o That is, use whitelist instead of blacklist

Part 4  Software 776


Miscellaneous
Software-Based
Attacks

Part 4  Software 777


Miscellaneous Attacks
 Numerous attacks involve software
 We’ll discuss a few issues that do not
fit into previous categories
o Salami attack
o Linearization attack
o Time bomb
o Can you ever trust software?

Part 4  Software 778


Salami Attack
 What is Salami attack?
o Programmer “slices off” small amounts of money
o Slices are hard for victim to detect
 Example
o Bank calculates interest on accounts
o Programmer “slices off” any fraction of a cent
and puts it in his own account
o No customer notices missing partial cent
o Bank may not notice any problem
o Over time, programmer makes lots of money!
Part 4  Software 779
Salami Attack
 Such attacks are possible for insiders
 Do salami attacks actually occur?
o Or is it just Office Space folklore?
 Programmer added a few cents to every
employee payroll tax withholding
o But money credited to programmer’s tax
o Programmer got a big tax refund!
 Rent-a-car franchise in Florida inflated gas
tank capacity to overcharge customers
Part 4  Software 780
Salami Attacks
 Employee reprogrammed Taco Bell cash
register: $2.99 item registered as $0.01
o Employee pocketed $2.98 on each such item
o A large “slice” of salami!
 In LA, four men installed computer chip
that overstated amount of gas pumped
o Customers complained when they had to pay for
more gas than tank could hold
o Hard to detect since chip programmed to give
correct amount when 5 or 10 gallons purchased
o Inspector usually asked for 5 or 10 gallons

Part 4  Software 781


Linearization Attack
 Program checks for
serial number
S123N456
 For efficiency,
check made one
character at a time
 Can attacker take
advantage of this?

Part 4  Software 782


Linearization Attack
 Correct number takes longer than incorrect
 Trudy tries all 1st characters
o Find that S takes longest
 Then she guesses all 2nd characters: S
o Finds S1 takes longest
 And so on…
 Trudy can recover one character at a time!
o Same principle as used in lock picking

Part 4  Software 783


Linearization Attack
 What is the advantage to attacking serial
number one character at a time?
 Suppose serial number is 8 characters and
each has 128 possible values
o Then 1288 = 256 possible serial numbers
o Attacker would guess the serial number in
about 255 tries  a lot of work!
o Using the linearization attack, the work is
about 8  (128/2) = 29 which is easy

Part 4  Software 784


Linearization Attack
 A real-world linearization attack
 TENEX (an ancient timeshare system)
o Passwords checked one character at a time
o Careful timing was not necessary, instead…
o …could arrange for a “page fault” when next
unknown character guessed correctly
o Page fault register was user accessible
 Attack was very easy in practice

Part 4  Software 785


Time Bomb
 In 1986 Donald Gene Burleson told employer
to stop withholding taxes from his paycheck
 His company refused
 He planned to sue his company
o He used company time to prepare legal docs
o Company found out and fired him
 Burleson had been working on malware…
o After being fired, his software “time bomb”
deleted important company data

Part 4  Software 786


Time Bomb
 Company was reluctant to pursue the case
 So Burleson sued company for back pay!
o Then company finally sued Burleson
 In 1988 Burleson fined $11,800
o Case took years to prosecute…
o Cost company thousands of dollars…
o Resulted in a slap on the wrist for attacker
 One of the first computer crime cases
 Many cases since follow a similar pattern
o Companies reluctant to prosecute
Part 4  Software 787
Trusting Software
 Can you ever trust software?
o See Reflections on Trusting Trust
 Consider the following thought experiment
 Suppose C compiler has a virus
o When compiling login program, virus creates
backdoor (account with known password)
o When recompiling the C compiler, virus
incorporates itself into new C compiler
 Difficult to get rid of this virus!
Part 4  Software 788
Trusting Software
 Suppose you notice something is wrong
 So you start over from scratch
 First, you recompile the C compiler
 Then you recompile the OS
o Including login program…
o You have not gotten rid of the problem!
 In the real world
o Attackers try to hide viruses in virus scanner
o Imagine damage that would be done by attack
on virus signature updates
Part 4  Software 789
Chapter 12:
Insecurity in Software
Every time I write about the impossibility of effectively protecting digital files
on a general-purpose computer, I get responses from people decrying the
death of copyright. “How will authors and artists get paid for their work?”
they ask me. Truth be told, I don’t know. I feel rather like the physicist
who just explained relativity to a group of would-be interstellar travelers,
only to be asked: “How do you expect us to get to the stars, then?”
I’m sorry, but I don't know that, either.
 Bruce Schneier

So much time and so little to do! Strike that. Reverse it. Thank you.
 Willy Wonka
Part 4  Software 790
Software Reverse
Engineering (SRE)

Part 4  Software 791


SRE
 Software Reverse Engineering
o Also known as Reverse Code Engineering (RCE)
o Or simply “reversing”
 Can be used for good...
o Understand malware
o Understand legacy code
 …or not-so-good
o Remove usage restrictions from software
o Find and exploit flaws in software
o Cheat at games, etc.
Part 4  Software 792
SRE
 We assume…
o Reverse engineer is an attacker
o Attacker only has exe (no source code)
o No bytecode (i.e., not Java, .Net, etc.)
 Attacker might want to
o Understand the software
o Modify (“patch”) the software
 SRE usually focused on Windows
o So we focus on Windows

Part 4  Software 793


SRE Tools
 Disassembler
o Converts exe to assembly (as best it can)
o Cannot always disassemble 100% correctly
o In general, not possible to re-assemble
disassembly into working executable
 Debugger
o Must step thru code to completely understand it
o Labor intensive  lack of useful tools
 Hex Editor
o To patch (modify) exe file
 Process Monitor, VMware, etc.
Part 4  Software 794
SRE Tools
 IDA Pro  good disassembler/debugger
o Costs a few hundred dollars (free version exists)
o Converts binary to assembly (as best it can)
 OllyDbg  high-quality shareware debugger
o Includes a good disassembler
 Hex editor  to view/modify bits of exe
o UltraEdit is good  freeware
o HIEW  useful for patching exe
 Process Monitor  freeware

Part 4  Software 795


Why is Debugger Needed?
 Disassembly gives static results
o Good overview of program logic
o User must “mentally execute” program
o Difficult to jump to specific place in the code
 Debugging is dynamic
o Can set break points
o Can treat complex code as “black box”
o And code not always disassembled correctly
 Disassembly and debugging both required
for any serious SRE task
Part 4  Software 796
SRE Necessary Skills
 Working knowledge of target assembly code
 Experience with the tools
o IDA Pro  sophisticated and complex
o OllyDbg  good choice for this class
 Knowledge of Windows Portable Executable
(PE) file format
 Boundless patience and optimism
 SRE is a tedious, labor-intensive process!

Part 4  Software 797


SRE Example
 We consider a simple example
 This example only requires disassembly
(IDA Pro used here) and hex editor
o Trudy disassembles to understand code
o Trudy also wants to patch (modify) the code
 For most real-world code, would also need a
debugger (e.g., OllyDbg)

Part 4  Software 798


SRE Example
 Program requires serial number
 But Trudy doesn’t know the serial number…

 Can Trudy get serial number from exe?

Part 4  Software 799


SRE Example
 IDA Pro disassembly

 Looks like serial number is S123N456

Part 4  Software 800


SRE Example
 Try the serial number S123N456

 It works!
 Can Trudy do “better”?
Part 4  Software 801
SRE Example
 Again, IDA Pro disassembly

 And hex view…

Part 4  Software 802


SRE Example

 “test eax,eax” is AND of eax with itself


o So, zero flag set only if eax is 0
o If test yields 0, then jz is true
 Trudy wants jz to always be true
 Can Trudy patch exe so jz always holds?
Part 4  Software 803
SRE Example
 Can Trudy patch exe so that jz always true?

xor  jz always true!!!

Assembly Hex
test eax,eax 85 C0 …
xor eax,eax 33 C0 …

Part 4  Software 804


SRE Example
 Can edit serial.exe with hex editor

serial.exe

serialPatch.exe

 Save as serialPatch.exe
Part 4  Software 805
SRE Example

 Any “serial number” now works!


 Very convenient for Trudy

Part 4  Software 806


SRE Example
 Back to IDA Pro disassembly…

serial.exe

serialPatch.exe

Part 4  Software 807


SRE Attack Mitigation
 Impossible to prevent SRE on open system
 Can we make such attacks more difficult?
 Anti-disassembly techniques
o To confuse static view of code
 Anti-debugging techniques
o To confuse dynamic view of code
 Tamper-resistance
o Code checks itself to detect tampering
 Code obfuscation
o Make code more difficult to understand
Part 4  Software 808
Anti-disassembly
 Anti-disassembly methods include
o Encrypted or “packed” object code
o False disassembly
o Self-modifying code
o Many other techniques
 Encryption prevents disassembly
o But need plaintext decryptor to decrypt code!
o Same problem as with polymorphic viruses

Part 4  Software 809


Anti-disassembly Example
 Suppose actual code instructions are

inst 1 jmp junk inst 3 inst 4 …

 What a “dumb” disassembler sees


inst 1 inst 2 inst 3 inst 4 inst 5 inst 6 …

 This is example of “false disassembly”


 Persistent attacker will figure it out

Part 4  Software 810


Anti-debugging
 IsDebuggerPresent()
 Can also monitor for
o Use of debug registers
o Inserted breakpoints
 Debuggers don’t handle threads well
o Interacting threads may confuse debugger…
o …and therefore, confuse attacker
 Many other debugger-unfriendly tricks
o See next slide for one example

Part 4  Software 811


Anti-debugger Example
inst 1 inst 2 inst 3 inst 4 inst 5 inst 6 …

 Suppose when program gets inst 1, it pre-


fetches inst 2, inst 3, and inst 4
o This is done to increase efficiency
 Suppose when debugger executes inst 1, it
does not pre-fetch instructions
 Can we use this difference to confuse the
debugger?
Part 4  Software 812
Anti-debugger Example
inst 1 inst 2 inst 3 inst
junk4 inst 5 inst 6 …

 Suppose inst 1 overwrites inst 4 in memory


 Then program (without debugger) will be OK
since it fetched inst 4 at same time as inst 1
 Debugger will be confused when it reaches
junk where inst 4 is supposed to be
 Problem if this segment of code executed
more than once!
o Also, self-modifying code is platform-dependent
 Again, clever attacker can figure this out
Part 4  Software 813
Tamper-resistance
 Goal is to make patching more difficult
 Code can hash parts of itself
 If tampering occurs, hash check fails
 Research has shown, can get good coverage
of code with small performance penalty
 But don’t want all checks to look similar
o Or else easy for attacker to remove checks
 This approach sometimes called “guards”
Part 4  Software 814
Code Obfuscation
 Goal is to make code hard to understand
o Opposite of good software engineering
o Spaghetti code is a good example
 Much research into more robust obfuscation
o Example: opaque predicate
int x,y
:
if((xy)(xy) > (xx2xy+yy)){…}
o The if() conditional is always false
 Attacker wastes time analyzing dead code
Part 4  Software 815
Code Obfuscation
 Code obfuscation sometimes promoted as a
powerful security technique
 Diffie and Hellman’s original idea for public
key crypto was based on code obfuscation
o But public key crypto didn’t work out that way
 It has been shown that obfuscation probably
cannot provide strong, crypto-like security
o On the (im)possibility of obfuscating programs
 Obfuscation might still have practical uses
o Even if it can never be as strong as crypto
Part 4  Software 816
Authentication Example
 Software used to determine authentication
 Ultimately, authentication is 1-bit decision
o Regardless of method used (pwd, biometric, …)
o Somewhere in authentication software, a single
bit determines success/failure
 If Trudy can find this bit, she can force
authentication to always succeed
 Obfuscation makes it more difficult for
attacker to find this all-important bit

Part 4  Software 817


Obfuscation
 Obfuscation forces attacker to analyze
larger amounts of code
 Method could be combined with
o Anti-disassembly techniques
o Anti-debugging techniques
o Code tamper-checking
 All of these increase work/pain for attacker
 But a persistent attacker can ultimately win

Part 4  Software 818


Software Cloning
 Suppose we write a piece of software
 We then distribute an identical copy (or clone)
to each customers
 If an attack is found on one copy, the same
attack works on all copies
 This approach has no resistance to “break
once, break everywhere” (BOBE)
 This is the usual situation in software
development

Part 4  Software 819


Metamorphic Software
 Metamorphism sometimes used in malware
 Can metamorphism also be used for good?
 Suppose we write a piece of software
 Each copy we distribute is different
o This is an example of metamorphic software
 Two levels of metamorphism are possible
o All instances are functionally distinct (only possible
in certain application)
o All instances are functionally identical but differ
internally (always possible)
o We consider the latter case
Part 4  Software 820
Metamorphic Software
 If we distribute N copies of cloned software
o One successful attack breaks all N

 If we distribute N metamorphic copies, where


each of N instances is functionally identical,
but they differ internally…
o An attack on one instance does not necessarily
work against other instances
o In the best case, N times as much work is required
to break all N instances

Part 4  Software 821


Metamorphic Software
 We cannot prevent SRE attacks
 The best we can hope for is BOBE resistance
 Metamorphism can improve BOBE resistance
 Consider the analogy to genetic diversity
o If all plants in a field are genetically identical,
one disease can rapidly kill all of the plants
o If the plants in a field are genetically diverse,
one disease can only kill some of the plants

Part 4  Software 822


Cloning vs Metamorphism
 Spse our software has a buffer overflow
 Cloned software
o Same buffer overflow attack will work against
all cloned copies of the software
 Metamorphic software
o Unique instances  all are functionally the
same, but they differ in internal structure
o Buffer overflow likely exists in all instances
o But a specific buffer overflow attack will only
work against some instances
o Buffer overflow attacks are delicate!
Part 4  Software 823
Metamorphic Software
 Metamorphic software is intriguing concept
 But raises concerns regarding…
o Software development, upgrades, etc.
 Metamorphism does not prevent SRE, but
could make it infeasible on a large scale
 Metamorphism might be a practical tool for
increasing BOBE resistance
 Metamorphism currently used in malware
 So, metamorphism is not just for evil!

Part 4  Software 824


Digital Rights Management

Part 4  Software 825


Digital Rights Management
 DRM is a good example of limitations
of doing security in software
 We’ll discuss
o What is DRM?
o A PDF document protection system
o DRM for streaming media
o DRM in P2P application
o DRM within an enterprise

Part 4  Software 826


What is DRM?
 “Remote control” problem
o Distribute digital content
o Retain some control on its use, after delivery
 Digital book example
o Digital book sold online could have huge market
o But might only sell 1 copy!
o Trivial to make perfect digital copies
o A fundamental change from pre-digital era
 Similar comments for digital music, video, etc.

Part 4  Software 827


Persistent Protection
 “Persistent protection” is the fundamental
problem in DRM
o How to enforce restrictions on use of content
after delivery?
 Examples of such restrictions
o No copying
o Limited number of reads/plays
o Time limits
o No forwarding, etc.

Part 4  Software 828


What Can be Done?
 The honor system?
o Example: Stephen King’s, The Plant
 Give up?
o Internet sales? Regulatory compliance? etc.
 Lame software-based DRM?
o The standard DRM system today
 Better software-based DRM?
o MediaSnap’s goal
 Tamper-resistant hardware?
o Closed systems: Game Cube, etc.
o Open systems: TCG/NGSCB for PCs

Part 4  Software 829


Is Crypto the Answer?

 Attacker’s goal is to recover the key


 In standard crypto scenario, attacker has
o Ciphertext, some plaintext, side-channel info, etc.
 In DRM scenario, attacker has
o Everything in the box (at least)
 Crypto was not designed for this problem!

Part 4  Software 830


Is Crypto the Answer?
 But crypto is necessary
o To securely deliver the bits
o To prevent trivial attacks
 Then attacker will not try to directly
attack crypto
 Attacker will try to find keys in software
o DRM is “hide and seek” with keys in software!

Part 4  Software 831


Current State of DRM
 At best, security by obscurity
o A derogatory term in security
 Secret designs
o In violation of Kerckhoffs Principle
 Over-reliance on crypto
o “Whoever thinks his problem can be solved
using cryptography, doesn’t understand his
problem and doesn’t understand cryptography.”
 Attributed by Roger Needham and Butler Lampson to each other

Part 4  Software 832


DRM Limitations
 The analog hole
o When content is rendered, it can be captured in
analog form
o DRM cannot prevent such an attack
 Human nature matters
o Absolute DRM security is impossible
o Want something that “works” in practice
o What works depends on context
 DRM is not strictly a technical problem!

Part 4  Software 833


Software-based DRM
 Strong software-based DRM is impossible
 Why?
o We can’t really hide a secret in software
o We cannot prevent SRE
o User with full admin privilege can eventually
break any anti-SRE protection
 Bottom line: The killer attack on software-
based DRM is SRE

Part 4  Software 834


DRM for PDF Documents
 Based on design of MediaSnap, Inc., a
small Silicon Valley startup company
 Developed a DRM system
o Designed to protect PDF documents
 Two parts to the system
o Server  Secure Document Server (SDS)
o Client  PDF Reader “plugin” software

Part 4  Software 835


Protecting a Document
persistent
encrypt protection

Alice SDS Bob

 Alice creates PDF document


 Document encrypted and sent to SDS
 SDS applies desired “persistent protection”
 Document sent to Bob

Part 4  Software 836


Accessing a Document
Request key

key
Alice SDS Bob
 Bob authenticates to SDS
 Bob requests key from SDS
 Bob can then access document, but only thru
special DRM software

Part 4  Software 837


Security Issues
 Server side (SDS)
o Protect keys, authentication data, etc.
o Apply persistent protection
 Client side (PDF plugin)
o Protect keys, authenticate user, etc.
o Enforce persistent protection
 Remaining discussion concerns client

Part 4  Software 838


Security Overview

Tamper-resistance

Obfuscation

 A tamper-resistant outer layer


 Software obfuscation applied within

Part 4  Software 839


Tamper-Resistance

Anti-debugger Encrypted code

 Encrypted code will prevent static analysis


of PDF plugin software
 Anti-debugging to prevent dynamic analysis
of PDF plugin software
 These two designed to protect each other
 But the persistent attacker will get thru!

Part 4  Software 840


Obfuscation
 Obfuscation can be used for
o Key management
o Authentication
o Caching (keys and authentication info)
o Encryption and “scrambling”
o Key parts (data and/or code)
o Multiple keys/key parts
 Obfuscation can only slow the attacker
 The persistent attacker still wins!

Part 4  Software 841


Other Security Features
 Code tamper checking (hashing)
o To validate all code executing on system
 Anti-screen capture
o To prevent obvious attack on digital documents
 Watermarking
o In theory, can trace stolen content
o In practice, of limited value
 Metamorphism (or individualization)
o For BOBE-resistance

Part 4  Software 842


Security Not Implemented
 More general code obfuscation
 Code “fragilization”
o Code that hash checks itself
o Tampering should cause code to break
 OS cannot be trusted
o How to protect against “bad” OS?
o Not an easy problem!

Part 4  Software 843


DRM for Streaming Media
 Stream digital content over Internet
o Usually audio or video
o Viewed in real time
 Want to charge money for the content
 Can we protect content from capture?
o So content can’t be redistributed
o We want to make money!

Part 4  Software 844


Attacks on Streaming Media
 Spoof the stream between endpoints
 Man in the middle
 Replay and/or redistribute data
 Capture the plaintext
o This is the threat we are concerned with
o Must prevent malicious software from
capturing plaintext stream at client end

Part 4  Software 845


Design Features
 Scrambling algorithms
o Encryption-like algorithms
o Many distinct algorithms available
o A strong form of metamorphism!
 Negotiation of scrambling algorithm
o Server and client must both know the algorithm
 Decryption at receiver end
o To remove the strong encryption
 De-scrambling in device driver
o De-scramble just prior to rendering

Part 4  Software 846


Scrambling Algorithms
 Server has a large set of scrambling
algorithms
o Suppose N of these numbered 1 thru N
 Each client has a subset of algorithms
o For example: LIST = {12,45,2,37,23,31}
 The LIST is stored on client, encrypted
with server’s key: E(LIST,Kserver)

Part 4  Software 847


Server-side Scrambling
 On server side

scrambled encrypted
data data scrambled data

 Server must scramble data with an


algorithm the client supports
 Client must send server list of algorithms it
supports
 Server must securely communicate algorithm
choice to client

Part 4  Software 848


Select Scrambling Algorithm
E(LIST, Kserver)

E(m,K)

scramble (encrypted) data


Alice using Alice’s m-th algorithm Bob
(client) (server)

 The key K is a session key


 The LIST is unreadable by client
o Reminiscent of Kerberos TGT

Part 4  Software 849


Client-side De-scrambling
 On client side
encrypted scrambled
scrambled data data data

 Try to keep plaintext away from


potential attacker
 “Proprietary” device driver
o Scrambling algorithms “baked in”
o Able to de-scramble at last moment

Part 4  Software 850


Why Scrambling?
 Metamorphism deeply embedded in system
 If a scrambling algorithm is known to be
broken, server will not choose it
 If client has too many broken algorithms,
server can force software upgrade
 Proprietary algorithm harder for SRE
 We cannot trust crypto strength of
proprietary algorithms, so we also encrypt

Part 4  Software 851


Why Metamorphism?
 The most serious threat is SRE
 Attacker does not need to reverse
engineer any standard crypto algorithm
o Attacker only needs to find the key
 Reverse engineering a scrambling
algorithm may be difficult
 This is just security by obscurity
 But appears to help with BOBE-resistance

Part 4  Software 852


DRM for a P2P Application
 Today, much digital content is delivered via
peer-to-peer (P2P) networks
o P2P networks contain lots of pirated music
 Is it possible to get people to pay for digital
content on such P2P networks?
 How can this possibly work?
 A peer offering service (POS) is one idea

Part 4  Software 853


P2P File Sharing: Query
 Suppose Alice requests “Hey Jude”
 Black arrows: query flooding
 Red arrows: positive responses

Frank Alice Dean Bob Marilyn


Carol
Pat

Ted Carol Pat Fred

 Alice can select from: Carol, Pat


Part 4  Software 854
P2P File Sharing with POS
 Suppose Alice requests “Hey Jude”
 Black arrow: query
 Red arrow: positive response
Bill
Ben
Joe

POS Alice Dean Bob Marilyn


Carol
Pat

Ted Carol Pat Fred

 Alice selects from: Bill, Ben, Carol, Joe, Pat


 Bill, Ben, and Joe have legal content!
Part 4  Software 855
POS
 Bill, Ben and Joe must appear normal to Alice
 If “victim” (Alice) clicks POS response
o DRM protected (legal) content downloaded
o Then small payment required to play
 Alice can choose not to pay
o But then she must download again
o Is it worth the hassle to avoid paying small fee?
o POS content can also offer extras

Part 4  Software 856


POS Conclusions
 A very clever idea!
 Piggybacking on existing P2P networks
 Weak DRM works very well here
o Pirated content already exists
o DRM only needs to be more hassle to break
than the hassle of clicking and waiting
 Current state of POS?
o Very little interest from the music industry
o Considerable interest from the “adult” industry

Part 4  Software 857


DRM in the Enterprise
 Why enterpise DRM?
 Health Insurance Portability and
Accountability Act (HIPAA)
o Medical records must be protected
o Fines of up to $10,000 “per incident”
 Sarbanes-Oxley Act (SOA)
o Must preserve documents of interest to SEC
 DRM-like protections needed by
corporations for regulatory compliance

Part 4  Software 858


What’s Different in
Enterprise DRM?
 Technically, similar to e-commerce
 But motivation for DRM is different
o Regulatory compliance
o To satisfy a legal requirement
o Not to make money  to avoid losing money!
 Human dimension is completely different
o Legal threats are far more plausible
 Legally, corporation is OK provided an
active attack on DRM is required

Part 4  Software 859


Enterprise DRM
 Moderate DRM security is sufficient
 Policy management issues
o Easy to set policies for groups, roles, etc.
o Yet policies must be flexible
 Authentication issues
o Must interface with existing system
o Must prevent network authentication spoofing
(authenticate the authentication server)
 Enterprise DRM is a solvable problem!

Part 4  Software 860


DRM Failures
 Many examples of DRM failures
o One system defeated by a felt-tip pen
o One defeated my holding down shift key
o Secure Digital Music Initiative (SDMI)
completely broken before it was finished
o Adobe eBooks
o Microsoft MS-DRM (version 2)
o Many, many others!

Part 4  Software 861


DRM Conclusions
 DRM nicely illustrates limitations of doing
security in software
 Software in a hostile environment is
extremely vulnerable to attack
 Protection options are very limited
 Attacker has enormous advantage
 Tamper-resistant hardware and a trusted
OS can make a difference
o We’ll discuss this more later: TCG/NGSCB

Part 4  Software 862


Secure Software
Development

Part 4  Software 863


Penetrate and Patch
 Usual approach to software development
o Develop product as quickly as possible
o Release it without adequate testing
o Patch the code as flaws are discovered
 In security, this is “penetrate and patch”
o A bad approach to software development
o An even worse approach to secure software!

Part 4  Software 864


Why Penetrate and Patch?
 First to market advantage
o First to market likely to become market leader
o Market leader has huge advantage in software
o Users find it safer to “follow the leader”
o Boss won’t complain if your system has a flaw,
as long as everybody else has same flaw…
o User can ask more people for support, etc.
 Sometimes called “network economics”

Part 4  Software 865


Why Penetrate and Patch?
 Secure software development is hard
o Costly and time consuming development
o Costly and time consuming testing
o Cheaper to let customers do the work!
 No serious economic disincentive
o Even if software flaw causes major losses, the
software vendor is not liable
o Is any other product sold this way?
o Would it matter if vendors were legally liable?
Part 4  Software 866
Penetrate and Patch Fallacy
 Fallacy: If you keep patching software,
eventually it will be secure
 Why is this a fallacy?
 Empirical evidence to the contrary
 Patches often add new flaws
 Software is a moving target: new versions,
features, changing environment, new uses,…

Part 4  Software 867


Open vs Closed Source
 Open source software
o The source code is available to user
o For example, Linux
 Closed source
o The source code is not available to user
o For example, Windows
 What are the security implications?
Part 4  Software 868
Open Source Security
 Claimed advantages of open source is
o More eyeballs: more people looking at the code
should imply fewer flaws
o A variant on Kerchoffs Principle
 Is this valid?
o How many “eyeballs” looking for security flaws?
o How many “eyeballs” focused on boring parts?
o How many “eyeballs” belong to security experts?
o Attackers can also look for flaws!
o Evil coder might be able to insert a flaw
Part 4  Software 869
Open Source Security
 Open source example: wu-ftp
o About 8,000 lines of code
o A security-critical application
o Was deployed and widely used
o After 10 years, serious security flaws discovered!
 More generally, open source software has
done little to reduce security flaws
 Why?
o Open source follows penetrate and patch model!
Part 4  Software 870
Closed Source Security
 Claimed advantage of closed source
o Security flaws not as visible to attacker
o This is a form of “security by obscurity”
 Is this valid?
o Many exploits do not require source code
o Possible to analyze closed source code…
o …though it is a lot of work!
o Is “security by obscurity” real security?

Part 4  Software 871


Open vs Closed Source
 Advocates of open source often cite the
Microsoft fallacy which states
1. Microsoft makes bad software
2. Microsoft software is closed source
3. Therefore all closed source software is bad
 Why is this a fallacy?
o Not logically correct
o More relevant is the fact that Microsoft
follows the penetrate and patch model

Part 4  Software 872


Open vs Closed Source
 No obvious security advantage to
either open or closed source
 More significant than open vs closed
source is software development
practices
 Both open and closed source follow the
“penetrate and patch” model

Part 4  Software 873


Open vs Closed Source
 If there is no security difference, why is
Microsoft software attacked so often?
o Microsoft is a big target!
o Attacker wants most “bang for the buck”
 Few exploits against Mac OS X
o Not because OS X is inherently more secure
o An OS X attack would do less damage
o Would bring less “glory” to attacker
 Next, we consider the theoretical
differences
o See this paper
Part 4  Software 874
Security and Testing
 Can be shown that probability of a security
failure after t units of testing is about
E = K/t where K is a constant
 This approximation holds over large range of t
 Then the “mean time between failures” is
MTBF = t/K
 The good news: security improves with testing
 The bad news: security only improves linearly
with testing!
Part 4  Software 875
Security and Testing
 The “mean time between failures” is
approximately
MTBF = t/K
 To have 1,000,000 hours between security
failures, must test 1,000,000 hours!
 Suppose open source project has MTBF = t/K
 If flaws in closed source are twice as hard
to find, do we then have MTBF = 2t/K ?
o No! Testing not as effective MTBF = 2(t/2)/K = t/K
 The same result for open and closed source!
Part 4  Software 876
Security and Testing
 Closed source advocates might argue
o Closed source has “open source” alpha testing,
where flaws found at (higher) open source rate
o Followed by closed source beta testing and use,
giving attackers the (lower) closed source rate
o Does this give closed source an advantage?
 Alpha testing is minor part of total testing
o Recall, first to market advantage
o Products rushed to market
 Probably no real advantage for closed source
Part 4  Software 877
Security and Testing
 No security difference between open and
closed source?
 Provided that flaws are found “linearly”
 Is this valid?
o Empirical results show security improves linearly
with testing
o Conventional wisdom is that this is the case for
large and complex software systems

Part 4  Software 878


Security and Testing
 The fundamental problem
o Good guys must find (almost) all flaws
o Bad guy only needs 1 (exploitable) flaw
 Software reliability far more
difficult in security than elsewhere
 How much more difficult?
o See the next slide…

Part 4  Software 879


Security Testing: Do the Math
 Recall that MTBF = t/K
 Suppose 106 security flaws in some software
o Say, Windows XP

 Suppose each bug has MTBF of 109 hours


 Expect to find 1 bug for every 103 hours testing
 Good guys spend 107 hours testing: find 104 bugs
o Good guys have found 1% of all the bugs

 Trudy spends 103 hours of testing: finds 1 bug


 Chance good guys found Trudy’s bug is only 1% !!!
Part 4  Software 880
Software Development
 General software development model
o Specify
o Design
o Implement
o Test
o Review
o Document
o Manage
o Maintain
Part 4  Software 881
Secure Software Development
 Goal: move away from “penetrate and patch”
 Penetrate and patch will always exist
o But if more care taken in development, then
fewer and less severe flaws to patch
 Secure software development not easy
 Much more time and effort required thru
entire development process
 Today, little economic incentive for this!

Part 4  Software 882


Secure Software Development
 We briefly discuss the following
o Design
o Hazard analysis
o Peer review
o Testing
o Configuration management
o Postmortem for mistakes

Part 4  Software 883


Design
 Careful initial design
 Try to avoid high-level errors
o Such errors may be impossible to correct later
o Certainly costly to correct these errors later
 Verify assumptions, protocols, etc.
 Usually informal approach is used
 Formal methods
o Possible to rigorously prove design is correct
o In practice, only works in simple cases
Part 4  Software 884
Hazard Analysis
 Hazard analysis (or threat modeling)
o Develop hazard list
o List of what ifs
o Schneier’s “attack tree”
 Many formal approaches
o Hazard and operability studies (HAZOP)
o Failure modes and effective analysis (FMEA)
o Fault tree analysis (FTA)

Part 4  Software 885


Peer Review
 Three levels of peer review
o Review (informal)
o Walk-through (semi-formal)
o Inspection (formal)
 Each level of review is important
 Much evidence that peer review is effective
 Although programmers might not like it!

Part 4  Software 886


Levels of Testing
 Module testing  test each small
section of code
 Component testing  test
combinations of a few modules
 Unit testing  combine several
components for testing
 Integration testing  put everything
together and test

Part 4  Software 887


Types of Testing
 Function testing  verify that system
functions as it is supposed to
 Performance testing  other requirements
such as speed, resource use, etc.
 Acceptance testing  customer involved
 Installation testing  test at install time
 Regression testing  test after any change

Part 4  Software 888


Other Testing Issues
 Active fault detection
o Don’t wait for system to fail
o Actively try to make it fail  attackers will!
 Fault injection
o Insert faults into the process
o Even if no obvious way for such a fault to occur
 Bug injection
o Insert bugs into code
o See how many of injected bugs are found
o Can use this to estimate number of bugs
o Assumes injected bugs similar to unknown bugs
Part 4  Software 889
Testing Case History
 In one system with 184,000 lines of code
 Flaws found
o 17.3% inspecting system design
o 19.1% inspecting component design
o 15.1% code inspection
o 29.4% integration testing
o 16.6% system and regression testing
 Conclusion: must do many kinds of testing
o Overlapping testing is necessary
o Provides a form of “defense in depth”
Part 4  Software 890
Security Testing: The
Bottom Line
 Security testing is far more demanding
than non-security testing
 Non-security testing  does system do
what it is supposed to?
 Security testing  does system do what it
is supposed to and nothing more?
 Usually impossible to do exhaustive testing
 How much testing is enough?

Part 4  Software 891


Security Testing: The
Bottom Line
 How much testing is enough?
 Recall MTBF = t/K
 Seems to imply testing is nearly hopeless!
 But there is some hope…
o If we eliminate an entire class of flaws then
statistical model breaks down
o For example, if a single test (or a few tests)
find all buffer overflows

Part 4  Software 892


Configuration Issues
 Types of changes
o Minor changes  maintain daily
functioning
o Adaptive changes  modifications
o Perfective changes  improvements
o Preventive changes  no loss of
performance
 Any change can introduce new flaws!
Part 4  Software 893
Postmortem
 After fixing any security flaw…
 Carefully analyze the flaw
 To learn from a mistake
o Mistake must be analyzed and understood
o Must make effort to avoid repeating mistake
 In security, always learn more when things
go wrong than when they go right
 Postmortem may be the most under-used
tool in all of security engineering!
Part 4  Software 894
Software Security
 First to market advantage
o Also known as “network economics”
o Security suffers as a result
o Little economic incentive for secure software!
 Penetrate and patch
o Fix code as security flaws are found
o Fix can result in worse problems
o Mostly done after code delivered
 Proper development can reduce flaws
o But costly and time-consuming
Part 4  Software 895
Software and Security
 Even with best development practices,
security flaws will still exist
 Absolute security is (almost) never possible
 So, it is not surprising that absolute
software security is impossible
 The goal is to minimize and manage risks of
software flaws
 Do not expect dramatic improvements in
consumer software security anytime soon!

Part 4  Software 896


Chapter 13:
Operating Systems and
Security
UNIX is basically a simple operating system,
but you have to be a genius to understand the simplicity.
 Dennis Ritchie

And it is a mark of prudence never to trust wholly


in those things which have once deceived us.
 Rene Descartes

Part 4  Software 897


OS and Security
 OSs are large, complex programs
o Many bugs in any such program
o We have seen that bugs can be security threats
 Here we are concerned with security
provided by OS
o Not concerned with threat of bad OS software
 Concerned with OS as security enforcer
 In this section we only scratch the surface

Part 4  Software 898


OS Security Challenges
 Modern OS is multi-user and multi-tasking
 OS must deal with
o Memory
o I/O devices (disk, printer, etc.)
o Programs, threads
o Network issues
o Data, etc.
 OS must protect processes from other
processes and users from other users
o Whether accidental or malicious

Part 4  Software 899


OS Security Functions
 Memory protection
o Protect memory from users/processes
 File protection
o Protect user and system resources
 Authentication
o Determines and enforce authentication results
 Authorization
o Determine and enforces access control

Part 4  Software 900


Memory Protection
 Fundamental problem
o How to keep users/processes separate?
 Separation
o Physical separation  separate devices
o Temporal separation  one at a time
o Logical separation  sandboxing, etc.
o Cryptographic separation  make information
unintelligible to outsider
o Or any combination of the above

Part 4  Software 901


Memory Protection
 Fence  users cannot cross a
specified address
o Static fence  fixed size OS
o Dynamic fence  fence register

 Base/bounds register  lower and upper


address limit
 Assumes contiguous space

Part 4  Software 902


Memory Protection
 Tagging  specify protection of each address
+ Extremely fine-grained protection
- High overhead  can be reduced by tagging
sections instead of individual addresses
- Compatibility
 More common is segmentation and/or paging
o Protection is not as flexible
o But much more efficient

Part 4  Software 903


Segmentation
 Divide memory into logical units, such as
o Single procedure
o Data in one array, etc.
 Can enforce different access restrictions
on different segments
 Any segment can be placed in any memory
location (if location is large enough)
 OS keeps track of actual locations

Part 4  Software 904


Segmentation
memory

program

Part 4  Software 905


Segmentation
 OS can place segments anywhere
 OS keeps track of segment locations
as <segment,offset>
 Segments can be moved in memory
 Segments can move out of memory
 All address references go thru OS

Part 4  Software 906


Segmentation Advantages
 Every address reference can be checked
o Possible to achieve complete mediation
 Different protection can be applied to
different segments
 Users can share access to segments
 Specific users can be restricted to
specific segments

Part 4  Software 907


Segmentation Disadvantages
 How to reference <segment,offset> ?
o OS must know segment size to verify access is
within segment
o But some segments can grow during execution (for
example, dynamic memory allocation)
o OS must keep track of variable segment sizes
 Memory fragmentation is also a problem
o Compacting memory changes tables
 A lot of work for the OS
 More complex  more chance for mistakes

Part 4  Software 908


Paging
 Like segmentation, but fixed-size segments
 Access via <page,offset>
 Plusses and minuses
+ Avoids fragmentation, improved efficiency
+ OS need not keep track of variable segment sizes
- No logical unity to pages
- What protection to apply to a given page?

Part 4  Software 909


Paging memory

program Page 1

Page 0
Page 2
Page 1
Page 0
Page 2
Page 3
Page 4
Page 4

Page 3

Part 4  Software 910


Other OS Security Functions
 OS must enforce access control
 Authentication
o Passwords, biometrics
o Single sign-on, etc.
 Authorization
o ACL
o Capabilities
 These topics discussed previously
 OS is an attractive target for attack!

Part 4  Software 911


Trusted Operating System

Part 4  Software 912


Trusted Operating System
 An OS is trusted if we rely on it for
o Memory protection
o File protection
o Authentication
o Authorization
 Every OS does these things
 But if a trusted OS fails to provide these,
our security fails

Part 4  Software 913


Trust vs Security
 Trust implies reliance  Security is a
 Trust is binary
judgment of
effectiveness
 Ideally, only trust
 Judge based on
secure systems
specified policy
 All trust relationships
 Security depends on
should be explicit
trust relationships

 Note: Some authors use different terminology!

Part 4  Software 914


Trusted Systems
 Trust implies reliance
 A trusted system is relied on for security
 An untrusted system is not relied on for
security
 If all untrusted systems are compromised,
your security is unaffected
 Ironically, only a trusted system can
break your security!

Part 4  Software 915


Trusted OS
 OS mediates interactions between
subjects (users) and objects
(resources)
 Trusted OS must decide
o Which objects to protect and how
o Which subjects are allowed to do what

Part 4  Software 916


General Security Principles
 Least privilege  like “low watermark”
 Simplicity
 Open design (Kerchoffs Principle)
 Complete mediation
 White listing (preferable to black listing)
 Separation
 Ease of use
 But commercial OSs emphasize features
o Results in complexity and poor security

Part 4  Software 917


OS Security
 Any OS must provide some degree of
o Authentication
o Authorization (users, devices and data)
o Memory protection
o Sharing
o Fairness
o Inter-process communication/synchronization
o OS protection

Part 4  Software 918


OS Services
users
Synchronization
Concurrency
Deadlock
Communication
User interface Audit trail, etc.

Operating system

Data, programs,
CPU, memory,
I/O devices, etc.

Part 4  Software 919


Trusted OS
 A trusted OS also provides some or all of
o User authentication/authorization
o Mandatory access control (MAC)
o Discretionary access control (DAC)
o Object reuse protection
o Complete mediation  access control
o Trusted path
o Audit/logs

Part 4  Software 920


Trusted OS Services
users
Synchronization
Concurrency
Deadlock
Communication
User interface Audit trail, etc.
Authentication

Operating system
Data, programs,
CPU, memory,
I/O devices, etc.

Part 4  Software 921


MAC and DAC
 Mandatory Access Control (MAC)
o Access not controlled by owner of object
o Example: User does not decide who holds a
TOP SECRET clearance
 Discretionary Access Control (DAC)
o Owner of object determines access
o Example: UNIX/Windows file protection
 If DAC and MAC both apply, MAC wins

Part 4  Software 922


Object Reuse Protection
 OS must prevent leaking of info
 Example
o User creates a file
o Space allocated on disk
o But same space previously used
o “Leftover” bits could leak information
o Magnetic remanence is a related issue

Part 4  Software 923


Trusted Path
 Suppose you type in your password
o What happens to the password?
 Depends on the software!
 How can you be sure software is not evil?
 Trusted path problem:
“I don't know how to to be confident even of a digital
signature I make on my own PC, and I've worked in
security for over fifteen years. Checking all of the
software in the critical path between the display and the
signature software is way beyond my patience. ”
 Ross Anderson

Part 4  Software 924


Audit
 System should log security-related events
 Necessary for postmortem
 What to log?
o Everything? Who (or what) will look at it?
o Don’t want to overwhelm administrator
o Needle in haystack problem
 Should we log incorrect passwords?
o “Almost” passwords in log file?
 Logging is not a trivial matter

Part 4  Software 925


Security Kernel
 Kernel is the lowest-level part of the OS
 Kernel is responsible for
o Synchronization
o Inter-process communication
o Message passing
o Interrupt handling
 The security kernel is the part of the
kernel that deals with security
 Security kernel contained within the kernel

Part 4  Software 926


Security Kernel
 Why have a security kernel?
 All accesses go thru kernel
o Ideal place for access control
 Security-critical functions in one location
o Easier to analyze and test
o Easier to modify
 More difficult for attacker to get in
“below” security functions

Part 4  Software 927


Reference Monitor
 The part of the security kernel that deals
with access control
o Mediates access of subjects to objects
o Tamper-resistant
o Analyzable (small, simple, etc.)

Objects Subjects

Reference monitor
Part 4  Software 928
Trusted Computing Base
 TCB  everything in the OS that we rely
on to enforce security
 If everything outside TCB is subverted,
trusted OS would still be trusted
 TCB protects users from each other
o Context switching between users
o Shared processes
o Memory protection for users
o I/O operations, etc.

Part 4  Software 929


TCB Implementation
 Security may occur many places within OS
 Ideally, design security kernel first, and
build the OS around it
o Reality is usually the other way around
 Example of a trusted OS: SCOMP
o Developed by Honeywell
o Less than 10,000 LOC in SCOMP security kernel
o Win XP has 40,000,000 lines of code!

Part 4  Software 930


Poor TCB Design

Hardware
OS kernel
Operating system
User space

Security critical activities

Problem: No clear security layer


Part 4  Software 931
Better TCB Design

Hardware
Security kernel
Operating system
User space

Security kernel is the security layer


Part 4  Software 932
Trusted OS Summary
 Trust implies reliance
 TCB (trusted computing OS
base) is everything in OS
we rely on for security
 If everything outside
OS Kernel
TCB is subverted, we still
have trusted system
 If TCB subverted,
security is broken Security Kernel

Part 4  Software 933


NGSCB

Part 4  Software 934


Next Generation Secure
Computing Base
 NGSCB pronounced “n-scub” (the G is silent)
 Was supposed to be part of Vista OS
o Vista was once known as Longhorn…
 TCG (Trusted Computing Group)
o Led by Intel, TCG makes special hardware
 NGSCB is the part of Windows that will
interface with TCG hardware
 TCG/NGSCB formerly TCPA/Palladium
o Why the name changes?
Part 4  Software 935
NGSCB
 The original motivation for TCPA/Palladium
was digital rights management (DRM)
 Today, TCG/NGSCB is promoted as general
security-enhancing technology
o DRM just one of many potential applications
 Depending on who you ask, TCG/NGSCB is
o Trusted computing
o Treacherous computing

Part 4  Software 936


Motivation for TCG/NGSCB
 Closed systems: Game consoles, etc.
o Good at protecting secrets (tamper resistant)
o Good at forcing people to pay for software
o Limited flexibility
 Open systems: PCs
o Incredible flexibility
o Poor at protecting secrets
o Very poor at defending their own software
 TCG: closed system security on open platform
 “virtual set-top box inside your PC”  Rivest

Part 4  Software 937


TCG/NGSCB
 TCG provides tamper-resistant hardware
o Secure place to store cryptographic key
o Key secure from a user with admin privileges!
 TCG hardware is in addition to ordinary
hardware, not in place of it
 PC has two OSs  regular OS and special
trusted OS to deal with TCG hardware
 NGSCB is Microsoft’s trusted OS

Part 4  Software 938


NGSCB Design Goals
 Provide high assurance
o High confidence that system behaves correctly
o Correct behavior even if system is under attack
 Provide authenticated operation
o Authenticate “things” (software, devices, etc.)
 Protection against hardware tampering is
concern of TCG, not NGSCB

Part 4  Software 939


NGSCB Disclaimer
 Specific details are sketchy
 Based on available info, Microsoft may not
have resolved all of the details
o Maybe un-resolvable?
 What follows: author’s best guesses
 This should all become much clearer in the
not-too-distant future
o At least I thought so a couple of years ago…

Part 4  Software 940


NGSCB Architecture
Left-hand side (LHS) Right-hand side (RHS)

u Application
n NCA t
t NCA r
r Application u
u User space s
s Kernel
t
t Regular OS e
e d
Nexus
d
Drivers

 Nexus is the Trusted Computing Base in NGSCB


 The NCA (Nexus Computing Agents) talk to Nexus
and LHS
Part 4  Software 941
NGSCB
 NGSCB has 4 “feature groups”
1. Strong process isolation
o Processes do not interfere with each other
2. Sealed storage
o Data protected (tamper resistant hardware)
3. Secure path
o Data to and from I/O protected
4. Attestation
o “Things” securely authenticated
o Allows TCB to be extended via NCAs
 All are aimed at malicious code
 4. also provides (secure) extensibility

Part 4  Software 942


NGSCB Process Isolation
 Curtained memory
 Process isolation and the OS
o Protect trusted OS (Nexus) from untrusted OS
o Isolate trusted OS from untrusted stuff
 Process isolation and NCAs
o NCAs isolated from software they do not trust
 Trust determined by users, to an extent…
o User can disable a trusted NCA
o User cannot enable an untrusted NCA

Part 4  Software 943


NGSCB Sealed Storage
 Sealed storage contains secret data
o If code X wants access to secret, a hash of X
must be verified (integrity check of X)
o Implemented via symmetric key cryptography
 Confidentiality of secret is protected since
only accessed by trusted software
 Integrity of secret is assured since it’s in
sealed storage

Part 4  Software 944


NGSCB Secure Path
 Secure path for input
o From keyboard to Nexus
o From mouse to Nexus
o From any input device to Nexus
 Secure path for output
o From Nexus to the screen
 Uses crypto (digital signatures)

Part 4  Software 945


NGSCB Attestation (1)
 Secure authentication of things
o Authenticate devices, services, code, etc.
o Separate from user authentication
 Public key cryptography used
o Certified key pair required
o Private key not user-accessible
o Sign and send result to remote system
 TCB extended via attestation of NCAs
o This is a major feature!

Part 4  Software 946


NGSCB Attestation (2)
 Public key used for attestation
o However, public key reveals the user identity
o Using public keys, anonymity would be lost
 Trusted third party (TTP) can be used
o TTP verifies signature
o Then TTP vouches for signature
o Anonymity preserved (except to TTP)
 Support for zero knowledge proofs
o Verify knowledge of a secret without revealing it
o Anonymity “preserved unconditionally”

Part 4  Software 947


NGSCB Compelling Apps (1)
 Type your Word document in Windows
o I.e., the untrusted LHS
 Move document to trusted RHS
 Read document carefully
 Digitally sign the document
 Assured that “what you see is what you sign”
o Practically impossible to get this on your PC

Part 4  Software 948


NGSCB Compelling Apps (2)
 Digital Rights Management (DRM)
 Many DRM problems solved by NGSCB
 Protect secret  sealed storage
o Impossible without something like NGSCB
 Scraping data  secure path
o Cannot prevent without something like NGSCB
 Positively ID users
o Higher assurance with NGSCB

Part 4  Software 949


NGSCB According to MS
 All of Windows works on untrusted LHS
 User is in charge of…
o Which Nexus(es) will run on system
o Which NCAs will run on system
o Which NCAs allowed to identify system, etc.
 No external process enables Nexus or NCA
 Nexus can’t block, delete, censor data
o NCA does, but NCAs authorized by user
 Nexus is open source

Part 4  Software 950


NGSCB Critics
 Many critics  we consider two
 Ross Anderson
o Perhaps the most influential critic
o Also one of the harshest critics
 Clark Thomborson
o Lesser-known critic
o Criticism strikes at heart of NGSCB

Part 4  Software 951


Anderson’s NGSCB Criticism (1)
 Digital object controlled by its creator, not
user of machine where it resides: Why?
o Creator can specify the NCA
o If user does not accept NCA, access is denied
o Aside: This is critical for, say, MLS applications
 If Microsoft Word encrypts all documents
with key only available to Microsoft products
o Then difficult to stop using Microsoft products

Part 4  Software 952


Anderson’s NGSCB Criticism (2)
 Files from a compromised machine could be
blacklisted to, e.g., prevent music piracy
 Suppose everyone at SJSU uses same pirated
copy of Microsoft Word
o If you stop this copy from working on all NGSCB
machines, SJSU users will not use NGSCB
o Instead, make all NGSCB machines refuse to open
documents created with this copy of Word…
o …so SJSU user can’t share docs with NGSCB user…

Part 4  Software 953


Anderson’s NGSCB Criticism (3)
 Going off the deep end…
o “The Soviet Union tried to register and
control all typewriters. NGSCB attempts
to register and control all computers.”
o “In 2010 President Clinton may have two
red buttons on her desk  one that
sends missiles to China and another that
turns off all of the PCs in China…”

Part 4  Software 954


Thomborson’s NGSCB Criticism
 NGSCB acts like a security guard
 By passive observation, NGSCB
“security guard” can see sensitive info
 Former student worked as security
guard at apartment complex
o By passive observations…
o …he learned about people who lived there

Part 4  Software 955


Thomborson’s NGSCB Criticism
 CanNGSCB spy on you?
 According to Microsoft
o Nexus software is public
o NCAs can be debugged (for development)
o NGSCB is strictly “opt in”
 Loophole?
o Release version of NCA can’t be debugged
and debug and release versions differ

Part 4  Software 956


NGSCB Bottom Line (1)
 NGCSB: trusted OS on an open platform
 Without something similar, PC may lose out
o Particularly in entertainment-related areas
o Copyright holders will not trust PC
o Already lost? (iPod, Kindle, iPad, etc., etc.)
 With NGSCB, will users lose some control
of their PCs?
 But NGSCB users must choose to “opt in”
o If user does not opt in, what has been lost?

Part 4  Software 957


NGSCB Bottom Line (2)
 NGSCB is a trusted system
 Only trusted system can break security
o By definition, an untrusted system is not
trusted with security critical tasks
o Also by definition, a trusted system is trusted
with security critical tasks
o If untrusted system is compromised, security is
not at risk
o If a trusted system is compromised (or simply
malfunctions), security is at risk

Part 4  Software 958


Software Summary
 Software flaws
o Buffer overflow
o Race conditions
o Incomplete mediation
 Malware
o Viruses, worms, etc.
 Other software-based attacks

Part 4  Software 959


Software Summary
 Software Reverse Engineering (SRE)
 Digital Rights Management (DRM)
 Secure software development
o Penetrate and patch
o Open vs closed source
o Testing

Part 4  Software 960


Software Summary
 Operating systems and security
o How does OS enforce security?
 Trusted OS design principles
 Microsoft’s NGSCB
o A trusted OS for DRM

Part 4  Software 961


Course Summary
 Crypto
o Symmetric key, public key, hash functions,
cryptanalysis
 Access Control
o Authentication, authorization
 Protocols
o Simple auth., SSL, IPSec, Kerberos, GSM
 Software
o Flaws, malware, SRE, Software development,
trusted OS

Part 4  Software 962


Conclusion

Conclusion 963
Course Summary
 Crypto
o Basics, symmetric key, public key, hash
functions and other topics, cryptanalysis
 Access Control
o Authentication, authorization, firewalls, IDS
 Protocols
o Simplified authentication protocols
o Real-World protocols
 Software
o Flaws, malware, SRE, development, trusted OS
Conclusion 964
Crypto Basics
 Terminology

 Classic ciphers
o Simple substitution
o Double transposition
o Codebook
o One-time pad
 Basic cryptanalysis

Conclusion 965
Symmetric Key
 Stream ciphers
o A5/1
o RC4
 Block ciphers
o DES
o AES, TEA, etc.
o Modes of operation
 Data integrity (MAC)

Conclusion 966
Public Key
 Knapsack (insecure)
 RSA

 Diffie-Hellman

 Elliptic curve crypto (ECC)


 Digital signatures and non-repudiation
 PKI

Conclusion 967
Hashing and Other
 Birthday problem
 Tiger Hash
 HMAC
 Clever uses (online bids, spam reduction, …)
 Other topics
o Secret sharing
o Random numbers
o Information hiding (stego, watermarking)

Conclusion 968
Advanced Cryptanalysis
 Enigma

 RC4 (as used in WEP)


 Linear and differential cryptanalysis
 Knapsack attack (lattice reduction)
 RSA timing attacks

Conclusion 969
Authentication
 Passwords
o Verification and storage (salt, etc.)
o Cracking (math)
 Biometrics
o Fingerprint, hand geometry, iris scan, etc.
o Error rates
 Two-factor, single sign on, Web cookies

Conclusion 970
Authorization
 History/system certification
 ACLs and capabilities
 Multilevel security (MLS)
o BLP, Biba, compartments, covert channel,
inference control
 CAPTCHA
 Firewalls
 IDS
Conclusion 971
Simple Protocols
 Authentication
o Using symmetric key
o Using public key
o Session key
o Perfect forward secrecy (PFS)
o Timestamps
 Zero knowledge proof (Fiat-Shamir)

Conclusion 972
Real-World Protocols
 SSH
 SSL
 IPSec
o IKE
o ESP/AH, tunnel/transport modes, …
 Kerberos
 Wireless: WEP & GSM
Conclusion 973
Software Flaws and Malware
 Flaws
o Buffer overflow
o Incomplete mediation, race condition, etc.
 Malware
o Brain, Morris Worm, Code Red, Slammer
o Malware detection
o Future of malware, botnets, etc.
 Other software-based attacks
o Salami, linearization, etc.
Conclusion 974
Insecurity in Software
 Software reverse engineering (SRE)
o Software protection
 Digital rights management (DRM)
 Software development
o Open vs closed source
o Finding flaws (do the math)

Conclusion 975
Operating Systems
 OS security functions
o Separation
o Memory protection, access control
 Trusted OS
o MAC, DAC, trusted path, TCB, etc.
 NGSCB
o Technical issues
o Criticisms

Conclusion 976
Crystal Ball
 Cryptography
o Well-established field
o Don’t expect major changes
o But some systems will be broken
o ECC is a major “growth” area
o Quantum crypto may prove worthwhile…
o …but for now it’s mostly (all?) hype

Conclusion 977
Crystal Ball
 Authentication
o Passwords will continue to be a problem
o Biometrics should become more widely used
o Smartcard/tokens will be used more
 Authorization
o ACLs, etc., well-established areas
o CAPTCHA’s interesting new topic
o IDS is a very hot topic

Conclusion 978
Crystal Ball
 Protocols are challenging
 Difficult to get protocols right
 Protocol development often haphazard
o “Kerckhoffs’ Principle” for protocols?
o Would it help?
 Protocols will continue to be a source of
subtle problem

Conclusion 979
Crystal Ball
 Software is a huge security problem today
o Buffer overflows are on the decline…
o …but race condition attacks might increase
 Virus writers are getting smarter
o Botnets
o Polymorphic, metamorphic, sophisticated attacks, …
o Future of malware detection?
 Malware will continue to be a BIG problem
Conclusion 980
Crystal Ball
 Other software issues
o Reverse engineering will not go away
o Secure development will remain hard
o Open source is not a panacea
 OS issues
o NGSCB (or similar) might change things…
o …but, for better or for worse?

Conclusion 981
The Bottom Line
 Security knowledge is needed today…
 …and it will be needed in the future
 Necessary to understand technical issues
o The focus of this class
 But technical knowledge is not enough
o Human nature, legal issues, business issues, ...
o As with anything, experience is helpful

Conclusion 982
A True Story
 The names have been changed…
 “Bob” took my information security class
 Bob then got an intern position
o At a major company that does lots of security
 One meeting, an important customer asked
o “Why do we need signed certificates?”
o “After all, they cost money!”
 The silence was deafening
Conclusion 983
A True Story
 Bob’s boss remembered that Bob had taken
a security class
o So he asked Bob, the lowly intern, to answer
o Bob mentioned man-in-the-middle attack on SSL
 Customer wanted to hear more
o So, Bob explained MiM attack in some detail
 The next day, “Bob the lowly intern” became
“Bob the fulltime employee”

Conclusion 984
Appendix

Appendix 985
Appendix
 Networking basics
o Protocol stack, layers, etc.
 Math basics
o Modular arithmetic
o Permutations
o Probability
o Linear algebra

Appendix 986
Networking Basics

There are three kinds of death in this world.


There's heart death, there's brain death, and there's being off the network.
 Guy Almes

Appendix 987
Network
 Includes
o Computers
o Servers
o Routers
o Wireless devices
o Etc.
 Purpose is to
transmit data

Appendix 988
Network Edge
 Network edge
includes…
 …Hosts
o Computers
o Laptops
o Servers
o Cell phones
o Etc., etc.

Appendix 989
Network Core

 Network core
consists of
o Interconnected
mesh of routers
 Purpose is to
move data from
host to host

Appendix 990
Packet Switched Network
 Telephone network is/was circuit switched
o For each call, a dedicated circuit established
o Dedicated bandwidth
 Modern data networks are packet switched
o Data is chopped up into discrete packets
o Packets are transmitted independently
o No dedicated circuit is established
+ More efficient bandwidth usage
- But more complex than circuit switched
Appendix 991
Network Protocols
 Study of networking focused on protocols
 Networking protocols precisely specify
“communication rules”
 Details are given in RFCs
o RFC is essentially an Internet standard
 Stateless protocols do not “remember”
 Stateful protocols do “remember”
 Many security problems related to state
o E.g., DoS is a problem with stateful protocols
Appendix 992
Protocol Stack
 Application layer protocols user
o HTTP, FTP, SMTP, etc. application space

 Transport layer protocols transport


o TCP, UDP OS
 Network layer protocols network
o IP, routing protocols
link
 Link layer protocols NIC
card
o Ethernet, PPP physical
 Physical layer
Appendix 993
Layering in Action
router
data application application data
transport transport
network network network
link link link
host physical
host
physical physical

 At source, data goes “down” the protocol stack


 Each router processes packet “up” to network layer
o That’s where routing info lives
 Router then passes packet down the protocol stack
 Destination processes packet up to application layer
o That’s where the application data lives
Appendix 994
Encapsulation data X

 X = application data at source application


 As X goes down protocol stack, each
transport
layer adds header information:
o Application layer: (H, X) network
o Transport layer: (H, (H, X))
o Network layer: (H, (H, (H, X))) link
o Link layer: (H, (H, (H, (H, X))))
physical
 Header has info required by layer
 Note that app data is on the “inside” packet
(H,(H,(H,(H,X))))
Appendix 995
Application Layer
 Applications
o For example, Web browsing, email, P2P, etc.
o Applications run on hosts
o To hosts, network details should be transparent
 Application layer protocols
o HTTP, SMTP, IMAP, Gnutella, etc., etc.
 Protocol is only one part of an application
o For example, HTTP only a part of web browsing

Appendix 996
Client-Server Model
 Client
o “speaks first”
 Server
o responds to client’s request
 Hosts are clients or servers
 Example: Web browsing
o You are the client (request web page)
o Web server is the server

Appendix 997
Peer-to-Peer Paradigm
 Hosts act as clients and servers
 For example, when sharing music
o You are client when requesting a file
o You are a server when someone
downloads a file from you
 In P2P, how does client find server?
o Many different P2P models for this

Appendix 998
HTTP Example
HTTP request

HTTP response

 HTTP  HyperText Transfer Protocol


 Client (you) requests a web page
 Server responds to your request

Appendix 999
cookie
Web Cookies
initial
session

Cookie
database
cookie
later
session

 HTTP is stateless  cookies used to add state


 Initially, cookie sent from server to browser
 Browser manages cookie, sends it to server
 Server uses cookie database to “remember” you
Appendix 1000
Web Cookies
 Web cookies used for…
o Shopping carts, recommendations, etc.
o A very (very) weak form of
authentication
 Privacy concerns
o Web site can learn a lot about you
o Multiple web sites could learn even more

Appendix 1001
SMTP
 SMTP used to deliver email from sender to
recipient’s mail server
 Then POP3, IMAP or HTTP (Web mail)
used to get messages from server
 As with many application protocols, SMTP
commands are human readable

Sender Recipient
SMTP SMTP
POP3

Appendix 1002
Spoofed email with SMTP
User types the red lines:
> telnet eniac.cs.sjsu.edu 25
220 eniac.sjsu.edu
HELO ca.gov
250 Hello ca.gov, pleased to meet you
MAIL FROM: <arnold@ca.gov>
250 arnold@ca.gov... Sender ok
RCPT TO: <stamp@cs.sjsu.edu>
250 stamp@cs.sjsu.edu ... Recipient ok
DATA
354 Enter mail, end with "." on a line by itself
It is my pleasure to inform you that you
are terminated
.
250 Message accepted for delivery
QUIT
221 eniac.sjsu.edu closing connection

Appendix 1003
Application Layer
 DNS  Domain Name Service
o Convert human-friendly names such as
www.google.com into 32-bit IP address
o A distributed hierarchical database
 Only 13 “root” DNS server clusters
o Essentially, a single point of failure for Internet
o Attacks on root servers have succeeded…
o …but, attacks did not last long enough (yet)

Appendix 1004
Transport Layer
 The network layer offers unreliable, “best
effort” delivery of packets
 Any improved service must be provided by
the hosts
 Transport layer: 2 protocols of interest
o TCP  more service, more overhead
o UDP  less service, less overhead
 TCP and UDP run on hosts, not routers

Appendix 1005
TCP
 TCP assures that packets…
o Arrive at destination
o Are processed in order
o Are not sent too fast for receiver: flow control
 TCP also attempts to provide…
o Network-wide congestion control
 TCP is connection-oriented
o TCP contacts server before sending data
o Orderly setup and take down of “connection”
o But no true connection, only logical “connection”
Appendix 1006
TCP Header
bits
0 8 16 24 31

Source Port Destination Port


Sequence Number
Acknowledgement Number
Offset reserved U A P R S F Window
Checksum Urgent Pointer
Options Padding
Data (variable length)

 Source and destination port


 Sequence number
 Flags (ACK, SYN, RST, etc.)
 Header usually 20 bytes (if no options)
Appendix 1007
TCP Three-Way Handshake
SYN request

SYN-ACK

ACK (and data)

 SYN  synchronization requested


 SYN-ACK  acknowledge SYN request
 ACK  acknowledge SYN-ACK (send data)
 Then TCP “connection” established
o Connection terminated by FIN or RST
Appendix 1008
Denial of Service Attack
 The TCP 3-way handshake makes denial of
service (DoS) attacks possible
 Whenever SYN packet is received, server
remembers this “half-open” connection
o Remembering consumes resources
o Too many half-open connections and server’s
resources will be exhausted, and then…
o …server can’t respond to legitimate connections
 This occurs because TCP is stateful
Appendix 1009
UDP
 UDP is minimalist, “no frills” service
o No assurance that packets arrive
o No assurance packets are in order, etc., etc.
 Why does UDP exist?
o More efficient (header only 8 bytes)
o No flow control to slow down sender
o No congestion control to slow down sender
 If packets sent too fast, will be dropped
o Either at intermediate router or at destination
o But in some apps this may be OK (audio/video)
Appendix 1010
Network Layer
 Core of network/Internet
o Interconnected mesh of routers
 Purpose of network layer
o Route packets through this mesh
 Network layer protocol of interest is IP
o Follows a best effort approach
 IP runs in every host and every router
 Routers also run routing protocols
o Used to determine the path to send packets
o Routing protocols: RIP, OSPF, BGP, …
Appendix 1011
IP Addresses
 IP address is 32 bits
 Every host has an IP address
 Big problem  Not enough IP addresses!
o Lots of tricks used to extend address space
 IP addresses given in dotted decimal notation
o For example: 195.72.180.27
o Each number is between 0 and 255
 Usually, a host’s IP address can change
Appendix 1012
Socket
 Each host has a 32 bit IP address
 But, many processes can run on one host
o E.g., you can browse web, send email at same time
 How to distinguish processes on a host?
 Each process has a 16 bit port number
o Numbers below 1024 are “well-known” ports
(HTTP is port 80, POP3 is port 110, etc.)
o Port numbers above 1024 are dynamic (as needed)
 IP address + port number = socket
o Socket uniquely identifies process, Internet-wide
Appendix 1013
Network Address Translation
 Network Address Translation (NAT)
o Trick to extend IP address space
 Use one IP address (different port
numbers) for multiple hosts
o “Translates” outside IP address (based
on port number) to inside IP address

Appendix 1014
NAT-less Example

source 11.0.0.1:1025
destination 12.0.0.1:80

source 12.0.0.1:80
destination 11.0.0.1:1025
Web
server Alice
IP: 12.0.0.1 IP: 11.0.0.1
Port: 80 Port: 1025

Appendix 1015
NAT Example

src 11.0.0.1:4000 src 10.0.0.1:1025


dest 12.0.0.1:80 dest 12.0.0.1:80

src 12.0.0.1:80 src 12.0.0.1:80


dest 11.0.0.1:4000 dest 10.0.0.1:1025
Web
server Firewall Alice
IP: 12.0.0.1 IP: 11.0.0.1 IP: 10.0.0.1
NAT Table
4000 10.0.0.1:1025

Appendix 1016
NAT: The Last Word
 Advantage(s)?
o Extends IP address space
o One (or a few) IP address(es) can be
shared by many users
 Disadvantage(s)?
o End-to-end security is more difficult
o Might make IPSec less effective
(IPSec discussed in Chapter 10)
Appendix 1017
IP Header

 IP header has necessary info for routers


o E.g., source and destination IP addresses
 Time to live (TTL) limits number of “hops”
o So packets can’t circulate forever
 Fragmentation information (see next slide)
Appendix 1018
IP Fragmentation
fragmented

re-assembled

 Each link limits maximum size of packets


 If packet is too big, router fragments it
 Re-assembly occurs at destination

Appendix 1019
IP Fragmentation
 One packet becomes multiple packets
 Packets reassembled at destination
o Prevents multiple fragmentation/reassemble
 Fragmentation is a security issue…
o Fragments may obscure real purpose of packet
o Fragments can overlap when reassembled
o Must reassemble packet to fully understand it
o Lots of work for firewalls, for example

Appendix 1020
IPv6
 Current version of IP is IPv4
 IPv6 is a “new-and-improved” version of IP
 IPv6 is “bigger and better” than IPv4
o Bigger addresses: 128 bits
o Better security: IPSec
 How to migrate from IPv4 to IPv6?
o Unfortunately, nobody thought about that…
 So IPv6 has not really taken hold (yet?)
Appendix 1021
Link Layer
 Link layer sends
packet from one
node to next
 Links can be
different
o Wired
o Wireless
o Ethernet
o Point-to-point…

Appendix 1022
Link Layer
 On host, implemented in adapter:
Network Interface Card (NIC)
o Ethernet card, wireless 802.11 card, etc.
o NIC is “semi-autonomous” device
 NIC is (mostly) out of host’s control
o Implements both link and physical layers

Appendix 1023
Ethernet
 Ethernet is a multiple access protocol
 Many hosts access a shared media
o On a local area network, or LAN
 With multiple access, packets can “collide”
o Data is corrupted and packets must be resent
 How to efficiently deal with collisions in
distributed environment?
o Many possibilities, ethernet is most popular
 We won’t discuss details here…
Appendix 1024
Link Layer Addressing
 IP addresses live at network layer
 Link layer also needs addresses  Why?
o MAC address (LAN address, physical address)
 MAC address
o 48 bits, globally unique
o Used to forward packets over one link
 Analogy…
o IP address is like your home address
o MAC address is like a social security number
Appendix 1025
ARP
 Address Resolution Protocol (ARP)
 Used by link layer  given IP address, find
corresponding MAC address
 Each host has ARP table, or ARP cache
o Generated automatically
o Entries expire after some time (about 20 min)
o ARP used to find ARP table entries

Appendix 1026
ARP
 ARP is stateless
 ARP can send request and receive reply
 Reply msgs used to fill/update ARP cache

IP: 111.111.111.001 IP: 111.111.111.002

LAN
MAC: AA-AA-AA-AA-AA-AA MAC: BB-BB-BB-BB-BB-BB

111.111.111.002 BB-BB-BB-BB-BB-BB 111.111.111.001 AA-AA-AA-AA-AA-AA


Alice’s ARP cache Bob’s ARP cache

Appendix 1027
ARP Cache Poisoning
 ARP is stateless, so…
 Accept “reply”, even if no request sent

Trudy 111.111.111.003
CC-CC-CC-CC-CC-CC

ARP “reply” ARP “reply”


111.111.111.002 111.111.111.001
CC-CC-CC-CC-CC-CC CC-CC-CC-CC-CC-CC

111.111.111.001
LAN 111.111.111.002
AA-AA-AA-AA-AA-AA BB-BB-BB-BB-BB-BB

111.111.111.002 CC-CC-CC-CC-CC-CC
BB-BB-BB-BB-BB-BB 111.111.111.001 AA-AA-AA-AA-AA-AA
CC-CC-CC-CC-CC-CC

Alice’s ARP cache Bob’s ARP cache

 Host CC-CC-CC-CC-CC-CC is man-in-the-middle


Appendix 1028
Math Basics

7/5ths of all people don’t understand fractions.


 Anonymous

Appendix 1029
Modular Arithmetic

Appendix 1030
Clock Arithmetic
 For integers x and n, “x mod n” is the
remainder when we compute x  n
o We can also say “x modulo n”
 Examples 0
o 33 mod 6 = 3 1
5
o 33 mod 5 = 3
o 7 mod 6 = 1 number “line”
o 51 mod 17 = 0 mod 6
o 17 mod 6 = 5
4 2

3
Appendix 1031
Modular Addition
 Notation and fun facts
o 7 mod 6 = 1
o 7 = 13 = 1 mod 6
o ((a mod n) + (b mod n)) mod n = (a + b) mod n
o ((a mod n)(b mod n)) mod n = ab mod n
 Addition Examples
o 3 + 5 = 2 mod 6
o 2 + 4 = 0 mod 6
o 3 + 3 = 0 mod 6
o (7 + 12) mod 6 = 19 mod 6 = 1 mod 6
o (7 + 12) mod 6 = (1 + 0) mod 6 = 1 mod 6

Appendix 1032
Modular Multiplication
 Multiplication Examples
o 3  4 = 0 mod 6
o 2  4 = 2 mod 6
o 5  5 = 1 mod 6
o (7  4) mod 6 = 28 mod 6 = 4 mod 6
o (7  4) mod 6 = (1  4) mod 6 = 4 mod 6

Appendix 1033
Modular Inverses
 Additive inverse of x mod n, denoted –
x mod n, is the number that must be
added to x to get 0 mod n
o -2 mod 6 = 4, since 2 + 4 = 0 mod 6
 Multiplicative inverse of x mod n,
denoted x-1 mod n, is the number that
must be multiplied by x to get 1 mod n
o 3-1 mod 7 = 5, since 3  5 = 1 mod 7

Appendix 1034
Modular Arithmetic Quiz
 Q: What is -3 mod 6?
 A: 3
 Q: What is -1 mod 6?
 A: 5
 Q: What is 5-1 mod 6?
 A: 5
 Q: What is 2-1 mod 6?
 A: No number works!
 Multiplicative inverse might not exist

Appendix 1035
Relative Primality
x and y are relatively prime if they
have no common factor other than 1
 x-1 mod y exists only when x and y are
relatively prime
 If it exists, x-1 mod y is easy to
compute using Euclidean Algorithm
o We won’t do the computation here
o But, an efficient algorithm exists
Appendix 1036
Totient Function
 (n) is “the number of numbers less than n
that are relatively prime to n”
o Here, “numbers” are positive integers
 Examples
o (4) = 2 since 4 is relatively prime to 3 and 1
o (5) = 4 since 5 is relatively prime to 1,2,3,4
o (12) = 4
o (p) = p-1 if p is prime
o (pq) = (p-1)(q-1) if p and q prime

Appendix 1037
Permutations

Appendix 1038
Permutation Definition
 Let S be a set
 A permutation of S is an ordered list
of the elements of S
o Each element of S appears exactly once
 Suppose S = {0,1,2,…,n-1}
o Then the number of perms is…
o n(n-1)(n-2)    (2)(1) = n!

Appendix 1039
Permutation Example
 Let S = {0,1,2,3}
 Then there are 24 perms of S
 For example,
o (3,1,2,0) is a perm of S
o (0,2,3,1) is a perm of S, etc.
 Perms are important in cryptography

Appendix 1040
Probability Basics

Appendix 1041
Discrete Probability
 We only require some elementary facts
 Suppose that S={0,1,2,…,N1} is the
set of all possible outcomes
 If each outcome is equally likely, then
the probability of event E  S is
o P(E) = # elements in E / # elements in S

Appendix 1042
Probability Example
 Forexample, suppose we flip 2 coins
 Then S = {hh,ht,th,tt}
o Suppose X = “at least one tail” = {ht,th,tt}
o Then P(X) = 3/4
 Often, it’s easier to compute
o P(X) = 1  P(complement of X)

Appendix 1043
Complement
 Again, suppose we flip 2 coins
 Let S = {hh,ht,th,tt}
o Suppose X = “at least one tail” = {ht,th,tt}
o Complement of X is “no tails” = {hh}
 Then
o P(X) = 1  P(comp. of X) = 1  1/4 = 3/4
 We make use of this trick often!

Appendix 1044
Linear Algebra Basics

Appendix 1045
Vectors and Dot Product
 Let  be the set of real numbers
 Then v  n is a vector of n elements
 For example
o v = [v1,v2,v3,v4] = [2,1, 3.2, 7]  4
 The dot product of u,v  n is
o u  v = u1v1 + u2v2 +… + unvn

Appendix 1046
Matrix
A matrix is an n x m array
 For example, the matrix A is 2 x 3

 Theelement in row i column j is aij


 We can multiply a matrix by a number

Appendix 1047
Matrix Addition
 We can add matrices of the same size

 We can also multiply matrices, but this


is not so obvious
 We do not simply multiply the elements

Appendix 1048
Matrix Multiplication
 Suppose A is m x n and B is s x t
 Then C=AB is only defined if n=s, in
which case C is m x t
 Why?
 The element cij is the dot product of
row i of A with column j of B

Appendix 1049
Matrix Multiply Example
 Suppose

 Then

 And AB is undefined
Appendix 1050
Matrix Multiply Useful Fact
 Consider AU = B where A is a matrix and U
and B are column vectors
 Let a1,a2,…,an be columns of A and
u1,u2,…,un the elements of U
 Then B = u1a1 + u2a2 + … + unan

Example:
[ 31 45] [ 26 ] = 2[ ]
3
1
+ 6 [ ]
4
5
= [ 30
32
]

Appendix 1051
Identity Matrix
A matrix is square if it has an equal
number of rows and columns
 For square matrices, the identity
matrix I is the multiplicative identity
o AI = IA = A
 The 3 x 3 identity matrix is

Appendix 1052
Block Matricies
 Block matrices are matrices of matrices
 For example

 We can do arithmetic with block matrices


 Block matrix multiplication works if
individual matrix dimensions “match”

Appendix 1053
Block Matrix Mutliplication
 Block matrices multiplication example
 For matrices

 We have

 Where X = U+CT and Y = AU+BT


Appendix 1054
Linear Independence
 Vectors u,v  n linearly independent
if au + bv = 0 implies a=b=0
 For example,

 Are linearly independent

Appendix 1055
Linear Independence
 Linear independence can be extended
to more than 2 vectors
 If vectors are linearly independent,
then none of them can be written as a
linear combination of the others
o None of the independent vectors is a
sum of multiples of the other vectors

Appendix 1056