You are on page 1of 30

Hash Pointers and Data Structures

Cryptographically Secure Hash


functions?
• Property 1: Deterministic
– No matter how many times you parse through a particular input through a hash function you
will always get the same result.
• Property 2: Quick Computation
– Hash function should be capable of returning the hash of an input quickly.
• Property 3: Pre-Image Resistance
– Given H(A) it is infeasible to determine A, where A is the input and H(A) is the output hash.
• Property 4: Small Changes In The Input Changes the Hash.
– Even if you make a small change in your input, the changes that will be reflected in the hash will
be huge.
• Property 5: Collision Resistant
– Given two different inputs A and B where H(A) and H(B) are their respective hashes, it is
infeasible for H(A) to be equal to H(B).
• Property 6: Puzzle Friendly
– For every output “Y”, if k is chosen from a distribution with high min-entropy it is infeasible to
find an input x such that H(k|x) = Y.
The same input will always produce the same hash (of same
size 256 bits) and Slightly different inputs will produce very
different hashes (of same size 256 bits).
Pointers
Hash Pointers
• Hash pointer is:
– Pointer to where some info/data is stored,
and
– (Cryptographic) hash of the info
Hash Pointers

What can you do with a hash pointer?


• Retrieve or get back the info/data
• Verify that the info/data hasn’t changed
• What else?
Use hash pointers to build data
structures!
Applications of
Hash Pointer Data Structures
• Blockchain
• Merkle tree
Block Chains
• A block header contains:
– Version: The block version number.
– Time: the current timestamp.
– The current difficulty target.
– Hash of the previous block.
– Nonce.
– Hash of the Merkle Root.
Block Chains
• What is a Block Chain?
– Linked list with hash pointers
• What is it used for?
– Tamper-evident log or register
Tamper-evident Log

 An attacker wants to tamper with one block of the chain, let’s say, block 1.
 The attacker changed the content of block 1, because of “collision free”
property of the hash function, he is not able to find another data which has
the same hash with the old one. So now the hash of this modified block is
also changed.
 To avoid others noticing the inconsistency, he also needs to change the
hash pointer of that block in the next block, which is block 2.
 Now the content of block 2 is changed, so to make this story consistent,
the hash pointer in block3 must be changed.
 Finally, the attacker goes to the hash pointer to the last block of the
blockchain, which is a roadblock for him, because we keep and remember
that hash pointer.
Conclusion on Tamper evident log
• If the adversary wants to tamper with data anywhere
in this entire chain, in order to keep the story
consistent he's going to have to tamper with hash
pointers all the way back to the beginning. And he's
ultimately going to run into a road block, because he
wont be able to tamper with the head of the list.
• So we can build a block chain like this containing as
many blocks as we want, going back to some special
block at the beginning of the list which we might call
the genesis block. And that's a tamper evidence log
built out of the block chamber.
Merkle tree
• Another useful hash pointer data structure is the Markle tree.
• A Markle tree is a data structure used for efficiently verifying the
integrity of large sets of data.
• Binary tree with hash pointers!
H( ) H( )

H( ) H( ) H( ) H( )

H( ) H( ) H( ) H( ) H( ) H( ) H( ) H( )

(data) (data) (data) (data) (data) (data) (data) (data)


Features of Merkle Tree
• Tamper evident
Just like blockchain, we only need to remember the hash pointer in
the root (top-level node), then we can traverse down to any leaf data
block to check if a node is in the tree or has it been tampered with.
• Traversal efficiency
To verify a data block, we only need to traverse the path from the top
to the leaf where the data is. So the complexity is O(log n), which is
much more efficient compared with O(n) of a linked list blockchain.
• None-membership proof
If Merkel tree is sorted, we can prove a given data is not in the tree: if
the data before and after the given data are both in the tree and
they’re consecutive, so there’s no space between them, this proves
that the given data is not in three.
• Advantage:
Merkle tree
– Tree holds many items, but just need to
remember the root hash H( ) H( )
– Proving membership of a data block in the
tree is easy
– Only need to show O(log n) items
H( ) H( )
– In other words, membership verification
in O(log n) time/space
• How to prove non-membership?
– Sorted Merkle trees: Order leafs of the H( ) H( )
tree in some fashion, say
lexicographically, numerically, etc.
– Verify membership of data before and
after the missing one!
(data)
– Non-membership verification also takes
O(log n) time/space
More generally ...

Hash pointers can be used in any pointer-based data structure


that has no cycles
Digital Signatures
&
Public Keys as Identities
Digital Signatures
• Second cryptographic primitive (in addition to Hash
functions) that we will need to build
cryptocurrencies (and bitcoins)
• What are the properties we need from digital
signatures? – same as properties we need from
handwritten signatures
– Only you can sign, but anyone can verify
– Signature tied to a particular document - can’t be cut-and-
pasted to another document (unforgeability)
Digital Signatures APIs
1. (sk, pk) := generateKeys(keysize) can be
sk: secret signing key randomized
pk: public verification key algorithms

2. sig := sign(sk, message)


Is a determinisitc
algorithm
3. isValid := verify(pk, message, sig)
Digital Signatures Requirements
1. Valid signatures must always verify correctly
• i.e., verify(pk, message, sign(sk, message)) == true
• Basic property for signatures to be useful!

2. Signatures should be existentially unforgeable 


can’t forge signatures
• i.e., adversary who knows pk and gets to see
signatures on messages of his choice, still can’t
produce a verifiable signature on another message
• Can be formalized by means of the unforgeability
game described next
Unforgeability game
Unforgeability game
• Participants: an adversary who claims that he can forge
signatures and a challenger that will test this claim
• Generate keys to generate the secret key which is given to
challenger and public key to adversary
• Allow the attacker to get signatures on some documents of
his choice, for as long as he wants, as long as the number of
guesses is plausible
• After that, the attacker picks some message M which he never
sees, that he will attempt to forge a signature on
• Challenger runs the verify algorithm to determine if the
signature produced by the attacker is a valid signature on M
– If it successfully verifies, the attacker wins the game
Digital Signature with Hash
Practical stuff...
algorithms are randomized
need good source of randomness
limit on message size
fix: use Hash(message) rather than message
fun trick: sign a hash pointer
signature “covers” the whole structure
• How to sign the entire Block chain?
– Sign the entire hash pointer of the head block!
– This signature “covers” the whole block chain structure
Digital Signatures used by Bitcoins
• Bitcoin uses Elliptic Curve Digital Signature Algorithm (ECDSA)
standard
– ECDSA is a US Government standard

• Bitcoin uses ECDSA over the standard elliptic curve secp256k1


 this curve is rarely used outside Bitcoins
– Provides 128 bit of security (equivalent to performing 2128 symmetric
encryptions)
– Private key – 256 bits
– Public key compressed – 257 bits
– Message to be signed – 256 bits
– Signature – 512 bits

• Good randomness is essential for ECDSA


– If you foul this up in generateKeys() or sign()  you probably leaked your
private key
Identities in a cryptocurrency
• Can we use public key pk,as generated before
by generateKeys(keysize), as an identity?

• For example, if you see signature sig such that


verify(pk, msg, sig)==true, think of it as: pk
says, “[msg]”.

• But, pk by itself cannot be used as an identity!


To “speak for” pk, you must know matching
secret key sk
Identities in a cryptocurrency
How to make a new identity?
• Create a new, random key-pair (sk, pk)
– pk is the public “name” you can use [usually
better to use H(pk)]
– sk lets you “speak for” the identity
• You control the identity pk, because only you know
sk
• Even if pk “looks random” that’s fine, nobody needs
to know your real identity for the cryptocurrency
application
• Just like while spending an actually currency note
Identities in a cryptocurrency
Decentralized identity management
• Anybody can make a new identity at any time
make as many as you want!
• No central point of coordination
• These identities are called “addresses” in
Bitcoin
Identities in a cryptocurrency
Privacy
• Addresses not directly connected to real-
world identity
• But observer can link together an address’s
activity over time, make inferences

You might also like