Universal Classes of Hash Functions

Universal Classes of Hash Functions
(extended abstract)
J. Lawrence Carter and Mark N. Wegman
IBM Thomas J. Watson Research Center
Y o r k t o w n Heights, New York 10598
Abstract:
This paper gives an i , put independe, t average linear time algorithm for storage
and retrieval on keys. The algorithm makes a random choice of hash function from a
suitable class of hash functions.
Given any sequence of inputs the expected time
(averaging over all functions in the class) to store and retrieve elements is linear in
the length of~the sequence. The number of references to the data base required by
the algorithm for any input is extremely close to the theoretical minimum for any
possible hash function with randomly distributed inputs.
We present three suitable
classes of hash functions which also may be evaluated rapidly. The ability to analyze
the cost of storage and retrieval w i t h o u t worrying about the distribution of the input
allows as corollaries improvements on the bounds of several algorithms.
Introduction:
2) A consequence of (1) is that one cannot classi-
One may v i e w different inputs to a program as

elements from a class of problems.
The answer
cally analyze the average performance of a subroutine independently of the main routine, since the
given by the program is, hopefully, a correct solution
main routine may skew the distribution of data.
to the problem.
Ordinarily, when one talks about
3) If the program is presented with a w o r s t - c a s e
the average performance of a program, one aver-
input, there is no way to avoid the resulting poor
ages over the class of problems the program can
performance.
solve.
G i l l [ 2 ] , R a b i n [ 7 ] , Strassen and S o l o v a y [ 9 ]
ithms to choose from and was able to realize that a
have used a different approach on some classes of
particular algorithm was running slowly on a g i v e n
problems. They suggest that the program randomly
input, then it might be possible to choose a different
choose an algorithm from the class of algorithms to
algorithm.
solve the problem.
They are able to bound the a v -
erage performance of the class of algorithms for the

worst case input.
However, if one had a class of algor-
This average on the worst case
In this paper, we apply these notions to hashing for storage and retrieval, and suggest that a
can be better than the performance of any known
class of hash functions be used.
single algorithm on its worst case.
the class of functions is chosen properly, then the
Some of the
p r o b l e m s which this approach overcomes are the
We show that if
average performance of the program on any input
following:
will be nearly as good as if a single function, chosen
1)
with knowledge of the input, were used.
Classical analysis (averaging over the class of
W~ pres-
inputs) must make assumptions about the distrib-
ent several classes of hash functions which insure
ution of the inputs. These assumptions may not hold
that every sample chosen from the input space will
in certain applications.
be distributed evenly by enough of the functions to
106
compensate for the poor performance of the algorProposition 1: Given any collection H of hash func-
ithm w h e n an unlucky choice of function is made.
tions (not necessarily universal2), there exists x,yeA

A brief outline of our paper follows.
such that
After in-
troducing some notation, w e define a property of a

class of functions: universal 2.
class of functions that is universal 2 has the property
then
IAI
A coun-
ting argument shows that for each fell,

a2
(~f (A,A) > - - - a.
of that class will be expected to distribute the s a m We
IHI
Proof (sketch): Let a = I A I and b = I B I .
that given any sample, a randomly chosen member
ple evenly.
IHI
8.(x,y) > IBI
W e show that any
exhibit several universal 2
classes of functions which can be evaluated easily.
Thus, 8H(A,A) _> a 2 1 H l ( 1 / b
- l/a).
Therefore, by
the pigeon hole principle, there exists x,yeA such
Finally w e give several examples of the use of these
that 8H(x,y) > [ H I(1/b
functions.
- 1/a).
[]
Proposition 2: Let x be any element of A and S any
Notation:
If S is a set,
elements in S.
I SI
Ix]
subset of A.
will denote the number of
means the least integer > x.
Z will represent the integers mod n.
All hash functions map a set A into a set B. W e will

always assume I A I
> IBI.
A is sometimes called
t h e set of possible keys, and B the set of indices.
If
f is a hash function and x,y~A, w e let ~r(x,y) be 1 if

x p y and f(x) = f(y), and 0 otherwise.
(with equal
T h e n . t h e expected
number of elements of S that x collides with, i.e.

ISI
8,(x,S), is _< - IBI
Proof:
Mean value of 8f(x,S)
1
~'~
I HI
feB
Thus, 8f(x,y)
is 1 if x and y are distinct elements of A which map

to the same value under f.
Let f be a function chosen randomly
a universal 2 class of functions
probabilities on the functions.)
If x and y are bit strings, then x ( ~ y is the exclusiveor of x and y.
from
If f, x or y is replaced in
1
IHI
< ~
8,(x,y) by a set, w e sum over all the elements in the

set. Thus, if H is a collection of hash functions, x~A
8f(x,S)
~_, 8H(x,y) (by notation)

y~s
~'~
IHI
~**~.q~-
1
_
and S e A then 8H(X,S) means
ISl
(by def. of universal 2)

[]
IBI"
In addition to being useful later, Proposition 2
Z
f~H
~ 8f(x,y).
y~S
has some direct applications.
cal character reader postprocessing system is d e -
Notice that the order of summation does not matter.
scribed in [ 8 ] .
Properties
say
of Universal
For instance, an o p t i -
This system is designed t o check if
a word x is a member of a set of valid words S.
Classes:
Let H be a class of functions from A to B. W e
The set { f(y) l yeS } is stored in memory.
that
A,
whether x is in S, a check is made to see if f(x) is in
That is, H is universal 2 if no pair of
the stored set. Since f(y) is generally shorter than y,
8.(x,y) <
-
H
IHI.
is
universal 2
if
for
all
x,y
in
IBI
To test
distinct keys are m a p p e d into the same index by
a considerable amount of space was saved. H o w e v -
more than one I B I'h of the functions.
er, there is a chance of error; if f(x)=f(y) and yeS,
Proposition 1
shows that this bound on 8.(x,y) is tight when

is much larger than
I BI.
I AI
then x may erroneously be accepted as valid
The second proposition
Prop-
osition 2 gives a bound on the probability of error
f o l l o w s almost i m m e d i a t e l y from the definition of
when f is chosen from a class of universal 2 f0nc-
universal 2 .
tions.
107
W e are interested in the cost of using these

functions in storage and retrieval operations.
be c l ~ s e n appropriately.
If there is no known upper
Given
bound it is possible to dynamically choose a size,
a sequence R of requests (insertions or retrievals) to
and rehash when that choice proves to be t o o small.
some data base, and a hash function f, w e define
Rehashing can be done in linear time, and even in
the cost of R with respect to f, C(f,R), to be the sum
real-time.
of the costs of the individual requests.
The cost of
an individual request referring to an element x is one
W e can s h o w that the expected cost (averaging
plus the number of distinct previously inserted ys
over the hash functions) of any request is virtually
for which f(x) = f(y).
the same as the expected cost (averaging over the

possible requests) of any single hash function when
This cost function reflects the w o r s t case cost

of inserting or finding elements in a storage and
retrieval scheme in which each element of B is a s sociated with a linked list, and an element x is
stored in the list associated with f(x) (see [ 1 ], page
111
would
them.
113.)
Other
collision
resolution
schemes
have other cost functions associated with

For example, if the keys with the same index
were stored in a balanced tree, the corresponding
applied to a random request after random insertions

have been made.
The reason is as f o l l o w s : Let
a = I AI
I BI.
and b =
The counting
argument
cited in proposition 1 implies that if f is any hash

function and x and y are chosen at' random from A,
then the expectation of 8f(x,y) is > ( 1 / b - l / a ) .
follows
is >
that
the
ISa(1/b-
expectation
of
l / a ) , where S is the random sub-
set of A which has been previously stored.
cost function w o u l d be smaller.
the
The following theorem gives a nice bound on
r.ost
1 + I S I(1/b
It
8f(x,S)
of
the
- l/a).
request
is
at
Thus,
least
When A is much larger than
the e x p e c t e d cost of using a universal 2 class of
B (which will be the case in most applications of
hash functions with the linked list method for re-
hashing), this is virtually the same as the cost of a
solving collisions.
request cited in the proof of Proposition 3.
Proposition 3:
which
includes
Let R be a sequence of r requests

k insertions.
Suppose
universal 2 class of hash functions.

choose f at random from
k
C(f,R) is _< r(1 + ~ - ~ - ) .
H is a
Then if w e
H, the expected cost
given a sequence of requests R, the performance of

a randomly chosen function will be worse than t o l erable on R. Since we k n o w that C(f,R) must be at
least r, we can conclude that when k is roughly the
Proof: The e x p e c t e d cost of R is the sum of the

expected costs of the individual requests.
It is also possible to bound the probability that
Proposi-
tion 2 and the definition of cost tell us that an i n d i vidual request has expected cost no greater than
k
1 + - []
IBI "
A special case of this proposition is that if k is
roughly the size of B then the expected cost is 2r.
size of I BI, the probability that C(f,R) > t.r is less

than 1 / ( t - 1 ) .
For some classes of hash functions
(such as the last class mentioned in this paper), it is

possible to derive a bound on the standard deviation
or higher moments of the cost of a randomly chosen
function on a particular R.
This allows us to get a
much better estimate of the probability that C(f,R)

will be large.
Notice that this linear bound holds for any sequence

of requests, not just for the " a v e r a g e " request.
For
Some
universal
2 classes:
many applications, there is an upper bound on the
The first class of universal 2 hash functions we
number of elements to be stored and hence, B can
present is suitable for applications where the bit
108
strings w h i c h represent the keys can c o n v e n i e n t l y
n u m b e r of choices for s such that r # s but g(r)=g(s)
be multiplied by the computer.
is < P - - I . Since there are p choices for r,

b
P p-1 > the n u m b e r of ordered pairs (r,s) satisfyb
ing the condition in the l e m m a = 8H(x,y). Recalling
S u p p o s e A={O,1 ..... a - l }
Let p be a prime with p>_a.
and B={O,1 ..... b - l } .

Let g be any function
from Zp to B which, as closely as possible, maps the
that
for
x=y,
same number of elements of Z, into each element of
universal 2 .
8H(x,y)=O,
this
shows
that
is
I-I
B. Formally, w e require
I {y~Zp I g(y)=z} I <
[ -~ ] for all z ~ B. A
natural choice for g is the residue m o d u l o b.
Frequently, algorithms are analyzed making the
When
assumption that multiplication takes unit time.
The
b = 2 k for some k, this a m o u n t s to taking the last k
n u m b e r of multiplications is s a i d , t o be the cost of
bits in the binary representation of y.
the algorithm. This m o d e l is appropriate when there

are no operations which can be done an unbounded
Let m and n be elements of Zp with mpO.

define hm,o:A--,Z p by h
(x) = (mx+n) m o d p.
define fm.n(X) = g(hm,,(X)).

{f~,n I m,neZp
m#0}.
We
n u m b e r of times for each multiplication.
Now
When a
universal 2 class of hash functions is used, then the
The class H is the set
n u m b e r of m e m o r y references per request can be
If desired, p can be chosen
b o u n d e d w h e n averaged over all functions in the
so the mod p operation can be calculated w i t h o u t a
class as in Proposition 3 (assuming k is not u n -
division.
b o u n d e d with respect to J BI.) Therefore the m o d e l

is appropriate, and under it the hash functions in the
The f o l l o w i n g lemma is useful in proving that
class given above may be applied in constant time.
this class is universal 2.
Thus in this model, for any sequence of requests,

Lemma:
When H is defined as above, then for any
the algorithm takes average time linear in the n u m -
x,y~A with x # y , 8H(x,y) equals the number of o r -
ber of requests.
dered pairs (r,s) with r,s~Zp, r ~ s and g(r)=g(s).

Proof:
It may seem that the addition of n in the class
There is a natural correspondence b e t w e e n
the functions hm, n and the ordered pairs (r,s) w h e r e
of functions given a b o v e plays an unimportant role.
r, seZp and r~s.
This is only partly troe.
Specifically, w e identify the f u n c -
tion hm, n by the ordered pair (hm,n(X),hm, n(y)).
Suppose for m~Zp w e d e -
fine hm(X) = (mx) rood p, and as before define fm(X)
Since
as g(hm(X)).
m~O, hm,n(X)~hrn,n(y). This correspondence is o n e -
Let H = {frn I meZp
m~O}.
It can
to-one
and
equations
be s h o w n that this class of functions comes within a
xm+n~r
(mod p) and y m + n - - s (mod p) have a uni-
factor of t w o of being universal 2, that is, for any x

^ IHI
and y, 8H(x,y) <_ Z ' ~ T - .
On the other hand, this
onto
since
the
linear
que solution for m and n in the field Zp.

If
(r,s)
is
the
pair
(hm,n(X),hm,n(y)),
fm,n(X)=fm,n(y) if and only if g(r)=g(s).

is the n u m b e r of such pairs.
b o u n d c a n n o t be i m p r o v e d significantly.
then
Thus, 8H(x,y)
[]
stance,
let
p = kb+k+l
b = JBI,
and
choose
For inso
is prime (there will be infinitely m a n y
such k's.) Let g(x) be the function x (mod b).

Proposition
4:
The
class
H defined
above
is
universal 2 .
Proof:
Let
that
x = 1 and y = b + l .
Let
Then the 2k functions fl ,
f2 ..... fk fp-k fp-k.l ..... fp-1 each m a p x and y to

n i be the
number
of
elements
in
[ teZp I g(t)=i }. The restriction on g is that for

each i, ni < I - ~ ] .
Since p and b are integers, this
implies that n i < p - 1 +1.

-
Thus, for a given r, the
the same bin. Thus, 8H(x,y) = 2k w h i l e

IH_~_l_ p-_~_l _ k b + k _ ( 1 + 1 ) k .
IBI
b
b
b
The universal 2 class of functions given above
109
may not be convenient when the keys are t o o long
length ia, each of w h o s e elements are bit strings o f
to be multiplied u=~ing a single machine instruction.
length j.
However, the next proposition gives a method of
is the k th element of m, and for x t A , let x k be the k 'h
extending a class of functions for long keys.
digit of x written in base a. W e define
For m~M, let m(k) be the bit string which
fro(x) -- m ( x , + l ) O m(xl+x2+2) (~ ...

Proposition 4: Suppose
I BI
is a p o w e r of t w o and
(~) m(~ xk+k).

k=l
The class H is the set { f . I m e M } .
H is a class of functions from A to B with the p r o p Another w a y of defining this class is to give a
erty that for each i~B,

I {f~H
Q
I f(x)(~f(y)
is
= i } I
exclusive-or.
Then
universal 2 class of hash function from A x A

follows.
Then
program for applying a function to an input x.
]HI
Recall that
IBI "
w e can define a
dcl m(ia) bit(j)init(random);

dcl x(i) digits base a;
dcl value bit(j);
disp := 0;
value := 0;
for k := 1 to i do begin
disp := disp + x(k) + 1;
value := value (~) m(disp);
end;
return (value);
to B as
For f, geH, define hf.g((x,y)) = f(x) (~ g(y).

this
new
J = {hf.Q ] f,g~H}
collection
of
hash
functions
is universal 2 and also satisfies
the condition of this proposition.

The proof is quite similar to the proof of Proposition 6, and therefore omitted.
Proposition 5 can
be applied repeatedly to extend the functions to

arbitrarily long keys.
Proposition 6:
If the functions in H can be
applied in constant time, then the time required to
Proof:
compute an extended function is proportional to the
For x and y in A, s u p p o s e fm(X) is the
e x c l u s i v e - o r of the rows r 1, r 2 ..... r s of m, and fro(Y)
length of the key.
is rs.l(~...(~)rt.
only one of fm(X) or fm(y). Then fro(x) will equal fro(y)
BI cannot be
if and only if r k is the e x c l u s i v e - o r of the other ri's.
an integer) and because the number of functions for
Since there are 2 j = B possibilities for that row, x
which f(x) (~ f(y) = 0, i.e. 6H(x,y), is actually less
and y will collide for one B th of the possible f u n c -
Both of these differences add small
tions fm"
[]
factors to 8j(x,y) which barely prevent J from being

universal 2. The percentage contributed
A s s u m i n g x ~ y , there will be
some k such that r k is involved in the calculation of
universal 2 class of functions given earlier both b e cause H is not a power of 2 (so I H I / I
Notice that fro(x) = fro(Y) if and only
if rl(~)...(~)r t = 0.
The proposition does not quite apply to the
than I H I / I B I .
The class H defined above is
universal 2 .
Thus, the class of all frn'S is universal 2.
by these
small factors decrease asymptotically to 0 as p is
For a given B, each function in H takes time
increased. More details will be given in a future p a -
linear in the length of the key.
per.
more accurately describe the distribution of costs of
In addition, w e can
a particular sequence of requests under the different

The following is a class of functions which do
functions.
For instance, M a r k o w s k y [ 6 ] has shown
not require multiplication, which may be better for
that for any sequence R of r requests with k inser-
many applications,
tions,
Suppose A can be viewed as
and
any
positive
the set of i - d i g ; t numbers written in base a, and B
C(f,R)--r(l+
as the set of binary numbers of length j. Then I A I
7k
also less than - t3iB]
= a'and
I BI = 2 j.
Let M be the class of arrays of
ii0
t,
the
) > r-t is less than
probability
k
tZlBI
and
that
importance:
amount of time refining hash functions for applications where it is critical that a unifor m distribution
be achieved ( [ 5 ] , p . 5 0 8 - 5 1 3 ) .
The following algorithm will
retrieved.
Further-
gram is run, then we can be sure of good p e r f o r m ance averaged over all runs.
The theoretical importance is that it allows one
to get a good bound on the average performance of
If a value has not been stored previously
presented w o u l d seem appropriate for this analysis.
unevenly by the particular hash function being used.
Future research:
Rabin has developed an algorithm which finds
There are a number of areas which can be in-
the nearest neighbors of a collection of points in a

plane, given the coordinates of the points [ 7 ] .
points, and it uses hashing.
taking constant time, the first class of functions we
being stored and re-
using
Their first argu-
Since addition and multiplication are viewed as
trieved t o w a r d s those cases that are distributed
making
assuming
The problem with
an ordinary hashing scheme is that the algorithm
involves
suggested,
implemented
Begin
Choose a hash function;
For i := 1 to n do
F o r j := 1 t o m d o
Begin
Coefficient := CP i * CQj;
Retrieve (EP i + EQj,k);
Store (EP i + EQi,Coefficient + k);
End;
Print all keys and values which have been stored;
End;
more, if the function is changed each time the p r o -
algorithm
we
are
for a given key, a retrieve will have a zero value.
Simply choosing a single hash function ran-
might bias the information
Retrieve
in the
domly from such a class gives a high expectation
an algorithm which uses hashing.
performance
ment is a key, and the second is the value stored or
of a class of universal 2 functions is that we know
that a uniform distribution will be achieved.
and
universal 2 class of hash functions.
One of the practical values
that there are many acceptable functions
Let CQ i and EQi stand for the
Store
not be biased in such a w a y as to make tl~e hash

function perform poorly.
nents of those terms.

same quantities of Q.
have the
This may be difficult
because it is necessary that the expected input set
class.
Let EP1,EP 2 ..... EPn be the expo-
the n terms of P.
Programmers sometimes spend a considerable
random
vestigated, such as:
This
choice
1) Improve the bounds cited here on the probability
of
that a particular function
If one also randomly
from the table l o o k - u p
class will perform poorly on a particular input.
chooses the hash function from a universal 2 class,
2) Suppose the
then the expected running time of the algorithm will
changed so that the last element retrieved is moved
always be linear the number of points.
to the first position on the list.
store
and
retrieve algorithm
is
Can a smaller bound
on the probability that the cost function exceeds a

In [ 3 ]
and [ 4 ]
an algorithm is suggested for
multiplying sparse polynomials, using hashing.

can strengthen the results of these papers.
specified value be derived?
We
3) Extend the analysis to other storage and retrieval
Let t w o
algorithms which involve hashing, such as double
polynomials, P and Q, have n and m n o n - z e r o terms
hashing and open addressing.
respectively.
4)
If multiplies and aCds are viewed as
Generalize
the
definition
of
universal 2
to
taking constant time, then given any t w o p o l y n o m i -
universal n to consider the action of the class of
als P and O., we can multiply them in average time
functions on any collection of n elements of A.
O(n'm).
termine if one can obtain improved results with such
Let CP1,CP 2 ..... CP n be the coefficients of
iii
De-
Computing, May, 1976, Seattle, Washington,

p. 91-95.
a stronger assumption. (One definition of universalk

implies that the expected number of keys from a
k-element set mapping into a given element of B
[3]
Goto, E., and Kanada, Y., Hashing Lemmas

on Time Complexities with Applications to
Formula Manipulation, Proceedings of the
1976 ACM Symposium on Symbolic and Algebraic Computation, August, 1976, Yorktown
Heights, New York, p.149-153.
[4]
Gustavson, F., and Yun, D.Y.Y., Arithmetic

Complexity of Unordered or Sparse
Polynomials Proceedings of the 1976 ACM
Symposium on Symbolic and Algebraic Computation, August, 1976, Yorktown Heights, New
York, p.154-159.
[5]
Knuth, D.E.,
Sorting and Searching,
Addison-Wesley, Reading, Mass. (1973).
[6]
Markowsky, G.,
[7]
Rabin, M.O.,
Probabilistic algorithms,
Proceedings of Symposium on New Directions and
Recent Results in Algorithms and Complexity
(1976 - to appear).
[8]
Rosenbaum, W.S., and Hilliard, J.J,, Multifont OCR postprocessing system. I B M Journal of Research and Development (July 1975).
[9]
Strassen, V., and Solovay, R.,

A fast
Mont~-Carlo test for primality, SlAM Journal
on Computing (to appear).
would be binomially distributed.)

5) When should~one decide that a particular function
is a poor choice and it would be worth the effort to
choose a new function and rehash?
Acknowledgements:
We would like to thank Ashok Chandra for helping
suggest and formulate the problem; Dave Glickman
for suggesting we examine the table look-up technique; Hania Gajewska and David Smith for suggesting revisions in an earlier manuscript; Walter
Rosenbaum for discussions about a practical use of
this work; and George Markowsky for help in understanding the distribution of performance of the
class obtained by;table look-up.
References:
[1]
[2]
Aho, A.V., Hopcroft, J.E., and Ullman, J.D.,

The Design and Analysis of Computer Algorithms, Addison-Wesley, Reading, Mass.
(1974).
Gill, J.T., III, Computational Complexity of
Probabilistic Turing Machines, Proceedings of
the Sixth AC,~f Symposium on the Theory of
112
private communication.

Universal Classes of Hash Functions

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Universal Classes of Hash Functions

Uploaded by

Copyright:

Available Formats

Universal Classes of Hash Functions

Given any sequence of inputs the expected time

We present three suitable

2) A consequence of (1) is that one cannot classi-

One may v i e w different inputs to a program as

given by the program is, hopefully, a correct solution

main routine may skew the distribution of data.

Ordinarily, when one talks about

3) If the program is presented with a w o r s t - c a s e

the average performance of a program, one aver-

input, there is no way to avoid the resulting poor

ages over the class of problems the program can

ithms to choose from and was able to realize that a

have used a different approach on some classes of

particular algorithm was running slowly on a g i v e n

problems. They suggest that the program randomly

input, then it might be possible to choose a different

choose an algorithm from the class of algorithms to

solve the problem.

They are able to bound the a v -

erage performance of the class of algorithms for the

However, if one had a class of algor-

This average on the worst case

can be better than the performance of any known

class of hash functions be used.

single algorithm on its worst case.

the class of functions is chosen properly, then the

p r o b l e m s which this approach overcomes are the

average performance of the program on any input

will be nearly as good as if a single function, chosen

with knowledge of the input, were used.

Classical analysis (averaging over the class of

inputs) must make assumptions about the distrib-

ent several classes of hash functions which insure

ution of the inputs. These assumptions may not hold

that every sample chosen from the input space will

be distributed evenly by enough of the functions to

ithm w h e n an unlucky choice of function is made.

tions (not necessarily universal2), there exists x,yeA

troducing some notation, w e define a property of a

class of functions that is universal 2 has the property

ting argument shows that for each fell,

of that class will be expected to distribute the s a m We

Proof (sketch): Let a = I A I and b = I B I .

that given any sample, a randomly chosen member

W e show that any

exhibit several universal 2

classes of functions which can be evaluated easily.

Thus, 8H(A,A) _> a 2 1 H l ( 1 / b

the pigeon hole principle, there exists x,yeA such

Finally w e give several examples of the use of these

that 8H(x,y) > [ H I(1/b

Proposition 2: Let x be any element of A and S any

will denote the number of

means the least integer > x.

Z will represent the integers mod n.

All hash functions map a set A into a set B. W e will

t h e set of possible keys, and B the set of indices.

f is a hash function and x,y~A, w e let ~r(x,y) be 1 if

number of elements of S that x collides with, i.e.

is 1 if x and y are distinct elements of A which map

Let f be a function chosen randomly

a universal 2 class of functions

probabilities on the functions.)