You are on page 1of 1

Goodleh

Search engines use a data structure called (inverted) index to answer user queries. For web search
engines this index is too large to t on a single machine and has to be broken into subindices that are
stored on many machines. Consequently, each user query is converted into many requests, each
requiring access to a single subindex. The search engine has complete freedom in building these
subindices and in assigning them to machines. Usually the subindices are created to fulll the
following conditions: (A) All subindices need the same amount of space and about 4-10 subindices t
on a machine. (B) Over a long enough period of time (like a day) the computational load on the
subindices generated by a typical sequence of requests is roughly balanced. (C) Individual requests
have time-outs of a few seconds, i.e., if a request cannot be answered in a few seconds it is
terminated. Thus the computational load of an individual request is a tiny constant. This problem is
dierent from the load balancing problems studied in the past for two reasons. (1) Prior work
assumed either that every le is stored on every machine (identical machine model) or that the
adversary can control for each individual request on which machine the request can be placed
(restricted assignment model.) In our model the le assignment is under the control of the algorithm,
but has to be xed once at the beginning of the request sequence. (2) Many of the lower bounds in
prior work depend on the fact that individual requests can be large, i.e., they can signicantly increase
the machine load of a machine. In our setting this cannot happen.

You might also like