The Google Legacy
Chapter Three: Google Technology
Google’s technology has emerged from a series of continuous improvements or what Japanesemanagement consultants call
. Each Google technical change may be inconsequential tothe average user of Google. But when taken as a whole, Google’s “technological advantage”comes from Google’s incremental innovations, clever adaptations of research-computingconcepts, and Byzantine tweaks to Linux. Some day, a historian of technology will be able toidentify, from the hundreds of improvements that Google has engineered in the last nine years,one or two that stand with PageRank as of major importance. Critics of Google will see thatthe company has grafted to its core technology processes from many different sources.To illustrate, the structure of Google’s data centers and the messages passed to and from thesedata centers is in many ways a variant of grid computing.
Google’s ability to read data frommany computers simultaneously is reminiscent of BitTorrent’s technology.
Google’s use of commodity or “white box” hardware in its data centers is an indication of Google’s hackerethos. The use of memory and discs to store multiple copies of data comes from the frontiersof computing.Google’s approach to technology, then, is eclectic and in many ways represents a buildingblock approach to large-scale systems. Google benefits from that eclecticism in several ways.First, Google’s computational framework delivers sizzling performance from low-costhardware. Second, Google worked around the bottlenecks of such operating systems asSolaris, Windows Advanced Server, and off-the-shelf Linux. Third, Google took goodprogramming ideas from other languages, implementing new functions and libraries toeliminate most of the manual coding required to parallelise an application across Google’sservers.
According to Jeff Dean, one of Google’s senior engineers, “Google engineering is sort of chaotic.”
This is neither surprising nor necessarily a negative. The Googleplex is a toy boxfor engineers and programmers. The tools are sophisticated. The challenges of the problemsand peers make Google “the place to be” for the best and brightest technical talent in theworld. The nature of creativity combined with Google’s approach to innovation make itdifficult to predict the next big thing from Google.Before reviewing selected parts of Google’s technology in somewhat more detail, the diagram“Google’s Computing Framework” provides an overview of the Googleplex and some of itstechnologies. These will be touched upon in this section.
4.Grid computing is applying resources from many computers in a network to a single problemor application. Google uses grid-like technology in its distributed computing system.5.BitTorrent is a peer-to-peer file distribution tool written by programmer Bram Cohen in2001.The reference implementation is written in Python and is released under the MIT License.6.Google has anywhere from 100,000 to 165,000 or more servers. Servers are organized intoclusters. Clusters may reside within one rack or across multiple racks of servers. Some Googlefunctions are distributed across data centers.7.From Dr Dean’s speech at the University of Washington in October 2003. See http:// www.uwtv.org/programs/displayevent.asp?rid=2459.