You are on page 1of 22

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/235449876

The Importance of Web Search Technology, Innovation and Business Model in


Explaining Google's Success

Conference Paper  in  Espacios · March 2009

CITATIONS READS

0 2,792

5 authors, including:

Luis Guedes Eduardo Pinheiro Gondim de Vasconcellos


Fundação Instituto de Administração University of São Paulo
26 PUBLICATIONS   51 CITATIONS    91 PUBLICATIONS   306 CITATIONS   

SEE PROFILE SEE PROFILE

Liliana Vasconcellos Moacir de Miranda Oliveira Junior


University of São Paulo University of São Paulo
30 PUBLICATIONS   72 CITATIONS    98 PUBLICATIONS   557 CITATIONS   

SEE PROFILE SEE PROFILE

Some of the authors of this publication are also working on these related projects:

innovation and internationalisation View project

Developing Research and Innovation in Multinationals: The case of Sweden and Brazil View project

All content following this page was uploaded by Luis Guedes on 16 May 2014.

The user has requested enhancement of the downloaded file.


The Importance of Web Search Technology, Innovation and
Business Model in Explaining Google’s Success
- A Healthy Disregard for the Impossible

LUÍS F. A. GUEDES
School of Economics, Management, and Accounting, University of Sao Paulo
Av. Prof. Luciano Gualberto, 908, 05508-900, São Paulo, Brazil

EDUARDO VASCONCELLOS
University of Sao Paulo, Department of Business Administration at FEA/USP
Av. Prof. Luciano Gualberto, 908, 05508-900, São Paulo, Brazil

LILIANA VASCONCELLOS
University of Sao Paulo, Department of Business Administration at FEA/USP
Av. Prof. Luciano Gualberto, 908, 05508-900, São Paulo, Brazil

MOACIR MIRANDA DE OLIVEIRA JR


University of Sao Paulo, Department of Business Administration at FEA/USP
Av. Prof. Luciano Gualberto, 908, 05508-900, São Paulo, Brazil

Abstract
Among the fast growing technology based companies, Google has shown an outstanding performance in
the last few years. Its revenue stream, based on a virtual infinitely scalable search advertising model,
relies ultimately on the performance of the search engine. The main advertisement product is the selling
of a high margin, pay-per-click, content sensitive advertisement on its own site – a product called
AdWords. As billion daily queries (there were about 4.8 billion in October, 2008 in U.S.) get faster,
relevant and more accurate, more interest Google site will attract and, one may expect, more effective
will be the advertisement, in terms of return over investment for clients. Besides its distinctive hardware
capacity and expertise, Google has a strong culture and a suitable management style, both of them
responsible in large part for the attraction of a high quality workforce.
The methodology used in this study was based on literature review, published sources, interview with
Google senior employees and case study. The bibliographical research emphasized the following
subjects: technological base of web search engines, innovation management, management, and business
model. The case study was held at the Brazilian branch of Goggle. The outcome of this research led us to
two main conclusions: the technological infrastructure assembled by Google since the early years and the
incremental innovations on the search engine form a base over which the entire business depends on.
This solid base not only supports the search engine itself, but also makes available to software engineers,
statistics experts and business developers an unprecedented computational capacity and an almost
infinite amount of data related to hundreds of thousands of queries that Google “sees” every day. This
unusual environment led to the development of many other related applications, such as Google Maps,
Gmail and Google Apps. On other hand, it was clear that the web search business has no switching costs
to the end user and the business continuity for Google relies on the sophistication, celerity and accuracy
of the search engine. Performance mechanism of continuous improvement is complex and requires the
most important effort of the company. PageRankTM, hypertext-matching analysis, MapReduce, and
Cloud Computing are examples of high complex technologies that boost firm competitiveness and add an
inestimable value to the company. The effect of learning curve over core technology and the huge
company technological infrastructure can be both considered as high entry barriers, a dissuasion
mechanism to potential competitors.

Keywords: Technology Management; Innovation; Cloud Computing; Google

Introduction
Among the fast growing of technological based companies, Google has shown an outstanding
performance in the last few years. Its revenue stream, based on an infinitely scalable search advertising
model, depends mostly however on the accuracy of the search engine itself. The main advertisement
product is the selling of high margin pay-per-click advertisements on its own site – a product called
AdWords. As the results of the millions of daily queries (there were about 4.8 billion in October, 2008 in
U.S.) get faster and more accurate, more interest Google site will attract and, one may expect, more
effective will be the advertisement, in terms of return over investment for advertisers.
On the other hand, the challenges faced by technology-based companies is well known: “Researches
indicate that only about 36% of new technology companies survive for four years, with an even lower
proportion still operating after five years.” (ZHU, 2008). Song et al (2008) found that only 21.9% of the
11,259 new technology ventures established between 1991 and 2000 in the United States survived after
five years.
Thus, increases the significance of understanding the factors influencing business success, mainly those
based on innovative technologies. Given that context, the purpose of this research is to analyze the
importance of web search technology, innovation, and business model towards Google‟s key
performance indicators. Considering that “Google‟s innovations in Internet technology have produced
the world‟s top search engine.” (ZHU, 2008), it is relevant to understand how the multibillion dollar
company present success is connected to its innovative search engine, powered by a huge and complex
technological infrastructure. It is important to point out that, although Google offers many products
based on different technologies, this study is focused on the search engine technology and its link to
performance factors desired by the users as one of the key factors of Google‟s success. The Business
model and the importance of the innovation culture are also discussed.
The main results obtained were summarized and organized under six topics: first, some considerations
about success of technology-based firms are presented based on a literature review; then the research
methods and Google‟s history are described; subsequently, the discussion about Google‟s success is
presented; and last but not least, the final remarks and references are set out.

Factors Influencing Success of Technology-based Firms


Company success can be “defined as a higher level of commercial and financial return” (WENSLEY,
1997). Figure 1 presents examples of key performance indicators based on the literature review.

Table 1 – Key Firm Performance Indicators


Key performance
Author Metrics
indicators

Brand value
Public image Bae et al (2000) Company value
Stock price and evolution
Batt (2002); Delaney et al
Sales
Sales growth (1996); Banker et al (1996);
Stock price, stock variation in time
Roberts (1992)
Local market share (USA)
Market share Delaney et al (1996)
International expansion
Productivity Hit et al (2001); Guthrie (2001) Revenues per employee
Arthur (1994); Guthrie (2001);
Turn over Best Place to Work ranking
Batt (2002)
Varma et al (1999); Bae et al Market share
Product quality
(2000); Delaney (1996) Active clients

Profitability Roberts (1992) Profit and Margin

Source: authors

Explaining company success is a continuous challenge. According to Wensley (1997):


As a result of decades of research, it is now generally accepted that the number of
factors which account for business performance are so many that it is almost
impossible for any single study to come up with a variable which accounts for more
than 10% of the variance in, say, ROI.

Also, “Several different groups of factors, combined in different ways, may produce equally successful
companies, each one successful for fundamentally different reasons.” (ROBERTS, 1992).
Besides these limitations, understanding the firm success factors can contribute to improve corporate
management, since leaders can adapt other companies‟ experience to their own organizational context.
So, there are many studies approaching this subject. Song et al (2008), for example, analyzed eight
homogeneous significant success factors to new technology ventures: (1) supply chain integration; (2)
market scope; (3) firm age; (4) size of founding team; (5) financial resources; (6) founders' marketing
experience; (7) founders' industry experience; and (8) existence of patent protection.

Roberts (1992) discusses the strategic actions influencing corporate success in post founding technology-
based companies: marketing orientation (market interactions and marketing organization and practices),
managerial orientation (managerial skills acquisitions, problem focus), and financing (subsequent
financing). Additionally, Albers and Clement (2007) tested the impact of marketing strategies and chosen
business models on revenue and profitability on a survey of 147 e-businesses. The results confirm the
importance of business models to reach profitability and the effect of marketing strategy on revenue
(ALBERS; CLEMENT, 2007).

Research Methods
The case study method was chosen as the most suitable instrument to accomplish the study of the
importance of technology in explaining Google‟s success. As a research limitation, it is important to
point out that the purpose of the case study method is not to represent a population, only to represent the
chosen case (STAKE, 1994; SCANDURA & WILLIAMS, 2000).
The research primary data was obtained through interviews with two Google‟s executives (CEO for
Latin America and Google Brazil Communication Director) and one employee from Financial
Department. Also, the data was complemented by company documents and other publications.

Google’s History
The first web search services were based on on line directories, such as catalogs. This solution was
proved inefficient, as the content available in the Internet became more complex and extensive. In this
context, in 1997, Sergey Brin e Larry Page, students at Stanford, initiated the development of their own
search engine algorithm and index, which later would be called PageRankTM.
Developed without a business plan, the search engine based on PageRankTM was first made available as
an academic assignment to be used by Stanford‟s students, teachers and employees. The entrepreneurs‟
primary strategy was to develop web search software and then sell it to a specialized website. With
support from Stanford‟s technology license office, Brin and Sergey tried to sell the search engine and the
advertisement method to other companies, but no one was interested. Eventually, they started their own
company to operate and improve the search tool and online advertisement mechanism.
PC Magazine in its 1998 special edition Top 100 Web Sites reports that Google “has an uncanny knack
for returning extremely relevant results” and recognizes the company as the best search engine and
places it on the top of list of all websites.
After receiving in 1999 investments of US$ 25 million from two venture capital firms, during 2000,
Google becomes the largest search engine on the web, with the more than 1 million home unique URLs
(universal resource locators). In late 2008, this number approached 1 trillion. Google were then available
in 26 languages, including Chinese, Portuguese, Dutch, Norwegian and Danish.
In 2001, Google hired Eric Schimdt as a chairman to succeed Sergey Brin, bringing executive experience
and signalizing the market an improvement on the company‟s top management. Also during 2001, a data
mining analysis performed by Google‟s analysts showed interesting information: around 60% of searches
were originated outside United States, but only 5% of revenues came from international sources. The
need to review the strategy for growth, now focused on Marketing and Sales instead of Engineering,
which led the firm to open new offices in Tokyo, Hamburg, Sydney, Dublin, London and São Paulo.
Figure 1 shows international revenues evolution, indicating that in 2008 for the first time international
revenues overcame local revenues (GOOGLE, 2009b).
Figure 1 – Google USA versus International Revenues

Google International Revenues Evolution


USA revenues International revenues

18% 22% 29% 34% 39% 43% 47% 51%

82%
78%
71%
66%
61%
57%
53%
49%

2001 2002 2003 2004 2005 2006 2007 2008

Source: authors (based on GOOGLE, 2008)

The following years revenues grew intensively and there were several products releases. The charts
below demonstrate the revenue and number of employees evolution (Figure 2) and the advertising
revenue contribution as part of total Google‟s outcome (Figure 3), since 2001. During this period of
analyzes a substantial and consistent growth was noticed, even though the revenue remained based
almost entirely on selling online advertisement.
Figure 2 – Evolution of Revenues and Total Number of Employees

Google Revenues and Employees


Employees Revenues

25.000 $ 25
Billion US$

$ 21,80

20.000 $ 20
20.222
$ 16,59

16.805
15.000 $ 15
Employees

Revenues

$ 10,60
10.000 10.674 $ 10

$ 6,14

5.000 5.680 $5
$ 3,19

$ 1,47 3.021
$ 0,09 $ 0,44
1.628
- $0
2001 2002 2003 2004 2005 2006 2007 2008

Source: authors (raw data compiled from GOOGLE INVESTOR RELATIONS, 2009)

Figure 3 – Google Advertising Revenue Representativeness


Google Advertising Revenue over Total Revenue

99% 99% 97%


99% 99%
97%
94%

77%

2001 2002 2003 2004 2005 2006 2007 2008

Source: authors (raw data compiled from GOOGLE INVESTOR RELATIONS, 2009)

Web Search Technology and Google’s Success


Google success is due to a non trivial set of factors and circumstances, but there are a few aspects that
deserve attention: the web search technology, the innovation capability, the technological infrastructure,
the business model, the leadership style, the organizational culture and the strategic choices over time.
Despite the fact that there are many relevant factors that contribute to Google‟s success, the purpose of
this paper is to analyze the role of web search technology, innovation, and business model in this
process.
To do so, we present a framework based on research results (Figure 4). The proposed outline indicates
how core web search technology and incremental innovations allowed Google to respond to client‟s web
search needs (competitiveness criteria) and, consequently, to boost company performance (outcome).
Considering their importance, management and business model aspects are part of the framework.
Moreover, each one of the components that influences Google‟s outcome are presented: core web search
technology, incremental innovations on the web search engine, management and business model, and
competitiveness criteria for the web search business.
Figure 4 – Web search technology and Google’s Success
Competitiveness
Core web search Incremental criteria – web search
technology innovations
Response time to a
query
PageRank Multi language
support Results accuracy
Hipertext
matching Spelling check Results relevance
Cloud computing Google suggest
MapReduce Universal search

Outcome

Company value
Management and Brand value
Business Model Profit and margin
Sales growth
Stock price and evolution
International expansion
Market share

Source: authors

Core web search technology


Google‟s web search engine was the very first technology developed by founders Larry Page and Sergey
Brin (when they were at Stanford). The application, thus, has a considerable effect over the learning
curve. The excellence of this technology boosts and supports the main revenue source of the company –
on line advertisement. One third of all company work force is engaged in improving the products,
commercializing them, contacting clients, and controlling the processes of on line advertisement
products, mainly Google AdSense and Google AdWords (GOOGLE, 2007).
The 2008 balance sheet indicates that 97% of the US$21B revenues came from the advertisement
business (GOOGLE, 2009b). Google, though, has more than a dozen other products, such as the well
known Gmail, Google Maps, and Google Earth and others not so familiar such as Google X, Google
Code Search e Google Mars. Even internet market specialists would have some trouble trying to name all
Google products, such is the consistency which the company provides to these new products – over
existing platforms (Google Desktop Search, Google Scholar and Google Books share a common
software core), or over brand new platforms (like Android, the cell phone operating system issued in
2008).
A search engine must cope with sometimes malicious site administrators in order to perform. Since the
web has billions of pages, being in a relevant position on a result page is essential to have a good traffic.
This assumption is even stronger for those sites that have some kind of on line advertisement or that offer
e-commerce. There is an entire industry built to offer site administrators strategies and tactics to
theoretically boost the performance of a web site in a result page (from Google, Yahoo! or others). This
type of consultancy is called SEO – Search Engine Optimization.
From the point of view of the search engine, once one identifies the logic that drives the results order,
this result can be manipulated. A good web search engine is not affected by manipulation, but driven
only by content relevance. Google engineers spend a considerable amount of resources to mask the logic
that drives results order and the two line snippets that describe each search result, in a way that the result
is less susceptible to manipulation. Google core search engine technologies are: PageRankTM, Cloud
computing, Hypertext matching and MapReduce. We describe bellow some high level features of these
technologies.

PageRankTM

The Web is, by definition, a complex, chaotic and growing collection of information and content. In the
early days of Internet, finding a small piece of information required a non trivial method and expertise. In
this context, specialized web search engine sites emerged to help finding or at least getting users closer to
what they needed (POULTER, 1997).
Although there are other means of reaching webpages such as link-following and
knowing or guessing universal resource locators (URLs), search engines are by far the
most prominent means, especially for conducting initial exploration of a particular
interest (INTRONA & NISSENBAUM, 2000).

Due to the explosive amount of web pages and content on Internet (news, videos, photos, images, maps,
blogs, social networks and so on), a typical web search will return a considerable amount of results
(search results for “good pizza in São Paulo” at Google Brazil returned 652,000 web sites). Therefore,
the order of appearance at the web search results page becomes extremely relevant both to the user (that
demands precise information) and to the site (that wants to drive attention to its content).
The directory service (collection of similar pages, manually indexed) was the first attempt to organize
web content. There were directories with thousands of categories (such as the 590,000 of the Open
Directory Project – www.dmoz.org) and other for specific audiences (such as Medical World Search –
www.mwsearch.com). Search engines came out when the amount of web content grew so dramatically
that Directories services became unfeasible. The first search engine that made use of web pages index
was WebCrawler, released in April, 1994, as part of a research project at University of Washington.
Later on, WebCrawler was bought by InfoSpace.
Search engines utilize small pieces of software called “robots” or “spiders”, whose job is to roam the
Internet acquiring the biggest number of web pages. This collection of pages is then indexed in a huge
database (comprised of several billion pages). A typical search engine works with dozens of robots,
working simultaneously, with the ability to cover, each of them, about 50 million web pages (CENDON,
2001). Because web pages are dynamic (their content change over time), it is common that the robots
visit their set of pages once a month, in an attempt to refresh the indexed database. For the most popular
sites on the web (such as YouTube, Wikipedia, and MySpace) web pages updates and broken links are
identified on a daily bases. As soon as the pages are acquired, indexing software search for relevant
terms of each page and build indexes that aim to speed up the search process and provide the base for a
relevance weight approach – if a certain term appears consistently on the web page or has any kind of
visual mark, for instance, the term has a weight that would reflect that on this particular page one can
find information regarding the subject. If a term was not included on the index, it will not be found by
the search engine. That‟s why the indexing logic is so important for the process. The indexing and weight
attribution criteria are strong influences on the search engine performance.
Most of search engines index all relevant terms that appear on a certain web page (excluding prepositions
and other common words). This method leads to an explosive database size, making the search an
intensive processing consumer. A possible alternative is to index only the terms that are most relevant,
such as those in the header, the ones that are several times repeated, that have a bigger font size or that
have any kind of visual mark. Despite being an evolution of the previous criteria, these incremental
innovations were not as efficient as needed, once the algorithms did not perform any content and
reliability analysis of the web pages. They just count and try to give relevance to words. Moreover, it was
not a hard task for programmers to take advantage of the logic and artificially overperform their sites on
the web search results page.
One of the first activities of then start up Google was the development of a web page search and indexing
logic that would give users a set of answers weighted by their relevance facing the search keywords
provided by the user. The PageRankTM algorithm was the first step of this journey. One of the key
characteristics of this mechanism is that the answers ranking criteria is free from human intervention and
it‟s not affected by any kind of commercial interests. The answers are given by an algorithm, or
mathematical recipe, that combines two sources of information to weight the importance of a given web
page: one based on the web page content intrinsic relevance and another based on the relative relevance
of the page, among its similar. Each analyzed page receives a rate - the PageRankTM of the web page.
The bigger it is, the more relevant is the page, in general. According to Robinson (2004), “PageRankTM
of a web page is a measure of how popular this page is as a function of the web‟s inlinks and outlinks”.
The assumption that underlines PageRankTM method is that on Internet, popularity means trustfulness.
The relative importance of a web page is given by an equation that manipulates all the incoming and
outgoing links of the page. The incoming links are weighted by the relevance of the page that points out
(and this relevance is measured by the PageRankTM of the web page). Following this logic, the bigger and
more relevant is the number of links pointing to a web page, the more relevant this page will be as a
content provider. The weight of an outgoing link is depressed by the total amount of outgoing links. A
simplified form of PageRankTM equation is:

PR(u)
PR(a)  
u  B u L(u)

Where:
- PR(a): PageRankTM of webpage a
- PR(u): PageRankTM of webpage u
- L(u): amount of outbound links from page u
- Bu: set that contains all pages linking to page u

Figure 5 shows a simplified scheme that illustrates PageRankTM. The size of each box illustrates the
relevance of a web page. Pages that have more incoming links have a higher PageRank than those with
less links pointed to it. “Votes” of pages with a higher PageRank have more significance than votes of
pages with low PageRankTM.
Figure 5 – PageRank weighted “voting” system

www.amazon.com
www.yahoo.com

www.wekipedia.org
www.orkut.com

Source: authors

Despite the core concept being straightforward, the number of calculations performed by Google to
determine a page‟s PageRank demands an extraordinary computational power, even more given the
gigantic size of the web itself. PageRank algorithm utilizes sophisticated mathematical tools to execute
all the calculation in a single step, on an effort to minimize execution time and processing impact
(ALTMAN & TENNENHOLTZ, 2005). In fact, PageRank has many other details which are beyond the
scope of this paper.

Hypertext-matching analysis

Identifying the most relevant web pages in a given context does not necessarily mean that the page is the
best answer to the user, tough. In order to improve accuracy, Google combines PageRankTM with a
sophisticated text-matching technique to find pages that are both important and relevant to a given search
(GOOGLE, 2009).
Besides its importance for the search engine as a role, PageRank is far from being the only mechanism
used to sort results. Google hypertext-matching analysis algorithm thoroughly analyzes the content on
each webpage and collects hundreds of variables in order to determine how relevant that webpage is,
given a certain search key entered by the user.
Various measures related to the occurrence of the query in the webpage are considered, but not
emphasized by Google, since the relevance ranking algorithm is one of the most significant differentials
among search engines. “Google uses a number of factors to rank search results including standard IR
measures, proximity, anchor text (text of links pointing to web pages), and PageRankTM” (PAGE et al,
1998). Among these items, Google incorporates as keys to decipher the relevance of a webpage
depending on the subject: font size, color, and where the term appears on the webpage. For example:
When there are 2 words together that match a multi-word search, this webpage will most likely be more
relevant than if they were really far apart. Larger font as well will be more relevant than smaller font, and
things with different colors or that come earlier in the site are most likely more important than black and
white text or that come later in the page (GOOGLE, 2009; PAGE et al, 1998; CHO et al, 1998).
Cloud computing

The underlying concept of cloud computing dates back to 1960 and was first suggested by the computer
scientist John McCarthy. He argued that Cloud Computing could be someday used as public utility, such
as electricity and water (BUYYA et al, 2007). This scenario should be equivalent to the evolution in
electricity a century ago, when farmers shut down their own generators and started to buy power from
much more efficient industrial utilities. Nevertheless the old concept, the interest for cloud computing
has emerged recently, as shown in the Figure bellow.
Figure 7 – Rising Interest in Cloud Computing

Cloud computing average search volume


[September 2007 baseline = 1]

10

0
set/07 out/07 nov/07 dez/07 jan/08 fev/08 mar/08 abr/08 mai/08 jun/08 jul/08 ago/08 set/08 out/08 nov/08 dez/08

Source: authors (raw data from Google Trends)

The way it is today, cloud computing can be defined as an infrastructure comprised by both a huge
hardware platform comprised of a collection of interconnected computers and a complex software
application that manages these resources. Google's cloud, for example, is a network made of hundreds of
thousands of interconnected cheap servers, each of them not much more powerful than a standard PC.
Google cloud stores stunning amounts of data, including copies of entire World Wide Web, and has an
aggregated capacity of a huge supercomputer. This particular set of hardware gives Google the ability to
respond to the many million daily queries in a fraction of a second and, likewise, deliver the possibility
to use search and retrieval algorithms that are computational intensive (GOOGLE, 2007). Google‟s
choice to invest in hardware resulted in a competence development that maybe considered as a
significant competitive advantage regarding other web companies (STROSS, 2008).
A cloud however is more than a collection of computer resources because it provides a mechanism to
manage its own resources. A cloud computing hardware platform can be programmed to dynamically
handle new provisioning requests, reconfigure the systems, perform workload rebalancing, and monitor
errors and performance. A simple scheme of a cloud computing arrangement is shown above.
Figure 8 – Cloud Computing Simplified Architecture

storage
Massive and flexible
storage capacity
security

performance Supercomputer like


processing capabilities
reliability

INTERNET

End users

Business applications

Source: authors

Figure 8 illustrates that both business and non business users can access files and use softwares that are
not physically on their machines. Files and softwares are on the cloud – distributed among several
servers, typically in more than one physical location. Strategically, Google does not comment the
location or the quantity of its data center, but is taken for sure that they are spread over several locations
and operate in a scheme of redundancy.
Cloud Computing also describes a set of applications that are developed to be accessible to users all over
the world, through the Internet. These cloud applications use large data centers and powerful servers that
host Web applications and Web services. Anyone with a suitable Internet connection and a standard
browser can access a cloud application (CHAPEL, 2008).
Google started to use Cloud Computing since the beginning of the company. The main reason back there
was to implement a cheap hardware design that ran a high machine consuming task in just a few seconds.
Google founders started to build their own servers based on cheap parts meant for personal computers.
They wanted to save money, and they realized that a network of computers would search the Web more
efficiently than the viable alternatives (HANSELL & MARKOFF, 2006). Today, Google‟s cloud has the
same characteristics as other cloud arrangements – among them, the propriety of never getting old. When
an individual server dies or gets severely damaged, maintenance crew draws it out and replaces it with a
new, faster and modern one. This means that the cloud regenerates as it grows, and virtually never gets
inoperative.
Together, Google‟s thousand machines form a massive and cost effective supercomputer, optimized to
perform one big task - find, sort and extract web-based information – as fast and accurate as possible
(GOOGLE, 2009).
Given those characteristics, the use of Cloud Computing offers Google not only huge processing power
(a typical search takes less than 0,5 sec) but also an open opportunity to make business (renting its
capacity and processing power) and a way to speed up new product development (via an unparallel
environment to perform simulations).
MapReduce

When a user types in a query, the search terms are looked up in an index and the results are then returned
from a separate set of document servers (which provide preview “snippets” of matching pages from
Google's copies of the web), along with related advertisements, which are returned from yet another set
of servers. To help the high performance the business demands, Google utilizes a highly strategic in-
house developed software, called MapReduce. This is a programming model which was developed in
early 2003 to utilize large distributed systems. The software categorizes key pieces of information,
distributes it across its server farm of PCs, and then eliminates irrelevant data. MapReduce divides each
complex task into small ones and simultaneously run them into thousands of computers in the cloud. In a
fraction of a second, each computer comes back with its single piece of contribution, and MapReduce
quickly consolidates the data (DEAN & GHEMAWAT, 2004; STIBEL, 2008).
Then, with the help of PageRankTM and hypertext-matching analysis, the page of search results is
assembled. Google manages to do this highly complex and demanding task cheaply, in less than a
second, and using computers built from inexpensive, off-the-shelf components linked together in a
reliable and speedy way, powered by Google's own developed software (GOOGLE, 2009).
Google intensively uses MapReduce, making it possible to write a program and run it efficiently on
thousands of machines, greatly speeding up the development and prototyping cycle. It was designed to
assist programmers who have no experience with large scale distributed systems, “since it hides the
details of parallelization, fault tolerance, location optimization, and load balancing” (DEAN &
GHEMAWAT, 2004).
There is a wide range of applications that use MapReduce within Google (GOOGLE, 2009c):
 Clustering problems for the Google News and Froogle products
 Extracting data to produce reports of popular queries (e.g. Google Zeitgeist and Google Trends)
 Extracting properties of Web pages for new experiments and products (e.g. extraction of
Geographical locations from a large corpus of Web pages for localized search),
 Processing of satellite imagery data

Incremental innovations on the web search engine


To build a web search engine that is increasingly accurate, relevant and fast is one of the major concerns
of Google and consumes a significant part of the company‟s resources. Under the vice president of
Engineering there is a group of mathematicians, software experts, and engineers focused on constant
improvements and quality of the search engine. During 2007, this team was responsible for more than
450 improvements (GOOGLE, 2009c).
There are many incremental innovations on the search engine architecture that Google does not disclose,
due to the sensitive nature of the information. But few of them are public and they help us to have a
glance of the effort and energy that the company drives to this task. We are going to describe bellow
some of these public innovations.
Innovations to deal with 40 languages. There are many differences that must be taken into account
regarding the language in which the web search service is being offered. Even the same languages,
spoken in different locations may be trick to deal with. See the example o “camião” (truck) in Portugal
and “caminhão” in Brazil. Not only the language matters, but also the user location does too.
The question of web search engine fitting several countries where Google is available grew in
importance in the last few years, once the main issues were already being addressed and the international
revenues, as we saw, had become more and more relevant. In order to address the challenge of adaptation
thoroughly the search engine for each and all of the 40 languages in which Google is available, the
company assembled an international network of employees, based on its 20 offices around the world.
This network counts on native speakers of virtually all major languages (from Abkhazian to Zulu). Some
of the most significant incremental innovations designed to deal with such language diversity are shown
above (GOOGLE, 2009c):

Table 2 – Search Engine Incremental Innovations to Support Google’s International Expansion

Google uses a dictionary that was built and maintained based on statistical
analysis of the several billion pages it has indexed. This multilingual
Spell corrections dictionary is the base that supports spell corrections and other features. This
tool is in constant improvement to cope with the ever changing nature of a
language. No standard dictionary would offer such a service.

There are some word variations that, despite being spelled differently, have
the same meaning. Such is the case of the Polish word “movie" (film), and the
Stemming
variants "filmów", "filmu", and "filmie". Researches performed with any of
these words will return common results.
Search for “Páginas Amarillas” may result in a multiplicity of web pages,
Influence of user since there are more than a dozen Spanish speakers countries. Using a
geography technique to identify user geographical region, Google assigns more relevance
to local pages.
Diacritical signs in some languages are used to support correct pronunciation
(cases in which Google does not take them into account) or may denote a
Diacritical marks totally diverse meaning to the word. This is the case where Google recognizes
them and use these marks as a parameter in the search and classification
routine.
The use and the meaning of punctuation marks have wide amplitude among
idioms. Knowing details each of these variations is fundamental for search
Punctuation marks
accuracy. Abbreviation formats, use of acronyms and ways to express dates
are among the main fluctuations that Google recognizes.
Idioms that utilizes non Latin characters generally are supported by double
function keyboards. If a user forget to hit the key to change keyboard idiom,
Google will recognize the equivalent keys, as if the user were typing is his
Non-Latim Scripts
idiom. The same with the phonemes of a non Latin language, as Greek. For
example "Bank of Attica" entered as Greeks speak it (trapeza attikhs), returns
good results for "Τράπεζα Αττικής".
On line suggestion of a word or sequence of words in a query meant is to
Google Suggest speed up the typing. Google, based on its large database of queries, offers
autofill options.

Source: authors

How to probe user experience. Google has a team whose mission is to understand as close as possible
how users experience the services. This team spends considerable time in field, observing consumer
behavior and habits towards web search. The field collected data offers a unique source of information
and insights that can dramatically improve user experience.
Another tool used to probe user experience is to issue some experiments just for a small group of
selected users and collect their opinion. This group can be formed by some regular users and some
experts, such as vertical blogs authors, specialized journalists and advanced users.
Universal search. The concept of universal search aims to put together several forms of content that
Google has access to. Thus, beside the results page addresses only web pages, it also addresses news,
maps, films, books, blogs and other medias that are indexed by Google. By doing so, the company
intends to deliver more relevance to the query and fulfill user demand.

Management and Business Model


For the functionality-oriented consumer perspective, a good managerial structure is the one that produces
useful and innovative products while keeping production and transaction costs low. As Google users tend
to be driven by the performance of the search engine itself, we can consider that innovation, brand
management, and efficient operation are among the most relevant managerial goals.
The ultimate test for any management team relies on how it can consistently improve company
performance towards a long-term period, rather than how fast it can grow company top line in the short-
term (HAMEL, 2001). In times such ours, where ambiguity and change are inexorable, the quest for long
term business sustainability demands an expressive capacity for rapid strategic adaptation.
Google has an organizational culture that fosters not only the constant search for excellence of the main
businesses (search and advertising), but also gives concrete incentive to engineers and computer
scientists to work on new off-budget ventures related to the company open-ended mission – to organize
the world's knowledge. The organizational structure itself is flat and non-hierarchical, in the traditional
way. As other knowledge intensive companies, at Google the hierarchy of a project is given not by the
paycheck level, but by the specific knowledge of each participant of the group that is demanded to solve
the immediate problems. Dedicated groups of three or four people are very common and they are
assembled and disband without strict managerial control. This fluid hierarchy helps the company to
engage in high level debates and cultivates an innovative environment. An idea must stand for its own
merits, regardless the latitude of its proposer. This is the way through which unconventional ideas (such
as to map Mars surface), has the odds to see the light of the day. Notwithstanding Google is a big
company, the relationship among high hierarchical bosses and “ground floor” engineers is more like
classmates than the one at a traditional large multinational company. Besides that, the level of formalism,
mainly on product development, in not very high. As Marissa Mayer (Vice President of Search Product
and User Experience at Google) has declared:

We still don't do very high-definition product specs. If you write a 70-page document
that says this is the product you're supposed to build, you actually push the creativity
out with process. You don't want to push that creativity out of the product. The
consensus-driven approach where the team works together to build a vision around
what they're building and still leaves enough room for each member of the team to
participate creatively, is really inspiring and yields us some of the best outcomes we've
had (BUSINESSWEEK, 2006).

Another aspect that contributed to Google‟s success was the innovative business model. In the end of
nineties the established search websites, such as Yahoo!, AOL and Ask.com, were investing in offering
complementary content in their webpages (news, weather forecast, stock value, sports, culture, fashion
and others). This strategy aimed to obtain user attention and to increase the web search frequency of
accesses, and indirectly to extend the online advertising value and increase the website credibility as a
source of relevant information.
Google‟s strategy was totally different from its competitors, starting from a minimalist homepage,
without pop ups or content (there is only the company logo, changing accordingly to important world
events, and a box to enter user‟s search keywords). Also, the communication strategy for new products is
based on products beta version diffusion (STROSS, 2008) in an informal network of technology
specialized journalists, employees and blogs authors. Even Google has its own official blog, targeted on
early adopters (GOOGLE, 2009d). Because Google relies on advertising revenue, it can provide most of
its products for free, relying on users to pass along knowledge to friends, family and colleagues.
Figure 9 below briefly illustrates Google‟s business model evolution and transformation during the 10
years of the firm existence.
Figure 9 – Google’s Business Model Evolution

1998 • Portal for web search only, no commercial use

• Start to commercialize AdWords, a text-based advertisement,


2000 linked to search results

• License search engine and sponsored links to AOL, Yahoo! –


2002 money stream start to flows

• Global presence. Found a way to insert relevant ads into any


2003 web content – online ad business is a huge success

• Increase the flow of new web based products, such as Gmail,


2005 Desktop Search, Scholar, Maps, Earth, Blog Search…

• Launches Chrome (a web browser), Google TV Ads, Checkout


(a web payment service) and Android (a mobile phone
2008 operating system). Intents to go far beyond Internet

Source: authors (based on GOOGLE, 2009e)

Competitiveness Criteria for the Web Search Business


Besides the innovative business model and its well designed fit to customer expectations, Google has a
distinguished technological infrastructure which, we believe, it‟s the single most important characteristic
that gives the company its solid competitive position. This infrastructure aims to deliver operational
excellence to the search engine, mainly focusing on three performance dimensions: accuracy, celerity,
and relevance. In the words of Larry Page (GOOGLE, 2009a), a perfect search engine “understands
exactly what you mean and gives you back exactly what you want”.
The effort around the continuous improvement of the search engine is a huge and complex job, which
consumes a third part of all company resources (GOOGLE, 2007). Table 2 shows how these key
performance dimensions can be unfold on client‟s perspective.

Table 2 – Search Engine Key Performance Dimensions

Key Performance
Brief description
Dimension

 Precision of the search results, given the user input


 Multiple items queries must return pages taking into account the proximity
Accuracy
of the terms, not only the terms themselves
 The location of the user matters to the answer
Key Performance
Brief description
Dimension
 The search mechanism must take into account diacritical marks and
spelling errors
 The results page must be assembled quickly
Celerity
 Typical time to response must be around a few seconds
 Reliable sources of content must be well graded in the results page
 Results must be free from spam (empty pages, pages with advertisement
only and pages with a content that does not match the description of the
Relevance
page)
 Results must be as comprehensive as possible (web pages, maps, figures,
videos, etc.)

Source: authors

Each of the core search engine technologies addresses one or more key performance dimension. The
result is bigger than the sum of the parts - a solid web application, suitable for a wide range of users,
reliable, trustable and not biased by economic interests. Figure 10 above shows how core web search
softwares address each of key performance dimensions.

Figure 11 – Web Search Core Technologies and Key Performance Dimensions

Google web search core technologies

Key Performance Hipertext Cloud


Page Rank Map Reduce
Dimension matching computing

Accuracy    
Celerity  
Relevance  
Source: authors

The importance of these core technologies is not restricted to web search – they go beyond and facilitate
the development of several new products. This is the case of Cloud Computing helping to put together
Google Docs or PageRank giving relevance for image search. Figure 12 shows some of recent
innovations on Google web search and the core technologies involved in the process.
Figure 12 – Web Search Core Technologies and some Incremental Innovations on Web Search Engine

Google web search core technologies Incremental innovation in Google web search tools

Key Performance Page Hipertext Cloud Map Spell Google Synonym Google Universal Geographi- Safe Google
Dimension Rank matching computing Reduce check suggest Search maps search cal context Search Scholar

Accuracy         
Celerity       
Relevance         

Source: authors
Final Remarks
There is an extensive set of variables that have driven Google to its present success, starting with the
geniality of its founders, and going through the surroundings in which the company grew, its
distinguished culture based on knowledge, company‟s strong principles and great technical capacity,
among other factors. This set of characteristics help to build a nontrivial business environment and
helped to attract a talented workforce.
We show bellow some of the most relevant Google development drivers:
- Scalable and reliable infrastructure
o Cloud computing delivers Google the ability of scaling its operations indefinitely, supporting other
intensive computational products besides web search itself, such as Google Maps, Google Earth, and
AdSense.
o Cloud computing also facilitates hardware reliability (since the system does not depend on a single
machine) and hardware smooth evolution.

- Innovation and evolution based on data


o Employees are encouraged to develop their own projects related to company‟s values and knowledge
plays a significant role on projects hierarchy.
o Analysts have a huge database from which a considerable knowledge about user behavior and
products usage can be extracted. Using this powerful tool helps Google to have insights and foster an
analytical, fact-based approach culture.
- Fast prototype-to-product cycle
o Huge technological infrastructure
o Important base of collaborative and enthusiastic users

Table 3 shows the key performance indicators of the firm and Google‟s present status in each of them.

Table 3 – Key Performance Indicators at Google

Key performance
Google
indicators

 Brand value: US$25,6B (2008). According Interbrand “Best Global Brands” report,
Google is among the top 5 more valuable brands in the world since 2006.
 Company value: Google is the 56th biggest company in the world, according to the
Public image Financial Times “Global 500 2008” report, with market value of more than US$116
billion (as is in Feb 06th, 2009).
 Stock price and evolution: stocks since 2004 outmatch Microsoft, Yahoo!, IBM,
S&P500, Nasdaq 100 (see Figure 12 bellow)

 2008 revenues from sales doubled the 2006‟s and reached US$21B
Sales growth
 Q4„08 Y/Y growth: 18%

 Local market share (USA): Google had a 64% of total US web searches in Dec-08,
according to Nielsen Online. Yahoo!, its closest competitor, has 17% of US web
searches, almost four times less than Google.
Market share
 International expansion: in 2008, the majority of the revenue came from abroad. The
company has offices in 20 countries and its search engine is available in 160 countries,
40 languages.

 Revenues per employee grew steadily since 2001, in a compound rate of almost 20% a
Productivity
year.
Key performance
Google
indicators

 Report “100 Best Companies to Work for in USA” (FORTUNE 500, 2009) from
GREAT PLACE TO WORK put Google at the top of the list for 2007 and 2008, and
Turn over ranked as #4 in 2009
 There were both 777,427 job applications in 2008 and no volunteer turn over at Google
USA.

 Google‟s 2008 net income was about US$4,3B, and represents 19% of the revenues.
Profitability
Yahoo! net income in 2007 was US$600M (9% of revenues).

Source: authors

Figure 12 – Cumulative Total Stockholder Return

Google

Yahoo! S&P 500

Microsoft

Source: BusinessWeek

References
ALBERS, Sönke; CLEMENT, Michel. Analyzing the Success Drivers of e-Business Companies. IEEE
Transactions on Engineering Management, 54, 2, p. 301-314, May-07.
ALTMAN, Alon; TENNENHOLTZ, Moshe. Ranking systems: The PageRank axioms. Proceedings of
the 6th ACM conference on Electronic commerce, 2005. Availabe at:
http://iew3.technion.ac.il/~moshet/pagerank.pdf. Access: 21st Jan-09.
ARTHUR, J. B. Effects of human resource systems on manufacturing performance and turnover.
Academy of Management Journal, 37, p. 670-687. 1994.
BAE, J.; LAWLER, J. Organizational and HRM strategies in Korea: impact on firm performance in an
emerging economy. Academy of Management Journal, 43, p. 502-517, 2000.
BANKER, Rajiv; LEE, Seok-Young; POTTER, Gordon. A Field Study of the Impact of a Peformance-
Based Incentive Plan. Journal of Accounting and Economics, 21, p. 195 - 226. 1996.
BATT, R. Managing customer services: human resource practices, quit rates and sales growth. Academy
of Management Journal, 45, p. 587-597, 2002.
BRIN, Sergey; PAGE, Lawrence. The anatomy of a large-scale hypertextual web search engine. WWW7:
Proceedings of the seventh international conference on World Wide Web, Abr. 1998.
BUSINESSWEEK ON LINE. Inside Google's New-Product Process. Jun-06. Availabe at:
http://www.businessweek.com/technology/content/jun2006/tc20060629_411177.htm. Access:
28th Jan. 2009.
BUYYA, Rajkumar; YEO, Chee Shin; VENUGOPAL, Srikumar. Market-Oriented Cloud Computing:
Vision, Hype, and Reality for Delivering IT Services as Computing Utilities. Proceedings of the
10th IEEE International Conference on High Performance Computing and Communications, Sept.
2008, Dalian, China.
CENDON, Beatriz Valadares. Ferramentas de busca na Web. Ciência da Informação. Brasília, 30, 1,
Abr. 2001. Availabe at: http://www.scielo.br/scielo.php?script=sci_arttext&pid=S0100-
19652001000100006&lng=en&nrm=iso. Access: 19th Jan. 2009.
CHAPPELL, David. A Short Introduction to Cloud Platforms. Availabe at:
http://www.davidchappell.com/CloudPlatforms--Chappell.pdf. Access: 10th Jan. 2009.
CHO, Junghoo ; GARCIA-MOLINA, Hector; PAGE, Lawrence. Efficient crawling through URL
ordering. WWW7: Proceedings of the seventh international conference on World Wide Web 7,
Abr. 1998.
CORREA, Fernando Ribeiro. 30 PHDs working together and 6000+ Linux Servers are the power behind
Google leadership. Larry Page interview to the site OLINUX, published on 22nd. Jan. 2000.
Availabe at: http://olinux.uol.com.br/artigos/230/print_preview.html. Access: 28th Jan. 2009.
DEAN, Jeffrey; GHEMAWAT, Sanjay. MapReduce: Simplified Data Processing on Large Clusters.
Sixth Symposium on Operating System Design and Implementation, San Francisco, CA, Dez.
2004.
DELANEY, J. M.; HUSELID, M. A. The impact of human resource management practices on
perceptions of organizational performance. Academy of Management Journal, 39, 949-969, 1996.
FORTUNE 500. 100 Best Companies to Work For, 2009. Available at:
http://money.cnn.com/magazines/fortune/bestcompanies/2009/snapshots/4.html . Access: 25th Jan
2009.
GARCIA-MUINA, Fernando E.; NAVAS-LOPEZ, Jose E. Explaining and measuring success in new
business: The effect of technological capabilities on firm results. Technovation., 27, 1,
Amsterdam, p. 30, Jan/Feb 2007.
GOOGLE, INC. Annual Report 2007, 2009a. Available at: http://www.google.com Access: Jan. 20th
2009.
GOOGLE, INC. Financial tables, 2009b. Available at: http://investor.google.com/fin_data.html Access:
Jan. 24th 2009.
GOOGLE, INC. Oficial Google Blog, 2009c. Available at: http://googleblog.blogspot.com Access: Jan.
12th 2009.
GOOGLE, INC. Google Research Blog, 2009d. Available at: http://googleresearch.blogspot.com Access:
12th Jan. 2009.
GOOGLE, INC. Google 10th Birthday, 2009e. Available at: http://www.google.com/tenthbirthday/#start
Access: 12th Jan. 2009.
GOOGLE, INC. Q4 2008 Quarterly Earnings Summary, 2008. Available at:
http://investor.google.com/pdf/2008Q4_google_earnings_slides.pdf Access: 12th Jan. 2009.
GOOGLE INVESTOR RELATIONS, 2009. Available at http://investor.google.com Access: 12th Jan.
2009.
GUTHRIE, J. P. High-involvement work practices, turnover, and productivity: evidence from New
Zealand. Academy of Management Journal, 44, p. 180-190. 2001.
HAMEL, Gary. Revolution vs. Evolution: You Need Both. Harvard Business Review, p. 150-158,
Spring, 2001.
HANSELL, Saul; MARKOFF, John. A Search Engine That's Becoming an Inventor. The New York
Times, July 3rd, 2006.
INTRONA, Lucas; NISSENBAUM, Helen. Defining the Web: The Politics of Search Engines.
Computer, 33, 1, p. 54-62, Jan. 2000.
MENEFEE, Michael L; PARNELL, John A. Factors Associated with Success and Failure Among Firms
in High Technology Environments: A Research Note. Journal of Applied Management and
Entrepreneurship, 12, 4, p. 60-73, Oct 2007.
PAGE, Lawrence; BRIN, Sergey; MOTWANI, Rajeev e WINOGRAD, Terry. The Pagerank Citation
Ranking: Bringing Order to the Web. Technical Report, Stanford University. Stanford, CA, 1998.
POULTER, Alan. The design of World Wide Web search engines: a critical review. Program, 31, 2, p
131-145, Abr. 1997.
ROBERTS, Edward B. The Success of High-Technology Firms: Early Technological and Marketing
Influences. Interfaces, 22, 4, July-August 1992, p.3-12.
ROBINSON, Sara. The Ongoing Search for Efficient Web Search Algorithms. SIAM News, 37, 9, Nov.
2004.
SCANDURA, Terri; WILLIAMS, Ethlyn A. Research methodology in management: Current practices,
trends, and implications for future research, Academy of Management Journal, Mississippi State,
43, 6, p. 1248-1264, dez. 2000.
SONG, Michael et al. Success Factors in New Ventures: A Meta-analysis. The Journal of Product
Innovation Management, 25, 1, New York, p. 7, Jan 2008.
STAKE, Robert E. Case Studies. In: DENZIN, Norman K.; LINCOLN, Yvonna S. Handbook of
Qualitative Research, London: Sage Publications, p.236-247, 1994.
STROSS, Randall. Planet Google: One Company's Audacious Plan To Organize Everything We Know,
Free Press, 2008.
VARMA, A.; BEATTY, R. W.; SCHNEIER, C. E.; ULRICH, D. High performance worksystems:
Exciting discovery or passing fad? Human Resource Planning, 22, p. 26-38. 1999.
WENSLEY, Robin. Explaining Success: the Rule of Ten Percent and the Example of Market Share.
Business Strategy Review, 8, 1, p. 63 -70, 1997.
ZHU, Yunxia. New Technology Ventures: What Factors Lead to Success? Academy of Management
Perspectives – Research Briefs, p. 108-109, May 2008.

View publication stats

You might also like