You are on page 1of 5

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/263572400

Google Scholar

Article  in  The Charleston Advisor · January 2011


DOI: 10.5260/chara.12.3.36

CITATIONS READS
4 173,286

1 author:

Amy Hoseth
Colorado State University
14 PUBLICATIONS   74 CITATIONS   

SEE PROFILE

All content following this page was uploaded by Amy Hoseth on 16 April 2015.

The user has requested enhancement of the downloaded file.


36   Advisor Reviews  /  The Charleston Advisor  /  January 2011 www.charlestonco.com

Advisor Reviews––Standard Review

Google Scholar
doi:10.5260/chara.12.3.36 Date of Review: November 9, 2010
Composite Score: HH 1/2 Reviewed by: Amy Hoseth
Assistant Professor/Liaison Librarian
Morgan Library
Colorado State University
1019 Campus Delivery
Fort Collins, CO  80523
<amy.hoseth@colostate.edu>

Abstract 2,900 scholarly publishers and includes more than 10 million items
from Google Book Search (Jascó 2010, 176–177), although there is
Google Scholar is an internet-based search engine designed to lo-
no authoritative information on potential overlap between Google
cate scholarly information, including peer-reviewed articles, theses,
Scholar, Google Books, and regular Google. Google’s ongoing re-
books, preprints, abstracts, and court opinions from academic pub-
fusal to provide discrete information about the size and scope of its
lishers, professional societies, online repositories, universities, and
database makes exact quantitative analysis next to impossible.
other Web sites. This review looks at the strengths and weaknesses
of this search engine to assist librarians in making informed decisions As with other Google products, Google Scholar relies primarily on
about the use of this tool. keyword searching to return relevant results. The exact algorithm that
makes these searches possible is unknown. An Advanced Scholar
Search option allows users to perform somewhat more sophisticated
Pricing Options queries (searching by author name, for example), although the prod-
Free access via any Web browser. uct’s lack of a controlled vocabulary, unpredictable handling of Bool-
ean operators, and incompatibility with standard database search op-
tions such as word truncation continue to challenge more experienced
Product Description researchers. And, as will be explored later, the decision by Google
Google Scholar is an internet-based search engine designed to lo- developers to rely on their own parsers and “smart crawlers” rather
cate scholarly information, including peer-reviewed articles, theses, than publisher-supplied metadata has led to significant errors in the
books, preprints, abstracts, and court opinions from academic pub- database.
lishers, professional societies, online repositories, universities, and
other Web sites. Results are returned in a relevance-ranked format. Since most database administrators and librarians are familiar with
Google Scholar is free on the Web; institutions whose holdings are Google Scholar at this point, this review will highlight those elements
available via a link resolver and/or WorldCat can opt to link patrons to of the product that are positive (“The Good”), negative (“The Bad”),
those resources as part of their Google Scholar search results. and particularly problematic (“The Ugly”) at this point in time, more
than five years after the product was launched.

Critical Evaluation The Good


Google Scholar: The Good,
the Bad, and the Ugly Perhaps the best elements of Google Scholar are those inherent to its
mission and purpose: the product is free, and it provides researchers
Since its launch in 2004, Google Scholar has firmly established itself with a way to search for academic citations. As is the case with many
as a critical resource for those conducting academic research. Bol- Open Access publications, Google Scholar can also help researchers
stered by its hard-to-beat pricing (free) and its broad, interdisciplin- find items that are freely available in full text. Google Scholar re-
ary coverage, Google Scholar is now included as a resource on many quires no login and can be accessed from any computer with an inter-
library Web sites and taught to students. Certainly, Google Scholar net connection.
is a solid entrant into the world of scholarly research and offers both
students and serious researchers alike a highly accessible, easy-to-use Coverage
research tool. However, this promising tool is not without significant
Google Scholar’s coverage of journals and books has expanded sig-
flaws. As William Badke noted in a June 2009 article, “Google Schol-
nificantly since it was launched: the coverage of books is supported
ar is, in essence, a large, academic metasearch tool. As such, it carries
by the Google Book Search project, which is ongoing and allows us-
all the promise and frustrations of metasearch––with additional frus-
ers to search within the full text of digitized monographs. In addi-
trations” (Badke 2009, 48).
tion, many more scholarly publishers appear to be cooperating with
Google Scholar’s initial launch was met with a mixture of skepticism Google Scholar now as compared to when the service first launched,
and support, and since then it has been the subject of numerous arti- including major players such as Elsevier and the American Chemical
cles, studies, and reviews. More than five years later, the product still Society. Google Scholar pulls information from publishers and their
wears a “beta” label and evidence indicates that programmers contin- Web sites as well as from abstracting and indexing (A&I) databases.
ue to make changes to Google Scholar behind the scenes. Today it has In late 2010, new research by Xiaotian Chen reports that “Google
been estimated that Google Scholar participates with approximately Scholar is able to retrieve any scholarly journal article record from all
The Charleston Advisor  /  January 2011 www.charlestonco.com    37  

Google Scholar Review Scores Composite: HH 1/2


The maximum number of stars in each category is 5.
Content: HHH
Expanded coverage of journals and books is a plus, but coverage gaps and ambiguous content are problematic. Problems
with illiteracy and innumeracy compromise the integrity of many records.

User Interface/Searchability: HH
Google Scholar’s Advanced Scholar Search options are not advanced enough for serious researchers; the tool offers limited
options for sorting and limiting searches.

Pricing: N/A

Contract Options: N/A

the publicly accessible Web sites and from subscription-based data- while the Alert function does not guarantee that the articles to which
bases it is allowed to crawl” (Chen 2010, 221). Chen’s research also you are directed are recently published (rather, they may be articles
indicates that the turnaround time between the date new articles are that have simply been newly indexed by Google Scholar), this is still
published to the date they are indexed by Google Scholar has dropped a useful feature.
to approximately nine days.
Searching Within Citing Articles
Google Scholar has enhanced its coverage still further by including a
significant number of patents, legal documents, and court cases. The In July 2010 Google Scholar added the option to search within citing
service enables users to search and read opinions for U.S. state appel- articles for additional terms. After running a search, users can click
late and Supreme Court cases since 1950, U.S. federal district, appel- on the Cited By link beneath an article to see a list of other articles
late, tax, and bankruptcy courts since 1923, and U.S. Supreme Court that have cited the original work. By entering additional search terms
cases since 1791. and clicking on Search Within Articles, users can sort and sift through
large numbers of citations to find information on more specific top-
Geographic and Linguistic Expansion ics. For example, a search for John F. Nash, Jr.’s classic 1950 paper,
“The bargaining problem,” indicates that it has been cited, according
Google Scholar has greatly improved and expanded the amount of
to Google Scholar, more than 4,000 times. The Search Within Articles
content it includes from other countries and from publications writ-
feature allows users to navigate through those thousands of citing pa-
ten in languages other than English. A 2010 study found that, among
pers by using other keywords (such as economics or political science)
a random sample of non-English journal articles, the coverage rate
to refine those results.
by Google Scholar was 100 percent (Chen 2010, 225). Because most
scholarly databases emphasize anglophone sources (in particular Finally, Google Scholar remains a useful resource to identify arti-
those from the U.S., Canada, and the U.K.), Google’s geographic ex- cles where only a partial or incomplete citation has been found (a
pansion and linguistic additions are noteworthy. good “port in the storm” when other databases are not helpful) and a
broad research supplement to interdisciplinary and cross-disciplinary
Links to Local Content searches.
The addition of the Library Links and Library Search tools to Google
Scholar is another feature worth highlighting. Those libraries that The Bad
make full-text access available to researchers via a link resolver can
opt-in to Google Scholar’s Library Links feature, which will display Unfortunately, the good points of Google Scholar are not strong
an additional link within records to direct users back to the library’s enough to outweigh the many problems, both “bad” and “ugly,” af-
servers and then to the item itself in full-text when available. Library
Search provides a similar service for participating libraries whose
collections are indexed in OCLC’s Open WorldCat; clicking on the Contact Information
Library Search link takes users to the WorldCat system, where they Google
can find specific titles in area libraries. 1600 Amphitheatre Parkway
Mountain View, California  94043
Bibliographic Citation Support and Alerts Phone: (650) 253-0000
Like many other scholarly databases, Google Scholar supports biblio- Fax: (650) 618-1499
graphic exporting to a number of citation tools as well as the creation E-mail: <info@google.com>
of alerts to inform researchers about articles that have been newly URL: <http://www.google.com>
added to the Google Scholar database. The bibliographic exporting URL: <http://scholar.google.com/>
feature supports EndNote, RefWorks, and several other tools. And
38   Advisor Reviews  /  The Charleston Advisor  /  January 2011 www.charlestonco.com

fecting the search tool. For example, while Google’s simple search years (Chen, 221), Google remains closed-mouthed about the extent
interface has many fans and imitators, the relatively limited advanced of its coverage––prompting scholars to comment that “Google Schol-
search options in Google Scholar and its complete lack of controlled ar could render future [studies] unnecessary and obsolete, simply by
vocabulary frustrate experienced searchers and result in noisy search- sharing a detailed description of its content collection methodology”
es that are almost impossible to narrow down. Other problems also (Neuhaus, 139).
exist.
Full-text Access
Relevancy Ranking While the addition of the Library Links feature to Google Scholar
The default ranking for Google Scholar results is by relevancy, rather was a positive development, it is not without some issues. Google
than by date as is generally the case in academic databases. So, for Scholar commonly includes links to British Library Direct (BL Di-
example, a simple search for “mountain pine beetle” returns a book rect) beneath the articles themselves. Google has partnered with BL
from 1985 as the very first result. Unfortunately, Google Scholar of- Direct since 2006 to provide fee-based access to articles found on-
fers limited options for reordering and limiting the results set. Users line via Google Scholar. The BL Direct link gets prime real estate on
may incorporate Advanced Search features to focus on articles from a the results page and is often provided for articles that are also free-
certain date range or use pull-down menus on the results page to limit ly available online, such as those accessible via PubMed. It remains
their searches to articles published since a certain year––neither of up to the savvy searcher to realize he can customize Google Scholar
which is a particularly elegant or effective way to sort. Google con- preferences to include Library Links and that he can access some arti-
tinues to provide no information on how articles are weighted or how cles freely online or via a local library instead of purchasing them via
relevancy is determined. BL Direct. Google Scholar’s lack of reliance on publisher metadata
also means that, even when users click on Library Links, full biblio-
Numerical Errors graphic content may not transfer from Google Scholar to an individu-
al institutions’ link resolver.
Innumeracy creates a significant number of errors and problems in
Google Scholar. Some of these numerical challenges are painfully ob-
vious. For example, searching Google Scholar for the term “the”––the The Ugly
most frequently used word in the English language––returns approxi- Ambiguous Content
mately 8.55 million results. Adding the word “a”––another common
Perhaps the most serious problem with Google Scholar is that, un-
English word––should logically result in more results. But searching
like scholarly databases, users of Google Scholar have no idea what
for “the OR a” instead returns just 7.68 million hits.
they are searching. “What does Google Scholar point to, cover, and
This illogical situation was explored by Jascó, who contends, index? These questions, as numerous authors have noted, have neither
“The enhancement of the content [in Google Scholar] has not been been made clear by Google Scholar nor by its creator Anurag Acha-
matched by improvements in the software” (Jascó 2008, 107). Be- rya” (Neuhaus et al, 128). As has been mentioned earlier, we have no
yond concerns about innumeracy, this simple test also raises ques- definitive information on what sources Google crawls or how often it
tions about how well (or whether) Google Scholar handles simple updates its database. Google is “almost ridiculously [rigid] when it
Boolean searching. comes to publishing full details of the scientific journals it crawls to
generate its database, or to revealing details of how often those jour-
Inflated Citation Counts nals are updated” (Winder, 10). Until Google Scholar is more forth-
Because the developers of Google Scholar did not use publisher-sup- coming about exactly what it indexes, it will be difficult to take it seri-
plied metadata, there are a number of errors in the database. One of ously as an important academic resource.
the more egregious is the inclusion of both master records and cita-
tion records for individual articles. This quirk results in multiple hits Ghost Authors
for the same article, and results in inflated citation counts that make it Another critical error introduced to Google Scholar by the developers’
nearly impossible to evaluate scholarly productivity by using Google decision not to use publisher metadata is poor author name informa-
Scholar. So, for example, a search for the article, “Song recognition tion. These “ghost authors” often take their names from other fields in
without identification: When people cannot ‘name that tune’ but can the document, resulting in clearly erroneous author names such as P
recognize it as familiar,” by Bogdan Kostic and Anne M. Cleary, re- Login (for Please Login) or A Registered (for Already Registered).
turns seven versions, including two that are simply citations without
This problem has received significant coverage in the literature (see
links to full-text options.
Jascó, 2009, among others); it appears that, as these errors have been
spotted, reported, and published, Google’s developers have retroac-
Coverage Confusion tively cleaned up the database. However, other errors remain. For ex-
While it is impossible to know exactly what sources Google Scholar ample, a search in early November 2010 returned an article ostensibly
includes, researchers have studied the issue numerous times in the written by “F Policy.” The actual article, titled “Fiscal policy, legisla-
years since its launch. Early research indicated that there were signifi- ture size, and political parties: Evidence from state and local govern-
cant gaps in the full-text indexing of many important serial and Open ments in the first half of the 20th century,” was written by Thomas W.
Access publications (Mayr 2008, 97); that Google Scholar’s cover- Gilligan and John G. Matsusaka. These errors significantly compro-
age of Open Access and scientific and medical literature was fairly mise users’ ability to consult Google Scholar as a source for deter-
strong, but that it was much weaker in other academic areas, includ- mining scholarly productivity.
ing the social sciences, humanities, and business (Neuhaus, 138); and
that there were lengthy delays between an article’s publication and Publication Date Errors
its indexing in Google Scholar. While Chen’s recent research indi- Erroneous publication years are yet another problem with Google
cates that these areas have improved significantly in the intervening Scholar. Conducting an Advanced Scholar Search and limiting the
The Charleston Advisor  /  January 2011 www.charlestonco.com    39  

date range to articles published between 2012 and 2025, for example, Howland, Jared L., Thomas C. Wright, Rebecca A. Boughan, and Bri-
returns more than 1,700 articles, all with problematic dates of publi- an C. Roberts. “How Scholarly Is Google Scholar? A Comparison to
cation. A casual review of these articles indicates that Google Scholar Library Databases.” College and Research Libraries 70, no. 3 (2009):
is creating bad dates from page numbers, volume and issue numbers, 227–234.
and other sets of numerical data. This is another example of how pro- Jascó, Péter. “Google Scholar’s Ghost Authors.” Library Journal 134,
gramming errors have compromised the overall quality of the data- no. 18 (2009): 26–27.
base and hamper the ability of users to search for relevant content.
———. “Google Scholar Revisited.” Online Information Review 32,
no. 1 (2008): 102–114.
Conclusion ———. “Metadata Mega Mess in Google Scholar.” Online Informa-
At this time, Google Scholar still appears full of potential, particular- tion Review 34, no. 1 (2010): 175–191.
ly for researchers who are conducting broad, interdisciplinary search- Mayr, Philipp, and Anne-Kathrin Walter. “An Exploratory Study of
es and who can benefit from a free online search tool. However, the Google Scholar.” Online Information Review 31, no. 6 (2007): 814–
tool still raises serious concerns for those who are familiar with more 30.
sophisticated and comprehensive search techniques due to significant ———. “Studying Journal Coverage in Google Scholar.” Journal of
search interface limitations and uncertainty regarding exactly what it Library Administration 47, no. 1 (2008): 81–99.
indexes. Google Scholar remains, as its “beta” label indicates, a work
in progress. Neuhaus, Chris, Ellen Neuhaus, Alan Asher, and Clint Wrede. “The
Depth and Breadth of Google Scholar: An Empirical Study.” Portal:
Libraries and the Academy 6, no. 2 (2006): 127–141.
Contract Provisions Walters, William H. “Google Scholar Search Performance: Compara-
No contract required. Freely available at <http://scholar.google. tive Recall and Precision.” Portal: Libraries and the Academy 9, no.
com>. 1 (2009): 5–24.
Wilson, Virginia. “A Content Analysis of Google Scholar: Coverage
Varies by Discipline and by Database.” Evidence Based Library and
Authentication Information Practice 2, no. 1 (2007): 134–136.
None required. Libraries that have implemented a link resolver can
Winder, Davey. “The Struggle for Scholarly Search.” Information
sign up for Google Scholar’s Library Links program, which includes
World Review 244 (2008): 10–11.
a link to full text (when available) at the user’s home institution next
to each item in the results list. Users must customize preferences to
see the links. IP authentication is handled at the local level. Simi- About the Author
larly, libraries that include their holdings in OCLC’s Open WorldCat Amy Hoseth is an Assistant Professor and Liaison Librarian at the
can participate in Google Scholar’s Library Search option, which pro- Colorado State University Libraries in Fort Collins. She holds an
vides links to local library holdings when possible. M.L.S. from the University of Maryland at College Park and a B.A.
in history from Drake University in Des Moines, Iowa. Before join-
ing the faculty at CSU she worked at the Association of Research Li-
References braries in Washington, D.C. as a communications coordinator for the
Badke, William. “Google Scholar and the Researcher.” Online LibQUAL+ assessment instrument.  n
(Weston, Conn.) 33, no. 3 (2009): 47–49.
Chen, Xiaotian. “Google Scholar’s Dramatic Coverage Improvement
Five Years after Debut.” Serials Review 36, no. 4 (2010): 221–226.

View publication stats

You might also like