Professional Documents
Culture Documents
http://eup.sagepub.com/
Published by:
http://www.sagepublications.com
Additional services and information for European Union Politics can be found at:
Subscriptions: http://eup.sagepub.com/subscriptions
Reprints: http://www.sagepub.com/journalsReprints.nav
Permissions: http://www.sagepub.com/journalsPermissions.nav
Citations: http://eup.sagepub.com/content/10/4/535.refs.html
What is This?
Downloaded from eup.sagepub.com at Biblioteca de la Universitat Pompeu Fabra on July 25, 2014
European Union Politics
DOI: 10.1177/1465116509346782
Measuring Interest Group
Volume 10 (4): 535–549
© The Author(s), 2009.
Influence Using Quantitative
Reprints and Permissions:
http://www.sagepub.co.uk/
Text Analysis
journalsPermissions.nav
Heike Klüver
University of Mannheim, Germany
ABSTRACT
KEY WORDS
! influence
! interest groups
! quantitative text analysis
! Wordfish
! WORDSCORES
535
Downloaded from eup.sagepub.com at Biblioteca de la Universitat Pompeu Fabra on July 25, 2014
536 European Union Politics 10(4)
Introduction
Textual data are arguably the most widely available source of evidence on
political processes. Content analysis was developed to make systematic use
of this rich data source. Political documents have a great potential to reveal
information about the policy positions of their authors: texts can be analysed
as many times as one wishes and they provide information about policy
positions at a specific point in time. Research on political parties has long
dealt with the measurement of policy positions and has developed three
Downloaded from eup.sagepub.com at Biblioteca de la Universitat Pompeu Fabra on July 25, 2014
Klüver Measuring Interest Group Influence Using Quantitative Text Analysis 537
major content analysis techniques for extracting policy positions from party
manifestos: hand-coding, WORDSCORES and Wordfish. Whereas hand-
coding is usually associated with a high degree of validity and low reliability,
the great advantage of WORDSCORES and Wordfish is a high degree of
reliability, but these are often criticized for a lack of validity. Hence, the
validity of WORDSCORES and Wordfish will be tested by comparing them
with hand-coding.
WORDSCORES
A major step forward was undertaken by Laver et al. (2003): they developed
a fully automated text analysis programme for measuring policy positions.
By comparing the relative frequencies of words in ‘reference texts’ (docu-
ments for which policy positions on predefined policy dimensions are known)
with relative frequencies in ‘virgin texts’ (unknown policy positions), one can
calculate the probability, Pwr, that one is reading a particular reference text r
given a specific word w. So it is assumed that each word provides a little piece
of information about which of the reference texts the virgin text most closely
resembles. Since the policy positions of the reference texts, Ard, are known,
one can use the probabilities, Pwr, together with the reference values, Ard, to
produce a score, Swd, for each word w on dimension d. Then the relative
frequency of each virgin text word as a proportion of the total number of
words in the text, Fwv, is computed. The policy position raw score, Svd, of any
Downloaded from eup.sagepub.com at Biblioteca de la Universitat Pompeu Fabra on July 25, 2014
538 European Union Politics 10(4)
virgin text is then the mean dimension score, Swd, of all the scored words that
it contains, weighted by the frequency of the scored words, Fwv. In order to
compare the scores of the virgin texts directly with those of the reference texts,
these raw scores are finally transformed into S*vd (see Laver et al., 2003).1
Confidence intervals are obtained by estimating the variance, Vvd, of the indi-
vidual word scores around the text’s mean score.
Wordfish
yij is the count of word j in text i. α is a set of text effects that control for the
length of the documents. ψ is a set of word fixed effects that control for the fact
that some words, such as articles or prepositions, are generally used more
frequently than other words. β is an estimate of a word-specific weight captur-
ing the importance of word j in discriminating between policy positions and ω
is the estimate of actor i’s policy position. The entire right-hand side of the
equation is estimated by an expectation maximization algorithm (see Slapin
and Proksch, 2008). In order to identify the model, α1 and the mean of all policy
positions of actors are set to 0 and the standard deviation of all policy positions
is set to 1. Confidence intervals are obtained using a parametric bootstrap.
Research design
In this section, I explain in detail which texts I used and what issue I selected
for the case study. In order to analyse interest group influence, I concentrated
Downloaded from eup.sagepub.com at Biblioteca de la Universitat Pompeu Fabra on July 25, 2014
Klüver Measuring Interest Group Influence Using Quantitative Text Analysis 539
on the policy formulation phase since a Commission proposal is the basis for
further debate between the Council and the European Parliament in the first
pillar. The policy positions of interest groups were extracted from their
submissions in an online consultation. Interest groups are consulted on a draft
before the final policy proposal is decided upon. Being aware that the
submissions may reflect ‘strategic’ rather than ‘true’ policy positions, this
should not constitute a problem for two reasons. First, only transmitted policy
positions – even if they over- or understate the ‘true’ ideal policy positions –
are taken into account by the Commission and, therefore, constitute the
basis for the influence measurement. Second, it is unlikely that there is a
systematic variation of strategically over- or understating preferences across
group types so that the revealed policy position can be taken as a proxy for
the true policy position.3
For the analysis of the Commission’s policy position, press releases
accompanying the communication and the adoption of the proposal are used.
In theory, one could also use the communication and the proposal directly.
This is, however, associated with a problem of comparability: whereas the
communication is written as a continuous political text, the proposal consists
of the explanatory memorandum, the preamble and the actual regulation.
Thus, these texts employ a very different lexicon and cannot be compared
directly using computer-based content analysis (Laver et al., 2003: 315).4
In order to test different text analysis approaches, I selected the online
consultation concerning the reduction in CO2 emissions from cars. On 7
February 2007, the European Commission proposed a legislative framework
to reduce CO2 emissions from cars to 120g/km in 2012. The Commission
called for improvements in vehicle technology, tyres and air-conditioning
systems as well as for a greater use of biofuels. Furthermore, fiscal measures,
improved consumer information and a code of good practice were suggested.
The Commission then launched a public online consultation, which ran from
7 February until 15 July 2007 and was open to anyone interested in this issue.
The Commission adopted its final proposal in December 2007. The policy
positions of the Commission and the interest groups are measured on a single
‘pro environmental control’ and ‘anti environmental control’ policy dimen-
sion. Being located at the ‘pro environmental control’ end of the policy scale
implies that interest groups support the framework suggested by the
Commission and might even go beyond the proposed measures. Interest
groups located at the ‘anti environmental control’ end of the policy scale are
against the measures proposed by the Commission.
This issue was selected for various reasons. First, a wide variety of
interest groups took part in this consultation and one can, therefore, assume
a broad range of policy positions. I classified the groups into four classes:
Downloaded from eup.sagepub.com at Biblioteca de la Universitat Pompeu Fabra on July 25, 2014
540 European Union Politics 10(4)
Analysis
Hand-coding
First, a hand-coded analysis largely based on the design of the CMP was
performed. Drawing on in-depth reading of the Commission and interest
group texts, a classification scheme with 41 categories was developed (see
Table 1): 20 categories were classified as ‘pro environmental control’ and 20
as ‘anti environmental control’. All statements that could not be allocated to
one of these categories were grouped into an ‘others’ category. The units of
analysis are natural sentences. Each sentence was allocated to at least one of
the specified categories. The pro/anti environmental control scale was
produced according to the CMP procedure. First, the percentages of pro and
anti environmental control categories in the total number of coded statements
per text were calculated. Then, the pro percentage was subtracted from
the anti percentage. Negative scores represent pro environmental control
positions and positive scores represent anti environmental control positions.
Figure 1 plots the policy estimates obtained using this classification
scheme. In order to guarantee comparability with the other content analysis
approaches, the estimates were transformed. All Traditional Industry Groups
are located closer to the ‘anti environmental control’ end of the policy scale
than the European Commission. All Alternative Industry Groups are located
closer to the ‘pro environmental control’ side of the policy scale than the
Commission. Four of the Environmental Groups are located closer to the ‘pro
environmental control’ side of the policy scale than the Commission and two
(WWF, RSPB) are located in between the two Commission positions. The
Commission moved from a policy position of –1.17 to a policy position of
–0.55, so it clearly moved towards the Traditional Automobile Industry.
Downloaded from eup.sagepub.com at Biblioteca de la Universitat Pompeu Fabra on July 25, 2014
Klüver Measuring Interest Group Influence Using Quantitative Text Analysis 541
Environmental
control
——————————
Overall category Pro Anti
Wordfish
In a second step, I analysed the documents using Wordfish. Since all docu-
ments discuss only the Commission initiative for reducing CO2 emissions
from cars, one can assume uni-dimensionality and, thus, the complete texts
were used for the analysis. As recommended, the documents were edited
before the analysis: bullet points, hyphens, group names, contact details and
enumerations were removed from the documents. Then, the spelling and
grammar check of Microsoft Word was used to identify and correct mistakes.
Using the program jfreq, stop words (extremely common words), numbers
and currencies were removed from the documents and the words were
stemmed and transformed into lowercase. Finally, all stems that were
Downloaded from eup.sagepub.com at Biblioteca de la Universitat Pompeu Fabra on July 25, 2014
542 European Union Politics 10(4)
UKAA
FAEP
ETUC
ETSC
ETRMA
BVRLA
WWF
T&E
BEUC
RSPB
GREENPEACE
FOE
FANC
COMMISSION 2
COMMISSION 1
ENGVA ACEA
EBB VDA
AVERE SMMT
AVELE RAI
AEGPL KAMA
ADTS JAMA
–2 –1 0 1 2
Pro Policy position Anti
Alternative Industry Traditional Industry Commission Environmental Groups Others
mentioned only in one single text were removed so that 1397 stems remain
for the analysis. Figure 2 shows the results of the analysis.
Most of the groups representing the Traditional Automobile Industry are
located closer to the ‘anti environmental control’ end of the policy scale than
the European Commission. Only RAI is located closer to the ‘pro environ-
mental side’ than both Commission positions, and SMMT is located between
the two Commission positions. By contrast, all Environmental and Alterna-
tive Industry groups are located closer to the ‘pro environmental side’ of the
policy scale than the Commission. The Commission moved from 0.50 to a
policy position of 0.93 towards the Traditional Automobile Industry at the
‘anti environmental control’ end of the policy scale. This shift is statistically
significant since there is no overlap of confidence intervals.
I then compared the Wordfish estimates with the results of the hand-
coding. Figure 3 plots the estimates together with a fitted regression line.8
The estimates correlate highly (r = .70, p < .001) and therefore largely cross-
validate each other. However, whereas both methods predict a clear move
towards the ‘anti environmental control’ end of the policy scale, hand-coding
sees the Commission closer to the ‘pro’ end of the policy scale. This differ-
ence could be due to the dichotomous categorization: a sentence is allocated
either to a ‘pro’ or to an ‘anti’ environmental control category. This leads to
Downloaded from eup.sagepub.com at Biblioteca de la Universitat Pompeu Fabra on July 25, 2014
Klüver Measuring Interest Group Influence Using Quantitative Text Analysis 543
UKAA
FAEP
ETUC
ETSC
ETRMA
BVRLA
WWF
T&E BEUC
RSPB
GREENPEACE
FOE
FANC
COMMISSION 2
COMMISSION 1
ENGVA ACEA
EBB VDA
AVERE SMMT
AVELE RAI
AEGPL KAMA
ADTS JAMA
–2 –1 0 1 2
Pro Policy position Anti
90% Conf. interval Alternative Industry Traditional Industry
VDA
Anti
2
ACEA
ETRMA JAMA
ETUC KAMA
1
Wordfish estimates
COMMISSION 2
SMMT
COMMISSION 1
BEUC
FOE
UKAA
0
GREENPEACE
T&E RSPB
FAEP
FANC RAI
ETSC BVRLA
WWF
EBB
–1
AEGPL
ENGVA
AVERE
AVELE
ADTS
Pro
–2
–2 –1 0 1 2
Pro Hand-coding estimates Anti
Alternative Industry Traditional Industry Commission
Downloaded from eup.sagepub.com at Biblioteca de la Universitat Pompeu Fabra on July 25, 2014
544 European Union Politics 10(4)
WORDSCORES
Downloaded from eup.sagepub.com at Biblioteca de la Universitat Pompeu Fabra on July 25, 2014
Klüver Measuring Interest Group Influence Using Quantitative Text Analysis 545
Anti
2
ACEA
ETRMA
KAMA JAMA
ETUC
1
Wordfish estimates
COMMISSION 2
SMMT
COMMISSION 1
BEUC
FOE
UKAA
0
GREENPEACE
RSPB T&E
FANC FAEP
BVRLA RAI
ETSC
WWF
EBB
–1
AEGPL
Pro
–2
–6 –4 –2 0 2 4
Pro WORDSCORES estimates Anti
Alternative Industry Traditional Industry Commission
the ‘pro’ environmental control side of the policy scale. Thus, it can be
concluded that the Wordfish results are largely validated by the WORD-
SCORES estimates.
WORDSCORES was then used to test the hand-coding policy positions
estimates. Reference values were obtained from the hand-coding estimates of
the most extreme documents. The VDA text serves as a reference text for the
‘anti’ environmental control end of the policy spectrum and the documents
by ADTS, AVERE, AEGPL (European Liquefied Petroleum Gas Association),
EBB (European Biodiesel Board) and AVELE are used for the ‘pro’ environ-
mental control positions. This time, five documents were collapsed because
AVELE and AVERE submitted word-for-word identical comments that differ
only in two sentences. Thus, if one of these texts were treated as a reference
and the other one as a virgin text, this would lead to an extreme score for the
virgin text. Figure 5 plots the policy position estimates derived from both
methods, together with a fitted regression line.10 The estimates correlate quite
highly (r = .53, p < .05). However, there is a lot of random noise: only 28%
of the variance of the hand-coding estimates can be explained by the WORD-
SCORES estimates. In conclusion, WORDSCORES strongly confirms the
Wordfish results whereas hand-coding is validated to only a medium degree.
Downloaded from eup.sagepub.com at Biblioteca de la Universitat Pompeu Fabra on July 25, 2014
546 European Union Politics 10(4)
Anti
2 ACEA
FAEP
Hand-coding estimates
KAMA
JAMA
1
UKAA
SMMT
RAI
BVRLA
0
ETRMA
ETUC
COMMISSION 2
–1
RSPB WWF
COMMISSION 1
GREENPEACE FOE
ENGVA FANC T&E
ETSC BEUC
Pro
–2
–6 –4 –2 0 2 4
Pro WORDSCORES estimates Anti
Alternative Industry Traditional Industry Commission
Conclusion
The aim of this article was to illustrate the usefulness of text analysis for the
measurement of interest group influence. Interest group influence can be
measured by comparing the policy preferences of interest groups with the
final policy output. The measurement of preferences, however, still consti-
tutes a big problem. This article therefore examined the applicability of
content analysis for the measurement of policy positions of interest groups.
A case study was carried out in order to compare three content analysis
approaches: hand-coding, WORDSCORES and Wordfish. The policy position
estimates correlate highly and, therefore, largely cross-validate each other.
Hence, in theory, all three approaches are applicable to the study of interest
group influence. However, one has to keep in mind that each approach has
advantages but also disadvantages.
The big advantage of hand-coding is the in-depth knowledge of the content
of the submissions and the high validity of the measurement. However, the
reliability of the results is relatively low compared with computerized content
analysis (Mikhaylov et al., 2008). Furthermore, hand-coding is very labour
intensive and time consuming. Finally, political issues may sometimes be
highly technical so that it might be difficult for researchers to understand the
content, develop a classification scheme and allocate the text units to categories.
Downloaded from eup.sagepub.com at Biblioteca de la Universitat Pompeu Fabra on July 25, 2014
Klüver Measuring Interest Group Influence Using Quantitative Text Analysis 547
Notes
Downloaded from eup.sagepub.com at Biblioteca de la Universitat Pompeu Fabra on July 25, 2014
548 European Union Politics 10(4)
References
Baumgartner, Frank R., Jeffrey M. Berry, Marie Hojnacki, David C. Kimball and
Beth Leech (2009) Lobbying and Policy Change: Who Wins, Who Loses, and Why.
Chicago: University of Chicago Press.
Benoit, Kenneth and Michael Laver (2003) ‘Extracting Policy Positions from
Political Texts Using Phrases As Data: A Research Note’, paper presented at
the annual meeting of the Midwest Political Science Association, 3–6 April,
Chicago.
Benoit, Kenneth, Michael Laver, Christine Arnold, Paul Pennings and Madeleine
O. Hosli (2005) ‘Measuring National Delegate Positions at the Convention on
the Future of Europe Using Computerized Word Scoring’, European Union
Politics 6(3): 291–313.
Budge, Ian and Judith Bara (2001) ‘Introduction: Content Analysis and Political
Texts’, in Ian Budge, Hans-Dieter Klingemann, Andrea Volkens, Judith Bara
and Eric Tanenbaum (eds) Mapping Policy Preferences: Estimates for Parties,
Electors, and Governments 1945–1998, pp. 1–16. Oxford: Oxford University Press.
Budge, Ian, Hans-Dieter Klingemann, Andrea Volkens, Judith Bara and Eric
Tanenbaum (2001) Mapping Policy Preferences: Estimates for Parties, Electors, and
Governments 1945–1998. Oxford: Oxford University Press.
Downloaded from eup.sagepub.com at Biblioteca de la Universitat Pompeu Fabra on July 25, 2014
Klüver Measuring Interest Group Influence Using Quantitative Text Analysis 549
Dür, Andreas (2008) ‘Measuring Interest Group Influence in the EU: A Note on
Methodology’, European Union Politics 9(4): 559–76.
Dür, Andreas and Dirk de Bièvre (2007) ‘Inclusion without Influence? NGOs in
European Trade Policy’, Journal of Public Policy 27(1): 79–101.
Klingemann, Hans-Dieter, Andrea Volkens, Judith Bara, Ian Budge and Michael
D. McDonald (2006) Mapping Policy Preferences II: Estimates for Parties, Electors,
and Governments in Eastern Europe, European Union, and OECD 1990–2003.
Oxford: Oxford University Press.
Laver, Michael, Kenneth Benoit and John Garry (2003) ‘Extracting Policy Positions
from Political Texts Using Words as Data’, American Political Science Review
97(2): 311–31.
Lowe, Will (2008) ‘Understanding Wordscores’, Political Analysis 16(4): 356–71.
Mahoney, Christine (2007) ‘Lobbying Success in the United States and the
European Union’, Journal of Public Policy 27(1): 35–56.
Mahoney, Christine (2008) Brussels versus the Beltway: Advocacy in the United States
and the European Union. Washington, DC: Georgetown University Press.
Michalowitz, Irina (2007) ‘What Determines Influence? Assessing Conditions for
Decision-Making Influence of Interest Groups in the EU’, Journal of European
Public Policy 14(1): 132–51.
Mikhaylov, Slava, Michael Laver and Kenneth Benoit (2008) ‘Coder Reliability and
Misclassification in Comparative Manifesto Project Codings’, paper presented
at the annual meeting of the Midwest Political Science Association, 3–6 April,
Chicago.
Pappi, Franz U. and Christian H. C. A. Henning (1999) ‘The Organization of
Influence on the EC’s Common Agricultural Policy: A Network Approach’,
European Journal of Political Research 36(2): 257–81.
Proksch, Sven-Oliver and Jonathan B. Slapin (2008) ‘WORDFISH: Scaling Software
for Estimating Political Positions from Texts, Version 1.2’, URL (consulted
Sept. 2008): http://www.wordfish.org.
Schneider, Gerald, Daniel Finke and Konstantin Baltz (2007) ‘With a Little Help
from Your State – Interest Intermediation in the Pre-Negotiations of EU Legis-
lation’, Journal of European Public Policy 14(3): 444–59.
Slapin, Jonathan B. and Sven-Oliver Proksch (2008) ‘A Scaling Model for Estimat-
ing Time-Series Party Positions from Texts’, American Journal of Political Science
52(3): 705–22.
Woll, Cornelia (2007) ‘Leading the Dance? Power and Political Resources of
Business Lobbyists’, Journal of Public Policy 27(1): 57–78.
Downloaded from eup.sagepub.com at Biblioteca de la Universitat Pompeu Fabra on July 25, 2014