Professional Documents
Culture Documents
ANDREW D. BRASFIELD
A Thesis
The Degree of
MASTER OF SCIENCE
IN
APPLIED INTELLIGENCE
A Thesis
Submitted to the Faculty of Mercyhurst College
In Partial Fulfillment of the Requirements for
The Degree of
MASTER OF SCIENCE
IN
APPLIED INTELLIGENCE
Submitted By:
ANDREW D. BRASFIELD
Certificate of Approval:
_______________________________________
Kristan J. Wheaton
Assistant Professor
Department of Intelligence Studies
_______________________________________
James G. Breckenridge
Chair/Assistant Professor
Department of Intelligence Studies
________________________________________
Phillip J. Belfiore
Vice President
Office of Academic Affairs
May 2009
Copyright © 2009 by Andrew D. Brasfield
All rights reserved.
DEDICATION
iii
This work is dedicated to Melody and Dharma
for being patient with my busy schedule during the last two years.
ACKNOWLEDGEMENTS
iv
First, I would like to thank Professor Kris Wheaton for his guidance and advice during
I also would like to thank Professor James Breckenridge for taking the role of my
secondary reader.
I also owe thanks to Professor Stephen Marrin for helping me obtain various documents
I would also like to thank Kristine Pollard for her technical assistance during this process,
and; without whom, I would not have been able to begin this process last summer.
I would also like to thank Hemangini Deshmukh for assisting in applying statistical
Lastly, I would like to thank Travis Senor for his assistance while conducting the
experiment.
v
FORECASTING ACCURACY AND COGNITIVE BIAS IN THE
ANALYSIS OF COMPETING HYPOTHESE
By
Andrew D. Brasfield
in the United States Intelligence Community to aid qualitative analysis. Taking into
consideration what previous studies found, an experiment was conducted testing the
which hinder the analytical process. The findings of the experiment suggest ACH can
such as confirmation bias, and is almost certain to encourage analysts to use more
information and apply it more appropriately. However, the results suggest that ACH may
be less effective for an analytical problem where the objective probabilities of each
hypothesis are nearly equal. Given these findings, future studies should focus less on the
question of ACH’s general efficacy, but instead should aim to expand our understanding
TABLE OF CONTENTS
Page
vi
COPYRIGHT PAGE……………………………………………………………... iii
vii
DEDICATION……………………………………………………………………. iv
ACKNOWLEDGEMENTS………………………………………………………. v
ABSTRACT………………………………………………………………………. vi
LIST OF TABLES………………………………………………………………... ix
LIST OF FIGURES………………………………………………………………. x
CHAPTER
1 INTRODUCTION………………………………… 1
2 LITERATURE REVIEW…………………….…… 5
Key Terms…………...…………………...... 5
The Debate: Structured
V. Unstructured Methods…………………. 8
Structured Methods in Intelligence……....... 17
Analysis of Competing Hypotheses………. 19
Strengths & Weaknesses………...... 24
Previous Studies…………………... 25
Hypotheses………………………… 28
3 METHODOLOGY……………………………….. 29
Research Design…………………………... 29
Participants………………………... 29
Procedures………………………… 31
Control Group…………………...... 33
Experimental Group………………. 36
Data Analysis……………………............... 36
4 RESULTS………………………………………… 38
Accuracy………………………………...... 38
Mindsets…………………………………... 39
Confirmation Bias………………………… 42
Other Findings of Interest………………… 44
viii
Summary of Results………………………. 46
5 CONCLUSION…………………………………… 47
BIBLIOGRAPHY………………………………………………………………… 53
APPENDICES……………………………………………………………………. 56
LIST OF TABLES
Page
Page
x
Figure 3.2 Group Comparison by Class Year………………. 30
xi
1
INTRODUCTION
weapons of mass destruction (WMD), it is clear that the United States Intelligence
Community could improve the process it uses to reach analytic judgments. Traditionally,
such judgments are reached through intuitive thinking. However, one of the
Regarding Weapons of Mass Destruction was that “the [intelligence] community must
develop and integrate into regular use new tools to assist analysts in filtering and
correlating the vast quantities of information that threaten to overwhelm the analytic
process.”1 This statement represents the growing belief that structured methods can help
the United States Intelligence Community’s analytic capabilities reach the quality and
In this structured technique, the scientific method is incorporated into the analytic process
by weighing multiple hypotheses in a matrix, evaluating all evidence for and against
each, and determining the likelihood of all possibilities by trying to disprove hypotheses.2
Researchers have found that this methodology can help "analysts overcome cognitive
1
United States Government - Commission on the Intelligence Capabilities of the United States
Regarding Weapons of Mass Destruction. Report to the President of the United States, (Washington D.C.,
2005), 402. <http://www.wmd.gov/report/> (Accessed 22 January 2009)
2
Richards J. Heuer, Jr.,“Limits of Intelligence Analysis.” Orbis (Winter 2005): 92.
2
The primary benefit is the added element of the scientific method. This, in theory,
improves the quality and accuracy of analysis by imposing structure onto our limited, and
make the analytic process and end product easier to critique and evaluate. This is
important for both analysts and their supervisors so that mistakes and successes can more
easily be identified and understood for the improvement of future efforts. Likewise, in the
Despite these potential benefits, there are some obstacles to the use of structured
methods in the US Intelligence Community. First, although there are over 200 analytic
methods available to intelligence analysts, exposure to these methods has been minimal.4
Because of this, it is likely most analysts in the US Intelligence Community are unaware
of the existence of methods that could aid their work, let alone have received training that
and skeptical of, if not hostile, to the notion of structured methods. One researcher notes
that this attitude is partly justified from the lack of empirical evidence suggesting
structured methods can improve intelligence analysis.5 According to Dr. Rob Johnston in
3
Diane Chido and Richard M. Seward, Jr., eds. Structured Analysis of Competing Hypotheses: Theory and
Application (Mercyhurst College Institute of Intelligence Studies Press, 2006), 48.
4
Rob Johnston, “Integrating Methodologists into Teams of Substantive Experts,” Studies in Intelligence,”
Vol. 47. No. 1: 65.
5
Stephen Marrin, “Intelligence Analysis: Structured Methods or Intuition?” American Intelligence Journal
25, no. 1 (Summer 2007): 10.
3
The principal difficulty lies not in developing the methods themselves, but
in articulating those methods for the purpose of testing and validating
them and then testing their effectiveness throughout the community. In the
long view, developing the science of intelligence analysis is easy; what is
difficult is changing the perception of the analytic practitioners and
managers and, in turn, modifying the culture of tradecraft.6
Hopefully, the quantitative data derived from this experimental study will offer insights
into the utility of structured methods and ACH specifically and challenge commonly held
Taking into account that previous studies on ACH have yielded mixed and
inconclusive results, the purpose of this study is to add to the small number of such
studies and shed further light on ACH’s utility and efficacy with intelligence analysis
phenomena that hinder our ability to think clearly and accurately. From the quantitative
data I collect, I hope to gain insight regarding the methodology’s usefulness for analysts
Unfortunately, there are some limitations to this study. These limitations pertain
experimental conditions that are not ideal but impossible to avoid with the given
resources. While ACH offers numerous potential benefits to analysis, such as those
related to hypothesis generation and its use in a team environment, the primary goals of
6
Rob Johnston, Analytic Culture in the US Intelligence Community (Washington D.C.: Center for the Study
of Intelligence, 2005), 20-21.
4
this experiment are to test the methodology’s accuracy and its ability to mitigate
necessary sacrifice.
experimental study such as this one would be US Intelligence Community analysts who
are specifically experienced with ACH. Participants with these qualifications would
likely provide higher quality and more valid results. Although all participants using ACH
will have had some experience with the methodology, this study did not have access to a
The nature and order of this study will be as such: First, the researcher will review
the existing body of literature pertinent to the topic, including important terms of
reference, the debate on the use of structured methods, as well as current and past use of
such methods in the US Intelligence Community. Next, the researcher will explain the
methodology for the experiment and the subsequent results. Finally, the researcher will
offer his final interpretation of the experiment results and postulate their implications for
LITERATURE REVIEW
5
To fully understand the purpose and place of this study and its experiment, it is
necessary to review important concepts and debates relevant to the use of structured
analytical techniques in the US Intelligence Community. First, this chapter will define
and discuss key terms such as intelligence, structured methods, and intuition. Next, this
chapter will attempt to summarize the debate on the use of structured and unstructured
analytical methods from a variety of perspectives. These will include views from
cognitive psychology, experts from within the US Intelligence Community, and empirical
studies on the topic. Furthermore, a general description of the use of structured methods
in the US Intelligence Community will follow. This will include subsections on current
use, explanations for the non-use of structured methods, and finally an in-depth discourse
on ACH itself. This study’s hypotheses will emerge from the intersection of all these
elements.
Key Terms
While the definition of intelligence has been debated for some time, several key
characteristics are clear. Mark Lowenthal, in his book, Intelligence: From Secrets to
“requested, collected, analyzed, and provided to policy makers…”7 While this common
definition is accurate, it is missing a very important element that is integral to the purpose
7
Mark M. Lowenthal, Intelligence: From Secrets to Policy (Washington D.C.: CQ Press,
2006), 9.
6
conflict.”8 Therefore, the ultimate purpose of intelligence analysis is estimating the nature
of current and future events. That is, using information to clarify the likelihood or nature
From these concepts comes the Mercyhurst College Institute for Intelligence
Studies (MCIIS) definition of intelligence, which incorporates all of the above concepts
focused externally, designed to reduce the level of uncertainty for a decision maker using
information derived from all sources.”9 While the debate continues and this definition is
not definitive, it will suffice in laying the intellectual groundwork for this research.
intelligence analysis breaks down topics and ideas that are difficult to quantify into
unstructured methods.
One former CIA analyst, Stephen Marrin, defines structured analytic methods as
external observers.”11 From this, it is apparent that the key features of a structured
analytic method are that it is systematic in nature and is externalized from the human
mind - typically in some visual format. This suggests that inherent in any systematic
8
Robert. M. Clark, Intelligence Analysis: A Target-Centric Approach (Washington D.C.: CQ Press, 2007),
8.
9
Diane Chido, et al., 9.
10
Robert D. Folker Jr. Intelligence Analysis in Theater Joint Intelligence Centers: An Experiment in
Applying Structured Methods (Washington D.C.: Joint Military Intelligence College, Occasional Paper #7,
2000), 5; citing Robert M. Clark, Intelligence Analysis: Estimation and Prediction (Baltimore: American
Literary Press, Inc., 1996), 30.
11
Marrin, 7.
7
method of analysis is the spirit of the scientific method, defined as “principles and
procedures for the systematic pursuit of knowledge involving the recognition and
formulation of a problem, the collection of data through observation and experiment, and
critical component of intelligence. Although much reform within the US national security
intelligence, Folker states that “the root cause of many critical intelligence failures has
been analytical failure,” citing examples such as the North Korean invasion of South
Korea in 1950, the Tet Offensive in Vietnam, the fall of the Shah of Iran, and the
However, the need to improve the analytic process is not unknown within the US
Government. As early as the 1940s and through the Cold War, numerous government
improve the analytic process and production of estimates. 14 More recently, the US
Commission on the Roles and Capabilities of the United States Intelligence Community
expertise among the analytical pool.”15 Amidst these recommendations, there is much
12
Merriam-Webster’s Collegiate Dictionary, 11th ed., s.v. “scientific method.”
13
Folker, 3-4.
14
Congressional Research Service Report for Congress, Proposals for Intelligence Reorganization, 1949-
2004. 2004, 6; United States Government, A Review of the Intelligence Community, (The Schlesinger
Report) (1971), 44.
15
United States Government - U.S. Commission on the Roles and Capabilities of the United States
Intelligence Community, Preparing for the 21st Century: An Appraisal of U.S. Intelligence (Washington,
D.C., 1996), 83.
8
debate within the US Intelligence Community on how to improve analysis and whether or
There has been a longstanding debate inside and outside of the US Intelligence
Community over the use of structured and unstructured methods for analysis and decision
making. On one side are those who believe intuitive thinking is sufficient for problem
solving and that scientific methods are inadequate when addressing the same problems.
On the other side of the debate are those who argue that structured and scientific methods
can supplement intuitive thinking and improve its quality. This debate begins with
cognitive psychology and understanding how the simplest and most basic human thought
cognition are inherent and can be detrimental to critical thinking. Specifically, the
research of Daniel Kahneman and Amos Tversky suggests that intuitive thinking can be
thought of as the mind’s shortcut mechanism to aid quick decision making. That is,
succession and assimilating that into a succinct explanation of the information being
perceived. Despite its utility in situations requiring this ability, such as deciding whether
to run from a perceived threat or stand and fight, these simplified and more efficient
cognitive processes are also inherently subject to a higher number of judgmental errors.16
These judgmental errors are believed to be caused by cognitive biases, defined as “mental
16
Amos Tversky and Daniel Kahneman, “Judgment Under Uncertainty: Heuristics and Biases,” Science
185, no. 4157, pp. 1124-1131 (1974), JSTOR (accessed March 15, 2009), 1124.
9
Powers and Perils, David G. Myers elaborates on these specific advantages and
judgmental errors which result from intuitive thinking. The simple advantage offered by
intuition is the ability to quickly and efficiently process large quantities of information.18
In Blink: The Power of Thinking Without Thinking, Malcolm Gladwell argues for
our ability to use this, which he calls “thin-slicing.” 19 Gladwell not only advocates the use
of intuitive thinking, but also argues that it can be just as effective as, if not superior, to
real-life examples that seemingly demonstrate the efficacy of intuition, as well as the
findings of some scientific studies. However, his own discussion on the fallibility of
While speed and efficiency are two advantages of intuitive thinking, inherent
limitations in human cognition are its Achilles’ heel. Summing up the research of Herbert
Simon, Richards J. Heuer, Jr. explains the use of mindsets in human cognition:
inclination,” and as “a fixed state of mind,” serve a good purpose for the most part. 21
17
Richards J. Heuer, Jr., Psychology of Intelligence Analysis (Washington D.C.: CIA Center for the Study
of Intelligence, 1999), 111.
18
David G. Myers, Intuition: Its Powers and Perils (New Haven: Yale University Press, 2002), 3-5.
19
Malcolm Gladwell, Blink: The Power of Thinking Without Thinking (New York: Back Bay Books/Little,
Brown and Company, 2007), 23.
20
Heuer, “Limits,” 78; citing Herbert Simon, Models of Man (New York: John Wiley & Sons, 1957).
21
Heuer, “Limits, 86; Merriam-Webster’s Collegiate Dictionary, 11th ed., s.v. “mindset.”
10
new information quickly and efficiently by using an existing mental framework based on
However, these rigid mindsets sometimes betray our judgment because they do not adapt
well when new information challenges strongly held beliefs and preconceptions.22 One
former CIA analyst, Stanley Feder, specifically identifies mindsets as being “a major
explanatory, confirmation bias is defined as the tendency “for people to seek information
and cues that confirm the tentatively held hypothesis or belief, and not seek (or discount),
those that support an opposite conclusion or belief.”24 A relevant example of this was the
2003 to seek evidence confirming the established belief that Saddam Hussein had
There is a plethora of other cognitive biases that also plague intuitive thinking in
intelligence. These biases can manifest themselves in research strategy, perception, and
memory. One of the major criticisms of intuitive thinking is that it has the tendency to
identify the first plausible or reasonable hypothesis and seek evidence that supports this
hypothesis, known as “satisficing.”26 The problem with this method is that often the same
22
Heuer, “Limits, 76, 81, 83, 86.
23
Stanley A. Feder. “Forecasting for Policy Making in the Post-Cold War Period,” Annual Review of
Political Science Vol. 5. (2002): 113.
24
Christopher D. Wickens and Justin G. Hollands, Engineering Psychology and Human Performance, 3rd
ed. (Upper Saddle River, NJ: Prentice Hall, 2000), 312.
312.
25
United States Government. Commission on the Intelligence Capabilities of the United States
Regarding Weapons of Mass Destruction. Report to the President of the United States, 31 March 2005,
(Washington D.C.), p 162. <http://www.wmd.gov/report/>
26
Heuer, “Psychology,” 44.
11
evidence is also consistent with any number of alternative hypotheses. Given this, an
analyst risks fooling himself into thinking he has identified the most likely hypothesis,
but unaware he is overlooking other valid, and possibly more likely, alternatives. Also
among these is vividness bias, which is the tendency for vivid evidence to have greater
influence on our thinking than less vivid evidence, regardless of its true value. 27 Another
common cognitive bias found in intuitive thinking is availability bias, which is the
tendency for people to estimate the likelihood of an event largely based on how many
relevant past instances they can recall and how easily they come to mind.28 These are
only a few among many cognitive biases that can hinder human cognition.
dependent on the absence of these biases.29 This opens an important question regarding
the utility of intuition in intelligence analysis. That is, if the efficacy of intuitive thinking
is dependent on the absence of such biases, then how prominent are these in human
cognition? Specifically, if these biases are prominent and difficult to willfully bypass, this
would suggest that intuition alone is ineffective when dealing with high-risk analytic
decision making. This is where Gladwell’s argument unravels because these biases are
pervasive and difficult to avoid in intuitive thinking. Heuer likens these biases to “optical
illusions in that the error remains compelling even when one is fully
aware of its nature. Awareness of the bias, by itself, does not produce
27
Ibid, 116.
28
Amos Tversky and Daniel Kahneman, “Availability: A Heuristic for Judging Frequency
and Probability,” Cognitive Psychology, 5 (1973): 207-232.
29
Gladwell, 72-76.
30
Heuer, “Psychology,” 112.
12
intuition with his book, Think: Why Crucial Decisions Can’t Be Made in the Blink of an
Eye, pointing out that many of the examples he gives are misleading or out of context.
Among these include the case of a museum which purchased what was assumed to be an
authentic Greek statue for its collection. From the start, various experts felt something
was wrong with the statue and these intuitive impressions subsequently led to the
discovery that it was a forgery. LeGault correctly points that these initial impressions
were not really the work of pure intuition, but resulted from observers’ expertise and
States national security and intelligence infrastructure,32 the use of structured methods has
been “debated in analytic circles for decades.”33 According to Folker, “At the heart of this
believe that many factors in a given analytic problem are too complex and abstract to be
incorporated into methods that are rigid and scientific.35 Hence, Folker sums up; this side
argues that the most effective qualitative analysis “is an intuitive process based on
31
Michael R. LeGault, Think: Why Crucial Decisions Can’t Be Made in the Blink of an Eye (New York:
Threshold Editions, 2006), 8-10.
32
Marrin, 9.
33
Ibid, 8.
34
Folker, 6.
35
Folker, 6-7, citing Richard K. Betts, “Surprise, Scholasticism, and Strategy: A Review of Ariel Levite’s
Intelligence and Strategic Surprises (New York: Columbia University Press, 1987),” International Studies
Quarterly 33, no. 3 (September 1989): 338.
13
instinct, education, and experience.”36 Even those who acknowledge structured methods
can improve analysis contend such improvements would be so minute that resources
Folker states, “there is also a concern that the artist [analyst] will fall in love with his art
and be reluctant to change it even in the face of new evidence. The more scientific and
objective approach encourages the analyst to be an honest broker and not an advocate.”39
These proponents argue that while subject-matter expertise has its utility, this also
heuristics, which can manifest themselves as cognitive biases.40 Heuer further makes the
case for the use of structured methods when he points out that the “the circumstances
under which accurate perception is most difficult are exactly the circumstances under
situations on the basis of incomplete, ambiguous, and often conflicting information that is
empirical insight that argues for the utility of structured methods in some circumstances.
While serving as a political analyst at the CIA, Feder used one particular structured
quantitative method to forecast more than 1200 international events.42 During this time,
36
Folker, 7; citing Tom Czerwinski, ed. Coping with the Bounds: Speculations in Nonlinearity in Military
Affairs (Washington: National Defense University, 1998), 139.
37
Folker, 9.
38
Ibid, 10.
39
Ibid, 10.
40
Johntson, “Integrating Methodologists Into Teams of Substantive Experts,” 65.
41
Heuer, “Limits,” 78-79.
42
Feder, “Forecasting,” 118-119.
14
he found that the structured method, when “compared with conventional intelligence
analyses…had more precise forecasts without sacrificing accuracy.”43 Feder also claims
that another specific structured method used at the CIA “helped avoid analytic traps and
improved the quality of analyses by making it possible to forecast specific outcomes and
the political dynamics leading to them.” 44 Also, while this method did not increase
forecasting accuracy over intuitive analysis, it did provide more nuanced results.45
methods of thinking were found to be correlated to better judgment. In his book, Expert
Political Judgment, the author aims to define indicators of good judgment, concluding,
“What experts think matters far less than how they think.”46 Tetlock uses a concept first
illustrated by Isaiah Berlin in “The Hedgehog and the Fox” from The Proper Study of
Mankind:
If we want realistic odds on what will happen next…we are better off
turning to experts who embody the intellectual traits of Isiah Berlin’s
prototypical fox – those who ‘know many little things,’ draw from an
eclectic array of traditions, and accept ambiguity and contradiction as
inevitable features of life – than we are turning to Berlin’s hedgehogs –
those who ‘know one big thing,’ toil devotedly within one tradition, and
reach for formulaic solutions to ill-defined problems.47
In his research, Tetlock analyzed and compared the forecasts of human participants and
43
Ibid, 119.
44
Stanley A. Feder, “FACTIONS and Policons: New Ways to Analyze Politics.” Inside the CIA’s Private
World: Declassified Articles from the Agency’s Internal Journal, 1955-1992, ed. H. Bradford Westerfield
(New Haven: Yale University Press, 1995), 275.
45
Feder, “FACTIONS,” 275.
46
Philip E. Tetlock, Expert Political Judgment (Princeton: Princeton University Press, 2005), 2.
47
Tetlock, 2.
48
Ibid, 49-51.
15
experts and amateurs, all of who used intuitive thinking.49 These groups made predictions
on the short and long-term futures of economic, political, and national security policies of
numerous countries.50 Examining the quantitative results, Tetlock discovered that human
noticed a level of consistency in some forecasters that clearly was not the result of
chance.51
participants’ backgrounds, belief systems, and cognitive style - how they think. The data
showed that level of education and professional experience had no correlation to better
and their forecasting accuracy. The questionnaire revealed two dominant cognitive styles:
Berlin’s fox and hedgehog.53 Statistical analysis revealed that having a fox-type
When the participants first created their forecasts, they included commentaries
explaining their thought process.55 From this information, Tetlock made numerous
generalizations about why foxes were able to forecast more accurately. Among these
include that foxes are reluctant to view problems through an established, rigid
framework; more cautious to explain current and future events through overly simplistic
49
Ibid, 54.
50
Ibid, 49.
51
Ibid, 7.
52
Ibid, 68.
53
Tetlock, 72-75.
54
Ibid, 78-80.
55
Ibid 88.
16
looping evidence; were more emotionally neutral; and are more likely to integrate
dissonant viewpoints into their analyses.56 Interestingly, these traits are also common
effective than cognitive styles that bear resemblance to structured methods because these
Proponents of intuitive analysis make valid points about the power of intuition
and the inherent limitations of structured methods in intelligence. That is, intuitive
thinking is naturally the basis of all analysis. Also, information used in intelligence
analysis problems will sometimes not fit easily into the rigid framework of a structured
method. On the other hand, proponents of structured methods make valid points about the
potential benefits of using such methods to aid intuitive analysis. That is, structured
thinking can improve both accuracy and nuance by mitigating the effects of cognitive
bias and other judgmental errors. The research and experimentation of Tetlock and others
exclusively one or the other, but instead “a combination of both intuition and scientific
methods.”58 Both styles of thinking have their strengths and weaknesses; and nothing
suggests they could not supplement each other. While this question still deserves future
research and debate, the “either/or proposition”59 may not be the most progressive
question to ask. Instead, the more appropriate question might be when are structured
56
Ibid; 88-92, 100-107.
57
Ibid, 117-118.
58
Folker, 13.
59
Ibid, 13.
17
methods appropriate? Hopefully, experiments such as this one will advance our
According to Dr. Rob Johnston from the CIA Center for the Study of Intelligence,
intelligence analysts currently have access to over 200 structured analytic methods.60
Despite this, intuition appears to be the predominant style of analysis within the IC and
most experts agree that structured methods are generally unused. Specifically, one expert,
Stephen Marrin, suggests the use of structured methods is mostly limited to analysts who
are required to use a very a specific methodology for a very specific purpose, such as
assertions, revealing only one analyst who claimed to routinely use a structured analytic
method.62
There are several reasons why structured methods are not widely used in the US
Intelligence Community. The primary reason for the non-use of structured methods is an
analytic culture predisposed to intuitive thinking. Specifically, Feder states that this
culture views analysts primarily as writers and summarizers of information, rather than
“methodologists” who tinker with scientific tools.63 Whether or not organizational culture
is a key factor, Folker states that in general, “most people instinctively prefer intuitive,
60
Johnston, “Integrating Methodologists Into Teams of Substantive Experts,” 65.
61
Marrin, 9.
62
Folker, 11.
63
Feder, “Forecasting,” 119.
64
Folker, 2.
18
Furthermore, according to Heuer, given the purpose and nature of their work, intelligence
methods to political analysts at the CIA in the 1970s, Heuer recalls that responses to the
underpinning of this skepticism, as discussed earlier, is the belief that structured methods
proponents have argued the case for structured methods, few experiments have been
Inadequate education regarding the use of structured methods is also to blame for
their non-use. Unlike many professions that have established cadres of specialists in
methodology, this is not the case with the US Intelligence Community. That is, exposure
are heavily preoccupied with their own area of expertise.69 This work environment,
understandably, does not encourage busy analysts to spend time experimenting with new
65
Folker 14; partly citing Morgan D. Jones, The Thinker’s Toolkit, 8.
66
Heuer, Adapting Academic Methods and Models to Government Needs: The CIA Experience (Carlisle
Barracks: Strategic Studies Institute, 1978), 7.
67
Ibid, 5.
68
Marrin, 10.
69
Johntson, “Integrating Methodologists Into Teams of Substantive Experts,” 64-65.
19
analytical techniques. This is even more the case with more complex methods, such as
bayesian analysis.70
improve intelligence analysis. According to the creator of the method, Richards J. Heuer,
Jr., ACH “requires an analyst to explicitly identify all the reasonable alternatives and
have them compete against each other for the analyst’s favor, rather than evaluating their
plausibility one at a time.”71 Heuer’s ACH is an eight step process; each with a specific
70
Folker, 8; citing Captain David Lawrence Graves, ISAF, Bayesian Analysis Methods for Threat
Prediction, MSSI Thesis (Washington: Defense Intelligence College, July 1993), second page of Abstract.
71
Heuer, “Psychology,” 95.
72
These are taken directly from Heuer’s eight-step ACH process as cited. Heuer, 97. A more detailed
discussion of these eight steps can be found in Chapter Eight of “Psychology.”
20
The first step of ACH is simply to identify all possible hypotheses, which Heuer
order to benefit from different perspectives and to reduce the likelihood that a plausible
hypothesis will not be identified.74 According to Heuer, there are not an ideal number of
hypotheses for any given problem; but the number should increase relative to the level of
uncertainty.75
While identifying
Figure 2.1 - Example ACH matrix from
hypotheses, an emphasis is Psychology of Intelligence Analysis
placed on distinguishing
evidence in contrast to a
disproved hypothesis,
an unproven hypothesis simply because it lacks supporting evidence. Doing so can result
possible supporting evidence exists but has not been found yet.76
73
Heuer, “Psychology,” 95.
74
Ibid, 97-98.
75
Heuer “Psychology,” 98.
76
Ibid.
21
The next step requires listing all pertinent evidence and arguments for and against
each hypothesis. This list is not limited to hard evidence but also includes assumptions
and logical deductions about the topic. These are incorporated into the structured process
because they will often have a strong influence on an analyst’s final thoughts. After
creating the list, an analyst asks himself several questions which will help identify
additional evidence that might be needed. For each hypothesis, what evidence should an
analyst expect to be seeing or not seeing if it were true? Also, the analyst considers how
the absence of evidence could be indicator itself.77 For example, in the case of possible
military attack, “the steps the adversary has not taken to ready his forces for attack may
be more significant than the observable steps that have been taken.”78
After the analyst is confident that all relevant evidence has been collected, step
three in the process requires constructing a matrix with the hypotheses lined over the top
and all evidence listed down the side. From this point, the analyst works across the matrix
irrelevant to that hypothesis and makes an appropriate notation for future reference. This
process is repeated for each piece of evidence until all cells in the matrix are filled. A
second objective in step three is to evaluate the diagnosticity of each piece of evidence.
That is, to evaluate its usefulness as an indicator for each hypothesis. Heuer uses a
patient is stricken with, a high-temperature does not have a high diagnosticity because
that symptom would apply to any number of illnesses. In the case of an ACH matrix,
77
Heuer, “Psychology,” 99; Diane Chido, et al., 39-40.
78
Heuer, “Psychology,” 99.
22
In the next step of the process, Heuer advises that the set of hypotheses should be
reevaluated for potential changes. After examining the evidence as it relates to each
Heuer, this is essential because the nuances of each hypothesis will greatly affect how it
is analyzed. Additionally, evidence from step three found to have no diagnostic value is
whole and tentative conclusions are formed about the likelihood of each. The analyst
works down the matrix one hypothesis at a time, trying to disprove each with the
an analyst is systematically narrowing down the possibilities until the most likely ones
are clear. The hypothesis with the least inconsistent evidence against it is viewed as the
most likely possibility.81 However, Heuer warns, ACH is not meant to be the absolute
analytic solution to any problem, “the matrix serves only as an aid to thinking and
and hypotheses and identification of those few items that really swing your judgment on
the issue.”82 In the end, the analyst must make the final call.
Before finalizing the conclusion, the analyst questions the integrity of key pieces
of evidence and the repercussions if those linchpins turned out to be false, deceptive, or
79
Heuer, “Psychology,” 100-102.
80
Heuer, “Psychology,” 103.
81
Heuer, “Psychology,” 103-104.
82
Ibid, 105.
23
misunderstood. Finally, when reporting conclusions, the analyst discusses the likelihood
of alternative possibilities and identifies circumstances which may indicate events are
biases such as satisficing. The ACH process is a structured, systematic methodology for
identifying all the possibilities and evidence, and determining the relation between all
design of the ACH matrix illuminates evidence and hypotheses side by side, acting as an
analytic “audit trail,” for any supervisory analyst or decision maker to take advantage of.
This benefits an analyst by being able to visually explain one’s thought process, and also
as well as its strengths. The main weakness of ACH is that it can be time consuming.
While an analyst is often under time constraints, filling out an ACH matrix can be
tedious.86 However, several computer software companies, such as the Palo Alto
83
Ibid, 105-107.
84
Kristan J. Wheaton, D.E. Chido, and McManis and Monsalve Associates, “Structured Analysis of
Competing Hypotheses: Improving a Tested Intelligence Methodology” Competitive Intelligence
Magazine, November-December 2006, http://www.mcmanis-
monsalve.com/assets/publications/intelligence-methodology-1-07-chido.pdf (accessed 14 June 2008).
85
Marrin, 7.
86
Kristan Wheaton, et al., 13.
24
Research Company (PARC), have developed programs which automate the ACH
process.87 While ACH can still be a lengthy process, these computer programs have
events, making it limited to being “only a snapshot in time.”88 As analysts are under time
constraints, they must force themselves to stop adding evidence into the matrix and begin
biases. More studies are necessary because only a limited number have been conducted
so far. Additionally, testing ACH under varying conditions will help shed light on how
In his study, conducted in conjunction with the Joint Military Intelligence College
(JMIC), Folker tested the accuracy of hypothesis testing; a structured method nearly
synonymous with Heuer’s ACH. The researcher measured this by comparing the
accuracy of two groups; one using hypothesis testing and one using an unstructured,
87
Palo Alto Research Center, “ACH2.0 Download Page,” http://www2.parc.com/istl/projects/ach/ach.html
(accessed August 19, 2008).
88
Diane Chido, et al., 50.
89
Ibid.
90
Folker, 29.
25
intuitive approach to the same two intelligence scenarios.91 The experimental group
performed slightly better in the first scenario using hypothesis testing, but the difference
was not statistically significant. However, the difference between control and
participants using hypothesis testing performed better than those using intuitive
analysis.92
Folker also notes that many experimental group participants “had difficulty
identifying all of the possible hypotheses and determining the consistency of each piece
that the effectiveness of structured methods depends heavily on the type of problem and
the training of each analyst. However, he concludes that an adequately trained analyst
confirmation bias and the anchoring effect. They define the anchoring effect as the
“tendency to resist change after an initial hypothesis is formed.”95 The study compared
groups working on the same intelligence problem; one group with ACH and one group
91
Ibid, 15.
92
Ibid, 29.
93
Ibid, 30.
94
Folker, 33.
95
B. Cheikes et al., Confirmation Bias in Complex Analyses. (Bedford, MA: MITRE, 2004), 9.
26
without. They found ACH users were just as susceptible to confirmation biases as non-
ACH users, except in special circumstances. ACH did not help mitigate an anchoring
effect, but the researchers admit this result is unreliable due to testing conditions.96 A
pattern of evidence distortion was present in both ACH and non-ACH groups but this is
weighting effect was present in the study and ACH helped mitigate this, but only with
users less experienced in intelligence analysis.98 The researchers’ final conclusion is that
In 2004, Jean Scholtz conducted an evaluation of ACH with six Naval Reservists,
who used both intuitive analysis and ACH to solve different intelligence problems. All
participants were tasked one of two intelligence problems, using intuitive analysis for the
first and ACH for the second. After completing both problems, Scholtz administered a
questionnaire to all participants regarding their experience with ACH. The answers from
these questionnaires were overwhelmingly positive toward ACH. Among the answers
provided by participants were that they felt ACH improved their analysis, it was easy to
use, and they would be inclined to use it in the future.100 The quantitative data suggested
that ACH helps users consider more hypotheses and incorporate more evidence.101
classroom at the Naval Postgraduate School (NPS). Pirolli split students at the NPS into
96
Ibid, 9.
97
Ibid, 12.
98
Ibid, iii.
99
B.A. Cheikes, et al., 16.
100
Jean Scholtz, Analysis of Competing Hypotheses Evaluation (Gaithersburg, MD: National Institute of
Standards and Technology, 2004), 1.
101
Ibid, 12.
27
two groups: those analyzing a problem using ACH on paper, and those using computer-
assisted ACH. In his final paper, Assisting People to Become Independent Learners in
the Analysis of Intelligence, Pirolli concluded there was little difference in ACH used on
Hypotheses
Taking into consideration the purpose and purported benefits of ACH, as well as
previous literature and studies pertinent to the subject, I developed a series of testable
hypotheses. My first hypothesis is that participants using ACH will, as a group, produce
more accurate forecasts regarding the assigned task than those using intuitive analysis.
The second hypothesis is that evidence of cognitive biases and mindsets will be more
prevalent among those using intuitive analysis, but less so among those using ACH
102
Peter Pirolli, Assisting People to Become Independent Learners in the Analysis of Intelligence
(Palo Alto Research Center, Calif.: Office of Naval Research, 2006), 63.
103
Ibid.
28
METHODOLOGY
Research Design
This experiment was designed with a control and experimental group and
conducted over the course of two weeks in October 2008. Both groups were tasked to
forecast the result of the 2008 Washington State gubernatorial election, which occurred
to use ACH to structure their analysis. Also, participants were organized into control and
could be measured between groups. Furthermore, the use of evidence among all
participants would be used to ascertain the presence and effects of confirmation bias.
down session to complete a task, this experiment gave participants a full week to
complete the assignment at their own convenience and they were given freedom to
collect any open source information which they viewed as relevant to the tasking. I
structured the experiment in this way to create a less artificial environment for
participants and one more similar to that in which most intelligence analysts work.
Participants
students from the Mercyhurst College Institute for Intelligence Studies (MCIIS). There
were a total of 70 students who participated in the experiment, with 38 in the control
29
years within each group was nearly even, except for a higher number of first year
graduate students in the control group and a higher number of second year graduate
students in the experimental group (See Figure 3.2). I placed nearly all first year graduate
students in the control group because they lacked experience in ACH at the time. I placed
most second year graduate students in the experimental group in order to even out the
Although I did not require all participants to use ACH in their tasking, I did
require that all participants had used the methodology at least once before participating in
this experiment (first year graduate students being an exception). This was done mostly
academic coursework. The exclusion of freshmen students also likely ensured an overall
In total, there were a noticeably higher number of students with the affiliation as a
Republican than as a Democrat (See Figure 3.3). In the control group, the proportion of
Republicans to Democrats was around 1.5:1. In the experimental group, this proportion
was nearly 2:1. Although an even number of Republicans and Democrats in both groups
would have been ideal, the circumstances surrounding participant recruitment did not
Figure 3.3
Procedures
I spent two weeks prior to conducting the experiment visiting classes to recruit
research was on, the time and work required, and the benefits for those who participated.
The primary benefit offered was that some professors were willing to assign extra credit
the experiment, I handed out and collected signup sheets from those who were interested
(See Appendix A). The sign-up sheets requested contact information, class year, political
31
affiliation, and preference for four different time slots to participate in the experiment.
After collecting signup sheets and finishing recruitment, I e-mailed all students with their
assigned time slot for the experiment. Time slots were assigned by myself rather than
methods,” rather than ACH. All students who participated had used ACH at least once
through coursework in the Intelligence Studies program and were familiar with the
methodology’s purpose of mitigating cognitive bias. If I had emphasized the use of the
methodology while recruiting, it might have ruined the integrity of the experiment’s
At the beginning of each tasking session, I handed out the Consent Form for each
participant to sign and return to me (See Appendix B). This Consent Form explained the
purpose of the experiment, what participation entailed, that there was no anticipated
dangers or harmful effects associated with participating, and that they may discontinue
participation at any time without penalty. After collecting Consent Forms, I handed out
experiment packets containing their tasking, answer sheet, and other relevant information
(See Appendix C). I reviewed the packet with them, explained their tasking, what was
expected during their participation, and discussed other issues related to successful
reliability.
32
At the end of the tasking session, participants were instructed on procedures for
returning their answer sheets for the experiment. Over the course of the next week and a
half, I, along with a colleague who offered his assistance, collected answer sheets from
participants who finished the experiment. Upon returning their answer sheet, participants
thanked students for participating, explained the purpose of the experiment in further
detail, as well as how this research would contribute to the body of academic work in
their field (See Appendix D). There were two different post-experiment surveys given to
participants, one for the control and one for the experimental (see Appendix E). The
surveys asked questions related to how much time and work was spent on the experiment,
estimated difficulty, as well as their understanding of the assigned task. The survey for
the experimental group also included questions about their understanding of ACH. The
purpose of these surveys was that, if the experiment was not successful, I would have
Control Group
After attending the tasking session, control group participants had a full week
from that date to complete their assigned task. This task was to assume the role of a
political analyst working for a fictional news company and forecast the result of the
upcoming 2008 Washington State gubernatorial election. The two hypotheses implicitly
● The incumbent governor, Christine Gregoire (D), will win the election.
Participants received some basic background information about the election and its
candidates, and were encouraged to use all available open source information, but were
The answer sheet also included a place to further explain their analytical findings, but this
was not required. The words of estimative probability (WEP) used in the experiment
were primarily based on those used by the National Intelligence Council (See Figure 3.4).
However, there were some slight modifications to accommodate the needs of the
experiment. First, the most central expression of likelihood, “even chance,” was removed.
The research design of this experiment required an analytical problem where the
likelihood of both hypotheses was so similar that, in this case, politically oriented
mindsets could tip participants’ forecast. Because the result of the election would be
difficult to call, I knew that a high number of participants would be tempted to select a
would have likely skewed the results because a high number of participants would have
supplied an answer useless to the research question. The second modification was adding
a level of likelihood between “likely” and “almost certain,” as well as its negative
equivalent on the opposite end of the scale. This is more similar to the scale of WEP used
by the students at Mercyhurst and I also felt this was more appropriate for the topic being
analyzed (See Figure 3.5). Although the Washington State gubernatorial election was
expected to be very close, I felt some participants still might desire to indicate a level of
indication of overall source reliability. Although already familiar with the concept of
source reliability, their tasking sheet included a short explanation. For analytic
Lastly, I provided control group participants with suggestions for beginning their
Washington State politics and links to related resources. Additionally, since MCIIS
students are not familiar with forecasting domestic elections, I provided a list of types of
evidence that could be useful indicators for the result of a gubernatorial election (See
Appendix C).
35
Experimental Group
Tasking for the experimental group was identical to the control group except that
participants were required to use the Palo Alto Research Center (PARC) ACH 2.0
software to create an ACH matrix for their analyses. They were instructed to print out this
matrix and return it along with their answer sheet. During their tasking session, I
Data Analysis
The primary question of this research is whether or not ACH increases forecasting
accuracy. I sought to answer this question simply by comparing the control and
experimental groups to see if there was a significant difference between the accuracy of
their forecasts. The secondary question is whether or not ACH helps mitigate the effects
of cognitive bias and mindsets in users. If the results yield discernible patterns in
evidence only in favor of their forecasted candidate, this would suggest the presence of
confirmation bias, specifically. If such patterns existed in the control group but were less
pronounced or non-existent in the experimental group, this would suggest ACH helps
All data pertaining to the above research questions was tested for statistical
significance using a program called Statistical Package for the Social Sciences (SPSS).
Derived from a series of mathematical formulas and tests, statistical significance is the
36
likelihood that the difference between control and experimental group data is the result of
mere coincidence. The SPSS tests for all data sets were placed at a 5 percent (.05)
threshold for statistical significance. That is, to achieve statistical significance, the chance
RESULTS
37
Accuracy
At the end of the 2008 Washington State Gubernatorial Election, the incumbent
Democrat, Christine Gregoire (D), defeated the Republican challenger, Dino Rossi (R),
by a margin of 6.4 percentage points.104 After compiling and analyzing the results,105 I
found that accuracy improved from the control to experimental group by 9 percentage
the eventual winner, Gregoire (See Figure 4.1). Accuracy in the experimental group
improved slightly with 70 percent of participants forecasting Gregoire (D) as the winner.
Figure 4.1
Statistical testing found that the data on accuracy is not statistically significant,
having a P-value of .421 (See Appendix F). While this testing does not definitively
invalidate these experiment results, it does raise some doubt about their validity. Other
factors that could have prevented statistical significance are the small sample size and
between the control and experimental groups in such an experiment should not be that
great. Although many criticisms of the human thought process are valid, intuitive analysis
is not obsolete. For an experiment like this one, a structured method should only improve
overall forecasting accuracy incrementally since intuitive analysis is, for the most part, an
effective method itself. Additionally, if and when cognitive bias affects an analyst’s
intuitive thought process, structured methods such as ACH can aid as a counter measure.
In other words, a structured method will not improve the analysis of all users. In sum, the
improvement of the group using ACH should not be discounted because it is modest.
This difference is expected and still supports the notion that ACH can improve analysis.
Mindsets
favor of the candidate associated with their own political affiliation. However, if ACH
helps mitigate this, this tendency should be less prominent. For example, if forecasts
among Republicans are significantly more in favor of Rossi (R) in the control group, but
more in sync with the actual winner of the election in the experimental group, this would
suggest that ACH helped mitigate the effect in that group. The same should hold true for
Democratic participants. However, interpreting the results will be subject to the winner of
the election. In this case, such a mindset among Democrats will be more difficult to
identify and evaluate because the democratic candidate won. Data comparing forecasts
39
between Democrats and Republicans in the control and experimental groups is depicted
in Figure 4.2.
Gregoire (D) compared to Rossi (R) was strongly in favor of Gregoire and remained
nearly identical from the control to experimental group. While this might suggest the
effects of a mindset were prevalent in both groups, it is more likely this appears to be the
case not because of the influence of an actual mindset, but because Democrats
Figure 4.2
ability to estimate the number of Democrats whose forecasts were subject to a mindset.
This hypothetical number of Democrats is likely hiding somewhere among the total
more discernable results. In the control group, the proportion of forecasts between
candidates was nearly equal, with only a 4 percent margin favoring Gregoire (D).
40
However, this proportion changed dramatically in the experimental group with the
margin expanding to 36 percentage points. This suggests it is likely that ACH helped
likely that Republicans’ thought process in the control group was heavily influenced by
their political leanings and preference for the Republican candidate, while ACH mitigated
incorrectly in favor of Rossi, they displayed better calibration than their counterparts in
the control group. That is, they were arguably less wrong. Tetlock defines calibration as
“the degree to which subjective probabilities [analytic estimate] are aligned with
objective probabilities.”106 Although their estimate was wrong, their matrices generally
indicated a lower level of likelihood than that of the control group analyses. Of the 32
percent of Republicans who still got it wrong with ACH, the methodology arguably
brought them closer to forecasting correctly than those in the control group.
Like the dataset on accuracy, this data did not meet the standard for statistical
significance, having P-values of .973 and .291 for Democrats and Republicans,
respectively (See Appendix F). However, also like the dataset on accuracy, this is likely
attributable to the even smaller sample size. Breaking down participants into Democrats
and Republicans in the control and experimental groups essentially cut the sample size of
consider appropriate standards for significance with different types of research. Although
the threshold for statistical significance was set at the general standard (p=.05), it is
106
Tetlock, 47.
41
the statistical results for mindsets among Republicans would not even satisfy an
acceptable standard for exploratory research (.10), having a P-value of .291 is still
notable for its proximity.107 Also, this P-value essentially says there is about a 70 percent
chance that the data is not the result of chance, suggesting that further research, with
Confirmation Bias
clearly reveals confirmation bias among participants in the control group. As discussed
earlier, confirmation bias is the tendency “for people to seek information and cues that
Figure 4.3
confirm the tentatively held hypothesis or belief, and not seek (or discount), those that
107
David G. Garson, Guide to Writing Empirical Papers, Theses, Dissertations (New York: Marcel
Dekker, Inc., 2002), 199.
108
Wickens and Hollands, 312.
42
80 percent of all participants in the control group provided evidence in their answer
sheets that entirely supported their forecasted candidate.109 On the other hand, only 9
percent of experimental group participants exhibited this behavior. The ACH matrices of
these participants show that both hypotheses were considered with varying proportions of
data, with the P-value being .000 (see Figure 4.4). In other words, according to the
calculations of the SPSS program, there is a zero percent chance that the results for
confirmation bias can be attributed to coincidence. This data suggests ACH tremendously
creating their estimate reveals a staggering difference and suggests something about the
109
This data excludes eight outliers. These outliers were participants who didTable
not provide
4.1 any evidence
whatsoever along with their estimative statement.
Group Avg. # of pieces of
evidence used
Control 2.9
Experimental 10.1
43
ability of ACH to encourage users to seek out and use more information (see Table 4.1).
In the control group, participants used on average less than 3 pieces of evidence for their
analysis. On the other hand, participants in the experimental group used on average 10
intuitive analysis and one of the strengths of ACH. One flaw of intuitive analysis is that
the human thought process is constrained by the inability to process more than a handful
of individual pieces of information at a time.110 Given this, analysts will often make a
judgment unaware that they are using an inadequate amount of information. On the other
hand, a structured method such as ACH allows a user to visualize all the information at
the same time. This will not only increase accuracy by allowing the user to better
understand the relationship of all the evidence, but also makes it easier for an analyst to
participants using intuitive analysis included fewer pieces of evidence in their analysis
because using cognition alone, they were far were less likely to identify information gaps
and also maintained a false sense of confidence in their collection before making a
forecast. For those using ACH, on the other hand, the matrix aided in both identifying
information gaps and dispelling any false sense of confidence regarding the amount of
evidence used.
There were no discernible patterns in the words used to describe the estimative
probability assigned to the results (the WEPs) among the control and experimental groups
related to cognitive bias. As can be seen in Figure 4.5, participants in both groups
overwhelmingly used “likely” as the WEP in their estimative statement. I expected this
110
George A. Miller, “The Magical Number Seven—Plus or Minus Two: Some Limits on our Capacity for
Processing Information,” The Psychological Review, Vol. 63, No. 2 (March 1956): 1-12.
44
result because of the close nature of the election. Average analytic confidence among
both groups was very close, with the control group averaging 6.1 on a scale of 10 and the
experimental group averaging 5.9. Analyst assessments of source reliability were very
similar among both groups and sub-groups within, with an overwhelming number of
scale. This consistency likely has less to do with the method and more to do with the
Figure 4.5
Summary of Results
The findings discussed in this section suggest that ACH is modestly effective for
improving accuracy and very effective at reducing the effects of mindsets and cognitive
bias in intelligence analysis. ACH slightly improved accuracy among users in the
politically oriented mindset regarding the Republican candidate. This was not the case
45
with Democrats, but this was likely because the Democratic candidate won the election,
hindering the ability to discern any difference between the control and experimental
groups. Regarding the use of evidence, ACH users incorporated substantially more
evidence into their analysis and applied it more appropriately. Specifically, a tendency
among nearly all control group participants to only incorporate evidence in favor of their
CONCLUSION
The main purpose of this study was to ascertain whether or not ACH is effective
for estimation and forecasting in intelligence analysis. The secondary purpose was to
determine whether or not the methodology is effective for mitigating cognitive bias and
46
other phenomena detrimental to intelligence analysis. While most of these results are not
definitive, they all support the notion that ACH can improve intelligence analysis.
The results of this experiment revealed that ACH improved forecasting accuracy,
but only modestly. With the exception of one component of Folker’s experiment, where
ACH/hypothesis testing performed drastically better than intuition, the minute difference
in accuracy between the control and experimental groups in this study is consistent with
A common variable in both these experiments was that the objective likelihoods
of the given hypotheses were very close. On the other hand, in the component of Folker’s
experiment where ACH/hypothesis testing performed drastically better, it was clear that
one of the given hypotheses was much more likely than the others.111 This suggests,
perhaps, that ACH is less effective with those problems where the objective probabilities
of each hypothesis are roughly equal and more so when they are slightly more uneven.
This inference helps us identify when ACH is most appropriate to use. In this
case, the results on accuracy have shed light on the utility of the methodology with
problems subject to varying objective probabilities among the given hypotheses. This
experiment and previous ones already suggest that ACH is less useful where those
probabilities are roughly equal. On the other end of the spectrum, when those
specific, the accumulated data suggests ACH may only be effective where the objective
probability of the most likely hypothesis is at least 10-15 percentage points above the
next most likely hypothesis. Such a probabilistic “distance” should allow the rough tool
111
These facts are derived from observing Folker’s priori evaluation of the intelligence scenarios and given
evidence.
47
that ACH is (compared to more refined statistical measurements) to distinguish the more
likely hypothesis from the less like ones. On the other hand, as the objective probability
of the most likely hypothesis rises more than 30-45 percent above the next most likely
hypothesis, ACH or, indeed, any structured method will become increasingly
unnecessary. The differences between the two hypotheses will be “visible to the naked
eye,” in a manner of speaking. The graph in Figure 5.1 demonstrates this concept for a
Figure 5.1
said, this suggestion may well provide avenues for future research into the utility of
ACH. Given this idea, a number of future experiments could be designed to shed further
light on ACH’s utility in varying circumstances. A subsequent experiment could test the
methodology’s utility with two hypotheses when the objective probabilities are more
hypotheses. The analytic problem in this experiment contained only two hypotheses;
however, future experiments could test ACH against a problem with more than two
ACH also appeared to mitigate the effects of politically oriented mindsets among
some participants; however, this is uncertain because of the conditions for measuring
such an effect. Overall, the researcher was surprised that the difference was not more
pronounced. I confidently expected, given the nature of the analytic problem and one
with close objective probabilities of each hypothesis, that politically oriented mindsets
would be present and would tip the balance in many participants’ forecasts. This
appeared to be the case with Republican participants, but at far less a magnitude than
expected. Anecdotally, I feel that the disparity in evidence used by participants was partly
For future tests like this one, an overall larger sample size would also be
beneficial since these tests required breaking down participants further into subsets
within each group, creating even smaller data sets and decreasing their reliability. This
suggestion is not meant to cast doubt on the interpretation that ACH helped mitigate the
why this tendency was less evident than expected. The influence of mindsets was present,
but the researcher believes a similar test with a larger sample size would have likely
Confirmation bias was clearly evident among those using intuitive analysis in the
control group. On the other hand, the near non-existence of this in the experimental group
suggests ACH substantially reduced this bias in the experimental group. This finding is
49
unique and unlike previous studies in several ways. First, the method of measuring and
discerning such an effect is vastly different than that of Cheikes, et. al. Rather than
focusing on evidence distortion for discerning the presence of confirmation bias, the
researcher derived his conclusion solely from the comparative use of evidence and how it
related to analysts’ forecasts. This is more in line with the Wickens and Holland’s
definition of confirmation bias, which emphasizes the idea of seeking and incorporating
unfavorable to a preferred hypothesis. Lastly, the substantial difference between the two
groups is also unlike any other finding on ACH and confirmation bias. This difference
demonstrates that ACH is excellent for encouraging analysts to incorporate and weigh a
Overall, the differences in evidence among those using intuitive analysis and
those using ACH were staggering. Not just in how the evidence was used, but even
simply in the amount of evidence used. ACH users incorporated a significantly higher
average number of pieces of evidence. This demonstrates that their analyses were overall
accountability derived from the use of structured methods. For every participant using
ACH, I can easily check every piece of evidence they used as well as how that evidence
contributed to their final conclusion. This was somewhat the case with the intuitive
thinkers, most of whom listed the evidence they used. However, their lists are nowhere as
One possible flaw in this study which might have prevented more definitive
results was the varying evidence used among participants. While allowing participants to
collect their own information led to its own insights such as the finding on confirmation
bias, this created a less than ideal environment for comparing some results among users.
For example, did some of the experimental group participants forecast incorrectly
because using ACH was ineffective or because their research led to incorrect or
evidence it contains.112 While this aspect of the methodology created some interesting and
valid results, it unfortunately creates some level of uncertainty about other results.
participants with a base set of evidence, but like this experiment, allow them within their
given period of participation to seek out additional information. Providing a base set of
evidence would help control for the varying evidence used among participants but still
maintain conditions conductive to testing for mindsets and confirmation bias. Also, this
base set of evidence would act as a benchmark to compare to any additional information
participants collect – improving the ability to measure confirmation bias. However future
studies on ACH are structured, it will benefit our understanding of the methodology for it
The results of this experiment support my hypotheses that ACH can improve
forecasting accuracy and that it aids in mitigating biases and other cognitive phenomena.
However, these are far from definitive and more research is needed that validates these
findings and test ACH in varying conditions. Doing so will continue to expand our
112
Heuer, “Psychology,” 109.
51
understanding of the methodology and support efforts to improve the United States’
methods which can improve their analysis. These analysts already have access to over
200 analytic methods – ACH being one of them. Taking into consideration both the need
for the use of such methods and the demonstrated ability of ACH to improve analysis,
there is no reason that structured methods should not be taken advantage of when
appropriate. Hence, the last step to improving intelligence analysis with structured
methods is innovative analysts willing to incorporate these tested methods into their daily
work. In answering the research question, I hope these findings promote the use of
structured methods that can improve the overall quality of intelligence analysis in the US
Intelligence Community.
BIBLIOGRAPHY
Cheikes, B.A., et al., Confirmation Bias in Complex Analyses. Technical Report No.
MTR 04B0000017. (Bedford, MA: MITRE, 2004).
Chido, Diane and Richard M. Seward, Jr., eds. Structured Analysis of Competing
52
Feder, Stanley A. “FACTIONS and Policons: New Ways to Analyze Politics.” Inside
the CIA’s Private World: Declassified Articles from the Agency’s Internal
Journal 1955-1992, ed. H. Bradford Westerfield. New Haven: Yale University
Press, 1995.
Feder, Stanley A. “Forecasting for Policy Making in the Post-Cold War Period.” Annual
Review of Political Science Vol. 5. (2002): 113-119.
Folker, Jr., Robert D Jr. (2000). Intelligence Analysis in Theater Joint Intelligence
Centers: An Experiment in Applying Structured Methods. Washington D.C.: Joint
Military Intelligence College, Occasional Paper #7, 2000.
Garson, David G. Guide to Writing Empirical Papers, Theses, Dissertations. New York:
Marcel Dekker, Inc., 2002.
Gladwell, Malcolm. Blink: The Power of Thinking Without Thinking. New York: Back
Bay Books/Little, Brown and Company, 2007.
Heuer, Jr. Richards. J. Adapting Academic Methods and Models to Governmental Needs:
The CIA Experience. Carlisle Barracks: Strategic Studies Institute, 1978.
Heuer, Jr., Richards. J. “Limits of Intelligence Analysis,” Orbis, Winter 2005, 75-94.
Heuer, Jr., Richards J. Psychology of Intelligence Analysis. Washington D.C.: CIA Center
for the Study of Intelligence, 1999. Johnston, Rob. Analytic Culture in the US
Intelligence Community: An Ethnographic Study. Washington D.C.: Center for the
Study of Intelligence, 2005.
LeGault, Michael R. Think: Why Crucial Decisions Can’t Be Made in the Blink of an
Eye.
New York: Threshold Editions, 2006.
53
Miller, George A. “The Magical Number Seven—Plus or Minus Two: Some Limits on
our Capacity for Processing Information.” The Psychological Review, Vol. 63, No. 2
(March 1956): 1-12.
Myers, David G. Intuition: Its Powers and Perils. New Haven: Yale University Press,
2002.
Tversky, Amos and Daniel Kahneman. “Availability: A Heuristic for Judging Frequency
and Probability.” Cognitive Psychology 5 (1973), 207-232.
Tversky, Amos and Daniel Kahneman. “Judgment Under Uncertainty: Heuristics and
Biases.” Science 185, no. 4157 (1974). JSTOR (accessed March 15, 2009).
United States Government - U.S. Commission on the Roles and Capabilities of the
United States Intelligence Community, Preparing for the 21st Century: An
Appraisal of U.S. Intelligence. Washington, D.C., 1996.
Wheaton, Kristan J., D.E. Chido, and McManis and Monsalve Associates.“Structured
Analysis of Competing Hypotheses: Improving a Tested Intelligence
Methodology. Competitive Intelligence Magazine, November-December 2006.
http://www.mcmanis-monsalve.com/assets/publications/intelligence-
methodology-1-07-chido.pdf (accessed 14 June 2008).
APPENDICES
55
Structured Methods
Experiment
56
Sign-Up Form
Name:
Class Year:
Phone Number:
E-mail Address:
_________________________________ __________________
Signature Date
_________________________________ __________________
If you have any further question about analytic methodology or this research
you can contact me at abrasf87@mercyhurst.edu.
You are a high-profile political analyst working for News Corporation X. You have been tasked to
forecast the winner of the 2008 Washington State Gubernatorial election, which will be decided
on November 4, 2008. To complete your task, use all available open source information. The
main candidates in this race are Christine Gregoire (D) and Dino Rossi (R). This will be a rematch
from the previous Washington State Gubernatorial election, which was hotly contested and
controversial. Your supervisor gave you a full week to prepare your forecast.
Use the National Intelligence Council (NIC) Words of Estimative Probability (WEP) as an indicator
of your forecast:
Example Forecast:
It is [WEP] that [Candidate Name] Will Win the 2008 Washington State Gubernatorial Election.
Record your final answers on the provided answer sheet. This answer sheet includes spaces for
your final estimate (WEP), Source Reliability, Analytic Confidence, and a short explanation of how
the evidence and subsequent analysis led to your final forecast. Please return all of the described
materials to the experiment administrator by the due date in order to receive extra credit from
your professor.
Important Information:
Source Reliability:
Source Reliability reflects the accuracy and reliability of a particular source over time.
Sources with high reliability have been proven to be accurate and consistently reliable.
Sources with low reliability lack the accuracy and proven track record commensurate with
more reliable sources.
o Rate source reliability as low, medium, or high.
Analytic Confidence:
Analytic Confidence reflects the level of confidence an analyst has in his or her estimates
and analyses. It is not the same as using words of estimative probability, which indicate
likelihood. It is possible for an analyst to suggest an event is virtually certain based on
the available evidence, yet have a low amount of confidence in that forecast due to a
variety of factors or vice versa.
o To assess analytic confidence, mark your rating on the line given on the answer sheet.
The far left represents the lowest level of confidence while the far right represents absolute
confidence in your analytic judgment.
61
You are a high-profile political analyst working for News Corporation Y. You have been tasked to
forecast the winner of the 2008 Washington State Gubernatorial election, which will be decided
on November 4, 2008. To complete your task, use all available open source information. Also,
use ACH to structure your analysis. The main candidates in this race are Christine Gregoire
(D) and Dino Rossi (R). This will be a rematch from the previous Washington State Gubernatorial
election, which was hotly contested and controversial. Your supervisor gave you a full week
to prepare your forecast.
Use the National Intelligence Council (NIC) Words of Estimative Probability (WEP) as an indicator
of your forecast:
Example Forecast:
It is [WEP] that [Candidate Name] Will Win the 2008 Washington State Gubernatorial Election.
Record your final answers on the provided answer sheet. This answer sheet includes spaces for
your final estimate (WEP), Source Reliability, Analytic Confidence, and a short explanation of how
62
the evidence and subsequent analysis led to your final forecast. Also include a print out of your
ACH matrix when returning the above materials. Please return all of the described materials to
the experiment administrator by the due date in order to receive extra credit from your professor.
Important Information:
Source Reliability:
Source Reliability reflects the accuracy and reliability of a particular source over time.
Sources with high reliability have been proven to be accurate and consistently reliable.
Sources with low reliability lack the accuracy and proven track record commensurate with
more reliable sources.
o Rate source reliability as low, medium, or high.
Analytic Confidence:
Analytic Confidence reflects the level of confidence an analyst has in his or her estimates
and analyses. It is not the same as using words of estimative probability, which indicate
likelihood. It is possible for an analyst to suggest an event is virtually certain based on
the available evidence, yet have a low amount of confidence in that forecast due to a
variety of factors or vice versa.
o To assess analytic confidence, mark your rating on the line given on the answer sheet.
The far left represents the lowest level of confidence while the far right represents absolute
confidence in your analytic judgment.
63
Answer Sheet
NAME:
FORECAST:
ANALYTIC CONFIDENCE:
64
of Confidence of Confidence
-------------------------------------------------------------------------------------------------------
of Confidence of Confidence
Starting point:
http://www.politics1.com/wa.htm
Google/Google News
● Incumbent/challenger popularity
● Election Polls
● Campaign spending
● Local issues relevant to the election
● Party issues
● National party support of incumbent/challenger
● Local economy
● State voting trends
● Voter registration
● Past elections
● Candidate debates
65
*This is not a list of required evidence to collect, but types of evidence that could be an indicator
for an election.
Participation Debriefing
Thank you for participating in this research process. I appreciate your contribution and willingness to
support the student research process.
The purpose of this study was to determine how well ACH mitigates cognitive bias and how accurate the
methodology is for forecasting in intelligence analysis, compared to unstructured methods. Only a handful
of experimental studies have been conducted on ACH, and this research hopes to contribute to the growing
body of literature on structured analytical methods. The experiment you participated in was designed to
test ACH’s capabilities against an unstructured method. Specifically, participants were organized into
experimental and control groups by political affiliation so that factors of interest could be measured.
As the US Intelligence Community faces recent intelligence failures, the use of advanced analytical
techniques will enhance the community’s quality of analysis and benefit US national security.
If you have any further questions about the Analysis of Competing Hypotheses or this research you can
contact me at abrasf87@mercyhurst.edu.
66
Follow-Up Questionnaire
Control Group
Thanks for your participation! Please take a few moments to answer the following questions. Your
feedback is greatly appreciated. Your response to these questions will NOT affect whether or not you
receive extra credit.
1. How much time did you spend working on the assigned task (hours)?
2. Why did you agree to participate in the experiment? (extra credit, other, etc.)
3. Do you feel you understood the assigned task as explained at the instruction session?
4. Were you able to find adequate open source information about the topic?
5. Please rate the level of difficulty in finding open source information related to the
topic:
1=Very difficult 5=Very Easy
1 2 3 4
67
6. Please provide any additional comments you may have about the Analysis of
Competing Hypotheses, the assigned task, or any other part of this experiment.
Follow-Up Questionnaire
ACH Group
Thanks for your participation! Please take a few moments to answer the following questions. Your
feedback is greatly appreciated. Your response to these questions will NOT affect whether or not you
receive extra credit.
1. How much time did you spend working on the assigned task (hours)?
2. Why did you agree to participate in the experiment? (extra credit, other)
3. Do you feel you understood the assigned task as explained at the instruction session?
4. Were you able to find adequate open source information about the topic?
5. Please rate the level of difficulty in finding open source information related to the
topic:
1=Very difficult 5=Very Easy
1 2 3 4 5
1 2 3 4 5
1 2 3 4 5
9. Please provide any additional comments you may have about the Analysis of
Competing Hypotheses, the assigned task, or any other part of this experiment.
Accuracy
Group Statistics
Std. Error
Group N Mean Std. Deviation Mean
Forecast Control 38 1.3947 .49536 .08036
Experimental 30 1.3000 .46609 .08510
Mindsets –Democrats
Ranks
Test Statisticsb
Mindsets – Republicans
Ranks
Test Statisticsa
Confirmation Bias
Group Statistics
Std. Error
Group N Mean Std. Deviation Mean
Confirmation Bias Control 30 1.2000 .40684 .07428
Experimental 32 1.9063 .29614 .05235
70