THE FORECASTING ACCURACY AND EFFECTIVENESS OF COMPLEXITY MANAGER

LINDY-JO SMART

A Thesis Submitted to the Faculty of Mercyhurst College In Partial Fulfillment of the Requirements for The Degree of MASTER OF SCIENCE IN APPLIED INTELLIGENCE

DEPARTMENT OF INTELLIGENCE STUDIES MERCYHURST COLLEGE ERIE, PENNSYLVANIA APRIL 2011

DEPARTMENT OF INTELLIGENCE STUDIES MERCYHURST COLLEGE ERIE, PENNSYLVANIA THE FORECASTING ACCURACY AND EFFECTIVENESS OF COMPLEXITY MANAGER A Thesis Submitted to the Faculty of Mercyhurst College In Partial Fulfillment of the Requirements for The Degree of MASTER OF SCIENCE IN APPLIED INTELLIGENCE

Submitted By: LINDY-JO SMART Certificate of Approval: ___________________________________ Kristan J. Wheaton Associate Professor Department of Intelligence Studies ___________________________________ William J. Welch Instructor Department of Intelligence Studies ___________________________________ Phillip J. Belfiore Vice President Office of Academic Affairs April 2011

Copyright © 2011 by Lindy-Jo Smart All rights reserved.

3

ACKNOWLEDGEMENTS First, I would like to thank Kris Wheaton for his incredible guidance and patience through this process and for always having the time to sit down and work through challenges. I would like to thank Bill Welch for taking on the role as my secondary reader. I would like to thank Hema Deshmukh for helping me complete all statistics in this thesis and for her patience throughout the process. I would like to thank Richards Heuer for his personal correspondence throughout the experiment creation process. I would like to thank all faculty members in the Intelligence Department at Mercyhurst College for their dedication, guidance, and for offering such a rewarding challenge that is the Intelligence Studies graduate program. I would also like to thank my friends and family for their continual support, encouragement, and patience with me the past two years.

5

ABSTRACT OF THE THESIS The Forecasting Accuracy and Effectiveness of Complexity Manager A Critical Examination By Lindy-Jo Smart Master of Science in Applied Intelligence Mercyhurst College, 2011 Associate Professor Kristan J. Wheaton, Chair

The purpose of this study was to assess the forecasting accuracy and effectiveness of the structured analytic technique, Complexity Manager. The study included an experiment with Mercyhurst College Intelligence Studies graduate and undergraduate students placed into small groups to assess an intelligence problem and forecast using intuition or the structured analytic technique, Complexity Manager. Data was collected using a researcher-created Forecasting Answering Sheet that included the variables that each group considered and a researcher-created questionnaire to capture individual responses to the study. Students that used Complexity Manager spent significantly more time working with their groups than the groups that used intuition alone. Students that used intuition alone generated a greater number of variables. However, there was no connection between the generation of variables and forecasting accuracy; the experiment group produced more accurate forecasts. The results of the study show that the use of Complexity Manager may increase forecasting accuracy; three out of 24 control groups

5

forecasted accurately while six out of 23 experiment groups forecasted accurately. Therefore, the use of this structured analytic technique increases collaboration with more accurate results than intuition alone. To yield more statistically sound results, it is necessary for further studies to yield a higher level of participation.

4

TABLE OF CONTENTS Page vii x 1 5 6 11 19 29 43 47 48 49 49 51 52 53 56 58 58 62 62 63 65 66 70 70 76 78 80 83 87 87 88 89 90 91 92 108 109 110 112 113

TABLE OF CONTENTS……………………………………………………… LIST OF FIGURES…………………………………………………………..... CHAPTER 1: INTRODUCTION………………………………………………. CHAPTER 2: LITERATURE REVIEW……………………………………….. Intelligence Failures……………………………………………………... Cognitive Bias…….……………………………………………………... Unstructured and Structured Techniques………………………………... Collaboration ……..……………………………………………………... Complexity Manager…………………………………………………….. Hypothesis ………..……………………………………………………... CHAPTER 3: METHODOLOGY……………………………………………… Setting……………………………………………………………........... Participants …………………………………………………………….. Intervention and Materials ……...……………………………………… Measurement Instruments ………………………………………….…... Data Collection and Procedures ………………..……………………... Data Analysis ……….………………………………………...………... CHAPTER 4: RESULTS……………………………………………………..… Survey Responses……………………………………………………... Group Analytic Confidence ……………………………………………. Group Source Reliability …………………………………………….. Variables………………………………………………………………. Forecasting Accuracy……..……………………………………………. Quality of Variables.............……………………………………………. CHAPTER 5: CONCLUSIONS………………………………………………... Discussion...…………………………………………………………... Limitations…………..………………………………………………….. Recommendations for Further Research………………………………... Conclusions……………………………………………………………... REFERENCES………………………………………………………………...... APPENDICES………………………………………………………………...... Appendix A……………………………………………………………. Appendix B………………………………………………................... Appendix C………………………………………………................... Appendix D………………………………………………................... Appendix E………………………………………………................... Appendix F………………………………………………................... Appendix G………………………………………………................... Appendix H………………………………………………................... Appendix I.………………………………………………................... Appendix J….………………………………………………................... Appendix K.………………………………………………......................

3

LIST OF FIGURES

Page Figure 2.1 Figure 3.1 Figure 4.1 Figure 4.2 Figure 5.1 Complexity Manager: Cross-Impact Matrix Number of Participants Per Academic Class Source Reliability Per Group Forecasting Per Group Analytic Confidence Per Group 46 50 63 66 71

3

5

1

CHAPTER 1: INTRODUCTION
The entire project cost $40,000 and consisted of 1,200 experiments in a 14-month period (“Edison Gets the Bright Light Right,” 2009). The team searched the world, testing materials from beard hair to fishing line to bamboo. Over 40,000 pages of notes were taken (“Edison’s Lightbulb at The Franklin Institute,” 2011). Then, in 1879, after testing over 1,600 materials, Thomas Edison and his associates found a filament that would burn for 15 hours. Edison stated, "I tested no fewer than 6,000 vegetable growths, and ransacked the world for the most suitable filament material" (“Edison’s Lightbulb at The Franklin Institute,” 2011). By 1880, Edison had produced a bulb that could last for 1,500 hours and was then placed on the market (“Light Bulb History – Invention of the Light Bulb,” 2007). The success of Edison’s invention, as with other scientific findings, is evident because it produces tangible and in this case, visual results. The longer a filament would burn, the more successful it was. Failure was depicted through nothingness; a filament didn’t burn for any noticeable length of time. When considering Edison’s filament experiments compared to experiments in other fields of study, are the results just as definite and stunning? In the context of intelligence, the two couldn’t be more different. If intelligence is successful, nothing may happen. But if intelligence fails, the results could be catastrophic. It is not something that can be tested within the confines of a vacuum bulb, nor can thousands of possible solutions be tested before the problem can be resolved. But it can and must improve somehow. One improvement can be through the use of structured analytic techniques. However, though structured techniques exist, not all analysts use them because effectiveness has not been proven and constraints further

2 hinder use. Therefore, the testing of the methods is needed to increase use within the Intelligence Community (IC) and determine the validity of each method. In “Assessing the Tradecraft of Intelligence Analysis,” Gregory F. Treverton and C. Bryan Gabbard define structured analytic techniques as “technologies, products, or processes that will help the analyst in three ways…searching for and dealing with data… building and testing hypotheses…and third, in communicating more easily both with those who will help them do their work” (2008, p. 18). Though these techniques are created to assist analysts, there is no consensus on the need for or value of them (Treverton & Gabbard, 2008). Some analysts willingly use them while others prefer not to. Therefore, even though a large number of structured analytic techniques are available to analysts, without proper training and understanding, the techniques are useless. Another major issue is that many of these structured analytic techniques have yet to be tested. Richards Heuer states that the concept of structured analytic techniques began in the 1980s when Jack Davis began teaching and writing about “alternative analysis” (Heuer & Pherson, 2010, p. 8). Even though the concept has been around for almost thirty years, it remains largely untested. Therefore, without proper training for the use of structured analytic techniques and without proven results to give the methods authority, the use of structured analytic techniques will remain limited. Richards Heuer states that all structured analytic techniques provide the same benefit; they guide communication among analysts who need to share evidence, provide alternative perspectives, and discuss significance of evidence (2009). Structured analytic techniques involve a step-by-step process that externalizes an analyst’s thoughts. Therefore, their thoughts and ideas can be reviewed and discussed at each step of the process. The techniques provide structure to individual thought processes and to the

3 interaction between collaborations to help generate hypotheses and mitigate cognitive limitations (Heuer, 2009). Heuer suggests the evaluation of structured analytic techniques because the only testing the IC has done is through experience and through a small number of colleges and universities that offer intelligence courses (Heuer, 2009). He further states that there is no systematic program established for evaluating and validating the techniques (Heuer & Pherson, 2010). To resolve this issue, Heuer suggests conducting experiments with analysts using the technique to analyze typical intelligence issues (Heuer and Pherson, 2010). Heuer states that the most effective approach to evaluating the “techniques is to look at the purpose for which a technique is being used, and then to determine whether or not it actually achieves that purpose, or if there is some better way” (2009, p. 5). Structured analytic techniques are needed to mitigate limitations such as organizational and individual bias and to decrease the number and negative effects of intelligence failures. Though structured analytic techniques may be an effective way to mitigate these issues, few have been tested including Complexity Manager, the subject of this study. Therefore, testing Complexity Manager would increase the validity of structured analytic techniques, particularly this specific technique. The purpose of this study was to conduct an experiment to test the effectiveness of the methodology Complexity Manager according to Richards Heuer’s recommendations of using intelligence analysts to analyze a typical intelligence issue. This thesis is one of thousands of filaments that need tested to ensure that structured analytic techniques are serving their purpose; to give the visible, tangible results needed in the intelligence field of study.

5

CHAPTER 2: LITERATURE REVIEW
Since the creation of the Intelligence Community (IC), issues surrounding the validity and soundness of its forecasts have surfaced due to intelligence failures. The IC has taken steps to proactively avoid further significant intelligence failures. However, there exist debates both about the causes of an intelligence failure and how preventable it may be actually. Further, the methods used for reaching a forecast are not universally agreed upon; whether it is more effective to use structured analytic techniques as opposed to strictly using intuition varies amongst analysts. Using structured analytic techniques gives the analyst greater confidence in their forecast, especially when managing a complex situation with multiple hypotheses. However, the very techniques that the analyst uses have not been proven effective. One reason is because structured analytic techniques assist the analyst in forecasting the outcome of future events; it is difficult to state with absolute certainty that the technique is effective. In other words, the technique may have been used properly, but the forecast may still have been wrong. Regardless, the testing of structured analytic techniques is essential for not only validating the method itself, but for the validity of the IC’s ability to forecast. Therefore, the testing of Richards Heuer’s Complexity Manager will be an initial step at assessing the strength of this structured analytic technique. The literature will address four areas related to the testing of Richards Heuer’s technique, Complexity Manager. The first section will address research related to intelligence failures and will be followed in the second section by a proposed cause of them, cognitive bias. The third section will focus on research about the use of

6 unstructured versus structured analytic techniques. Finally, the fourth section will discuss research related to the use of collaboration and its connection to Complexity Manager. Intelligence Failures For the purpose of this thesis, “intelligence” will be defined according to the Mercyhurst College Institute for Intelligence Studies (MCIIS) definition created by Kristan Wheaton and Michael Beerbower which states that intelligence is “a process focused externally, designed to reduce the level of uncertainty for a decision maker using information derived from all sources” (2006, p. 319). For the purpose of this thesis, “intelligence failure” will be defined according to Gustavo Diaz’s definition from “Methodological Approaches to the Concept of Intelligence Failure” which states that an intelligence failure is: “the failure of the intelligence process and the failure of decision makers to respond to accurate intelligence” (Diaz, 2005, p. 2). Diaz cites both Mark Lowenthal and Abram N. Shulsky for the creation of this definition. Mark Lowenthal’s definition emphasizes that an intelligence failure is a failing in the intelligence cycle: “the inability of one or more parts of the intelligence process (collection, evaluation, analysis, production, dissemination) to produce timely, accurate intelligence on an issue or event of importance to national interest” (Lowenthal, 1985, p. 51, as cited by Diaz, 2005). Shulsky’a definition acknowledges the connection between intelligence and policy: “a misunderstanding of the situation that leads a government to take actions that are inappropriate and counterproductive to own interests” (Shulsky, 1991, p. 51, as cited by Diaz, 2005). The discussion of intelligence failures is not only to understand why they occurred, but to what extent they can be minimized in the future. There is a lack of consensus regarding the cause of intelligence failures. Some researchers state that

2 analytic failure causes intelligence failures while others state that they are caused by the decision maker’s potential misunderstanding of the intelligence product given to them. In other words, it can either be caused by faulty analysis or by miscommunication with the decision maker. Regardless of the cause, researchers often agree that intelligence failures are inevitable. In “Methodological Approaches to the Concept of Intelligence Failure” Gustavo Diaz claims that intelligence failures are inevitable because of unavoidable limitations. He suggests that there are two schools of thought for intelligence failure: the traditional point of view and the optimistic point of view. The traditional point of view believes that policymakers are responsible for the failures because they do not listen to the analysis or they misinterpret it (Diaz, 2005). The optimistic point of view believes that intelligence can be improved through the use of technology and that failures can be reduced by the use of new techniques (Diaz, 2005). Diaz suggests a third approach, an alternative approach that captures both; there is not one single source of guilt for intelligence failures. An intelligence failure, like any human activity, is inevitable because failures and imperfections are normal. Intelligence cannot always give the same result even with the same environment because there are always factors that cannot be controlled (Diaz, 2005). Diaz states that accidents in complex systems, such as a county’s national security, are inevitable because it is impossible to account for all failures. Not only is it impossible to account for every factor, but there are limitations to the amount and relevance of data collection and the reliability of sources. Also, funding limits the amount of resources available to fight the threats and leads to a need to prioritize them.

4 Richard Betts, in “Analysis, War, and Decision: Why Intelligence Failures are Inevitable” notes barriers that are inherent to the nature of intelligence that include: ambiguity of data, ambivalence of judgment due to conflicting data, and useless reforms in response to previous intelligence failures (1978). Therefore, the inability of intelligence to be infallible and its intrinsic ties to decision making makes intelligence failures inevitable. Betts further states that if decision makers had more time, then intelligence failures would not occur because it could be resolved, just as academic issues are resolved (Betts, 1978). Because time will always be a main concern and reason for needing intelligence analysis, Betts concludes by suggesting “tolerance for disaster” (Betts, 1978, p. 89). Intelligence failures can be inevitable due to the nature of intelligence work, or by human nature. According to Simon Pope and Audun Josang, co-authors of “Analysis of Competing Hypothesis using Subjective Logic,” intelligence failures or errors in general, are due to problems with framing, resistance to change, risk aversion, limitations of shortterm memory, and cognitive bias. These issues can negatively affect intelligence, especially where issue outcomes appear similar. (Pope & Josang, n.d.). Because of the inevitability of error with human reason and past intelligence failures, the researchers conclude that to continue to rely solely on intuition would be irresponsible. As the authors state, “management of intelligence analysis should encourage the application of products that allow clear delineation of assumptions and chains of inference.” (Pope & Josang, p.2). The inevitability of intelligence failures due to cognitive bias will be discussed in more detail after the discussion of intelligence failures. The inevitability of intelligence failures may not only be natural, but may be a necessary part of intelligence. Stephen Marrin, a former CIA analyst, suggests in

5 “Preventing Intelligence Failures by Learning from the Past” that intelligence failures occur as a trade-off to another action that could have caused future, unavoidable failures. Also, imperfections in the intelligence process are the result of unavoidable tradeoffs to structures. He states, for example, that any changes that would have been made to prevent September 11, 2001 would have caused other unavoidable future failures (Marrin, 2004). Therefore, the only way to make improvements is to understand that everything has tradeoffs and either work to minimize them or find new ways of doing things that move beyond the tradeoffs. An intelligence failure often implies a negative impact on the U.S. national security; however, failures occur every day in varying degrees. Though these things occur on a daily basis, it is not until the information is applied to a high profile situation that it is then known as an intelligence failure (Marrin, 2004). Marrin also makes the point that though intelligence failures are becoming more public through investigations; successes are often not discussed to avoid losing sources and methods (Marrin, 2004). Consequently, in the public view, failures outnumber successes; the degree of success is not known. If intelligence failures are inevitable, then the misconception that failures outnumber successes may also be a necessary part of intelligence to maintain source confidentiality. John Hollister Hedley, states in “Learning from Intelligence Failures” that from the United States’ perspective, anything that occurs that catches the U.S. by surprise or was unintended, is then seen as an intelligence failure. Aside from the perception of intelligence failure, Hedley also notes that failure is inevitable because analysts must be willing to take risks to do their job well because even when information is incomplete, inaccurate, or contradictory, a decision must be made (Hedley, 2005). Even under these

6 circumstance and though it is impossible to learn how to prevent something inevitable like intelligence failures, the ratio of success to failure could be improved (Hedley, 2005). Therefore, to improve the ratio, structures and methods could be applied to increase the likelihood of success. In Intelligence Analysis in Theater Joint Intelligence Centers: An Experiment in Applying Structured Methods, Master Sergeant Robert D. Folker, Jr. states that the root cause of intelligence failures is analytic failure; the lack of analysis of the collected raw data. He states that regardless of what analysts believe about intelligence failures, it is the opinion of the decision maker that matters most; if the decision maker doesn’t believe the accuracy of the intelligence then it will not be useful. Therefore, improvements should focus on improving accuracy and producing timely and useful products (Folker, 2000). Regardless of how inevitable failures are, Folker emphasizes a need for improvements in the quality of analyst’s work so decision makers can make quality decisions. The literature reviewed on intelligence failures stated that failures are caused by cognitive bias, analytic failure, or the decision maker’s misunderstanding of the intelligence product they are given. Whatever the cause, the IC agrees that intelligence failures are inevitable because failure and imperfection are inevitable and normal. Also, the nature of intelligence requires judgments to be made on time sensitive issues so tradeoffs must be made. Though failure is inevitable, it is not justifiable or excusable to not attempt to prevent or reduce the severity of its consequences. Rather, the more intelligence failures are understood and what the causes are, the more likely it would be to lessen its effects, especially the factors that are within the analyst’s control including

7 an understanding of their own cognitive bias and its ramifications on their intelligence products. Cognitive Bias In “Fixing the Problem of Analytic Mind-Sets: Alternative Analysis,” Roger George describes cognitive bias as a mindset from which both the analyst and the decision maker develop a series of expectations based on past events and draw their own conclusions. As both are presented with new data, they either validate it because it is consistent with earlier data or they disregard it if it does not fit into the pattern. As new events occur, data consistent with earlier patterns of beliefs are more likely to be accepted as valid, while data that conflicts with expectations lack precedent. It is human nature for individuals to ‘‘perceive what they expect to perceive,’’ making the mindset unavoidable (George, 2004, p. 387). Initially, the mindset can help create experts for data collection; however, eventually the mindset will make the experts obsolete as they are unable to accept or process new information or changing events. Heuer identifies cognitive bias as a mental error that is consistent and predictable. In Jack Davis’ introduction to Heuer’s Psychology of Intelligence Analysis, Davis identifies three factors that Heuer recognizes as the cognitive challenges that analysts face: the mind cannot effectively deal with uncertainty; even if the analyst has an increased awareness of their biases, this does little to help analysts deal effectively with uncertainty; and tools and techniques help the analyst apply higher levels of critical thinking and improve analysis on complex issues, especially when information is incomplete or deceptive (Davis, 1999). In the chapter, “Thinking About Thinking,” Heuer notes that weakness and bias are inherent in the human thinking process. However,

2 they can be alleviated by the analyst’s conscious application of tools and techniques (Heuer, 1999). Though bias is present, techniques can help mitigate its effects. Patterns are necessary for analysts to know what to look for and what is important. These patterns then form the analyst’s mindset and create their perception of the world (Davis, 1999). Mindsets are unavoidable and objectivity is achieved only by making assumptions as clear as possible so when others view the analysis, they can assess its validity. Cognitive bias may be unavoidable, but overt acknowledgment reduces its negative effect on intelligence analysis. Cognitive biases can develop within individuals and collectively in the organization they work for. David W. Robson states in “Cognitive Rigidity: Methods to Overcome It” that organizations develop mental models that serve as the basis for belief within the organization. Like an individual’s mindset, these mental models can be difficult to overcome (Robson, n.d.). This cognitive rigidity can lead to reliance on hypotheses that purely reinforce what the organization believes to be true and valued. Just as an expert’s judgment can become obsolete, cognitive rigidity within an organization can ultimately lead it to dismiss radical alternatives to its approach and restrict its abilities to change over time though necessary. Organizations often struggle with cognitive rigidity because, by its very nature, it is undetectable (Robson, n.d.). Robson notes that this is especially true for organizations that handle complex problems and forecast possible outcomes. The danger of cognitive rigidity lies in the experience of the organization; the more experienced the organization, the more susceptible it is to being set in its mental model. Over time, the organization’s cognitive framework becomes self-reinforcing as it accepts only data that confirms its core assumptions that it has been built on. For organizations that deliver actionable

3 intelligence, these frameworks consequently influence the estimation of probability and may negatively influence the intended solution and possibly lead to an intelligence failure (Robson, n.d.). Rob Johnston identifies two general types of bias in “Integrating Methodologists into Teams of Substantive Experts” that include pattern bias and heuristic bias. Pattern bias, more commonly known as confirmation bias, is looking for evidence that confirms instead of rejects a hypothesis and heuristic bias uses inappropriate guidelines or rules to make predictions (Johnston, 2005). Johnston looks at how each affects experts. He states that unreliable expert forecasts are often caused by both pattern and heuristic bias. Becoming an expert requires years of viewing the world through a particular lens; however, because of these biases, it can also lead to poor intelligence. Johnston states that intelligence analysis is like other complex tasks that demand expertise to solve complex problems; the more complex the task, the longer it takes to build necessary expertise. However, this level of expertise paradoxically makes expert forecasts unreliable. Johnston notes that experts outperform novices with pattern recognition and problem solving, but expert predictions are “seldom as accurate as Bayesian probabilities” (2005, p. 57). Johnston attributes this to cognitive bias and time constraints. Experts are effective but only to the extent to where their bias does not affect the quality of their analysis. Johnston also discussed bias in Analytic Culture in the US Intelligence Community. An Ethnographic Study. Johnston conducted a series of 439 interviews, focus groups, and other forms of participation from members of the IC (2005). The purpose of the work was to identify and describe elements that negatively affect the IC. Within this study, he focused a section of his work on confirmation bias. Johnston found through

4 interviews and observation that confirmation bias was the most prevalent form of bias in the study (Johnston, 2005, p. 21). For example, when Johnston asked the participants to describe their work process, they responded that the initial steps to investigating an intelligence issue were to do a literature search on previous literature. The issue that Johnston notes with this is that searches can quickly lead to unintentional searching for confirming information. Therefore, the evidence collection could quickly become a search that only confirmed the analyst’s own thoughts and assumptions. A weakness of Johnston’s ethnographic study on confirmation bias may be the presence of his own bias. Johnston chose to include only four quotations from interviews with analysts during his discussion on confirmation bias. All answers that he provides in the body of his results conclude the same thing: initial searches are done by reading previous products and past research. However, with over 439 participants in the study, it is doubtful that only four participants answered this question and it is even less likely that all 439 participants only discussed literature searches. A more comprehensive look at confirmation bias within the IC would have been to measure, in a more quantitative form, the responses to see exactly where the bias stems from. If not, it appears that Johnston chose those quotes only to make his point that analysts most often use literature searches to begin the analysis process instead of presenting all the responses. To test cognitive bias, researchers Brant A. Cheikies, Mark J. Brown, Paul E. Lehner, and Leonard Adelman assessed the effectiveness of a structured analytic technique in their study “Confirmation Bias in Complex Analyses.” The researchers believe that most studies of confirmation bias involve abstract, unrealistic experiments that do not mirror complex analysis tasks managed by the IC. Therefore, the purpose of this study was to recreate a study of an actual event, using techniques to assess the

5 presence of confirmation and anchoring bias and if the structured analytic technique, Analysis of Competing Hypotheses (ACH), successfully reduces it. For their study, the researchers define an anchoring effect as a “tendency to resist change after an initial hypothesis is formed” (Cheikies et al, 2004, p. 9). The researchers define confirmation bias as the “tendency to seek confirming evidence and/or bias the assessment of available evidence in a direction that supports preferred hypotheses” (Cheikies et al, 2004, p. 10). This is the first recorded experiment that looked to test ACH’s ability to minimize confirmation bias (Cheikies et all, 2004). The researchers replicated a study by Tolcott, Marvin, and Lehner, conducted in 1989, to obtain the same confirmation bias results. Therefore, they could then test ACH’s ability to mitigate confirmation bias using the previous study as a control. For the study, the researchers used 24 employees from a research firm. The participants averaged 9.5 years of intelligence analysis experience. All participants were emailed 60 pieces of evidence regarding The USS Iowa Explosion that occurred in April 1989, including three hypothesized causes of the explosion: Hypothesis 1 (H1) inexperienced rammerman inadvertently caused the explosion, (H2) friction ignited powder, and (H3) gun captain placed incendiary device. The experiment group was given the same information but was also given an ACH tutorial. To test for confirmation and anchoring bias, the researchers had H1 and H3 receive the most confirming evidence in the first two rounds while having H2 have the least confirming evidence. Also H1 and H3 were constructed to be easiest to visualize. To analyze the results, two analyses of variance (ANOVA) were performed to determine the confidence ratings of the participants. The results showed that as the participants assessed new evidence, it was greatly affected by the beliefs that they had at the time the evidence was given. The evidence

6 that confirmed the participants’ current belief was given more weight than the disconfirming evidence (Cheikies et all, 2004). In this study, ACH reduced confirmation bias, but only to the participants that did not have professional analysis experience. The study was able to show that an anchoring effect was present and also that ACH was able to minimize an analyst’s tendency toward confirmation bias. The researchers were effective at establishing an anchoring effect because they built on a successful study that had previous done the same thing. However, the researchers’ use of participants that were not experienced in intelligence analysis is a weakness of the study. Though all participants were interested in analysis, only 12 had analysis experience. The varying abilities of the participants calls into question the validity of the results because those not trained or experienced in analysis would not only be less aware of the purpose of weighing criteria and the use of a structured analytic technique, but also of the presence of cognitive bias. In his Applied Intelligence Master’s Thesis, “Forecasting Accuracy and Cognitive Bias in the Analysis of Competing Hypothesis,” Andrew Brasfield looked to further investigate the inconclusive and varying results from previous studies about the structured analytic technique, Analysis of Competing Hypotheses (ACH) (2009). Specifically, Brasfield looked at ACH’s goals of increased forecasting accuracy and decreased cognitive bias. Seventy undergraduate and graduate Intelligence Studies student participants were divided into control and experiment groups and further divided into groups based on political affiliation to detect the presence of a pre-existing mindset regarding the topic, the 2008 Washington State gubernatorial election. The two possible outcomes would either be the incumbent governor or the challenger would win the election. The

7 participants had access to open source material to gather evidence for their forecast. The control group used an intuitive process and the experiment group used ACH to structure their analysis. The participants were given a full week to complete the assignment. Brasfield tested accuracy by comparing the results of the control group to the results of the experiment group. To test for cognitive bias, Brasfield looked to see if there was a pattern between the participants’ party affiliation and their forecasts. Also, for the experiment group, Brasfield used the evidence in the participants’ ACH that overwhelmingly supported a particular party affiliation; this detected the presence of cognitive bias. The results of the election showed that the incumbent won. The results of the study showed that the experiment group was 9 percent more accurate than the control group; 70 percent of the participants in the experiment group forecasted the winner and 61 percent of the participants in the control group forecasted the winner (Brasfield, 2009). Brasfield states that structured analytic techniques “should only improve overall forecasting accuracy incrementally since intuitive analysis is, for the most part, an effective method itself” (Brasfield, 2009, p.39). Therefore, though the improvement is only minor, the findings show that ACH does improve analysis. Regarding cognitive bias, ACH appeared to mitigate bias of party affiliation among Republicans but not Democrats. Brasfield notes that this may be due to the Democratic candidate winning the election. For the experiment group using ACH, participants used more evidence and applied it appropriately. Nearly all control group participants used evidence that only supported the candidate they forecasted to win, suggesting confirmation bias in the control group (Brasfield, 2009). Therefore, ACH does appear to mitigate this bias.

8 As intelligence failures are inevitable, so is the cognitive bias that contributes to them. Cognitive bias within the IC is most prevalent in the form of confirmation bias where analysts and decision makers seek information that conforms most to the data that they currently have. Within organizations, this takes the shape of confirming the beliefs and values that the organization already holds. Because intelligence relies on judgment, an analyst’s cognitive bias can affect the accuracy of the forecast and the decision maker’s cognitive bias can affect the action they take. The biases of both are a major contributor to whether intelligence succeeds or fails. Cognitive bias is a mindset that either rejects or confirms information based on previous experiences. By consciously recognizing cognitive bias, the analyst is taking the first proactive measure to mitigate their contribution to an intelligence failure. However, the analyst must do more than just recognize their bias; they must take the next step to distance their bias from their forecast either by using intuition or structured analytic techniques. Unstructured and Structured Techniques It is debated within the IC about the use of structured analytic techniques for effective forecasting and decision making. One side believes unstructured, intuitive thinking is effective and comes with experience working in the field. The other side believes that the use of structured analytic techniques organizes and manages complex situations more effectively than intuition alone. The argument for both, however, recognizes a need to overcome bias and the need for a process to effectively aid in strategic decision-making. Because bias is inherent and present in every decision that is made, the analyst must make a conscious application of analytic techniques outside of their own mind to help reduce the effects of bias. However, intuition is also a powerful tool that can be used alone or in conjunction with analytic techniques.

1 Intuition David Meyers, a professor of Psychology at Hope College Memory, describes the intuitive process in his book, Intuition: Its Powers and Perils. He states that memory is not a single, unified system. Rather, it is “two systems operating in tandem” (Meyers, 2002, p. 22). Implicit memory, or procedural memory, is learning how to do something whereas explicit memory, or declarative memory, is being able to state what and how something is known. Meyers offers an example: as infants, we learn reactions and skills used throughout our lives; however, we cannot explicitly recall anything from our first three years (2002). This phenomenon continues throughout our lifetime. Though we may not explicitly recall much of our past, we implicitly and intuitively remember skills. Beyond a basic idea of intuition, Meyers also discusses intuitive expertise. Compared to novices, experts know more through learned expertise. Meyers describes William Chase and Herbert Simon’s chess expert study. The researchers found that the chess experts could reproduce the board layout after looking at the board for only 5 seconds and could also perceive the board in clusters of positions that they were familiar with. Therefore, the experts could intuitively play 5 to 10 seconds a move without compromising their level of performance. Through this example, Meyers relays that experts are able to recognize cues that enable them to access information they have stored in their memories; expert knowledge is more organized and, therefore, more efficiently accessible (Meyers, 2002). Experts see large, meaningful patterns while novices are only able to see the pieces. Another difference between experts and novices is that experts define problems more specifically. Meyers does note, however, that expertise is discerning. Expertise is within a particular field and scope for each individual (Meyers, 2002).

3 Though intuition allows us to access experiences and apply them efficiently, it does have its drawbacks. Meyers recognizes three forms of bias that intuition is prone to: hindsight bias, self-serving bias, and overconfidence bias (Meyers, 2002). Hindsight bias is when events and problems become obvious retrospectively. Then, once the outcome is known, it is impossible to revert to the previous state of mind (Meyers, 2002). In other words, we can easily assume in hindsight that we know and knew more than we actually did. Another disadvantage of intuition is a self-serving bias. Meyers states that in past experiments, people more readily accepted credit for successes but attributed failure to external factors or “impossible situations” (Meyers, 2002, p. 94). A third drawback to intuition is an overconfidence bias that can surface from judgments of past knowledge in estimates of “current knowledge and future behavior” (Meyers, 2002, p. 98). Overconfidence is then sustained by seeking information that will confirm decisions (Meyers, 2002). Intuition can be a powerful tool, especially when quick decisions need to be made. However, intuition is subject to bias and an individual’s expertise is limited in scope and personal ability. Because of this, it is necessary for analysts to remain cognizant of its possible disadvantages and to use tools that most effectively make use of their intuition. Naresh Khatri and H. Alvin in “Role of Intuition in Strategic Decision Making” look to fill the gap in the field of research on the role that intuition serves in decision making. At the time of their study, the researchers state that there are only a few scholarly works on intuition and even less research that has been conducted in the field. The researchers define intuition as a “sophisticated form of reasoning based on ‘chunking’ that an expert hones over years of job-specific experience...in problem-

4 solving and is founded upon a solid and complete grasp of the details of the business” (Khatri & Alvin, p. 4). Khatri and Alvin (n.d.) state that intuition can be developed through exposure to and experience with complex problems, especially those that have a mentor through the process. They note six important properties of intuition. The first is that intuition is a subconscious drawing of experiences. The second is that intuition is complex because it can handle more complex systems than the conscious mind (Parikh, 1994, as cited by Khatri & Alvin, p. 5). The rational mind thinks more linearly while intuition can overcome those limitations. A third property is that the process of intuition is quick. Intuition can recall a number of experiences in a short period of time; compressing years of learned behavior into seconds (Isenberg, 1984, as cited by Khatri & Alvin, p.5). A fourth property of intuition is that it is not emotion. Intuition does not come from emotion; rather, emotions such as anger or fear cloud the subtle intuitive signals. The fifth property is that intuition is not bias. The researchers state that there are two sides to the bias debate over intuition. The first is that cognitive psychology research states decision making “is fraught with cognitive bias” (Khatri & Alvin, p. 6). However, another body of research suggests that intuition it not necessarily biased but “uncannily accurate” (Khatri & Alvin, p.6). The researchers’ line of reasoning follows that the same cognitive process that is used for valid judgments is the same one that generates the biased ones; therefore, “intuitive synthesis suffers from biases or errors, so does rational analysis” (Khatri & Alvin, p.6). Finally, intuition is part of all decisions. It is used in all decisions, even decisions based on concrete facts. As the researchers note, “at the very least, a forecaster has to use intuition in gathering and interpreting data and in deciding which unusual future events might influence the outcome” (Goldberg, 1973, cited by

5 Khatri & Alvin). The researchers show through these properties that intuition is not an irrational process because it is based on a deep understanding and rooted in years of experience surfacing to help make quick decisions. Khatri and Alvin state that strategic decisions are characterized by incomplete knowledge; that decision makers cannot rely solely on formulas to solve problems, so a deeper sense of understanding of intuition is necessary. The authors note that intuition should not be viewed as the opposite of quantitative analysis or that analysis should not be used. Rather, “the need to understand and use intuition exists because few strategic business decisions have the benefit of complete, accurate, and timely information” (Khatri & Alvin, p. 8). Khatri and Alvin surveyed senior managers of computer, banking, and utility industries in the Northeastern United States and found that intuitive processes play a strong role in decision making within each respective industry. The industries were chosen based on each environment; the computer industry is least stable, the banking industry is moderately stable, and the electric and gas companies are the most stable but the least competitive of the three. Khatri and Alvin acknowledged the effect the size of the organization has on the culture; small organizations tend to “use more of informal/intuitive decision making and less of formal analysis than large organizations” (Khatri & Alvin, p. 14). The researchers narrowed their scope by sampling organizations that fell within a specified sales volume range. For the scope of the study, organizations in the computer and utility industries all had sales of over $10 million and nine banks ranged from $50 million to $350 million in assets. The researchers used both subjective and objective

6 indicators for measurement of performance. The researchers had a response rate of 68 percent, or 281 individuals from 221 companies. The industry mean scores were examined using the Newman-Keuls procedure and a hierarchical regression analysis. The results showed that the computer industry uses a higher level of intuitive synthesis than banks and the banking industry uses a higher level of synthesis than the utility industry. The researchers’ three indicators of intuition include judgment, experience, and gut-feeling. Each of these varied according to industry. They found that managers in banks and computer companies use more judgment and rely on their previous experience more so than the utility company managers. Managers of computer companies rely on gut-feelings significantly more than bank or utility managers. Therefore, Khatri and Alvin found that intuition is used more in unstable environments than stable. Due to their findings, the researchers suggest that intuition be used for strategic decision making in less stable environments and cautiously in more stable environments. Khatri and Alvin state that intuition can be developed through exposure to and experience with complex problems, especially those that have a mentor through the process. Khatri and Alvin note that their geographic range, size, and choice of industries chosen were limitations to their study. The Northeast was the geographic area studied and its economy can vary widely not only regionally, but nationally. They note that further research should draw from large sample sizes of varying industries. Another limitation of the study may be the researcher’s use of indicators and definition of a “stable” environment. The researchers noted that the indicators were subjective. Without making the indicators as objective as possible, it becomes increasingly difficult to use the indicators as a standard for comparing data across the industry. Also, Khatri and Alvin

7 state that the use of intuition is more effective in an unstable environment. However, without having a clear definition of what a “stable” vs. an “unstable” environment is, the appropriate use of intuition for a specified environment may not be properly determined. Structured Analytic Techniques “The National Commission on Terrorist Attacks Upon the United States”, or the 9/11 Commission, was created to evaluate and report the causes relating to the terrorist attacks on September 11, 2001 (Grimmet, 2004). The Commission also reported on the evidence collected by all related government agencies about what was known surrounding the attacks, and then reported the findings, conclusions and recommendations to the President and Congress on what proactive measures could be taken against terroristic threats in the future (Grimmet, 2004). Throughout the document, there are repeated recommendations stressing the necessity of information sharing not only throughout United States agencies but through international efforts. The Commission recommended information sharing procedures that would create a trusted information network balancing security and information sharing (Grimmet, 2004). Also throughout is the emphasis on improved analysis. A Key Recommendation of the Joint Inquiry House and Senate Intelligence Committees stated that the IC should increase the depth and quality of its domestic intelligence collection and analysis (Grimmet, 2004). The committees also suggested an “information fusion center” where all-source terrorism analysis could be improved in both quality and focus (Grimmet, 2004). The committees also stated that the IC should “implement and fully utilize data mining and other advanced analytical tools, consistent with applicable law” (Grimmet, 2004, p. 20). By stating this, the committee is recognizing the value in using

2 structured analytic techniques to improve intelligence. Therefore, the use of analytic techniques is necessary within the IC, especially when working directly with terrorism. Folker states in Intelligence Analysis in Theater Joint Intelligence Centers: An Experiment in Applying Structured Methods that a debate exists between unstructured and structured analytic techniques. It is a difference in thinking of intelligence as either an art form or a science. The researcher states only a small number of analysts occasionally use structured analytic techniques when working with qualitative data and instead rely on unstructured methods (Folker, 2000). In the context of this experiment, structured analytic techniques are defined as “various techniques used singly or in combination to separate and logically organize the constituent elements of a problem to enhance analysis and decision making” (Folker, 2000, p. 5). Folker states that advocates of unstructured methods feel that intuition is more effective because structured analytic techniques too narrowly define the intelligence problem and ignore other important pieces of information (Folker, 2000). Those who use structured analytic techniques claim that the results are more comprehensive and accurate. The methods can be applied to a broad range of issues to assist the analyst to increase objectivity. The emphasis on structured analytic techniques is not to replace the intuition of the analyst, but to implement a logical framework to capitalize on intuition, experience, and subjective judgment (Folker, 2000). However, no evidence exists for either; at the time of the experiment, Folker states that there has not been a study done to adequately assess if the use of structured analytic techniques actually improves qualitative analysis (2000). Folker points to the advantages of structured analytic techniques when he states:

3 A structured methodology provides a demonstrable means to reach a conclusion. Even if it can be proven that, in a given circumstance, both intuitive and scientific approaches provide the same degree of accuracy, structured methods have significant and unique value in that they can be easily taught to other analysts as a way to structure and balance their analysis. It is difficult, if not impossible, to teach an intelligence analyst how to conduct accurate intuitive analysis. Intuition comes with experience. (2000, p. 14) The ability of structured analytic techniques to be taught and replicated is shown to be a clear advantage over intuition which is learned through personal experience. Folker states that even though structured analytic techniques have advantages, they are not used because of time constraints, a sense of increased accountability, and because there is no proof that it will actually improve analysis. Analysts are faced with an ever increasing amount of qualitative data that is used for solving intelligence problems. In order to use the data in a more objective way, Folker designed an experiment to test the effectiveness of a structured analytic technique and its ability to improve qualitative analysis. This was accomplished by comparing analytic conclusions drawn from two groups; between those that used intuition and those that used a structured analytic technique. Then the participants’ answers were scored as correct or incorrect and compared statistically to determine which group performed better. There were 26 total participants in this study; 13 in the control and 13 in the experiment group. The low participation level was taken into account and Fisher’s Exact Probability Test was used to determine “statistical significance for the hypotheses and for the influence of the controlled factors (rank, experience, education, and branch of service)” (Folker, 2000, p. 16). The participants completed a questionnaire to give

4 demographics and identify any prior training and experience they had. Both groups were given a map, the same two scenarios, and an answer sheet. All were given one hour to complete the first scenario and 30 minutes to complete the second scenario. Both scenarios were built using extensive testing and were based on actual events. The results indicated that the use of structured analytic techniques improved qualitative analysis and that the controlled factors did not seem to affect the results. Folker noted that time constraints for learning the methodology was a limiting factor. Folker allotted one hour for teaching Analysis of Competing Hypotheses (ACH) and stated that the complexity of the scenarios may have affected the results. Therefore, for future studies, it is necessary to either use experienced analysts of varying degree that are familiar with the methodology, or allot more time for learning it. In “The Evolution of Structured Analytic Techniques,” Heuer states that structured analytic techniques are “enablers of collaboration”; that the techniques are the process by which effective collaboration occurs (Heuer, 2009). Structured analytic techniques and collaboration should be developed together for the success of both. Heuer states that there is a need for evaluating the effectiveness of structured techniques beyond just experience; each needs tested. Heuer states that testing the accuracy of a methodology is difficult because it assumes that the accuracy of intelligence can be measured. Also, testing for accuracy is problematic when most intelligence questions are probabilistic (2009). Heuer notes that this would require a large number of experiments to acquire a distinguishable comparison between the accuracy of one technique over another. Further, the number of participants that would be needed for these would be unrealistically high. Heuer states that the most feasible and effective approach for evaluating a technique is to look at the purpose for

5 which the technique is being used and then determine whether it achieves that purpose or if there is a better way to achieve that purpose; simple empirical experiments can be created to test these. Rather than a debate between the use of unstructured versus structured analytic techniques, it is possible to view the two sides as existing on either ends of a spectrum; with both being necessary and useful for decision making and forecasting. The emphasis of structured analytic techniques is not to replace intuition but to create support for the analyst’s intuition, experience, and subjective judgment; analytic tools increase objectivity. As shown through the literature, intuition is especially effective for highly experienced professionals, and analytic tools help analysts attain a higher level of critical thinking and improve analysis on complex issues. This is necessary in intelligence, especially when information can be deceptive or datasets can be incomplete. In sum, structured analytic techniques can be taught and intuition cannot. Though both are valuable, the use of structured analytic techniques can increase objectivity, especially for the novice that may have a less developed intuition than an experienced analyst. Collaboration Collaboration may be the first initial step towards reducing confirmation bias. As Johnston stated in Analytic Culture in the US Intelligence Community: An Ethnographic Study, analysts often use literature searches as the initial step for assessing an intelligence problem. However, collaboration could generate multiple hypotheses that a literature search would miss. Though collaboration could alleviate issues with confirmation bias, problems with group dynamics could hinder multiple hypotheses generation. In The Wisdom of Crowds, James Surowiecki proposes the use of groups not only to generate more ideas, but to increase the quality of the decisions made. Surowiecki

2 affirms that groups make far better judgments than individuals, even experts. Surowiecki states that experts are important, but their scope is narrow and limited; there is no evidence to support that someone can be an expert at something broad like decision making or policy (2009). Instead, groups of cognitively diverse individuals make better forecasts than the most skilled decision maker (Surowiecki, 2009). Surowiecki furthers the argument that cognitively diverse groups are important by stating that a diverse group of people with varying levels of knowledge and insight are more capable at making major decisions than “one or two people, no matter how smart those people are” (Surowiecki, 2004, p. 31). When making decisions, it is best to have a cognitively diverse group (Surowiecki, 2009). This is because diversity adds perspective and it also eliminates or weakens destructive group decision making characteristics such as overly influential members. In other words, conscientious selection helps to alleviate dominate personalities from taking over the group. Surowiecki cites James Shanteau, one of the leading thinkers on the nature of expertise, to back up his claim. Shanteau asserts that many studies have found individual expert judgment to be inconsistent with other experts in their field of study (Surowiecki, 2009). Also, they are prone to what Meyers would call the “overconfidence bias”; experts, like anyone else whose judgments are not calibrated, often overestimate the likelihood that their decisions are correct. In other words, being an expert does not necessarily mean accurate decision-making. Experts should be integrated into a group to make them the most effective they can be. Surowiecki uses scientific research as an example to show the effectiveness of collaboration. He states that because scientists collaborate and openly share their data with others, the scientific community’s knowledge continues to grow and solve complex

3 problems (Surowiecki, 2009). Collaboration not only improves research, it also fully utilizes experts’ abilities. Individual judgment is not as accurate or consistent as a cognitively diverse group. Therefore, diverse groups are needed for sound decision making. Wesley Shrum, in his work, “Collaborationism” discusses the motivations and purpose of collaboration. Shrum states that collaboration should not be generalized because it occurs in many forms across a wide range of disciplines. For example, some disciplines require collaboration while others can easily opt not to use it. Shrum questions what motivates individuals and groups to collaborate when their field does not necessarily require it and what those motivators are. The most common motivation is resources such as technology and funding. By collaborating, individuals are more likely to have access to resources they need. Others are motivated by bettering their discipline through strategic efforts. In other words, if a less established discipline collaborates with a well established discipline, the less established discipline will gain legitimacy (Shrum, n.d.). A third motivation is to gain information from other disciplines to solve complex problems. Shrum (n.d.) states that cross discipline collaboration is increasing. A major issue that Shrum sees in current collaboration is that it is often technology-based. Collaboration is designed to “produce knowledge later rather than now” (Shrum, p. 19). The collaboration isn’t being used to solve problems or produce results at that present moment, but to create things such as databases to be used at a later point in time. Shrum states that the knowledge produced later may not even involve the same individuals that originally collaborated to create it. This is a problem because the farther all disciplines move away from the “interactivity of collaboration…the farther we move from the essential phenomenon that the idea of collaboration centrally entails:

4 people working together with common objectives” (Shrum, p.19). Shrum looks at collaboration in a realistic sense that individuals are using collaboration to get what they need out of it as individuals and can abandon the process at any point they feel it is no longer useful. Collaboration is being used for individual instant gratification rather than strategic pursuits. By analyzing collaboration in its current state, Shrum is able to identify the benefits and issues surrounding modern collaboration. In his study, “Processing Complexity in Networks: A Study of Informal Collaboration and its Effect on Organizational Success” Ramiro Berardo seeks to identify how individual organizations “active in fragmented policy arenas” are able to achieve their goals through collaborations and what motivates collaboration (Berardo, 2009, p. 521). The basis of the study is the resource exchange premise that individuals or organizations rarely have enough resources to pursue their goals; therefore, they must exchange resources. The more resources the individual or organization is able to acquire, the more likely they will be able to achieve their goals. Berardo states that it is through the expansion of connections that the individuals or organizations will be most successful. It is not only the number of connections but more importantly, the way that the collaborations are connected to others in the network. Berardo studied a multi-organizational project that was addressing water-related problems in southwest Florida. Berardo used data collected from the applicants that were part of the project that would determine whether or not the applicant received funding. The data also contained detailed information about the nature of work the applicant did. Then, using this information, Berardo contacted the 92 applicants that worked on the project through a semi-structured phone survey. The information provided through the survey gave Berardo the names of other organizations that participated in the project as

5 “providers of resources that could be used to strengthen the application” (Berardo, 2009, p. 527). The data was then put into a matrix that contained information about the organizations and their common relationships (Berardo, 2009). It then showed the pattern of participation of organizations in the project. The main organization controlled 50 percent of the budget and other organizations became a part of the project through an application process, hoping to obtain funding from the main organization. The applicants had diverse backgrounds ranging from financial to legal to technical expertise. Therefore, both the funder and the applicants benefited because the funder received knowledge from the expertise of the applicants and the applicants received funding (Berardo, 2009). Berardo explains that all involved in this process, from the main organization to the experts, were part of an informal collaboration and that these types of collaborations are becoming more common (2009). The results of the study showed that a larger number of partners increase the likelihood of a project to be funded and organizations that are most active and take a leadership role are most likely to be funded. This study confirmed the resource exchange theory that the more partners, the more resources available to improve quality. Berardo found that the leading organization is most successful when its partners collaborate with each other in other projects. Also, once the collaboration gets to a certain number, over seven, the likelihood of getting funding for the project declines. This is because it creates a level of unmanageable complexity for the main funding organization (Berardo, 2009). Berardo states, “there is a limit to the benefits of engaging in collaborative activities with more and more partners, and that limit is given by the increasing complexity that an organization is likely to face when the partners provide large amounts of nonredundant resources” (Berardo, 2009, p. 535).

7 A weakness with the study was that it looked at only one collaborative effort. Therefore, future studies would need to confirm the results by looking at other types of collaborations, ranging in size and areas of expertise. When thinking about this study in terms of collaboration within the intelligence community, factors may differ from the results of this study. For example, agencies may not be collaborating to mutually benefit because of funding. Therefore, incentives to collaborate may be different within the IC than through nonprofit organizations or for-profit companies. The question is, then, what is the incentive to collaborate within the IC when the resource may not be as straightforward as funding? In other words, what would be the incentive for an expert working in the for-profit sector to collaborate with an analyst? This may be why the individual analyst lacks the motivation to collaborate and mandates for collaboration are necessary in the field. In their study, “A Structural Analysis of Collaboration between European Research Institutes,” researchers Bart Thijs and Wolfgang Glänzel investigate the influence the research profile has on an institute’s collaborative trends. Thijs and Glänzel note that there is extensive research on the collaborative patterns of nations, institutes, and individuals with most of them finding a positive correlation between collaboration and scientific productivity (2010). The researchers aimed to provide a more micro look at collaborative behavior; focusing less on nations or institutions as a whole, but instead looking at the research institute and its international and domestic collaborations. The researchers classified a research institute by its area of expertise in order to establish what other types of research institutes it collaborated with and why. The researchers then looked to find the group that, according to its research profile, was the most preferred partner for collaboration.

9 Thijs and Glänzel used data from the Web of Science database of Thomas Reuters and limited their scope to include only articles, letters, or reviews indexed between 2003 and 2005. The documents were classified into a subject category system, dividing them into eight different groups: Biology (BIO), Agriculture (AGR), a group of institutions with a multidisciplinary focus (MDS), Earth and space sciences (GSS), Technical and natural sciences (TNS), Chemistry (CHE), General medicine (GRM) and, Specialized medicine (SPM) (Thijs & Glänzel, 2010). The researchers found that institutions from the multidisciplinary group are most likely to be partners. Also, groups that are more closely related are more likely to collaborate than the other groups. For example, biology with agriculture; technical and nature sciences with chemistry; and general medicine with specialized medicine (Thijs & Glänzel, 2010). Aside from showing what the collaboration strengths are within the sciences, this study shows that collaborations are usually the strongest within a certain field or focus. Also, instead of the multidisciplinary group collaborating with a more specialized group, it partners with others like itself. In other words, instead of a multidisciplinary group seeking the expertise of a particular field for collaboration, it seeks others like itself. Blaskovich looked into group dynamics in her research, “Exploring the Effect of Distance: An Experimental Investigation of Virtual Collaboration, Social Loafing, and Group Decisions.” Global businesses use technology to virtually collaborate with a dispersed workforce. Through past studies, it has been shown that virtual groups have improved in brainstorming capabilities and more thorough analysis (Blaskovich, 2008). While virtual collaboration (VC) has potential benefits, it may be counterproductive, resulting in social loafing; “the tendency for individuals to reduce their effort toward a

10 group task, resulting in sub-optimal outcomes” (Latane, et al., 1979, as cited by Blaskovich, 2008). Social loafing has been considered a contribution to poor group performance, but it is a critical problem intensified by VC. In her study, participants were grouped into teams and given a hypothetical situation. They were to be management accountants responsible for the company’s resources for information technology investments. The groups were asked to give one of two recommendations: “(1) expend resources to invest in the internal development of a new technology system (insource) or (2) use the resources to contract with a third-party outsourcing company (outsource)” (Blaskovich, 2008, p. 33). The groups were given a data set with mixed evidence as their source of information. A total of 279 undergraduate and graduate students were placed randomly into groups of three. The control groups worked face-to-face in a conference room and the experiment group worked from individual computers in separate rooms; the VC group used text-chat as their form of communication. To measure the communication of the groups, Blaskovich had the groups continually update their recommendations as new pieces of evidence were introduced. The recommendation pattern of the face-to-face groups moved toward the outsourcing option regardless of the evidence order. However, the VC groups were dependent on the order of the evidence. Therefore, Blaskovich concluded that group recommendations were influenced by the mode and order of the evidence introduced. The groups made their decisions and submitted them through the designated group recorder. Then, the face-to-face group members were moved to separate computers. The VC members logged-off of their chat session and all completed a questionnaire about the experiment (Blaskovich, 2008).

11 Social loafing was recorded as being present according to time spent on the task, the participants’ ability to recall information about the task, and self-reported evidence about their personal effort (Blaskovich, 2008). The face-to-face groups spent an average of 20.6 minutes on the task while the VC groups spent an average of 22.0 minutes (Blaskovich, 2008).The accuracy score for the participants’ recall ability was 8.28 items on the test for the face-to-face groups and the VC group was 7.93. For the self-reporting, the VC groups perceived their efforts and level of participation to be lower than the faceto-face groups (Blaskovich, 2008). Blaskovich concludes that VC causes group performance to decline and that social loafing may be a contributing factor to this. Also, the VC group decisions may be of poorer quality than that of face-to-face groups because their judgments were influenced by the order of evidence instead of the quality of evidence (Blaskovich, 2008). Blaskovich’s research shows that virtual collaboration should be used cautiously if the virtual group is making a decision or recommendation. Collaboration has shown to be beneficial for brainstorming, especially when a diverse group of experts contribute. However, Blaskovich’s study raises the issue of exactly what the advantages and disadvantages of VC are. Also, applied to the IC, this study brings up the issue of the level of collaboration that may be appropriate virtually; if VC is effective at brainstorming or if decisions or recommendations by participants of VC should be considered reliable. The Office of the Director of National Intelligence (ODNI) created the report “United States Intelligence Community: Information Sharing Strategy” which discusses the increased need for information sharing, especially after September 11, 2001. The “need to know” culture that was formed during the Cold War now impedes the

12 Intelligence Community’s ability to respond properly to terroristic threats (ODNI, 2008). Therefore, the IC needs to move towards a “need to share” mindset; a more collaborative approach to properly uncover the threats it now faces (ODNI, 2008). The report stresses that “information sharing is a behavior and not a technology” (ODNI, 2008, p. 3). Information sharing has to take place within the community; it has to happen through effective communication and not just through the availability of new technology. The Office of the Director of National Intelligence supports the transformation of the IC culture to emphasize information sharing. However, it recognizes the difficulty that would come with overhauling the entire culture and mindset of the IC. The new environment that the ODNI proposes could include the same information, but it would make the information available to all authorized agencies that would benefit from the collaborative analysis (ODNI, 2008). ODNI’s vision and model stresses a “responsibility to provide” that would promote greater collaboration in the IC and to its stakeholders. Ultimately, the report is stating that the IC has a responsibility to improve communication and collaboration to effectively manage new threats. Douglas Hart and Steven Simon’s “Thinking Straight and Talking Straight: Problems of Intelligence Analysis” discusses the need for structured arguments and dialogues in intelligence. Hart and Simon note that the 9/11 Commission Report, the 9/11 Joint Congressional Inquiry Report, and other reports have cited that a lack of collaboration is one of the causes for more recent intelligence failures (Hart & Simon, 2006). Hart and Simon propose that dialogues encourage analysts from different backgrounds to develop common definitions and understandings to decrease potential misunderstandings. Communication also encourages the exchange of different viewpoints

13 to reduce confirmation bias. Through the use of communication and brainstorming, conversations evolve into critical thinking sessions for both the individual and the group: Critical thinking can be enabled by collaboration, especially when it involves compiling, evaluating, and combining multi-disciplinary perspectives on complex problems. Effective collaboration; however, is possible only when analysts can generate and evaluate alternative and competing positions, views, hypotheses and ideas (Hart & Simon, 2006, p. 51). The authors view collaboration as necessary and effective; however, the authors state that documents like the National Intelligence Estimates (NIE) seem to discourage collaboration between individuals and agencies: “Enforced consensus relegating alternative assessments to footnotes…has been a disincentive to collaboration…in addition, collaboration and sharing generally require extra work that competes with time spent on individual assignments” (Hart & Simon, 2006, p. 51). Less time is being spent on collaboration because individual assignments are the priority. Researchers Jessica G. Turnley and Laura A. McNamara address collaboration issues in “An Ethnographic Study of Culture and Collaborative Technology in the Intelligence Community.” The goal of the study was to research improvements in intelligence analysis that could be implemented through methods that effectively merged sources and analysis through multi-agency teams. The researchers conducted their ethnographic study at two intelligence agencies located at three different sites to address the question: “What does collaboration mean in the analytic environment, and what is the role of technology in supporting collaboration” (Turnley & McNamara, p. 2). The research was conducted through interviews of analysts and through group and daily work routine observations. The researchers visited three sites. Two sites were

14 within the same agency which the researchers called Intelligence Agency One (IA-1). This agency focused on strategic intelligence. The third site was an agency that developed software tools for tactical intelligence. The researchers called this site Intelligence Agency Two (IA-2). One researcher spent five and one-half weeks observing and interviewing analysts at IA-1. She collected data through 30 interviews and forty hours of observation. At IA-2, the other researcher spent 20 hours becoming familiar with the site and organization and spent 40 hours interviewing and observing operations. In the sites the researchers studied, the word “collaboration” is intrinsically tied to information, hierarchy, and power in the IC. Therefore, the analyst’s ability to collaborate was only effective if the collaboration did not have a negative impact on the investments of the individual within the organization. The structure of IA-1 was noted to be hierarchical with each analyst given a specific area of responsibility and subject focus. The researcher noted collaboration issues at IA-1 because of the hierarchy structure. Participants’ responses about issues with collaboration were placed into these five categories: introversion, a feeling of ownership over subject matter, privilege of individual effort over group effort for rewards, organizational knowledge, and overclassification of information. IA-2, the site responsible for information management technology to produce tactical intelligence, also had issues with collaboration but for a different reason. The issues at this site stemmed from multiple companies working together, but having different agendas for their participation in the contract. An even bigger issue was defining ownership of the technology used for collaboration. At this site, the analysts could call up an inquiry and multiple resources from multiple sensors are displayed on a single platform. This is an issue to the organization because it defines who owns, controls, and

15 manages the data. Due to power struggles or fear of diminished confidentiality of sources, certain sensors may refuse to give up necessary information and the collaboration can be stalled or stopped. This research was effective at showing how collaboration may already be used, but that the organization’s culture greatly affects the use of it. The limitation of this study was the lack of facilities the researchers were able to visit. Because they only visited three sites, two of which operated under the same agency, they had a less comprehensive view of the IC and its use of collaboration. In “Small Group Processes for Intelligence Analysis,” Heuer discusses the role of collaboration in the production of quality intelligence products and the elements needed for successful collaboration. Heuer states that intelligence analysis is requiring more of a group effort rather than an individual effort (Heuer, 2008). This is because intelligence products need input from multiple agencies and from subject matter experts outside of their field. Collaboration is also encouraged within agencies that have multiple locations and can work online together to save time and travel costs. However, there are issues within groups that can be counterproductive. Individuals can be late to the group’s sessions or may be unprepared. The groups may be dominated by certain types of individuals which prevents others from speaking up or allowing for full generation of ideas. Also, the positions that the individual holds in the agency can affect the group’s performance. For example, top level professionals are often less likely to express dissent for fear of retribution or even embarrassment (Heuer, 2008). Group dynamics play an important role in the effectiveness of collaboration. To avoid these issues or to mitigate them, Heuer suggests the use of small, diverse groups of analysts that openly share ideas and an increased use of structured analytic

16 techniques (2008). Using analysts from multiple agencies will broaden perspectives “leading to more rigorous analysis” (Heuer, 2008, p. 16). Structured analytic techniques can give structure to individual thoughts and the interaction between analysts. By using structured techniques, analysts are providing group members with a written example of their thought process; this can then be compared and critiqued by the other members (Heuer, 2008). Heuer states that each step of the structured analytic technique process induces more divergent and novel discussion than just collaboration alone (2008). Analysts should not only use tools but should also collaborate with other analysts or subject matter experts to make sure personal, individual cognitive bias is not affecting the product and to generate multiple hypotheses. When describing the need for collaboration in the form of subject matter experts, Heuer stated that expertise is needed because the methodology itself does not solve the problem. The combination of expertise and methodology “is always needed” because it is the methodology that guides the expertise (R. Heuer, personal communication, June 2010). Collaboration also allows those with diverse backgrounds from various fields to apply their expertise to the intelligence problem. In other words, the more brainstorming, the more that hypotheses can be identified than just a literary search only could provide. With collaboration, communication and dialogue evolve into critical thinking for both the individual and the group. The use of collaboration and structure analytic techniques has gained the attention of the IC when considering solutions to minimizing the frequency of intelligence failures. Also, it is necessary for individual analysts and organizations to be aware of the presence of cognitive bias and take safeguards to avoid its negative effects on intelligence analysis. There exists little evidence of the effectiveness of structured analytic techniques or

17 intuition and both need to be empirically tested for the validity of the methods and for managing complex situations within the IC. Complexity Manager According to Richards Heuer, the origin of his idea for Complexity Manager goes back “over 30 years ago” when a future forecasting technique called Cross-Impact Analysis was tested at the CIA (R. Heuer, personal communication, September 2010). Heuer recalls taking a group of analysts through the development of a cross-impact matrix, used in Complexity Manager, and was inspired by the technique’s effectiveness as “a learning experience for all the analysts to develop a group understanding of each of the relationships” (R. Heuer, personal communication, September, 2010). Taking this experience, along with a broad understanding of how increasingly complex the world has become, Heuer looked to create a technique that dealt with this new level of complexity while still allowing ease of use to the analyst. Heuer states that research organizations often deal with complexity by developing complex models that are expensive and take a lot of time (R. Heuer, personal communication, September 2010). However, much of the benefit from such modeling comes in the early stages when identifying the variables, rating their level of significance, and understanding the interactions between each. As Heuer states: “that [variable identification and interaction] is easy to do and can be sufficient enough to generate new insights, and that is what I tried to achieve with Complexity Manager” (R. Heuer, personal communication, September, 2010). By using Complexity Manager, the analyst is breaking down the complex system into its smallest component parts before moving forward to analyze the entire system. By doing so, the analyst can understand potential outcomes and

2 unintended side effects of a potential course of action (R. Heuer, personal communication, June 2010). Complexity Manager is a structured analytic technique that also makes use of collaboration to brainstorm multiple hypotheses for a complex issue. Therefore, if proven effective, Complexity Manager would help to further decrease an analyst’s contributions to intelligence failures by limiting the influence of cognitive bias. This would be alleviated through the use of collaboration and through the process of using this structured technique. Complexity Manager combines the advantages of both structured analytic techniques and collaboration through small teams of subject matter experts. Complexity Manager “is a simplified approach to understanding complex systems—the kind of systems in which many variables are related to each other and may be changing over time” (Heuer & Pherson, 2010, p. 269). Complexity Manager, as a decision support tool, helps to organize all options and relevant variables in one matrix. It also provides an analyst with a framework for understanding and forecasting decisions that a leader, group, or country is likely to make as well as their goals and preferences. Complexity Manager is most useful at helping the analyst to identify the variables that are most significantly influencing a decision. As Heuer states, Complexity Manager “enables analysts to find a best possible answer by organizing in a systematic manner the jumble of information about many relevant variables” (Heuer & Pherson, 2010, p. 273). Complexity Manger is an eight step process. The following is Richards Heuer’s directions for use of the structured analytic technique: 1. 2. 3. 4. Define the problem Identify and list relevant variables Create a Cross-Impact Matrix Assess the interaction between each pair of variables

1 5. 6. 7. 8. Analyze direct impacts Analyze loops and indirect impacts Draw conclusions Conduct an opportunity analysis (Heuer & Pherson, 2010, p. 273-277).

For a more detailed description of each of the eight steps, consult Heuer and Pherson’s Structured Analytic Techniques for Intelligence Analysis. Below, Figure 2.2 shows the Cross-Impact Matrix that is used for recording the nature relationships between all the variables (Heuer & Pherson, 2010, p. 273). Heuer recognizes that the Cross-Impact Matrix includes the same initial steps that are required to build a computer model or simulation (Heuer & Pherson, 2010, p. 272). Therefore, when an analyst does not have the time or budget to build a social network analysis or use the Systems Dynamics approach, they can gain the same benefits using Complexity Manager through: “identification of the relevant variables or actions, analysis of all the interactions between them, and assignment of rough weights or other values to each variables or interaction” (Heuer & Pherson, 2010, p. 272).

Figure 2.1 The Cross-Impact Matrix is used as to assess interactions between variables.

2 In theory, Complexity Manager is able to mitigate cognitive bias through the use of both small group collaboration and a structure analytic technique. However, there is no research proving the effectiveness of this technique, nor is there literature on it being used in the field. When used in the appropriate context, Complexity Manager may be one effective tool that may be used to reduce the risks of intelligence failures caused by cognitive bias. However, unless tested or used in the field by analysts, this will not be known. Therefore, it is necessary to test the effectiveness of Complexity Manager through teams of analysts.

Hypothesis Assessing Complexity Manager as an intelligence analysis tool, I have developed four testable hypotheses. My first hypothesis is that the groups using Complexity Manager will have a higher level of confidence in their forecast than those using intuition alone. My second hypothesis is that analysts using Complexity Manager will produce higher quality variables than those using intuition alone. My third hypothesis is that the groups using Complexity Manager will identify more variables than those that used intuition alone. My fourth hypothesis is that those using Complexity Manager will produce more accurate forecasts than those that used intuition alone.

1

CHAPTER 3: METHODOLOGY
The purpose of this study is to assess the forecasting accuracy and effectiveness of Complexity Manager, a structured technique. For the validity of structured techniques and for all tools and techniques used by professionals in the Intelligence Community, it is necessary to evaluate effectiveness through multiple experiments. This study is one of many evaluations needed for that purpose. The following research questions were addressed in this study: 1. Do analysts that use Complexity Manager have a higher level of confidence than those that used intuition alone? 2. Do analysts that use Complexity Manager have higher quality variables assessed before delivering their forecast than those that used intuition alone? 3. Do analysts that use Complexity Manager have a higher number of variables assessed before delivering their forecast than those that used intuition alone? 4. Do analysts that use Complexity Manager produce a more accurate forecast than those that use intuition alone? The study was designed to compare a structured analytic technique, Complexity Manager, to intuition alone when forecasting. If Complexity Manager is effective, advantages of its use would be shown through the data collected. Data was collected by standardized questionnaires and forms created by the researcher. Pre and post intervention data were collected and analyzed through statistics and descriptive analysis of results between the control and experiment group.

3

Setting The research was conducted at Mercyhurst College in Erie, Pennsylvania. The researcher recruited Intelligence Studies students through classroom visits across the campus and the intervention was completed at the computer labs at the Intelligence Studies building. Classroom visits were conducted two weeks before the intervention to allow for the students to plan for the intervention and to maximize the number of signups for the researcher. The intervention was conducted in the computer labs at the Intelligence Studies building because the department supports such endeavors and the researcher could reserve the two computer labs exclusively for the purpose of the study. This controlled environment allowed for each student to utilize a computer for preintervention collection. Both computer labs were equipped with a projector so the researcher could present a tutorial to the experiment on how to use Complexity Manager. Participants To ensure that this study was conducted in an ethical manner, the researcher submitted the study to Mercyhurst College’s Institutional Review Board and obtained permission before starting the study. A copy of the consent form and all related documents can be found in the appendix of this thesis. The participants were selected through purposive sampling to meet the needs and criteria of the study. The participants were restricted to undergraduate and graduate level Intelligence Studies students only because of their understanding of the Intelligence process and the need for analysts to evaluate Complexity Manager. Freshman Intelligence Students were able to participate even though they have very limited experience in the field because the researcher created groups with each freshman paired with an

2 undergraduate upperclassman or second year graduate student. This maximized the sampling size for the study and varied the level of expertise for each group. Figure 3.1 shows the distribution of students according to their academic class. There were 56 females and 106 males for a total of 162 participants. Freshman through second year graduate students participated in the study: 43 freshmen; 37 sophomores; 28 juniors; 18 seniors; 26 first year graduate students; and 10 second year graduate students. When completing initial sign-ups for the experiment, the researcher requested that the participants disclose information about their education. For the undergraduates, because they were all intelligence majors, the researcher requested that the participants list any minors they may have. For the graduate students, the researcher requested that these participants list their undergraduate major and minor if applicable. This was done to show the range of expertise of the participants. Not all undergraduate intelligence students had minors; 28 of the 126 had a declared minor. Undergraduate intelligence students that participated in the study had the following minors: Russian Studies, Business Intelligence, Business Administration, History, Criminal Justice, Criminal Psychology, Philosophy, Psychology, Political Science, Spanish, Computer Systems, and Asian Studies. The 36 graduate students that participated in the study disclosed the following undergraduate majors: Intelligence, Social and Political Thought, History, English, Political Science, Spanish, International Affairs, Russian Studies, Psychology, Telecommunication and Business, International Business, Security and Intelligence, Biochemistry, Mathematics, French, Forensics, Sociology, Social Work, and Criminal Justice. 17 of the 36 graduate students had the following minors at the undergraduate
Figure 3.1. Number of participants per academic class.

3 level: Science and Technology, International Affairs, Life Sciences, Political Science, Mandarin, Spanish, Russian Studies, French, Middle Eastern Studies, Asian Pacific Studies, Economics, Public Policy, and East Asian Studies. Intervention and Materials The independent variable for this intervention was the experiment group’s use of Complexity Manager and the dependent variable is the accuracy of the groups’ forecasts, as well as the number and quality of the variables that the groups produced. The researcher first consulted Richards Heuer and Randy Pherson’s book Structured Analytic Techniques for Intelligence Analysis because it contained the step-bystep procedure for using Complexity Manager. Then the researcher created forms based on the procedure and from further instruction from email correspondence with Mr. Richards Heuer. The methodology form was created to replicate the step-by-step procedure while allowing for participants’ maximum understanding of the methodology in a short period of time; each step of the procedure and instructions guiding the participant were put on separate pages (See Appendix F). Each form created was used to collect data directed towards the research questions and all forms were approved by the Institutional Review Board at Mercyhurst College. Measurement Instruments The researcher collected data through the use of a post-intervention questionnaire, through assessment of the groups’ forecasting accuracy, and by the number and quality of the variables documented by the groups. Questionnaire Answers The questionnaire for the control group contained nine questions. Four of the questions, Questions 1-4, asked for the amount of time and the number of variables that

2 the individual contributed compared to the amount of time and the number of variables that the group produced. Questions 1-4 asked for quantitative amounts that could be compared to other individuals and other groups. Four questions, Questions 5-8, asked the participant to rank their knowledge of the intelligence issue, the clarity of instructions, the availability of open source information, and the helpfulness of working in teams for the assigned task. Questions 5-8 asked the participants to rank their experience on a scale of one to five. The final question asked for general comments about the experiment. The questionnaire for the experiment group consisted of thirteen questions. Questions 1-9 were identical to the control group questions. Questions 10-13 were specific to the use of Complexity Manager including: the usefulness of Complexity Manager for assessing significant variables, understanding of Complexity Manager before the experiment, understanding of Complexity Manager after the experiment, and if the participant would use Complexity Manager for future tasks. All questions allowed for space for the participant to comment further if they wanted to do so. (See Appendix H and I for both questionnaires).

Forecasting Accuracy All participants were tasked with forecasting whether the vote for the Sudan Referendum set for January 9, 2011, would occur as scheduled or if it would be delayed. The use of an actual event allowed for definite results to compare the groups’ forecasts against. (See Appendix G for the forecasting worksheet.) Number of Variables Along with forecasting if the Sudan Referendum would occur on the set date, the participants were also tasked with identifying the variables that were most influential for

2 deciding the course of the Sudan Referendum. The researcher calculated the number of variables that the control group produced compared to the experiment group to assess if Complexity Manager aided in the production of an increased number of variables considered. Quality of Variables The quality of the variables recorded by the control group was qualitatively compared to the experiment group. The researcher assessed quality by visually comparing the thoroughness and comprehensiveness of the control versus the experiment groups’ variables. Data Collection and Procedures Pre-Intervention From October 11, 2010, to October 19, 2010, the researcher visited eleven Intelligence Studies classes. Recruitment occurred at the beginning of the class period. The researcher handed out sign-up sheets requesting general information that included: name, email address, undergraduate minor if applicable, and a ranking for preferred days to participate in the study for November 1, 2010, to November 4, 2010. The form also requested graduate students to include their undergraduate major and minor, if applicable. Also, because many of the second year graduate students did not have a class during the week of recruitment, emails requesting participation for the experiment were sent to only second year graduate students. By October 19, 2010, 239 participants had volunteered to participate. All undergraduate and first year graduate professors offered extra credit to those students that participated in the study. The researcher then entered all the sign-up data into a spreadsheet and organized the participants into one of the four dates, November 1 through 4, 2010, with nearly all

2 the participants receiving their first-ranked choice. Once the participants were organized into days, the researcher then organized the participants into groups; each group had three members. All groups had at least one freshman assigned to each group. On October 25, 2010, the researcher emailed the participants to let them know their assigned date and time. From October 25, 2010, to October 31, 2010, participants that were unable to participate emailed the researcher; at this point, 17 participants withdrew from the study. From November 1 to November 4, 2010, the researcher then emailed the participants the morning of their assigned date and time to remind them that the experiment was to occur that evening. Intervention At the beginning of the intervention, participants were asked to sit with their assigned group. At the front of the room was a list of all the participants organized into groups of three and four. The groups all had a number and the participants were to sit at the computers with corresponding numbers. All documentation the participants would need was placed at the computers before the intervention began. First, after all participants were seated, they signed a consent form. After all participants completed this, the researcher gave instructions for the intervention. Both documents are located in Appendix C and D. After all forms that would be used for the intervention were explained, the researcher then addressed the participants who had group members missing. Those that did not have a full group of three were asked to come to the front of the room so they could be moved into another group. This instruction and reconfiguration of groups took 10 minutes. For the control group, the next step was to begin collection. The groups were given a list of possible sources they could use to begin their collection process and the

2 groups independently divided the workload. Please see Appendix E for this document. After an hour of collection, the groups reconvened to brainstorm possible variables and to give their forecast on a Forecasting Answer Sheet the researcher created. Please see Appendix G for this document. Before the experiment group began collection, the researcher gave a brief PowerPoint tutorial on how to use Complexity Manager. The researcher then described a packet that was created for the groups to work through the methodology step-by-step. The participants were also given the directions as written by Richards Heuer and Randy Pherson in their book, Structured Analytic Techniques for Intelligence Analysis. This tutorial and explanation took 10 minutes. The groups were then given an hour for collection and given the same list of possible sources as the control group. After an hour of collection, the groups reconvened to brainstorm possible variables using Complexity Manager. The experiment group participants then gave their forecast on the Forecasting Answer Sheet the researcher created. Please see Appendix E for the methodology packet. All participants were given two and one half hours to complete the experiment. Post-Intervention The post-intervention period including completing the questionnaire described in the Measurement Instruments section. The students were also given a debriefing statement describing the purpose of the experiment. Please see Appendix J for the debriefing statement. On November 4, 2010, the researcher emailed the names of all the students that participated in the study to the professors that offered extra credit. Data Analysis Descriptive and inferential statistics were used for data analysis of the survey responses, group analytical confidence, group source reliability, and the number of

2 variables both the control and experiment group considered. The data was subdivided for analysis purposes and Statistical Package for the Social Sciences (SPSS) software was used to identify the mean and standard deviation for the control and experiment group. An independent sample t test was used to compare the mean scores and to identify any significant differences between the control and experiment data sets’ mean scores. The survey questions comparing the control and experiment group are: individual amount of time spent working in the study; group amount of time spent working in the study; previous knowledge of the Sudan Referendum before beginning the study; clarity of instructions; availability of open source materials; and how helpful it was to work in teams. The variables comparing the number that the control and experiment group produced within each group are: economic, social, political, geographic, military, technology, and then a total number of variables for both the control and the experiment group. The quality of the variables was analyzed descriptively and assessed for content.

3

CHAPTER 4: RESULTS
The results will be presented in order of reference from the Methods section of this study: survey responses, group analytical confidence, group source reliability, the number of variables the control and experiment groups considered, and the forecasting accuracy of both groups. This will be followed by the descriptive analysis of the quality of variables. Please see Appendix K for complete SPSS data. Survey Responses Surveys were distributed to each individual after their group completed and returned their forecasting answer sheet to the researcher. Time Surveys asked each individual to state the approximate amount of time they spent working individually and the amount of time that they spent working with their groups. 80 control group members and 65 experiment group members answered the survey question regarding the amount of time they each spent working individually. The control group’s individual amount of time spent ranged from 20 minutes to 110 minutes. The experiment group’s individual amount of time ranged from 15 minutes to 120 minutes. Using SPSS software, the results showed that there is no difference between the control and experiment group for the individual amount of time spent working, t (142) = -.797, p (.455) > (α = 0.05). 80 control group members and 67 experiment group members answered the question regarding the amount of time they spent working as a group. The control group amount of time spent working together ranged from 5 to 90 minutes. The experiment group amount of time spent working together ranged from 25 to 150 minutes. Using

2 SPSS software, the results showed that there is a difference between the control and experiment group for the amount of time spent working together, t (145) = -7.71, p (0.00) ˂ (α = 0.05). The experiment group had a greater mean (M = 74.1045 minutes, SD = 30.30058) than the control group (M = 40.4375 minutes, SD = 20.71802). The experiment group spent more time working in their groups than the control group did. Knowledge of Sudan Referendum To gauge the understanding of the subject matter used for the intervention, the researcher asked the students to state their knowledge of the Sudan Referendum prior to beginning the study. The students were given a scale ranging from 1 to 5 with 1 indicating that the individual had little knowledge of the Sudan Referendum and 5 indicating that the individual had great knowledge about the Sudan Referendum. 83 control group members and 67 experiment group members answered the survey question regarding their prior knowledge of the Sudan Referendum. Using SPSS software, the results showed that there was no difference between the control and experiment group’s knowledge of the Sudan Referendum, t (115.143) = -1.699, p (0.092> (α = 0.05). The experiment group had a slightly greater mean (M = 1.7463, SD = 1.17219) than the control group (M = 1.4578, SD = .83083). Clarity of Instructions To gauge the participants’ perception of the clarity of the researcher’s instructions, the researcher asked the students to rate it on a scale of 1 to 5. 1 indicated little clarity and 5 indicated that the directions were entirely clear. 83 control group members and 67 experiment group members answered the survey question regarding the clarity of instruction. Using SPSS software, the results showed that there is a difference between the control and experiment group’s perception

2 of the clarity of instructions provided by the researcher, t (140.859) = 6.098, p (0.000) < (α = 0.05). The control group had a greater mean (M = 4.3614, SD = .77426) than the experiment group (M = 3.5821, SD = .78140). The control group perceived the instructions to be clearer compared to the experiment group. This is likely due to the differences in the directions between the control group and the experiment group. The control group had more straightforward instructions; collaborate with the team to come up with a forecast. The experiment group’s task was more ambiguous with the added instruction of learning and using Complexity Manager. Though both groups were explained the process of the experiment, the experiment group may have perceived the instructions to be less clear because of the added complexity of learning and applying a structure analytic technique. Open Source Availability To gauge the participants’ perception of the availability of open source information regarding the Sudan Referendum, the researcher asked the students to rate it on a scale of 1 to 5. 1 indicated little availability and 5 indicated an abundance of open source information regarding the Sudan Referendum. 83 control group members and 67 experiment group members answered the survey question regarding the availability of open source materials. Using SPSS software, the results showed that there is no difference between the control and experiment group’s perception of the availability of open source materials regarding the Sudan Referendum, t (138.980) = -0.914, p (0.3620) > (α = 0.05). The experiment group had a very slight greater mean (M = 4.2985, SD = .79801) than the control group (M = 4.1807, SD = . 76739) Team Helpfulness

3 Individuals were placed into groups of 3 or 4 to complete a team forecast. To gauge the participants’ perception of how helpful it was to work in teams for the study, the researcher asked the students to rate it on a scale of 1 to 5. 1 indicated that working in teams was not helpful and 5 indicated that it was very helpful to work in teams. 83 control group members and 67 experiment group members answered the survey question regarding the helpfulness of teamwork. Using SPSS software, the results showed that there is no difference between the control and experiment group’s perception of the helpfulness of working in teams for this study, t (115.648) = 1.175, p (0.242) > (α = 0.05). The control group had a very slight greater mean (M = 4.4819, SD = .70471) than the experiment group (M = 4.3134, SD = .98794). Both the control group and the experiment group found that working in a team was helpful. Initially, it would seem that those that use a structured analytic technique would value teamwork more because it consciously facilitates collaboration. However, the academic major may have overshadowed this and played a larger role in the participants’ perception of team helpfulness. The Intelligence Studies major at Mercyhurst College values and draws heavily on the use of groups to facilitate learning and collaboration. Therefore, all students likely came into the experiment with the mindset that teamwork adds value and validity to the forecast. Another factor when considering the shared perception of team helpfulness for both the control and experiment groups is the nature of the task. The amount of learning that had to be done would have been very difficult for one person to do in a two and one half hour timeframe. Therefore, a team would likely be a welcomed solution to the workload regardless of if a structured analytic technique was used or not. A third factor is a varied level of individual experience with analysis. 53% of the participants in the control group were freshmen or sophomores and 39% of

4 participants in the experiment group were freshmen or sophomores. Collectively, freshmen and sophomores accounted for 46% of the total participants. Therefore it is likely that many of the freshmen and sophomores valued working on a team with more experienced upperclassmen. Group Analytic Confidence On the group forecasting answer sheet, the researcher requested that the groups give their analytic confidence for their forecast regarding the Sudan Referendum. The participants were to gauge their confidence with “High” being the most confident and “Low” being of the lowest confidence. 24 control groups and 23 experiment groups gauged their analytic confidence. Normality assumptions were not satisfied because the sample size was small, less than 30, so the Mann-Whitney test was used. The results showed that there is no difference between the control and experiment groups’ analytic confidence, p (0.458) > (α = 0.05). The implications of this finding will be explored in more detail in the Conclusions chapter. Group Source Reliability On the group forecasting answer sheet, the researcher requested that the groups give their source reliability for their forecast regarding the Sudan Referendum. The participants were to gauge their confidence with “High” being the most confident in the sources used for forecasting and “Low” being of the lowest confidence. 24 control groups and 23 experiment groups gauged their source reliability. Normality assumptions were not satisfied because the sample size was small, less than 30, so the Mann-Whitney test was used. The results showed that there is no difference between the control and experiment groups’ analytic confidence, p (0.914) > (α = 0.05). Figure 4.1

Figure 4.1. Source reliability per group.

2 shows the majority of the control and experiment groups had medium reliability in sources. No group indicated that they had low source reliability. Variables Individuals were placed into groups of three or four, 24 control groups and 23 experimental groups, and were asked to give a team forecast that included a list of variables that were used for considering their group forecast. The researcher created categories of variables for the groups’ consideration that include: economic, social, political, geographic, military, and technology. The variables were examined by the researcher through both statistics and descriptive analysis; recognizing that it is not only the quantity but also the quality of the variables that make accurate forecasts. Implications of all the variables findings will be explored in more detail in the Conclusions chapter. Economic Variables Using SPSS software, the results showed that there is a difference between the control and experiment group’s number of economic variables considered, t (34.545) = 4.476, p (0.000) < (α = 0.05). The control group had a greater mean (M = 2.9583, SD = 1.26763) than the experiment group (M = 1.6522, SD = .64728). Social Variables Normality was not satisfied for the experiment group, so the Mann-Whitney test for independent samples was used. The results showed that there is a difference between the control and experiment groups’ number of social variables considered, p (0.000) < (α = 0.05). The experiment group produced 34% less variables than the experiment group. Political Variables

3 Using SPSS software, the results showed that there is a difference between the control and experiment group’s number of political variables considered, t (36.222) = 4.608, p (0.000) < (α = 0.05). The control group had a greater mean (M = 3.3333, SD = 1.43456) than the experiment group (M = 1.7826, SD = .79524). Geographic Variables Normality was not satisfied for the experiment group, so the Mann-Whitney test for independent samples was used. The results showed that there is a difference between the control and experiment groups’ number of geographic variables considered, p (0.000) < (α = 0.05). The experiment group produced 48% less variables than the control group.

Military Variables Using SPSS software, the results showed that there is a difference between the control and experiment group’s number of military variables considered, t (42.888) = 7.178, p (0.000) < (α = 0.05). The control group had a greater mean (M = 2.8750, SD = . 89988) than the experiment group (M = 1.2174, SD = .67126). Technology Variables Using SPSS software, the results showed that there is a difference between the control and experiment group’s number of technology variables considered, t (31.176) = 3.519, p (0.001) < (α = 0.05). The control group had a greater mean (M = 2.3750, SD = 1.43898) than the experiment group (M = 1.1739, SD = .83406). Total Variables Using SPSS software, the results showed that there is a difference between the control and experiment group’s number of total variables considered, t (39.195) = 8.295, p (0.000) < (α = 0.05). The control group had a significantly greater mean (M = 17.1250,

2 SD = 4.08936) than the experiment group (M = 8.8696, SD = 2.59903). Again, implications regarding all variables considered can be found in the Conclusions chapter. Forecasting Accuracy 24 control groups and 23 experiment groups forecasted whether the vote for the Sudan Referendum would occur on January 9th, 2011 or if it would be delayed. On January 9th, 2011, the voting process did begin as scheduled (Ross, 2011). 3 of the 24 control groups and 6 of the 23 experiment groups accurately forecasted the event. Using SPSS software, it was determined that there was no statistical difference between the control and experiment group’s ability to accurately forecast (P-value = 0.2367) > (α = 0.05). Although assumptions of normality were not satisfied due to the small sample size, the raw data does show that twice as many experiment groups accurately forecasted the event (see figure 4.2). 19 of the 24 control groups and 16 of the 23 experiment groups inaccurately forecasted the event. Using SPSS software, it was determined that there was no difference between the control and experiment group’s inaccurate forecast (P-value = 0.4505) > (α = 0.05). Three groups’ forecasts were not included in the statistical testing: one control and one experiment group did not give a forecast and one group in the control forecasted that the chances were even. Therefore, these three forecasts were not included in the analysis. Quality of Variables Because quantity
Figure 4.2. Forecasting per group.

may not reflect the quality of information, the researcher also descriptively analyzed the written variables completed by the 24 control groups and 23 experiment groups. Quality, according to the Merriam-Webster’s Collegiate Dictionary, id defined as “a degree of

2 excellence, superiority in kind” (n.d.). In Structured Analytic Techniques for Intelligence Analysts, Heuer and Pherson discuss a three-step approach to the evaluation of structured analytic techniques. In this evaluation, they note that quality of analysis is not restricted to just accuracy. Heuer and Pherson suggest that quality of analysis is measured by “clarity of presentation, transparency in how the conclusion was reached, and construction of an audit trail for subsequent review” (2010, p. 317). Considering the definition of quality and Heuer and Pherson’s measure of quality of analysis, the researcher defined quality variables as “variables that are superior as shown through clarity of presentation and transparency in how conclusions were reached.” The researcher did not include “construction of an audit trail” because both the control and the experiment group were asked to write out the variables considered. Therefore, this instruction required both groups to leave an audit trail of their variables and significant findings. Using the above definition of quality, the researcher found that quality variables were presented in two ways: completeness of the description and specificity. Completeness of the Description Both the control and the experiment groups consistently cited similar variables for consideration when forecasting. Both groups consistently spoke of border disputes, issues involving oil rights, and ethnic tensions. However, the teams in the control group routinely used full sentences while only one team in the experiment group used full sentences. Though this does not increase the validity of the data, it does show the completeness of the team’s thought; it showed clarity of presentation. The teams that used complete sentences were also able to show cause and effect. Therefore, the ability to

2 show cause and effect allowed for transparency in how the analysts arrived at their conclusions. For example, one team in the control group writes, “Southern Sudan’s economy is mostly comprised from oil revenue that it receives from the North. However, because the north has ceased paying its share of oil revenues in foreign currency, turmoil internally will likely result.” One team in the experiment group writes on the same topic, “Share of oil revenues.” Both groups are stating that oil revenues are a variable to consider when forecasting whether then Sudan Referendum will occur on the date scheduled, but the team in the control group conveyed why oil revenues would affect the possibility of delay. Specificity The completeness of the description allowed for more specific variables to be considered. The teams in the control group more frequently cited specific pieces of evidence for consideration while the experiment group used broader concepts for consideration. For example, one team in the control group writes, “The Northern government has control over the TV and radio signals, and only allows broadcasts that are in line with their policies.” One team in the experiment group writes, “Low tech capabilities affect other key variables.” The team in the control group is citing specifics; the Northern government has control over certain technologies in the country. The team in the experiment group is stating that a low level of technical capability in Sudan affects other variables. In this control group example, as does the greater part of the control group, the team is not only stating a specific piece of evidence, but showing the split between north and south Sudan; the reason for the Sudan Referendum. In this experiment group example, the team in the experiment group is stating that there is a connection

2 between a low level of technology and how that affects the other variables; however, this broad generalization does not allow for an understanding of the urgency of the technological issues in Sudan and how it could affect the possible delay of the referendum. Two factors could influence the result of variable specificity. The first factor is possibly a lack of experience with the method and the topic. By using broad topics such as “political pressure” and “security issues” the students were decreasing the complexity of the task by equating the variables to things they were more familiar with. “Political pressure” and “security issues” are more common to the students than the specifics of the Sudan Referendum. Therefore, by broadening the variables, the students could more easily use the cross impact matrix; assessing how politics affected security, rather than how one specific instance in Sudan affected the other. The other factor could be a lack of clarity. While the control group may have cited specific instances for their variables, the experiment group may have placed those specific instances into broader categories, which they identified as variables. Therefore, clarity may be lacking; is a variable a specific example (“The Northern government has control over the TV and radio signals, and only allows broadcasts that are in line with their policies”) or a broadened understanding of specific instances (“Low tech capabilities affect other key variables”). A definition of “variable” in the context of Complexity Manager may help to clarify what exactly is needed for the structured technique to be most effective.

4

CHAPTER 5: CONCLUSION
Throughout the IC’s history, intelligence failures have driven reform in organizational structure, information sharing, and the use of structure analytic techniques. However, if the use of structured analytic techniques is the solution to decreasing the possibility or severity of future intelligence failures, then structured analytic techniques should be tested to ensure each is a valid means to reducing such risks. Though encouraged, testing of the techniques is limited. The intent of this study was to test one structured analytic technique and offer further research for testing. The purpose of this study was to assess the validity of the structured method, Complexity Manager. To do so, the researcher designed an experiment comparing the use of Complexity Manager versus intuition alone. The research was conducted at Mercyhurst College’s Intelligence Studies Program with participants from every academic class level; freshman to second year graduate students. Discussion Do analysts that use Complexity Manager have a higher level of confidence than those that used intuition alone? Analytic confidence is based on the use of a structured analytic technique, source reliability, source corroboration, level of expertise on the subject, amount of collaboration, task complexity, and time pressure (Peterson, 2008). The results of the survey questions asking groups to rate their level of analytic confidence shows that those that used Complexity Manager did not have a higher level of confidence than those that used intuition alone. Furthermore, looking at the results from the other survey questions that identify components of analytic confidence confirms that the experiment group did

Figure 5.1. Analytic confidence per group.

2 not experience a higher level of confidence than the control group. The control group and the experiment group had no difference for their source reliability; level of expertise on subject matter; and amount of collaboration, shown through “team helpfulness.” Assessing task complexity and time pressure, both the control group and experiment group were given the same task with the same amount of time. However, the experiment group, (M = 74.1045 minutes, SD = 30.30058) spent a greater amount of time working than the control group (M = 40.4375 minutes, SD = 20.71802). The time that the experiment groups spent working together may have been related to the teams’ need to learn and then use Complexity Manager. This finding suggests that using a structured analytic technique may not increase analytic confidence, but may better calibrate the analyst. The analysts could have lacked confidence in not only their ability to use the structured analytic technique, but lacked confidence in the analysis that it helped to produce. Task complexity was high and there was a time constraint of 2.5 hours; however, the majority of both the control and experiment groups had a medium analytic confidence. Nine groups gave low analytic confidence; three control and six experiment. Overall, the experiment group had a lower confidence level, but had a greater forecasting accuracy. Therefore, this suggests that there may not be a connection between analytic confidence and the use of a structured analytic technique. Analytic confidence may have no bearing on forecasting accuracy. In other words, having a high analytic confidence may not suggest that the analyst is more likely to forecast accurately. In summary, this finding suggests that using a structured analytic technique could assist the analyst in assessing their own analytic confidence, but does not improve their overall analytic confidence. Further studies yielding a higher

3 number of group or individual analytic confidence ratings would be needed to statistically confirm this. Do analysts that use Complexity Manager have higher quality variables assessed before delivering their forecast than those that used intuition alone? The researcher concluded after a descriptive analysis comparing the control to the experiment group that those that used Complexity Manager did not have higher quality variables than those that used intuition alone. The teams that used Complexity Manager often used short, broad generalizations while those that used intuition alone wrote out complete sentences that specified specific points and conflicts between north and south Sudan. These complete sentences allowed the control group to fully explain the cause and effect of each variable. The experiment group spent a greater amount of time working together than the control group, yet the quality of their variables was less than that of the control group. Two factors may have influenced the style of reporting. The experiment group was given an answer sheet for their forecast along with a methodology packet; a step-bystep guide to using Complexity Manager where they could complete the steps directly on those pages. However, the control group was only given an answer sheet. The first factor the researcher considered was redundancy, having to write the variables twice, to have influenced the experiment group to only write short, broad generalizations on their answer sheet. However, the methodology packets reveal the same statements. This leads to a second factor that may have influenced the style of writing. For Complexity Manager, the experiment group was tasked with completing a cross-impact matrix and to do so, the variables first had to be listed in the left hand column. The teams may have judged the lines too short to include whole sentences and therefore only transferred those

2 statements onto the answer sheet. Though this may account for the length of the sentence, it does not account for why the experiment group often used broader concepts, such as referencing oil refineries, while the control group used more specific statements, such as stating where the refineries were and why it was a conflict. Do analysts that use Complexity Manager have a higher number of variables assessed before delivering their forecast than those that used intuition alone? As shown through the intervention results, analysts that used Complexity Manager did not have a higher number of variables assessed before delivering their forecast. The control group had a greater number of variables assessed in every category. Also, the control group had a greater mean total number of variables (M = 17.1250, SD = 4.08936) than the experiment group (M = 8.8696, SD = 2.59903). The amount of variables assessed did not connect to more accurate forecasts. Therefore, quantity had no bearing on quality. Increasing the pieces of evidence could easily bias the analyst, thinking that the more evidence found, the more likely it would be that the Sudan Referendum would not occur on the scheduled date. Though the experiment group had less variables assessed, they produced a greater number of accurate forecasts. This suggests that the experiment group weighed the significance of each variable, rather than totaling the pieces of evidence confirming the likelihood of one event over another. Therefore, using a structured analytic technique may have helped decrease analyst bias when forecasting. Do analysts that use Complexity Manager produce a more accurate forecast than those that use intuition alone? There was no statistical difference between the control and experiment group for producing more accurate forecasts. This may have been due to the small sample size of

2 both the control and experiment groups. However, in effect, a p-value of 0.2367 indicates that there is a 76% chance that the data is not due to chance. Furthermore, looking at the raw data, 6 out of 23 experiment groups had accurate forecasts while only 3 out of 24 control groups had accurate forecasts. Both of these facts suggest that Complexity Manager assisted the experiment group with producing more accurate forecasts. Analytic Confidence findings showed that 3 control groups and 6 experiment groups gave a low confidence rating. Forecasting Accuracy findings also showed that 3 control groups and 6 experiment groups produced accurate forecasts. However, only one experiment group that forecasted accurately gave a low confidence rating; 8 of the 9 accurate forecasts gave a medium analytic confidence rating. Two further conclusions can be drawn from the forecasting accuracy of the experiment group. The first conclusion is that forecasting accuracy may be connected to collaboration, a required process within the Complexity Manager technique. The experiment group spent more time working collaboratively than the control group and also produced more accurate forecasts. (Time in Minutes: Experiment group mean = 74.1045 and Control group mean = 40.4375). The second conclusion is that there appears to be no connection between the number of variables assessed and forecasting accuracy; the control group recorded a greater number of variables than the experiment group but did not forecast more accurately. Shannon Ferrucci recorded similar findings in her 2009 Master’s thesis, “Explicit Conceptual Models: Synthesizing Divergent and Convergent Thinking.” When assessing the size of conceptual models that participants produced in her study, Ferrucci found that though the experimental group’s conceptual models were larger than the control group’s models, the control group forecasted better than the experiment group (Ferrucci, 2009).

3 Ferrucci suggested that the large number of concepts that the experiment group created in their models created confusion and decreased their ability to understand the most relevant information for completing their forecast (2009). As in Ferrucci’s experiment, the large number of variables that the control group created may have overwhelmed the analysts and made it more difficult to select the most relevant variables that would affect the possible delay for the Sudan Referendum. Another factor that may account for the control groups’ low forecasting accuracy could be a connection between the number of variables assessed and cognitive bias. Robert Katter, John Montgomery, and John Thompson found in their 1979 study, “Cognitive Processes in Intelligence Analysis: A Descriptive Model and Review of Literature,” that intelligence is conceptually driven rather than data driven. This understanding is important because it shows how the analyst arrives at their conclusions. An analyst’s forecast is not pure data. Instead, it is a process driven by how the analyst interprets that data after moving through their cognitive model (Katter et al., 1979). The purpose of the cognitive model is to account for “inputs” of the analyst, with input meaning stimuli from the external world or what is in their internal memory (Katter et al., 1979). The model has three parts that are summarized below: 1. The individual’s initial processing of outside information is automatically conducted in less than a second. Then the new information is automatically compared with information already stored in the memory. When even just a gross match is found, the new information that matches existing memory patterns is stored. 2. New information that may not fit into the existing memory patterns can be automatically ignored or viewed as irrelevant or uninteresting.

1 3. The central cognitive function consists of a continuous “Compare/Construct” cycle that modifies the memory-storage. Three types of information modification are: sensory information filtering, memory information consolidation, and memory access interference. In this study on Complexity Manager, the control group did not have anything in place to force themselves to be more cognizant of their cognitive models as they recorded all of their variables for forecasting. Therefore, without having an external regulator such as a structured analytic technique, the control groups’ forecasts were more negatively affected by their cognitive models, taking the form of cognitive bias. Limitations A limitation to the study was the sample size. 162 individuals participated, but the number of forecasts was limited because the individuals were grouped into teams of three and four. Therefore, instead of 162 forecasts, only 47 were given. Time constraints were also a limitation. The study was conducted during a two and one half hour time period. This did not allow for full development or understanding of the issue. The researcher intentionally chose this time period to maximize the number of participants and decrease the number of drop-outs. Also, the participants were students with time constraints due to other classes and obligations. A third limitation was the level of expertise from the participants. The participants were students with limited knowledge of the field and very limited knowledge of both Complexity Manager and the intelligence topic, the Sudan Referendum. Three other limitations of this study relate directly to the implementation of the intervention. The first limitation was the amount of time that may be appropriate for learning not only Complexity Manager, but any structured analytic technique. One of the

2 main considerations when restricting the intervention to 2.5 hours was maximizing the sample size. The participants, being students, had many other obligations. If the researcher had asked participants to commit to a longer intervention, the sample size may have decreased significantly. However, the time restriction may not have allowed for proper understanding and absorption of Complexity Manager. Besides time restrictions with learning the structured analytic technique, assessing when to provide the Complexity Manager tutorial was a limitation. The researcher gave the tutorial directly after giving the tasking for the analysis. This was done to allow the groups to work at their own pace and complete their analysis earlier if they desired to. However, this turned into a limitation because it overloaded the participants with information. Collection may have suffered because the students were more concerned with understanding the technique. The third limitation is the timing of the intervention. Mercyhurst College operates on a trimester system, with each term lasting ten weeks. The researcher gave the intervention during the eighth week of the term. Though students may have attended to earn extra credit knowing the end of the term was near, this may also negatively affected the intervention. Participants dropped out of the intervention because they had other obligations such as team meetings and projects. For those that attended the intervention, they may have worked more quickly through it than if it had been held earlier in the term when they had less pressing obligations to manage. If the technique had been presented separately and more thoroughly, and if the intervention had taken place earlier in the term, the results may have more accurately reflected the purpose of the study. Recommendations for Future Research Based on the results of this study, there are several recommendations for future research. Limitations concerning time constraints for training the participants using

2 Complexity Manager could be mitigated or eliminated if the training is separate from the intervention. Not only could understanding increase, but this would allow the participants to analyze more complex issues that have multiple outcomes, a major function of Complexity Manager. Along with taking more time to train the participants, having participants that are trained in a particular area of expertise could also improve the intervention. Therefore, the second recommendation is to either include professionals that have had more experience in the field or to choose a topic within an area of expertise that would be more familiar to all participants. The participants for this intervention had to familiarize themselves with a topic that was largely unknown to the majority of the participants and also learn a new structured analytic technique. Having participants that are subject matter experts or have a higher level of expertise could assist the participants to more fully utilizing Complexity Manager; to explore potential outcomes and unintended side effects of a potential course of action. In other words, involving subject matter experts and working through issues with multiple outcome possibilities are two major components of Complexity Manager that could be tested in further studies. The researcher of this intervention focused on variables, analyst confidence, and the forecasting accuracy of Complexity Manager. The third recommendation is to compare Complexity Manager to another well-tested structured technique. The researcher compared intuition to the use of Complexity Manager. Doing so showed that those that used an intuitive process produced a greater number of specified variables compared to those that used Complexity Manager. However, those that used Complexity Manager worked significantly longer as in their groups. Having both the control and experiment group use a structured analytic method could possibly eliminate the dramatic change in

3 control and experiment group time and isolate the question: “Is Complexity Manager more effective at assisting analysts brainstorm variables that impact a complex issue than other techniques?” The fourth recommendation is to obtain a higher number of participants or forecasts for the study. Collaboration was necessary for Complexity Manager, therefore, the researcher organized the participants into groups of three to four; this significantly minimized the number of forecasts. Therefore, increased participation or a method that would allow for individual forecasting would minimize this limitation. Further studies using these recommendations could more fully assess the validity of Complexity Manager. Conclusions Complexity Manager originated as a way for analysts to develop a group understanding of each of the relationships within a complex system. Heuer created Complexity Manager to help analysts generate new insights through variable identification and interaction in order to understand the potential outcomes and unintended side effects of a potential course of action. Heuer states that Complexity Manager is useful for identifying the variables that are most significantly influencing the decision at hand and enables the analyst to find the best possible answer to an intelligence question by organizing information into the structured technique. The dynamics of the variable interactions were not noted in this study because this experiment focused on how effective Complexity Manager is at variable identification and its correlation to forecasting accuracy, or, the best possible answer to an intelligence question. The results show that those that used Complexity Manager identified fewer variables and less specific variables, but had a higher number of accurate

2 forecasts than those that did not use Complexity Manager. Therefore, it can be concluded that Complexity Manager is effective at identifying variables that lead to more accurate forecasts than using intuition alone. Using Complexity Manager did not assist participants with identifying more variables than those that did so intuitively. This may suggest that those that used intuition may have actually used a divergent process, brainstorming, before forecasting or that Complexity Manager is not effective at assisting analysts with drawing out a higher volume of or higher quality variables. However, the increased number of variables recorded did not connect to an increased forecasting accuracy. The use of teams greatly reduced the number of forecasts, which in turn, reduced the sample size for extracting statistically significant results. However, the results of this suggest that Complexity Manager increases forecasting accuracy. Collaboration is a necessary part of Complexity Manager; the teams must brainstorm variables and work through the matrix together. The increased amount of time spent collaborating and following the steps of the structured analytic technique increased the number of accurate forecasts in the experiment group. This experiment showed that Complexity Manager’s strongest abilities include effective collaboration, possible improved analytic confidence calibration, and an aid to increasing forecasting accuracy. An area of improvement would be a stronger definition of what a variable is in the context of Complexity Manager; is it a specific event that could be a catalyst to other events? Is it broader? Is a single variable composed of multiple significant events that are categorized under one general umbrella? Or is it a combination of both? Thinking of the entire IC, how should analyst balance single, significant or seemingly insignificant events with more general trends?

1 Final Thoughts The testing of the effectiveness of one analytic technique, at this point in time, seems to be secondary to gathering empirical evidence regarding the collective benefits and abilities that structured analytic techniques offer. Instead of testing one particular technique one at a time, this researcher recommends testing two techniques against each other or two techniques against intuition alone. This would increase the amount of techniques tested and would keep the control of intuition in place. More importantly, comparing techniques against each other could help to show emerging patterns through the strengths and weaknesses that may overlap within all structured analytic techniques. This would improve all structured analytic techniques. Though this is only one study assessing the effectiveness and forecasting accuracy of one structured analytic technique, it did produce quantitative results that confirm that structured techniques decrease bias and increase forecasting accuracy. One by one, experiments and results such as this add to the validity of each structured analytic technique and the Intelligence Community as a whole.

3

REFERENCES
Berardo, R. (2009). Processing complexity in networks: a study of informal collaboration and its effects on organizational success. Policy Studies Journal, (37)3, 521-539. Retrieved September 24, 2010, from Academic Search Complete. doi: 10.1111/j.1541-0072.2009.00326.x Betts, R. (1978). Analysis, war, and decision: why intelligence failures are inevitable. World Politics, 31(1), 69-89. Retrieved June 20, 2010, from http://www.jstor.org/stable/200 9967. Blaskovich, J.L. (2008). Exploring the effect of distance: an experimental investigation of virtual collaboration, social loafing, and group decisions. Journal of Information Systems (22)1. 27-46. Retrieved September 3, 2010, from Academic Search Complete. Brasfield, A.D. (2009) Forecasting accuracy and cognitive bias in the analysis of competing hypotheses (Unpublished master’s thesis). Mercyhurst College, Erie, PA. Cheikies, B. A., Brown, M. J., Lehner, P.E., & Adelman, L. (October 2004). Confirmation bias in complex analyses. 1-16. Retrieved June 13, 2010, from http://www.mitre.org/work/tech_papers/tech_papers_04/04_0985/04_0985.pdf. Davis, J. (1999). Improving intelligence analysis at CIA: Dick Heuer’s contribution to intelligence analysis. In Heuer, R., Jr. (1999). Psychology of Intelligence Analysis, Center for the Study of Intelligence: Central Intelligence Agency, xiii-xxv. Retrieved May 31, 2010, from https://www.cia.gov/library/center-for-the-studyof-intelligence/csi-publications/books-and-monographs/psychology-ofintelligence-analysis/PsychofIntelNew.pdf. Diaz, G. (January 2005). Methodological approaches to the concept of intelligence failure. UNISCI Discussion Papers, Number 7, 1-16. Retrieved July 31, 2010, from http://revistas.ucm.es/cps/16962206/articulos/UNIS0505130003A.PDF. Edison's Lightbulb at The Franklin Institute. (2011). Retrieved April 1, 2011, from http://www.fi.edu/learn/sci-tech/edison-lightbulb/edison-lightbulb.php? cts=electricity. Folker, R. D., Jr. (2000). Intelligence Analysis in Theater Joint Intelligence Centers: An Experiment in Applying Structured Methods Occasional Paper Number Seven, 145. Retrieved June 13, 2010, from http://www.fas.org/irp/eprint/folker.pdf. Ferrucci, S. (2009). Explicit conceptual models: synthesizing divergent and convergent thinking (Unpublished master’s thesis). Mercyhurst College, Erie, PA.

3

George, R. Z. (2004). Fixing the problem of analytical mind-sets: alternative analysis. International Journal of Intelligence and Counterintelligence, 17(3), 385-404. doi: 10.1080/08850600490446727. Goodman, M. (2003). 9/11: The failure of strategic intelligence: intelligence and national security, 18(2), 59-71. doi: 10.1080/02684520310001688871. Grimmet, R.F. (2004). Terrorism: key recommendations of the 9/11 commission and recent major commissions and inquiries. (Congress Research Service). Washington, DC. Retrieved September 3, 2010, from http://www.au.af.mil/au/awc/awcgate/crs/rl32519.pdf. Hart, D. & Simon, S. (2006). Thinking straight and talking straight: problems of intelligence analysis. Survival, 48(1), 35-59. doi: 10.1080/00396330600594231. Hedley, J. (2005). Learning from intelligence failures. International Journal of Intelligence and Counterintelligence, 18(3), 436. doi: 10.1080/08850600590945416. Heuer, R., Jr. (2009). The evolution of structured analytic techniques. Presentation to the National Academy of Science, National Research Council Committee on Behavioral and Social Science Research to Improve Intelligence Analysis for National Security. Washington, D.C. Heuer, R., Jr. (1999). Psychology of Intelligence Analysis. Center for the Study of Intelligence: Central Intelligence Agency, 1-183. Retrieved May 31, 2010, from https://www.cia.gov/library/center-for-the-study-of-intelligence/csipublications/books-and-monographs/psychology-of-intelligenceanalysis/PsychofIntelNew.pdf. Heuer, R. J., Jr. (2008). Small group processes for intelligence analysis, 1-38. Retrieved September 16, 2010 from http://www.pherson.org/Library/H11.pdf. Heuer, R. J. Jr. & Pherson, R. (2010). Structured Analytic Techniques for Intelligence Analysis. Washington, D.C.: CQ Press. Johnston, R. (2005). Analytic culture in the US Intelligence Community: an ethnographic study. Retrieved July 31, 2010, from http://www.au.af.mil/au/awc/awcgate/cia/analytic_culture.pdf. Johnston, R. Integrating methodologists into teams of substantive experts. Studies in Intelligence, 47(1), 57-65. Retrieved June 13, 2010, from http://www.dtic.mil/cgibin/GetTRDoc?AD=ADA525552&Location=U2&doc=GetTRDoc.pdf. Katter, R., Montgomery, C., & Thompson, J. (1979). Cognitive processes in intelligence analysis: a descriptive model and review of the literature (Technical Report 445). Arlington: US Army Intelligence Security and Command.

5

Khatri, N. & Alvin, H. Role of intuition in strategic decision making. 1-38. Retrieved May 31, 2010, from http://www3.ntu.edu.sg/nbs/sabre/working_papers/01-97.pdf. Lefebvre, S. J. (2003). A look at intelligence analysis. Retrieved from http://webzoom.freewebs.com/swnmia/A%20Look%20At%20Intelligence %20Analysis.pdf. Light Bulb History - Invention of the Light Bulb. (2007). Retrieved April 1, 2011, from http://www.ideafinder.com/history/inventions/lightbulb.htm. Marrin, S. (2004). Preventing intelligence failures by learning from the past. International Journal of Intelligence and Counterintelligence, 17(4), 655-672. Retrieved June 20, 2010, from http://dx.doi.org/10.1080/08850600490496452. Myers, D.G. (2002). Intuition: Its Powers and Perils. London: Yale University Press. National Commission on Terrorist Attacks. (2004). The 9/11 commission report. Washington, DC: US Government Printing Office. Retrieved October 15, 2010, from http://govinfo.library.unt.edu/911/report/911Report.pdf. October 21, 1879: Edison Gets the Bright Light Right. This Day In Tech. Wired.com. (2009). Retrieved April 1, 2011, from http://www.wired.com/thisdayintech/2009/10/1021edison-light-bulb/. The Office of the Director of National Intelligence (2008). United States Intelligence Community Information Sharing Strategy (ODNI Publication No. A218084). Washington, DC. Retrieved September 3, 2010, from http://www.dni.gov/reports/IC_Information_Sharing_Strategy.pdf. Pope, S. & Josang, A. Analysis of Competing Hypothesis using subjective logic. 10th International Command and Control Research and Technology Symposium: The Future of C2 Decisionmaking and Cognitive Analysis. Retrieved May 31, 2010, from http://www.cs.umd.edu/hcil/VASTcontest06/paper126.pdf. Quality. (n.d.) In Merriam-Webster’s collegiate dictionary. Retrieved from http://www.merriam-webster.com/dictionary/quality. RAND National Security Research Division. (2008). Assessing the tradecraft of intelligence analysis. Santa Monica: Treverton, G & Gabbard, C.B. Ross, W. (2011, January 11). Southern Sudan votes on independence. BBC. Retrieved from http://www.bbc.co.uk/news/world-africa-12144675. Robson. D.W. Cognitive rigidity: methods to overcome it. Retrieved May 31, 2010, from https://analysis.mitre.org/proceedings/Final_Papers_Files/40_Camera_Ready_Pap er.pdf.

7 Shrum, W. Collaborationism. Retrieved September 24, 2010, from http://worldsci.net/papers.htm#Collaboration. Surowiecki, J. (2004). The Wisdom of Crowds. New York: Anchor Books. Thijs, B. & Glänzel, W. (2010). A structural analysis of collaboration between European research institutes. Research Evaluation (19)1, 55-65. doi: 10.3152/095820210X492486. Thomas Edison's biography: Edison Invents! Smithsonian Lemelson Center. (n.d.). . Retrieved April 1, 2011, from http://invention.smithsonian.org/centerpieces/edison/000_story_02.asp. Turnley, J. G. & McNamara, L. An ethnographic study of culture and collaborative technology in the intelligence community. Sandia National Laboratory, 1-21. Retrieved May 31, 2010, from http://est.sandia.gov/consequence/docs/JICRD.pdf. Wheaton, K. J. & Beerbower, M. T. (2006). Towards a new definition of intelligence. Stanford Law & Policy Review, 17(2), 319-330. Retrieved September 13, 2010, from LexisNexis.

9

Appendix A: IRB Approval

3

Appendix B: Structured Methods Experiment Sign Up Name: Class Year: E-mail Address: Undergraduate Minor (If Applicable): Graduate Student’s Undergraduate Major (If Applicable): Graduate Student’s Undergraduate Minor (If Applicable) Please select a ranked preference for the dates: (Rank Session Preference: 1=Highest, 4=Lowest) Monday, November 1, 2010: 6 pm _____ Tuesday, November 2, 2010: 6pm _____ Wednesday, November 3, 2010: 6 pm Thursday, November 4, 2010: 6 pm _____ _____

Upon completion, please return this form to Lindy Smart.

3 Appendix C: Participation Consent Form You have been invited to participate in a study about forecasting in Intelligence analysis. Your participation in the experiment involves the following: team assignments, a team evaluation of a designated subject, and returning the completed forms back to the researcher of the experiment. Teams will be given one hour for collection and then will reconvene for up to an hour and a half to put the analysis together and give a team forecast. Your name will only be used to notify professors of your participation in order for them to assign extra credit. There are no foreseeable risks or discomforts associated with your participation in this study. Participation is voluntary and you have the right to opt out of the study at any time for any reason without penalty.

I, ____________________________, acknowledge that my involvement in this research is voluntary and agree to submit my data for the purpose of this research. _________________________________ __________________ Signature _________________________________ __________________ Printed Name

Date

Class

Name(s) of professors offering extra credit: ____________________________________ Researcher’s Signature: ___________________________________________________ If you have any further question about forecasting or this research you can contact me at Research at Mercyhurst College which involves human participants is overseen by the Institutional Review Board. Questions or problems regarding your rights as a participant should be addressed to Tim Harvey; Institutional Review Board Chair; Mercyhurst College; 501 East 38th Street; Erie, Pennsylvania 16546-0001; Telephone (814) 824.3372. Lindy Smart, Applied Intelligence Master’s Student, Mercyhurst College Kristan Wheaton, Research Advisor, Mercyhurst College

3

Appendix D: Forecasting Thesis Experiment Instructions You are an analyst working at the Embassy of the United States of America in Sudan. You have been tasked with forecasting whether the vote for the Sudan Referendum set for January 9, 2011 will occur as scheduled or if it will be delayed. You are also to identify the variables that are most influential for deciding the course of the Sudan Referendum. The state-level high committees responsible for organizing the referendum expect delays but the United Nations is committed to conducting it on time. You and your teammates will be assigned areas of expertise in the economic, political, military, social, technological, and geographic area of Sudan. You will use open source information to complete your task. You will be given a list of sources to use at a starting point for collection. Teams will be given one hour for collection and then will reconvene for up to an hour and a half to put the analysis together and give a team forecast.

Researcher Contact: Lindy Smart

Appendix E: Starting Point for Collection

2

Possible Sources: http://www.usip.org/ http://www.state.gov/ http://www.bloomberg.com/ http://www.pbs.org/newshour/ http://www.alertnet.org/ http://www.reuters.com/ http://www.hrw.org/ http://news.yahoo.com/ http://www.washingtontimes.com/news/ http://allafrica.com/ http://www.bbc.co.uk/news/world/africa/ http://www.janes.com/ http://w3.nexis.com/new/ (Must have your username and password) http://merlin.mercyhurst.edu/ (Databases available through the Mercyhurst Library)

Appendix F: Structured Methods Experiment Methodology

2

1. State the problem to be analyzed, including the time period to be covered by the analysis: _________________________________________________________________ _________________________________________________________________ __ 2. Brainstorming list of relevant variables: Economic:

Political:

Social:

Technology:

Military:

3

Geographic:

3. List the variables in the Cross-Impact Matrix. Put the most important variables at the top. *Matrix not limited to 10 variables/may not have 10 variables. A B C D E F G H I J A B C

2 D E F G H I J
Reading the Matrix: The cells in each row show the impact of the variable represented by that row on each of the variables listed across the top of the matrix. The cells in each column show the impact of each variable listed down the left side of the matrix on the variable represented by the column. Direction and magnitude of the impact: + Strong positive impact - Strong negative impact + Medium positive impact - Medium negative impact + Weak Positive Impact - Weak negative impact

Use plus and minus signs to show whether the variable being analyzed has a positive or negative impact on the paired variable. The size of the plus or minus sign signifies the strength of the impact on a three-point scale. 3=strong, 2=medium, 1=weak If the variable being analyzed has no impact on the paired variable, the cell is left empty. If a variable might change in a way that could reverse the direction of its impact, from positive to negative or vice versa, this is shown by using both a plus and a minus sign.

Please note: Size of matrix does not reflect actual size of the matrix given to students. Students received a matrix that fit a page in its entirety. DIRECTIONS FOR COMPLETING THE CROSS-IMPACT MATRIX 3. As a team, assess the interaction between each pair of variables and enter the results into the relevant cells of the matrix. For each pair of variables, ask the question: Does this variable impact the paired variable in a manner that will increase or decrease the impact or influence of that variable?

1 a. When entering ratings in the matrix, it is best to take one variable at a time, first going down the column and then working across the row. The variables will be evaluated twice; for example, the impact of variable A on variable B and the impact of variable B on variable A. b. After rating each pair of variables, and before doing further analysis, consider pruning the matrix to eliminate variables that are unlikely to have a significant effect on the outcome. c. Measure the relative significance of each variable by adding up the weighted values in each row and column. Record the totals in each row and column. i. The sum of the weights in each row is a measure of each variable’s impact on the system as a whole. ii.The sum of the weights in each column is a measure of how much each variable is affected by all the other variables. iii.Those variables most impacted by the other variables should be monitored as potential indicators of the direction in which events are moving or as potential sources of unintended consequences.

4. Write about the impact of each variable, starting with variable A. (Use the following pages to write out your answers.) a. Describe the variable further if clarification is necessary (For example, if one of the variables you identified is “Weak Government Officials” then use this space to write exactly what you meant. You may want to include names, party affiliations, and examples of why the officials are “weak”). Variable A 1. Describe the variable further if clarification is necessary:

2

2. Identify all the variables that impact on Variable A with a rating of 2 or 3 (Medium or Strong Effect) and briefly explain the nature, directions, and, if appropriate, the timing of this impact: Variables that impact Variable A: (Shown in the COLUMNS) a. How strong is it and how certain? b. When might these impacts be observed? c. Will the impacts be felt only in certain conditions?

3. Identify and discuss all variables on which Variable A has an impact with a rating of 2 or 3 (Medium or Strong Effect): Variables on which Variable A has an impact: (Shown in the ROWS) a. How strong is it and how certain? b. Identify and discuss the potentially good or bad side effects of these impacts. Good side effects: Bad side effects:

Variable B 1. Describe the variable further if clarification is necessary: 2. Identify all the variables that impact on Variable B with a rating of 2 or 3 (Medium or Strong Effect) and briefly explain the nature, directions, and, if appropriate, the timing of this impact: Variables that impact Variable B: (Shown in the COLUMNS) a. How strong is it and how certain?

2

b. When might these impacts be observed? c. Will the impacts be felt only in certain conditions?

3. Identify and discuss all variables on which Variable B has an impact with a rating of 2 or 3 (Medium or Strong Effect): Variables on which Variable B has an impact: (Shown in the ROWS) a. How strong is it and how certain? b. Identify and discuss the potentially good or bad side effects of these impacts. Good side effects: Bad side effects:

Variable C 1. Describe the variable further if clarification is necessary: 2. Identify all the variables that impact on Variable C with a rating of 2 or 3 (Medium or Strong Effect) and briefly explain the nature, directions, and, if appropriate, the timing of this impact: Variables that impact Variable C: (Shown in the COLUMNS) a. How strong is it and how certain? b. When might these impacts be observed?

3

c. Will the impacts be felt only in certain conditions?

3. Identify and discuss all variables on which Variable C has an impact with a rating of 2 or 3 (Medium or Strong Effect): Variables on which Variable C has an impact: (Shown in the ROWS) a. How strong is it and how certain? b. Identify and discuss the potentially good or bad side effects of these impacts. Good side effects: Bad side effects:

Variable D 1. Describe the variable further if clarification is necessary: 2. Identify all the variables that impact on Variable D with a rating of 2 or 3 (Medium or Strong Effect) and briefly explain the nature, directions, and, if appropriate, the timing of this impact: Variables that impact Variable D: (Shown in the COLUMNS) a. How strong is it and how certain? b. When might these impacts be observed?

1 c. Will the impacts be felt only in certain conditions?

3. Identify and discuss all variables on which Variable D has an impact with a rating of 2 or 3 (Medium or Strong Effect): Variables on which Variable D has an impact: (Shown in the ROWS) a. How strong is it and how certain? b. Identify and discuss the potentially good or bad side effects of these impacts. Good side effects: Bad side effects:

Variable E: 1. Describe the variable further if clarification is necessary: 2. Identify all the variables that impact on Variable E with a rating of 2 or 3 (Medium or Strong Effect) and briefly explain the nature, directions, and, if appropriate, the timing of this impact: Variables that impact Variable E: (Shown in the COLUMNS) a. How strong is it and how certain? b. When might these impacts be observed? c. Will the impacts be felt only in certain conditions?

2

3. Identify and discuss all variables on which Variable E has an impact with a rating of 2 or 3 (Medium or Strong Effect): Variables on which Variable E has an impact: (Shown in the ROWS) a. How strong is it and how certain? b. Identify and discuss the potentially good or bad side effects of these impacts. Good side effects: Bad side effects:

Variable F: 1. Describe the variable further if clarification is necessary: 2. Identify all the variables that impact on Variable F with a rating of 2 or 3 (Medium or Strong Effect) and briefly explain the nature, directions, and, if appropriate, the timing of this impact: Variables that impact Variable F: (Shown in the COLUMNS) a. How strong is it and how certain? b. When might these impacts be observed? c. Will the impacts be felt only in certain conditions?

1 3. Identify and discuss all variables on which Variable F has an impact with a rating of 2 or 3 (Medium or Strong Effect): Variables on which Variable F has an impact: (Shown in the ROWS) a. How strong is it and how certain? b. Identify and discuss the potentially good or bad side effects of these impacts. Good side effects: Bad side effects:

Variable G: 1. Describe the variable further if clarification is necessary: 2. Identify all the variables that impact on Variable G with a rating of 2 or 3 (Medium or Strong Effect) and briefly explain the nature, directions, and, if appropriate, the timing of this impact: Variables that impact Variable G: (Shown in the COLUMNS) a. How strong is it and how certain? b. When might these impacts be observed? c. Will the impacts be felt only in certain conditions?

1 3. Identify and discuss all variables on which Variable G has an impact with a rating of 2 or 3 (Medium or Strong Effect): Variables on which Variable G has an impact: (Shown in the ROWS) a. How strong is it and how certain? b. Identify and discuss the potentially good or bad side effects of these impacts. Good side effects: Bad side effects:

Variable H: 1. Describe the variable further if clarification is necessary: 2. Identify all the variables that impact on Variable H with a rating of 2 or 3 (Medium or Strong Effect) and briefly explain the nature, directions, and, if appropriate, the timing of this impact: Variables that impact Variable H: (Shown in the COLUMNS) a. How strong is it and how certain? b. When might these impacts be observed? c. Will the impacts be felt only in certain conditions?

3. Identify and discuss all variables on which Variable H has an impact with a rating of 2 or 3 (Medium or Strong Effect):

3

Variables on which Variable H has an impact: (Shown in the ROWS) a. How strong is it and how certain? b. Identify and discuss the potentially good or bad side effects of these impacts. Good side effects: Bad side effects:

Variable I: 1. Describe the variable further if clarification is necessary: 2. Identify all the variables that impact on Variable I with a rating of 2 or 3 (Medium or Strong Effect) and briefly explain the nature, directions, and, if appropriate, the timing of this impact: Variables that impact Variable I: (Shown in the COLUMNS) a. How strong is it and how certain? b. When might these impacts be observed? c. Will the impacts be felt only in certain conditions?

3. Identify and discuss all variables on which Variable I has an impact with a rating of 2 or 3 (Medium or Strong Effect): Variables on which Variable I has an impact: (Shown in the ROWS)

3

a. How strong is it and how certain? b. Identify and discuss the potentially good or bad side effects of these impacts. Good side effects: Bad side effects:

Variable J: 1. Describe the variable further if clarification is necessary: 2. Identify all the variables that impact on Variable J with a rating of 2 or 3 (Medium or Strong Effect) and briefly explain the nature, directions, and, if appropriate, the timing of this impact: Variables that impact Variable J: (Shown in the COLUMNS) a. How strong is it and how certain? b. When might these impacts be observed? c. Will the impacts be felt only in certain conditions?

3. Identify and discuss all variables on which Variable J has an impact with a rating of 2 or 3 (Medium or Strong Effect): Variables on which Variable J has an impact: (Shown in the ROWS) a. How strong is it and how certain?

3

b. Identify and discuss the potentially good or bad side effects of these impacts. Good side effects: Bad side effects:

1. Analyze loops and indirect impacts: a. Identify any feedback loops. b. Determine if the variables are static or dynamic. i. Static: Static variables are expected to remain more or less unchanged during the period covered by the analysis. ii.Dynamic variables are changing or have the potential to change. c. Determine if the dynamic variables are either predictable or unpredictable. i. Predictable: Predictable change includes established trends or established policies that are in the process of being implemented. ii.Unpredictable: Unpredictable change may be a change in leadership or an unexpected change in policy or available resources. Feedback loops:

Static Variables:

Dynamic-Predictable:

Dynamic-Unpredictable:

2

7. Draw conclusions: Using data about the individual variables assembled in Steps 5 and 6, draw conclusions about the system as a whole. a. What is the most likely outcome or what changes might be anticipated during the specified time period? b. What are the driving forces behind the outcome? c. What things could happen to cause a different outcome? d. What desirable or undesirable side effects should be anticipated?

2

Appendix G: Forecasting Thesis Experiment Answer Sheet Names and Corresponding Professor Offering Extra Credit: Name____________________ Name____________________ Name____________________ Name____________________ Forecast: Variable(s) considered: Economic: Prof. Offering Extra Credit: ____________________ Prof. Offering Extra Credit: ____________________ Prof. Offering Extra Credit: ____________________ Prof. Offering Extra Credit: ____________________

Social:

Political: Geographic:

Military:

3

Technological:

Source Reliability (circle one): LOW

MEDIUM

HIGH

Analytic Confidence (circle one): LOW MEDIUM HIGH Appendix H: Follow-Up Questionnaire: Control Group Thanks for your participation! Please take a few moments to answer the following questions. Your feedback is greatly appreciated. 1. Individual amount of time spent on the assigned task? _____________ 2. Amount of time spent working with your group? _____________ 3. Number of variables that you contributed to the group? _____________ 4. Total number of variables that the group considered before forecasting? _____________ 5. Please rate your level of knowledge of the Sudan Referendum before the experiment with 1 being no knowledge at all and 5 being a very thorough understanding. 1 2 3 4 5

6. Please rate the clarity of the instructions for the task with 1 being not clear at all and 5 being very clear. 1 2 3 4 5

1 7. Please rate the availability of open source information with 1 being very difficult to find and 5 being very easily found. 1 2 3 4 5

8. Please rate how helpful it was to work in teams for this task with 1 being not helpful at all and 5 being very helpful. 1 2 3 4 5

9. Please provide any additional comments you may have about the experiment: Appendix I: Follow-Up Questionnaire: Experiment Group Thanks for your participation! Please take a few moments to answer the following questions. Your feedback is greatly appreciated. 1. Individual amount of time spent on the assigned task? _____________ 2. Amount of time spent working with your group? _____________ 3. Number of variables that you contributed to the group? _____________ 4. Total number of variables that the group considered before forecasting? ________ 5. Please rate the clarity of the instructions for the task with 1 being not clear at all and 5 being very clear. 1 2 3 4 5

6. Please rate the availability of open source information with 1 being very difficult to find and 5 being very easily found. 1 2 3 4 5

1 7. Please rate how helpful it was to work in teams for this task with 1 being not helpful at all and 5 being very helpful. 1 2 3 4 5

8. Please rate your level of knowledge of the Sudan Referendum before the experiment with 1 being no knowledge at all and 5 being a very thorough understanding. 1 2 3 4 5

9. Please rate how helpful Complexity Manager was for assessing significant variables before forecasting the assigned tasks with 1 being not helpful at all and 5 being very helpful. 1 Comment: 10. Please rate your level of understanding of Complexity Manager before the experiment with 1 being no understanding at all and 5 being a very thorough understanding. 1 Comment: 11. Please rate your level of understanding of Complexity Manager after the experiment with 1 being no understanding and 5 being a very thorough understanding. 1 Comment: 12. Would you use Complexity Manager for future tasks? (circle one) Yes No 2 3 4 5 2 3 4 5 2 3 4 5

3

Comments: 13. Please provide any additional comments you may have about Complexity Manager or the experiment overall:

Appendix J: Complexity Manager Participant Debriefing Thank you for participating in this research. I appreciate your contribution and willingness to support the student research process. The purpose of this study is to determine the accuracy of Complexity Manager for forecasting compared to unstructured methods. This experiment was designed to test if Complexity Manager aided analysts in accurately forecasting compared to using the intuitive process alone. This is the first experiment conducted on Complexity Manager and is also an addition to other experiments on the effectiveness and accuracy of structured methodologies. Participants were organized at random and both the control group and the experiment group were placed into groups of three to simulate subject matter expert collaboration required for this methodology. The results of this experiment will be given to Mr. Richards Heuer, the creator of Complexity Manager.

If you have any further questions about Complexity Manager or this research you can contact me.

1

Appendix K: SPSS Testing Time in Minutes: Individual
Case Processing Summary Cases Valid Group Time in Minutes Control Experiment N 80 64 Percent 100.0% 100.0% N 0 0 Missing Percent .0% .0% N 80 64 Total Percent 100.0% 100.0%

Independent Samples Test Levene's Test for Equality of Variances F
Time in Minutes Equal variances assumed Equal variances not assumed

t-test for Equality of Means t .000 -.797 df 142 Sig. (2-tailed) .427 Mean Difference -3.35938

Sig.

19.425

-.750 92.938

.455

-3.35938

Time in Minutes: Group
Group Statistics Group Time in Minutes for Group Control Experiment N 80 67 Mean 40.4375 74.1045 Std. Deviation 20.71802 30.30058 Std. Error Mean 2.31635 3.70181

Independent Samples Test Levene's Test for Equality of Variances F Sig. t t-test for Equality of Means df Sig. (2-tailed) Mean Difference

1
Group Statistics Group Time in Minutes for Group
Time in Minutes for Group

N 80 14.017

Mean 40.4375

Std. Deviation 20.71802 .000 -7.963

Std. Error Mean 2.31635 145 .000 -33.66698

Control
Equal variances assumed Equal variances not assumed

-7.710

113. 292

.000

-33.66698

Analytic Confidence

Source Reliability

1

Survey Questions
Group Statistics Group Knowledge of Sudan Control Experiment Clarity of Instructions Control Experiment Open Source Control Experiment Team Help Control Experiment N 83 67 83 67 83 67 83 67 Mean 1.4578 1.7463 4.3614 3.5821 4.1807 4.2985 4.4819 4.3134 Std. Deviation .83083 1.17219 .77426 .78140 .76739 .79801 .70471 .98794 Std. Error Mean .09120 .14321 .08499 .09546 .08423 .09749 .07735 .12070

Independent Samples Test Levene's Test for Equality of Variances F
Knowledge of Sudan Equal variances assumed Equal variances not assumed Clarity of Instructions Equal variances assumed Equal variances not assumed Open Source Equal variances assumed Equal variances not assumed Team Help Equal variances assumed Equal variances not assumed

t-test for Equality of Means t -1.760 -1.699 df 148 115.143 148 140.859 148 138.980 148 115.648 Sig. (2-tailed) .080 .092 .000 .000 .360 .362 .225 .242

Sig. .002

10.211

.011

.917

6.104 6.098

.044

.834

-.918 -.914

6.172

.014

1.217 1.175

1

Variables Economic Variables

Group Statistics Group Economic Variable Control Experiment N 24 23 Mean 2.9583 1.6522 Std. Deviation 1.26763 .64728 Std. Error Mean .25875 .13497

Independent Samples Test Levene's Test for Equality of Variances F Economic Variable Equal variances assumed Equal variances not assumed 4.476 34.545 .000 9.613 Sig. .003 t-test for Equality of Means t 4.419 df 45 Sig. (2-tailed) .000

1

Social Variables

Political Variables

Group Statistics Group Political Variable Control Experiment N 24 23 Mean 3.3333 1.7826 Std. Deviation 1.43456 .79524 Std. Error Mean .29283 .16582

1

Independent Samples Test Levene's Test for Equality of Variances F Political Variable Equal variances assumed Equal variances not assumed 4.010 Sig. .051 t-test for Equality of Means t 4.555 4.608 df 45 36.222 Sig. (2-tailed) .000 .000

Geographic Variables

Military Variables

1
Group Statistics Group Military Variable Control Experiment N 24 23 Mean 2.8750 1.2174 Std. Deviation .89988 .67126 Std. Error Mean .18369 .13997

Independent Samples Test Levene's Test for Equality of Variances F Military Variable Equal variances assumed Equal variances not assumed 3.229 Sig. .079 t-test for Equality of Means t 7.133 df 45 Sig. (2-tailed) .000 .000

7.178 42.48 8

Technology Variables

Group Statistics Group Technology Variable Control Experiment N 24 23 Mean 2.3750 1.1739 Std. Deviation 1.43898 .83406 Std. Error Mean .29373 .17391

Independent Samples Test Levene's Test for Equality of Variances F Technology Variable Equal variances assumed 9.883 Sig. .003 t-test for Equality of Means t 3.481 df 45 Sig. (2-tailed) .001

1

Total Variables

Group Statistics Group Total Variables Control Experiment N 24 23 Mean 17.1250 8.8696 Std. Deviation 4.08936 2.59903 Std. Error Mean .83474 .54193

Independent Samples Test Levene's Test for Equality of Variances F Total Variables Equal variances assumed Equal variances not assumed 6.132 Sig. .017 t 8.219 8.295 t-test for Equality of Means df 45 39.195 Sig. (2-tailed) .000 .000

1

Forecasting Accuracy
Hypothesized Difference Level of Significance Group 1 Number of Successes Sample Size Group 2 Number of Successes Sample Size 0 0.05 3 24 6 23

Intermediate Calculations Group 1 Proportion 0.125 0.26086956 Group 2 Proportion 5 0.13586956 Difference in Two Proportions 5 0.19148936 Average Proportion 2 1.18338919 Z Test Statistic 7 Two-Tail Test Lower Critical Value Upper Critical Value p-Value Do not reject the null hypothesis 1.95996398 5 1.95996398 5 0.23665493 7

1

Data Hypothesized Difference Level of Significance Group 1 Number of Successes Sample Size Group 2 Number of Successes Sample Size

0 0.05 19 24 16 23

Intermediate Calculations 0.79166666 Group 1 Proportion 7 0.69565217 Group 2 Proportion 4 0.09601449 Difference in Two Proportions 3 0.74468085 Average Proportion 1 0.75462400 Z Test Statistic 2 Two-Tail Test Lower Critical Value Upper Critical Value p-Value Do not reject the null hypothesis 1.95996398 5 1.95996398 5 0.45047461 8