THINKING AND REASONING, 2001, 7 (1),GENERAL ISSUES 103–118

103

Studying judgement: General issues
University College London
The previous papers raise a number of issues. How should we develop task typologies both to separate judgement from related cognitive tasks and to classify tasks within the judgement domain? Are there grounds for selecting between models of judgement when empirical tests fail to do so? What techniques can be used to find out more about the cognitive processes underlying judgement behaviour? I discuss these issues and give a brief assessment of the current state of play in this rapidly changing area.

Nigel Harvey

I shall discuss five general issues. The first two are prompted by Maule’s observation that there is a lack of clarity about how judgement should be distinguished first from decision making and then from problem solving. I shall consider whether it is possible to make these distinctions clearer. The third issue arises from Funke’s analysis of the properties that can be used to define his area of judgement research and to classify tasks within that domain. I ask whether his approach to task typology could be extended to cover the judgement area as a whole. The fourth issue relates to the different positions taken by Dhami and Harries and by Smith, McKenna, Pattison, and Waylen with respect to use of psychological plausibility and transparency as criteria for selecting between models of judgement. I argue that there are problems associated with applying these criteria, but they are not ones that we have to face because possible means for selecting between models on empirical grounds are far from exhausted. The last issue concerns the process-tracing methodologies discussed by Harte and Koele. I suggest that use of these techniques would benefit from being more informed by recent developments in other areas of cognitive psychology.

Correspondence should be addressed to Nigel Harvey, Department of Psychology, University College London, Gower Street, London WC1E 6BT UK. Tel: + 44 (0)20 7679 5387 Fax: + 44 (0)20 7436 4276 email: n.harvey@ucl.ac.uk I should like to thank Clare Harries, Ilan Yaniv, and Mandeep Dhami for their insightful comments on an earlier draft of this paper. © 2001 Psychology Press Ltd http://www.tandf.co.uk/journals/pp/13546783.html DOI: 10.1080/13546780042000064

104

HARVEY

Any attempt to distinguish judgement and decision making is, to some extent, arbitrary and unlikely to satisfy everyone working in these fields. Goldstein and Hogarth (1997) have, however, argued that there are separate programmes of research into the two areas. Those interested in decision making are influenced by economists’ and statisticians’ research into how decisions ought to be made. They focus on how people decide on a course of action when outcomes of different courses of action are uncertain and when goals are in conflict. In contrast, those interested in judgement have been influenced mainly by research on perception (e.g., Brunswik, 1956). They are concerned primarily with how probabilistic environmental cues to some criterion variable and fallible cognitive processing of those cues result in estimates or predictions for that variable. To say that the two areas are researched by different people working within different research traditions and addressing rather different issues still begs the question of exactly what judgement and decision making are and how they are related. It is fair to say that judgements are usually regarded as assessments, estimates, or predictions that can provide input to decision making in much the same way that perception can provide input to action. To make a decision, beliefs and desires must be integrated: judgements about the likelihoods of various possible states of the world under the different options must be combined with judgements about the desirability of those various states. Decisions may be poor either because the judgements on which they depend are inaccurate or because beliefs and desires are combined inappropriately. Is there some operational way of distinguishing judgements from decisions? In his paper, Maule (this issue) calls for an explicit task typology. This would require a rule-based scheme for categorising tasks into either judgements or decisions. What task features could form such a scheme? Judgements vary in accuracy whereas decisions vary in optimality. This is not to say that judges focus only on the correspondence that their estimates have with reality while ignoring their coherence. They may check that their judgements are coherent: for example, they may confirm that their probability estimates for an exhaustive set of events sum to one. However, the purpose of this is to advance their primary aim of maximising accuracy. Similarly, decision makers may be concerned with the accuracy of the judgements on which their decisions rely. However, the purpose of this is to advance their primary aim of maximising returns. In summary, judgements are assessed in terms of how accurate they are whereas decisions are assessed in terms of their potential consequences. Put in another way, decisions have consequences; judgements have no direct consequences but they can have indirect ones via the decisions that they inform. This suggests that all one has to do to change judgements into decisions is to add consequences to different types of outcome. To see whether this is plausible,

DISTINGUISHING JUDGEMENT FROM DECISION MAKING

GENERAL ISSUES

105

let us consider a couple of examples. The first concerns applications of signal detection theory to perception (Green & Swets, 1966) and judgement (Ferrell & McGoey, 1980; Swets, 2000). As long as criterion placement is influenced only by signal base rate, most psychologists would accept that the task under study is one of judgement. However, criterion placement can also be influenced by the positive payoffs associated with hits and correct rejections or by the negative payoffs associated with misses and false alarms. When this occurs, most psychologists would recognise the task as one of decision making rather than mere judgement. With this in mind, consider Swets’ (2000) example of doctors reading mammograms to determine whether or not they are indicative of breast cancer. A doctor just taking account of breast cancer base rates in order to maximise the number of correct diagnoses would be making a set of categorical judgements. A doctor who also takes into account the anguish that false alarms cause patients, the discomfort produced by biopsies, the cost of tissue sample tests, and so on would be making a set of diagnostic decisions. For a second example, consider someone who is employed by a firm that manufactures consumer products and who makes judgemental forecasts for sales volumes of those products. If this person’s sole aim is to maximise forecast accuracy, it is fair to say that they are performing a judgement task. However, they may have other implicit aims. Sales targets are typically derived from forecasts, and staff often receive larger bonuses for exceeding targets by a greater amount. Thus there is an advantage to sales staff of underforecasting. In contrast, stock managers may be penalised if they have not ordered sufficient components to make up the products. For them, there is an advantage of overforecasting. Forecasters who take these asymmetric loss functions into account turn the forecasting task from one of judgement into one of decision making. This can cause problems when those affected by the forecasts do not share the forecaster’s loss function. These examples suggest that it may be possible to establish the rule-based scheme for categorisation that is required by an explicit task typology: judgements do not take consequences into account and aim to maximise accuracy, whereas decisions do take consequences into account and aim to maximise returns. However, it will occasionally lead to tasks’ being characterised in ways that do not match most researchers’ intuitions. For example, an experimenter studying judgement may wish to provide participants with an incentive for being accurate. However, instituting a performance-related pay scheme would turn the task from one of judgement to one of decision making. It is unlikely that researchers themselves spontaneously use a rule-based scheme to classify tasks. According to prototype theory (e.g., Rosch, 1978), concepts are mentally represented by an “ideal” or “average” prototype and instances are categorised according to the prototype to which they are most similar. According to exemplar models of categorisation (e.g., Medin &

106

HARVEY

Schaffer, 1978; Nosofsky, 1986), instances are classified according to their similarity not to prototypes but to sets of exemplars that are accepted as members of candidate categories. Both prototype and exemplar models imply that researchers categorise tasks by using information about how they previously categorised other tasks. As experience of different researchers is unlikely to be the same, they will vary somewhat in how they categorise tasks.

Researchers are also likely to use prototypes or exemplars rather than rules to distinguish problem solving tasks from judgement and decision making tasks. However, we can still ask whether there is some rule-based scheme that would form the basis for an explicit task typology of the sort that Maule calls for. In problem solving tasks, the solution, once found, is effective. There is no noise or random element in the system capable of preventing the solution from working. The solution to a crossword or other problem is just as effective each time it is implemented. Decision making tasks are different. Even when an optimal decision exists, there is no guarantee that it will be effective. This is because randomness in the system can perturb the outcome away from what would be expected in the long run. If the average journey time to the airport is less by train than by bus, and journey time is the only consideration, then the optimal decision is to go by train rather than by bus. However, signalling problems may delay trains on the day of travel and make them slower than buses. Optimal decisions are unlikely to have the same effectiveness each time that they are implemented. This rather simplistic way of distinguishing problem solving from decision making is not altogether satisfactory. Even very simple systems without a random element can produce unpredictably chaotic behaviour (May, 1986). Adding a purely random element to them can make that behaviour simpler (Crutchfield , Farmer, & Huberman, 1982) and can hardly be said to change the task of controlling them from one of problem solving to one of decision making. Perhaps we could say that people make decisions in unpredictable environments but solve problems in predictable ones. Unpredictability could arise either from randomness or from deterministic chaos. There are at least two problems with this. First, chaos appears and disappears in non-linear systems as control parameters are altered. Thus, someone attempting to control a system by changing a control parameter may change their task from one of decision making to one of problem solving and back again many times while performing it. Task typologies should be more robust than this. Second, unpredictability is in the eye of the beholder. It means that observers cannot extract patterns that fully account for the behaviour in question (cf. Ayton, Hunt, & Wright, 1989). Even behaviour of a deterministic linear system may initially be unpredictable but, after a great

DISTINGUISHING PROBLEM SOLVING FROM JUDGEMENT AND DECISION MAKING

GENERAL ISSUES

107

deal of practice, may become fully predictable. Would we really want to say that the task had changed from one of decision making to one of problem solving? I do not think so: by the time someone had produced this transformation, performance would be perfect and there would be no problem to solve. It is very difficult to discern any feature crucial to distinguishing the experiments of those working on “complex problem solving” (Frensch & Funke, 1995a) from the experiments of those working on dynamic decision making (e.g., Gibson, Fichman, & Plaut, 1997; Kleinmuntz & Thomas, 1987; Sterman, 1989). Indeed, Dörner argues that the “ complex problem solver permanently elaborates on her goals and constructs hypotheses about the (unknown) structure of the domain. He or she makes decisions and needs to control the results”, while Huber suggests that complex problem solving “ is the task of optimising one or more target variables of a system by a series of decisions” (Frensch & Funke, 1995b, p. 14). Thus, it appears that to those working in the area, complex problem solving is seen as a type of decision making. Those working in judgement and decision making are unlikely to disagree. Perhaps many of the tasks studied by those interested in complex problem solving are better characterised as dynamic judgement tasks (cf. Cooksey, 1996, Chapter 8). This is because the goal is usually to bring a variable into a target range and to keep it there. Consequences of failure and how they relate to degree and direction of error are rarely made explicit. The aim is to maximise accuracy rather than financial returns. There are, however, a number of exceptions to this generalisation that can be more properly described as dynamic decision making tasks (e.g., Huber, 1995).

Psychologists working on skilled behaviour have found it useful to develop various task typologies (e.g., Fleishman & Quaintance, 1984). They have found that different typologies may be useful for different purposes: a classification based on observed behaviours may be useful for training and instruction of skill; one based on the abilities demanded by tasks may be useful for predicting transfer; one based on how lay people describe tasks may be useful for career guidance. The previous sections contain suggestions for distinguishing judgement from decision making and both of these from problem solving. They represent an attempt to develop a coarse-grained task typology. Recently, those working on judgement have become more interested in producing finer-grained task typologies (e.g., Funke, 1995; Hammond, 1996, 2000) that distinguish between different types of judgement. There are good reasons for this. As Funke (this issue, p. 74) points out in his paper, research into dynamic tasks has used “ a broad number of systems, each of them programmed in a unique manner, not easy to

DISTINGUISHING BETWEEN DIFFERENT TYPES OF JUDGEMENT TASK

108

HARVEY

change for experimental purposes, and not at all comparable to other programs ”. As there is no way of relating research on one system to research on another, it is difficult to summarise results from previous studies into a coherent body of knowledge that can be used to predict how people will behave in tasks that have not yet been examined. Arguably, this problem is much more severe in the study of “ naturalistic decision making” (Zsambok & Klein, 1997). Hammond (2000, p. 27) has made a similar point with respect to studies of the effect of stress on judgement: “the topic has been studied with a variety of theories, methods, and hypotheses that have never been organised into a coherent body of knowledge—nor can they be. No doubt many excellent studies have been carried out, but the lack of a general organising principle defeats any attempt to make use of this body of work. ” How should we go about developing a task typology for judgement? Both Funke (1995) and Hammond (2000) recognise that performance must be a joint function of task properties and cognitive (person) properties but they have different approaches to developing a task typology. Funke (1995) adopts an operational approach: he defines different types of task in terms of situation and system factors. Hammond (1996, 2000) takes a more subject-based approach: different tasks are defined in terms of the type of cognitive processing that they elicit. I shall discuss these approaches in turn before commenting on their compatibility. Funke (1995) includes type of task (i.e., what participants are told to do), level of stress (e.g., time pressure), individual vs group judgements, task transparency, and information presentation format as situation factors. He suggests that these can all be characterised as contextual in nature. However, it appears difficult to define them more precisely than by saying that they are task factors that are not system factors. Thus, in his paper, Funke (this issue, p. 73) argues that (in)transparency and multiple goals, two task factors identified by Dörner (1980), are part of the “presentation to a subject, but they are not features of the computersimulated system, like connectivity and dynamics ”. This separation of task factors into situation and system factors appears unambiguous and may prove useful. However, its potential for helping us to create a coherent body of knowledge about judgement would be enhanced if an exhaustive list of situation factors could be agreed. It would then be possible to specify how any judgement task related to any other. In his paper, Funke explicitly identifies dynamics and connectivity as system factors and implies that linearity is also an important feature of systems. It may be that number of variables (rather than “ complexity”) should also be included as a system factor. Studies of multiple-cue probability learning suggest that the number of cues and the intercorrelations (connections) between them have different effects on judgement performance (Cooksey, 1996). Would defining system factors in this way (i.e., as system properties) be helpful in developing a task typology that could be used to organise our knowledge about judgement?

GENERAL ISSUES

109

One problem is that the difficulty of controlling a system depends on its behaviour and its behaviour can depend on how it is parameterised. For example, consider a very simple system such as a logistic map: Xt + 1 = AXt(1 – Xt), where X is a variable between 0 and 1 that has to be set within some target range and A is a parameter under control of participants. This system has a fixed set of properties (dynamic, non-linear, single variable, low connectivity) yet the difficulty of the task depends on the initial setting of A: when A is initially low, the system is stable and easy to control; when A is initially at an intermediate value, the system oscillates and is harder to control; when A is initially high, the system behaves chaotically and is extremely difficult to control (Harvey, Koehler, & Ayton, 1997). Although the range of behaviours that a system can exhibit depends on its properties, it is the system behaviour rather than its properties that most directly influence task difficulty. Thus, perhaps we should consider whether system factors should be defined in terms of system behaviours rather than system properties. This is, in fact, an approach that has been previously taken by Funke (1995): he included autonomous movement by the system and feedback delay as behaviours that should be included as system factors. Hammond (1996, p. 180) argues that “cognitive tasks can be ordered on a continuum with regard to their capacity to induce intuition, quasirationality, or analytical cognition ”. Like others before him (e.g., Kahneman & Chajzcyk, 1983; MacLeod & Dunbar, 1988), Hammond suggests that there is a continuum between intuitive (automatic) cognition and analytic (controlled) cognition. What he refers to as “ quasirationality ” is a form of cognition that involves both intuitive and analytic elements. His theory is that judgement performance will be optimal when the type of cognitive processing matches the type of processing that the tasks naturally elicit. Such a match will normally occur. Sometimes, however, the nature of the task will change quickly, the match will be disrupted, and performance will suffer (Hammond, 2000). Here we are concerned not with this ecologically based theory (which certainly requires testing) but with the task typology on which it relies. Although Hammond (1996) characterises tasks in what appears to be a subject-based manner (i.e. the type of cognition that they induce in people), the way that he determines where a task lies on the continuum between analysisinducing and intuition-inducing depends very much on the sort of task properties identified by Funke (1995; this issue). For example, Hammond, Hamm, Grassia, and Pearson (1997) use eight task factors, most of which would be characterised by Funke as system factors. They argue that these factors have implications for the type of cognition elicited by the task. For example, a large number of cues, linearity, and high system noise tend to elicit intuitive cognition, whereas fewer cues, non-linearity, and low system noise tend to elicit analytic cognition. Tasks are assessed on each of the eight factors using a 10-point scale and the results then combined into a single task continuum index. Given that judges’ models of

110

HARVEY

cognitive processing can be ascertained, this index can then be used to make predictions about performance levels. Hammond’s (1996) typology has some limitations. First, many psychologists would argue that the type of processing that a task naturally elicits depends both on how familiar the performer is with it (Fitts & Posner, 1967; Stanley, Mathews, Buss, & Kotler-Cope, 1989) and on the cognitive style induced by the type of education and training common within the performer’s culture (Yates, Lee, & Bush, 1997). Second, Hammond et al.’s (1997) list of task factors would need to be extended before it could be applied to the sort of dynamic tasks that are the focus of Funke’s research. These problems aside, Hammond et al. (1997) have shown that it is possible to use a set of task factors to structure knowledge in a way that allows predictions to be made about how tasks that have not been previously studied will be performed. Without some organising principle derived from current knowledge, the number of possible permutations of different levels of a large number of task factors would make this difficult to accomplish. Hammond’s organising principle is based on cognitive processing modes but organising principles based on other criteria could be useful for developing task taxonomies for other purposes (cf. Fleishman & Quaintance, 1984). In their paper, Dhami and Harries report that the two models of judgement that they considered fitted their data almost equally well. Like Gigerenzer and Goldstein (1996), they suggest that psychological plausibility of models and the ease with which they can be understood (transparency) should also influence model acceptability. This would enable them to act as “tie-breakers ” when data fail to distinguish between the alternatives. Let us consider these two criteria in turn. By psychological plausibility, Dhami and Harries (this issue, p. 21) mean the degree to which a model’s assumptions are consistent with what else is known about human psychology. For example, they point out that advocates of fast-and-frugal heuristics regard regression models as psychologically implausible because they see them as “incompatible with the fact that the human mind is characterised by limited cognitive processing capacity (e.g., … Miller, 1956) ”. This is, of course, exactly the argument made by those who had earlier proposed other models based on a different type of heuristic processing. For example, in their preface, Kahneman, Slovic, and Tversky (1982, p. xii) write: “Bruner and Simon were both concerned with strategies of simplification that reduce the complexity of judgment tasks, to make them tractable for the kind of mind that people happen to have. Much of the work that we have included in this book was motivated by the same concerns. ” Smith, McKenna, Pattison, and Waylen (this issue) are right to suggest that we should exercise caution before accepting this argument. In making assessments of psychological plausibility, it is only too easy to focus on certain findings while

SELECTING BETWEEN MODELS OF JUDGEMENT

GENERAL ISSUES

ignoring others. For example Dhami, and Harries point to Miller’s (1956) work that identified a limit (seven items plus or minus two) that constrains performance in categorisation and digit-span tasks. However, with practice, this limit is easily exceeded. Chase and Ericsson (1982) showed that people can acquire digit spans of over 70 items, and Ericsson and Faivre (1988) demonstrated that they can develop an ability to place unidimensional stimuli into 21 different categories. Furthermore, many psychologists would argue that it is becoming increasingly apparent that expertise in cognitive skills is based on storage of and search through many separate previously experienced exemplars (e.g., Kruschke, 1992; Nosofsky & Palmeri, 1997; Perruchet & Pacteau, 1990) rather than on use of simple rules extracted from these exemplars (e.g., Reber, 1989). They have come to this conclusion despite the fact that use of simple rules would reduce storage and search demands and, hence, if we have limited processing capacity, would be more psychologically plausible. Indeed, we could say that their work has redefined what can be considered to be psychologically plausible. What about using transparency as a criterion for selecting between models? Dhami and Harries argue that transparent models are more useful because they can be employed to train people to make better judgements. Berry and Broadbent’s (1984) results cast some doubt on this claim. In their dynamic judgement task, they found that simple instructions about how to perform well improved people’s ability to answer questions about how to perform well without changing their level of performance. In general, we must treat any claim about the usefulness of models as a matter that requires empirical testing rather than as a priori grounds for accepting those models. Dhami and Harries do not suggest that there is any relationship between the transparency and the correctness of models. However, it is clear that nature is not adapted to the way that conscious processing takes place in the human brain. Einstein’s physics provides a better account of the data than Newton’s does. However, it is much less transparent. In fact, people find even Newton’s account counter-intuitive. The naïve physical theories that they actually use are easy to understand and simple to communicate but do not match reality (Champagne, Klopfer, & Anderson, 1980; McCloskey, 1983). Observations such as this have led Wolpert (1992) to argue for a negative relationship between model transparency and correctness. Anyone accepting his arguments should be less likely to accept more transparent models as correct. If they assume that incorrect models are less likely to be useful than correct ones, they may also be less likely to accept more transparent models as useful. However, I suspect that most of us, while having some sympathy with Wolpert’s views, would prefer to remain sceptical. If neither psychological plausibility nor transparency can be used as criteria to select between models when data fail to do so, what can be done? Parsimony is recognised as an important characteristic of scientific theories but, in practice, it can be very difficult to assess. We should rely on the fact that, as long as models

111

112

HARVEY

do not possess some hidden formal equivalence, the search for empirical ways of selecting between them has some chance of success. There is no reason to suppose that an experimental means of determining whether fast-and-frugal or regression models are more appropriate will not be found. It is worth looking back at published findings as well as forward to new types of paradigm. For example, Einhorn (1972) asked doctors to code biopsies of 193 patients with Hodgkin’s disease and, on the basis of these, to make an overall rating of the severity of each one’s condition. These global assessments failed to predict the longevity of patients. However, when the biopsy variables that the doctors had coded were entered into multiple regression models, they did predict patients’ survival. This finding is easily explicable within the multipleregression approach: people could assess the different predictor variables but could not take weights into account when integrating information into global assessments (Dawes, 1979). It is less clear how it should be interpreted within the fast-and-frugal framework. There is no problem with the ability to weight the predictor variables: that would be needed in order to decide which is best. But why could people not then make global assessments? Presumably all they had to do was select the predictor with the highest weight and to use it; no information integration is necessary within the fast-and-frugal framework.

Judgement and decision making is one of many areas in which psychologists are keen to identify the cognitive processes responsible for task performance. As far as I know, it is the only one in which efforts towards this end are known as process tracing. Harte and Koele (this issue) focus on the two main process tracing techniques: information boards and protocol analysis. I shall make a few comments about these methodologies before briefly mentioning two alternatives that are more commonly used in other areas of cognitive psychology. Harte and Koele analyse most of the problems inherent in the use of information boards to track cognitive processes. However, there is one that they do not cover. What is to stop people collecting all the information that is available? Furthermore, if they know that they can collect it all without penalty, why should they bother to collect it in any particular order? Information collection is itself a decision, although this is often recognised only when it is expensive. From a normative perspective, it should take place when the expected gain that it produces exceeds the expected cost that it incurs (Edwards, 1965). In fact, people do not behave normatively when making information-collection decisions (Connolly & Gilani, 1982). Either they aim to maximise their expected gains but do so very poorly or else they collect information for some other reason (Harvey & Bolger, in press).

IDENTIFYING COGNITIVE PROCESSES UNDERLYING JUDGEMENT AND DECISION MAKING

GENERAL ISSUES

113

One could argue that, as information costs nothing in information-board experiments , it should be collected as long as it produces some benefit, however small. So what can we infer when someone fails to collect information that could plausibly help them to make a better decision? It may be that they are taking account of implicit costs (time, effort) that are inherent in the experiment. It may be that they are poor at making information-collection decisions in a normative manner. It may be that they have reasons for collecting (and not collecting) information that are not easily explained in normative terms. Before using information collection as a process-tracing tool to uncover the cognitive processes subserving judgement and decision making, it would be useful to know more about the cognitive processes on which information collection itself depends. Harte and Koele also provide a valuable discussion of the use of concurrent verbal protocols and post-task verbal reports. They point out that it is difficult to determine the veridicality of concurrent protocols. According to Ericsson and Simon’s (1993) theory, protocols will be veridical when they rely on the contents of short-term memory (focal attention) and when those contents are verbalisable. Unfortunately , this just passes the buck: instead of asking whether concurrent verbal protocols from a task are veridical, we now have to ask whether the contents of short-term memory during task performance are verbalisable. How are we supposed to find this out? We should be able to say a lot more about the veridicality of post-task verbal reports. This is because it is an issue that has been studied by many of the cognitive psychologists interested in implicit learning. Typically, they train people to perform well in, say, a dynamic decision-making task of the sort that Funke describes in his paper. Then they ask them questions about performing the task. In an early study, Berry and Broadbent (1984) found that practice improved people’s ability to make decisions without changing their ability to make posttask verbal reports whereas verbal instructions about how to perform the task improved performance on the post-task verbal reports without affecting how decisions were made during the task. Such findings should lead us to query the veridicality of post-task verbal reports. However, Shanks and St John (1994) suggested that, in many studies, the post-task questions were not designed to tap exactly the same information that people had acquired while practising the task. They argue that when the post-task questionnaire is appropriately designed, answers show that people do have access to how they performed the task. However, Shanks and St John (1994) have been criticised in their turn (Cleeremans , 1997; Dienes & Altmann, 1997). At present, there is very little agreement in this area: views range between those who argue that we cannot access any information used in performing these tasks (e.g., Lewicki, Czyzewska, & Hill, 1997) to those who argue that we have full access to it (e.g., St John & Shanks, 1997). Until greater agreement emerges, it might be wise to be

114

HARVEY

circumspect in using post-task verbal reports as a process-tracing technique—at least for the sort of judgement task where improvement comes with practice rather than instruction. Harte and Koele mention use of reaction time as an aid to process tracing in the context of their discussion of the methodology of information boards. Certainly reaction times, in conjunction with methods that have been devised for their interpretation (Donders, 1868/1969; Sternberg, 1969), have long been used by psychologists interested in identifying the processing stages involved in the performance of a wide variety of tasks. Despite problems (e.g., McClelland, 1979; Townsend, 1974), there is some consensus that they have proved to be a valuable tool for this purpose (Sanders, 1990). Curiously, their use has not been widespread within the study of judgement and decision making. However, this may be changing: a number of those researching into the calibration of probability judgements have recently used reaction times to clarify the cognitive processes involved in this task (e.g., Baranski & Petrusic, 1998; Juslin & Olsson, 1999; Wright & Ayton, 1988). In other areas of cognitive psychology, neuropsychological studies with brain-damaged patients have been used in attempts to identify processes subserving task performance (e.g., Shallice, 1988). For example, one of the patterns sought in data is a double dissociation between deficits exhibited by different patients. The argument is that if one patient can no longer perform task A while still being able to perform task B, but another patient can no longer perform task B while still being able to perform task A, then we have some justification for concluding that performance of the two tasks depends on different processes. (Other patterns in data would provide better evidence that processes are independent [Dunn & Kirsner, 1988] but the methods required to reveal them are more complex and so they are rarely sought.) Use of neuropsychological methods to study judgement and decision making is even rarer than use of reaction times but, recently, attempts have been made to identify different processes subserving decision making (e.g., Bechara, Damasio, Damasio, & Lee, 1999; Rogers et al., 1999) and to separate these from other cognitive processes (Bechara, Damasio, Tranel, & Anderson, 1998). However, judgement researchers have serious concerns about the design of the tasks that are used in such studies: “ the judgment tasks used by neuroscientists…are generally useless from the point of view of a judgment researcher because they fail to consider the task parameters essential from their point of view” (Hammond, 2000, p. 27). Resolving general issues of task typology and methodology should help the study of judgement to become better organised and to advance more quickly. However, these issues are not easily settled. In practice, this does not seem to matter much.

IMPLICATIONS

GENERAL ISSUES

115

The area is already advancing apace. It is less than 20 years since Lichtenstein, Fischhoff, and Phillips (1982, p. 333) could write that a “striking aspect of the literature [on probability judgement] is its ‘dust-bowl empiricism’. Psychological theory is often absent, either as motivation for research or as explanation of the results.” Since that time, a plethora of interesting models, many of them computational in nature, have been developed and tested with experiments and simulations (for reviews see Harvey, 1997; McClelland & Bolger, 1994). Similar developments have taken place in other areas of judgement (e.g., Gigerenzer, Todd, & the ABC Research Group, 1999). This work will flourish whether or not general issues of the sort that I have been discussing are resolved and may well contribute vicariously to their resolution.
Ayton, P., Hunt, A., & Wright, G. (1989). Psychological conceptions of randomness. Journal of Behavioral Decision Making, 2, 221–238. Baranski, J.V., & Petrusic, W.M. (1998). Probing the locus of confidence judgments: Experiments on the time to determine confidence. Journal of Experimental Psychology: Human Perception and Performance , 24, 929–945. Bechara, A., Damasio, H., Damasio, A.R., & Lee, G.P. (1999). Different contributions of the human amygdala and ventromedial prefrontal cortex to decision-making. The Journal of Neuroscience , 19, 5473–5481. Bechara, A., Damasio, H., Tranel, D., & Anderson, S.W. (1998). Dissociation of working memory from decision making within the human prefrontal cortex. The Journal of Neuroscience, 18, 428–437. Berry, D.C., & Broadbent, D.E. (1984). On the relationship between task performance and associated verbalisable knowledge. Quarterly Journal of Experimental Psychology , 36, 209– 231. Brunswik, E. (1956). Perception and representative decision in psychological experiments (2nd ed.). Berkeley & Los Angeles: University of California Press. Champagne, A.B., Klopfer, L.E., & Anderson, J.H. (1980). Factors influencing the learning of classical mechanics. American Journal of Physics, 48, 1074–1079. Chase, W.G., & Ericsson, K.A. (1982). Skill and working memory. In G.H. Bower (Ed.), The psychology of learning and motivation. Vol. 16 (pp. 1–58). New York: Academic Press. Cleeremans, A. (1997). Principles for implicit learning. In D.C. Berry (Ed.), How implicit is implicit learning? (pp. 195–234). Oxford: Oxford University Press. Connolly, T., & Gilani, N. (1982). Information search in judgment tasks: A regression model and some preliminary findings. Organizational Behavior and Human Performance, 30, 330–350. Cooksey, R.W. (1996). Judgment analysis: Theory, methods, and applications. New York: Academic Press. Crutchfield, J.P., Farmer, J.P., & Huberman, B.A. (1982). Fluctuations and simple chaotic dynamics. Physics Reports, 92, 45–82. Dawes, R.M. (1979). The robust beauty of improper linear models in decision-making. American Psychologist , 34, 571–582. Dhami, M.K., & Harries, C. (this issue). Fast and frugal versus regression models of human judgement. Thinking and Reasoning , 7, 5–27. Dienes, Z., & Altmann, G. (1997). Transfer of implicit knowledge across domains: How implicit and how abstract? In D.C. Berry (Ed.), How implicit is implicit learning? (pp. 107–123). Oxford: Oxford University Press.

REFERENCES

116

HARVEY

Donders, F.C. (1969). On the speed of mental processes. In W.G. Foster (Ed. & Trans.). Attention and performance II. Special Issue. Acta Psychologica , 30, 412–431 [originally published in 1868]. Dörner, D. (1980). On the difficulty people have in dealing with complexity. Simulation and Games, 11, 87–106. Dunn, J.C., & Kirsner, K. (1988). Discovering functionally independent mental processes: The principle of reversed association. Psychological Review, 95, 91–101. Edwards, W. (1965). Optimal strategies for seeking information. Journal of Mathematical Psychology , 2, 312–329. Einhorn, H.J. (1972). Expert measurement and mechanical combination. Organizational Behavior and Human Performance, 7, 86–106. Ericsson, K.A., & Faivre, L. (1988). What’s exceptional about exceptional abilities? In L. Kobler, & D. Fein (Eds.), The exceptional brain: Neuropsychology of talent and special abilities (pp. 436–473). New York: Guilford Press. Ericsson, K.A., & Simon, H.A. (1993). Protocol analysis: Verbal reports on data (Revised ed.). Cambridge, MA: The MIT Press. Ferrell, W.R., & McGoey, P.J. (1980). A model of calibration for subjective probabilities. Organizational Behaviour and Human Performance , 26, 32–53. Fitts, P.M., & Posner, M.I. (1967). Human performance. Belmont, CA: Brooks/Cole. Fleishman, E.A., & Quaintance, M.K. (1984). Taxonomies of human performance: The description of human tasks. Orlando, FL: Academic Press. Frensch, P.A., & Funke, J. (1995a). Complex problem solving: The European perspective. Hillsdale, NJ: Lawrence Erlbaum Associates Inc. Frensch, P.A., & Funke, J. (1995b). Definitions, traditions and a general framework for understanding complex problem solving. In P.A. Frensch, & J. Funke (Eds.). Complex problem solving: The European perspective. (pp. 3–25). Hillsdale, NJ: Lawrence Erlbaum Associates Inc. Funke, J. (1995). Experimental research on complex problem solving. In P.A. Frensch, & J. Funke (Eds), Complex problem solving: The European perspective (pp. 243–268). Hillsdale, NJ: Lawrence Erlbaum Associates Inc. Funke, J. (this issue). Dynamic systems as tools for analysing human judgement. Thinking and Reasoning, 7, 69–89. Gibson, F.P., Fichman, M., & Plaut, D.C. (1997). Learning in dynamic decision tasks: Computational model and empirical evidence. Organizational Behavior and Human Decision Processes, 71, 1–35. Gigerenzer, G., & Goldstein, D.G. (1996). Reasoning the fast and frugal way: Models of bounded rationality. Psychological Review, 103, 650–669. Gigerenzer, G., Todd, P.M., & the ABC Research Group (1999). Simple heuristics that make us smart. Oxford: Oxford University Press. Goldstein, W.M., & Hogarth, R.M. (1997). Judgment and decision research: Some historical context. In W.M. Goldstein, & R.M. Hogarth (Eds.), Research on judgment and decision making: Comments, connections, and controversies (pp. 3–65). Cambridge: Cambridge University Press. Green, D.M., & Swets, J.A. (1966). Signal detection theory and psychophysics. New York: Wiley. Hammond, K.R. (1996). Human judgment and social policy: Irreducible uncertainty, inevitable error, unavoidable injustice. Oxford: Oxford University Press. Hammond, K.R. (2000). Judgments under stress. Oxford: Oxford University Press. Hammond, K.R., Hamm, R.M., Grassia, J., & Pearson, T. (1997). Direct comparison of the efficacy of intuitive and analytical cognition in expert judgement. In W.M. Goldstein, & R.M. Hogarth (Eds.), Research on judgment and decision making: Comments, connections, and controversies (pp. 144–180). Cambridge: Cambridge University Press.

GENERAL ISSUES

117

Harte, J.M., & Koele, P. (this issue). Modelling and describing human judgement processes: The multiattribute evaluation case. Thinking and Reasoning, 7, 29–49. Harvey, N. (1997). Confidence in judgment. Trends in Cognitive Sciences, 1, 78–82. Harvey, N., & Bolger, F. (in press). Collecting information: Optimising outcomes, screening options or facilitating discrimination? Quarterly Journal of Experimental Psychology, 54(1). Harvey, N., Koehler, D.J., & Ayton, P. (1997). Judgments of decision effectiveness: Actor– observer differences in overconfidence. Organizational Behavior and Human Decision Making, 70, 267–282. Huber, O. (1995). Complex problem solving as multistage decision making. In P.A. Frensch, & J. Funke (Eds.), Complex problem solving: the European perspective (pp. 151–173). Hillsdale, NJ: Lawrence Erlbaum Associates Inc. Juslin, P., & Olsson, H. (1999). Computational models of subjective probability calibration. In P. Juslin, & H. Montgomery (Eds.), Judgment and decision making: Neo-Brunswikian and process-tracing perspectives (pp. 67–95). Mahwah, NJ: Lawrence Erlbaum Associates Inc. Kahneman, D., & Chajczyk, D. (1983). Tests of the automaticity of reading: Dilution of Stroop effects by color-irrelevant stimuli. Journal of Experimental Psychology: Human Perception and Performance , 9, 497–509. Kahneman, D., Slovic, P., & Tversky, A. (Eds.). (1982). Judgment under uncertainty: Heuristics and biases. Cambridge: Cambridge University Press. Kleinmuntz, D.N., & Thomas, J.B. (1987). The value of action and inference in dynamic decision making. Organizational Behavior and Human Decision Processes, 39, 341–364. Kruschke, J.K. (1992). ALCOVE: An example-based connectionist model of category learning. Psychological Review, 99, 22–44. Lewicki, P., Czyzewska, M., & Hill, T. (1997). Nonconscious information processing and personality. In D.C. Berry (Ed.), How implicit is implicit learning? (pp. 48–72). Oxford: Oxford University Press. Lichtenstein, S., Fischhoff, B., & Phillips, L.D. (1982). Calibration of probabilities: The state of the art to 1980. In D. Kahneman, P. Slovic, & A. Tversky (Eds.), Judgment under uncertainty: Heuristics and biases (pp. 306–334). Cambridge: Cambridge University Press. Maule, A.J. (this issue). Studying judgement: Some comments and suggestions for further research. Thinking and Reasoning , 7, 91–102. May, R.M. (1986). Simple mathematical models with very complicated dynamics. Nature, 261, 459–467. McClelland, A.G.R., & Bolger, F. (1994). The calibration of subjective probability: Theories and models 1980–1993. In G. Wright, & P. Ayton (Eds.), Subjective probability (pp. 453–482). Chichester: Wiley. McClelland, J.L. (1979). On the time relations of mental processes: An examination of systems of processes in cascade. Psychological Review, 86, 287–330. McCloskey, M. (1983). Intuitive physics. Scientific American, 24, 122–130. MacLeod, C.M., & Dunbar, K. (1988). Training and Stroop-like interference: Evidence for a continuum of automaticity. Journal of Experimental Psychology: Learning, Memory and Cognition, 14, 126–135. Medin, D.L., & Schaffer, M.M. (1978). Context theory of classification learning. Psychological Review, 85, 207–238. Miller G.A. (1956). The magic number seven, plus or minus two: Some limits on our capacity for processing information. Psychological Review, 63, 81–93. Nosofsky, R.M. (1986). Attention, similarity, and the identification-categorization relationship. Journal of Experimental Psychology: General, 115, 39–57. Nosofsky, R.M., & Palmeri, T.J. (1997). An exemplar-based random walk model of speeded classification. Psychological Review, 104, 266–300.

118

HARVEY

Perruchet, P., & Pacteau, C. (1990). Synthetic grammar learning: Implicit rule abstraction or explicit fragmentary knowledge? Journal of Experimental Psychology: General, 119, 264– 275. Reber, A.S. (1989). Implicit learning and tacit knowledge. Journal of Experimental Psychology: General, 118, 219–235. Rogers, R.D., Owen, A.M., Middleton, H.C., Williams, E.J., Pickard, J.P., Sahakian, B.J., & Robbins, T.W. (1999). Choosing between small, likely rewards and large, unlikely rewards activates inferior and orbital prefrontal cortex. The Journal of Neuroscience , 20, 9029–9038. Rosch, E. (1978). Principles of categorisation. In E. Rosch, & B. Lloyd (Eds.), Cognition and categorisation (pp. 27–84). Hillsdale, NJ: Lawrence Erlbaum Associates Inc. Sanders, A.F. (1990). Issues and trends in the debate on discrete versus continuous processing of information. Acta Psychologica, 74, 123–167. Shallice, T. (1988). From neuropsychology to mental structure. Cambridge: Cambridge University Press. Shanks, D.R., & St John, M. (1994). Characteristics of dissociable learning systems. Behavioral and Brain Sciences, 17, 367–395. Smith, P.T., McKenna, F., Pattison, C., & Waylen, A. (this issue). Structural equation modelling of human judgement. Thinking and Reasoning , 7, 51–68. Stanley, W.B., Mathews, R., Buss, R., & Kotler-Cope, S. (1989). Insight without awareness: On the interaction of verbalisation, instruction and practice on a simulated process control task. Quarterly Journal of Experimental Psychology , 41, 553–577. Sterman, J.D. (1989). Misperception of feedback in dynamic decision making. Organizational Behavior and Human Decision Making, 71, 1–35. Sternberg, S. (1969). The discovery of processing stages: Extensions of Donders’ method. In W.G. Koster (Ed.), Attention and performance II. Special Issue. Acta Psychologica, 30, 276–315. St John, M.F., & Shanks, D.R. (1997). Implicit learning from an information processing standpoint. In D.C. Berry (Ed.), How implicit is implicit learning? (pp. 162–194). Oxford: Oxford University Press. Swets, J.A. (2000). Enhancing diagnostic decisions. In T. Connolly, H.R. Arkes, & K.R. Hammond (Eds.), Judgment and decision making: An interdisciplinary reader (2nd ed., pp. 66–81). Cambridge: Cambridge University Press. Townsend, J.T. (1974). Issues and models concerning the processing of a finite number of inputs. In B.H. Kantowitz (Ed.), Tutorials in performance and cognition (pp. 133–185). Hillsdale, NJ: Lawrence Erlbaum Associates Inc. Wolpert, L. (1992). The unnatural nature of science. London: Faber and Faber. Wright, G., & Ayton, P. (1988). Decision time, subjective probability, and task difficulty. Memory and Cognition, 16, 176–185. Yates, J.F., Lee, J.-W., & Bush, G.G. (1997). General knowledge overconfidence: Cross-national variation, response style, and ‘reality’. Organizational Behavior and Human Decision Making, 70, 87–94. Zsambok, C.E., & Klein, G. (1997). Naturalistic decision making. Mahwah, NJ: Lawrence Erlbaum Associates Inc.

Sign up to vote on this title
UsefulNot useful