This action might not be possible to undo. Are you sure you want to continue?

**EVALUATION AND DECISION MODELS:
**

a critical perspective

**EVALUATION AND DECISION MODELS:
**

a critical perspective

Denis Bouyssou ESSEC

Thierry Marchant Ghent University

Marc Pirlot SMRO, Facult´ Polytechnique de Mons e

Patrice Perny LIP6, Universit´ Paris VI e

Alexis Tsouki`s a LAMSADE - CNRS, Universit´ Paris Dauphine e

Philippe Vincke SMG - ISRO, Universit´ Libre de Bruxelles e

KLUWER ACADEMIC PUBLISHERS Boston/London/Dordrecht

. . . . . . . . 2. . 1. . . . . . 2. . . . . . .6 Conventions . . . . . . . 3. .3 Some theoretical results . . . . . . . .3 Aggregating grades . . . . . . . . . . .3 Other models . . . . . . . . .7 Acknowledgements . . . . . .3 Structure . . . . . . . . . . . . 3. . . . .2 Deﬁnition of the set of the voters . . . . . . . . . . . . . . . . . .3 Choice of the aggregation method 2. .Contents 1 Introduction 1. . . 3. . . . . . . . . . . . . . . . . . . . . .2 Evaluating students in Universities 3. . . . . . . . . . . . . . . . . .2 Fuzzy relations . . . . . . . . . . . . . . . 2. . . . . . . . . . . . . . . . . . . . . . .2 Modelling the preferences of a voter . . . . . . . . .1. . . . . . . . . . . . . .2. . 2. . .1. . . . 2. . . . . . . . . 3. . . . . . . . . . . . . . . . . . . . . . . . . .4 Outline . . . . . .2. . . . . . 1 1 2 3 3 5 5 6 7 8 9 13 16 18 18 21 23 23 23 24 24 25 25 26 29 29 29 30 31 31 31 37 40 41 41 2 Choosing on the basis of several opinions 2. . . . . .2 Grading students in a given course . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5 Who are the authors ? 1. . . . . . . . . . 3 Building and aggregating evaluations 3.1 Introduction . . . . . . . 1. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1 Analogies . .4. . . . . .1 Analysis of some voting systems . . . . . . . . . . . . . . . . . . .3 The voting process . . . . . . . . . . 2. . . . . . . . . . . . .1 Motivations . . . . . . . . .3. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3. . . . . . 2. . . . . . . . . . . . . . . . . . . . . . . . . . . . .2 Election by rankings . . .3. . . .1 Uninominal election . .3. . . . 2. . . . . . . . . . . . . . . . . . . . . . . . .1. 2. . . . . . . . . . . . . . . 2. . . . . . . . . . . . . . . . . . . . . . . . . . .1. . .2 The grading process . . . .1 Motivation . .2. .2. . . 1. . . . . . .4 Social choice and multiple criteria decision support . . . . . . . . . . . . 1. . . . . . . . . . . . . . . . . . . .1 Rankings . . . . . . .1 What is a grade? . . . . . . . . .3 Interpreting grades . . . . . . . . . . . .2. . . . . . . . . . . . . .1 Rules for aggregating grades . . . . . . . 3. . .5 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . 2. . . .1 Deﬁnition of the set of candidates 2. . . . .2. . . . . . . . . . . . . . .3. . . . . . 2. . . . . .2. . . . . . . . . . . 1. . . . . . . . . . . . . . . . . . . .4 Why use grades? . . . . . . . . . . . .2 Audience . . . . 3. . . . . . . . . v . . . . . . . . . . . . . . 3.1.

. . . . . 4. . . .4 4 Constructing measures 4.2. . . . . . . . . . . . . . . .2. . . . . .2 Non compensation . .1. . . . .2. . . . . . . . . . . . 4. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107 . . . . . . . 99 . . . . . .2 Time gains .3. . . . Conclusions . . . . . . . . . . . . .5 Conclusions . . . . . . . . . . . . . . . . . 87 . . . . . . . . 4.1 Scale Normalisation . . .3.4 Indicators and multiple criteria decision support . . . 6.1 Role of the decathlon score . . . . . . . 98 . . . . . 5. . . .1 Prevision of traﬃc . . . . 5. . . .5 Conclusion . . . . . . . . .2. . . . . . . . . . . 5.3. .1 Thierry’s choice . . . . . . .1. . 106 . 5. . . . . . . . . .2. .1. . . . .2. . . . . . . . .2. . . . . . . . . . . . . . . . .4 The diﬃculties of a proper usage of the weighted sum 6. .2 Reasoning with preferences . . . . . . . . . .2 Using the weighted sum on the case . . . . . . . . . 88 . .1 Direct methods for determining single-attribute value functions . . . . . . . . . . . . . . . 5. . . . . . . .3 The decathlon score . . . .3 Some examples in transportation studies . . . .2. . . . . 5. . . . 4.2 Air quality index . . . . 4. . . . . . . 42 51 53 54 56 57 58 59 59 61 61 62 62 63 65 66 69 71 71 73 73 75 76 79 80 80 81 82 83 3. . . . .3. . . . . 4. . . . . . . . . . 97 . . . . . . . . .2 From Corporate Finance to CBA . . . . . . . . . . . 4. . . 4. . . . . . . . . . 4. .4 Conclusions . . . . . . 6. . . . . . . . . . . . . .2. . . . . . .vi 3. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3.2. 5. . . . . . . . . . . . . . . . . . . . . .1 Introduction . . . . . . . . . . . . . . . . . . 105 . . . . . .1 Monotonicity . . . . . . . . . . . .3 Dimension independence . .1. . . . .3 The additive value model . . . . . . . .3 Is the resulting ranking reliable? . . . 101 . . . . . . . . . . . . . . . . . . .1 The human development index . . . . . . . . . 91 . . . . . . . . . . . . . . . . . .2 Aggregating grades using a weighted average . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6. . . . . . . . . . . . . . . . . . . . . . 4. . . 6. . .1. . . . . . . . . . . . 6. . . .3 Security gains . 4. . . . . . . . . . . . . . . . . . . . . . . . . 87 . . 4. . . . 5. . .4 Other eﬀects and remarks . . . . . .4 Scale construction . . . . . . . . . . . . . . . . . . . 5. . . . . . . . . . . . . . . . . . . . . . . .3 Meaningfulness . . . .1. . . . . . . . .1 Transforming the evaluations . . . . . . . . . . . 6. . . . . . 6. . . . . . . . . . . . . . . . . 6 Comparing on several attributes 6. .3. . . . . . . .2 The principles of CBA . . . . . . 6. . . . . . . . . . . . . . . . . . 4. . . . . .5 Statistical aspects . . . . . 6. . . . . . . . . . . . . .1 Description of the case . . . . . .2 Compensation . . 5 Assessing competing projects 5. . . . . . . . .3 Theoretical foundations . . .2 The weighted sum . . .3. . .2. . . . . . 99 . . . . . . . .1.1 Choosing between investment projects in private ﬁrms 5. . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6. . .3.4. . . . . .3. 6. . . .4 An hybrid approach for automatic decision-making .2. 7. .4. .3 The expected utility approach . . . 7. .2 The context . . . . . . . . . . . . . . . . . . 8. . . . . . .2 The set of criteria . . . . . 8. . . . . . . 8. . . . . . . . . . . . . . . . . .2 A System with Explicit Decision Rules . . . . . . . . . . . . . .4 Some comments on the expected utility approach 8. . . . . . . . . . . . . . . . . . . 6. . . . . . . . . .3. . . . .5 Conclusion . . . . . . . .3. . . . . . . . . . . 8. . . 8. . . . . . . . . . . . .3. . . 7. . . . . . . . . .4. .3 The model . . 7. . 6. . . . . . . . . . . . . .2 Automatising human decisions by learning from examples 7. . . . . . . . . . . . . . .2 A simple outranking method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2. . . .1 Controlling the quality of biscuits during baking .4 Conclusion . . . . . . . . . . . . . . . . . . . . General conclusion . . . . . . . .6 Interpreting output labels as (fuzzy) intervals .4 Main features and problems of elementary outranking approaches . . 8. . . . . . .7 The approach applied in this case: second step . 6. . . .4 6. . . . . . . . . . . . . . .3 Interpreting input labels as scalars . . . . . . . . . . . .6 Comment on the ﬁrst step . . . . . . . . .4 The temporal dimension . . . . . . . .4. . .4. . . . . . . .3. . Outranking methods . . . . . . . . . . 8. .5 The approach applied in this case: ﬁrst step . .3 AHP and Saaty’s eigenvalue method .2.1 The set of actions . . . .3 A System with Implicit Decision Rules . . . . . . . . . . . . 8. . . . . . .3. . . 7. .4 Interpreting input labels as intervals . . . . . . . . . . 7. . . . . . . . . . . . . . . . . . . . . . . . . . .4. . . . . . . . . . . . . . .5 Interpreting input labels as fuzzy intervals . 8. . . . . .3. . .1 Introduction . . . 8. . . .3 Uncertainties and scenarios . . . . . .3. . . . . . . . . . . . . .1 The expected value approach . 111 117 124 124 124 129 131 139 141 144 147 147 149 150 150 153 156 161 164 170 170 171 174 176 179 179 179 180 180 181 182 184 186 186 187 187 189 191 193 196 198 6. . . . . . . . . . . . . . . . . . . . . . . . . . . .1 Designing a decision system for automatic watering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4. . . . . . . . .5 Summary of the model . 8. . . . . . . . . . . . .4. . .1 Condorcet-like procedures in decision analysis . . . . . . . . .2 Some comments on the previous approach . . . . 7. . . . 7.2. . . .3. . . . . . . . . . . 6.4. . . . . 8 Dealing with uncertainty 8.5 7 Deciding automatically 7. . . . . . . . . . . . . .5 Advanced outranking methods: from thresholding towards valued relations . . An indirect method for assessing single-attribute value functions and trade-oﬀs . . . . . . . . . . . . . . 7. . . . . . . 7. . . . . . . 7. . .4. .4 A didactic example .4. . .3 Using ELECTRE I on the case . . 8. . .1 Introduction . . .2. . . . . . . . . . . . 8. . . . . . .2 6. . . . 8. .2. .4. . . . .2 Linking symbolic and numerical representations . . .vii 6. . . . .

. . . 9. . .2 What have we learned? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3. . . . .viii 8. . . . . .1 Problem Formulation . . . . . . . . . .5 Conclusions . Appendix A . . . . . . . . . . . . . . . .3 The ﬁnal recommendation 9. . . . . . . . . . . . . . . . . . . . . . . . . . . . . Bibliography Index . . . . . . . . . . . . . . . . . .2 The Evaluation Model . . . . . . . . . . . . . . . . . . 9. . . . . . . . . . .3 What can be expected? . . . . . . . 10. . . . . . . . . . . . . . . . . . . . .3. 9. . . . . . . . . . . . . 200 205 206 207 210 211 213 219 226 228 231 237 237 239 243 247 262 9 Supporting decisions 9. . . . . . . . . . . .3. . . . . . . . . . . .1 Formal methods are all around us . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9. . . . . Appendix B . . . . . . . . . . . .4 Conclusions . . . . . . . . . . . . . 10 Conclusion 10. . 9. . . .3 Decision Support . . . . . . . . 10. . . . . . . . . . . .2 The Decision Process . . .1 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

that help making decisions. i. multiple criteria decision analysis. Although informal decision support techniques can be of interest. assess and process information in order to be able to make recommendations in decision and/or evaluation processes. we resort to a decision support technique: an informal one–we toss a coin. Among the latter. we think–or a formal one. Let us cite but a few examples. a technique. we will focus on formal ones. But there are many other ones. . All these formal techniques are what we call (formal) decision and evaluation models. it quite often happens that we do not know or we are not sure what to decide and. in this book.e. • When a bank must decide whether a given client will obtain a credit or not. the air quality index. a set of explicit and well-deﬁned rules to collect. he probably takes the value of some indicators. • When the director of a school must decide whether a given student will pass or fail.1 INTRODUCTION 1. In order to do so. we visit an astrologer. .1 Motivations Deciding is a very complex and diﬃcult task. in many instances. when the task is too complex or the interests at stake are too important. we ﬁnd some well-known decision support techniques: cost-beneﬁt analysis. They are so widespread that almost no one can pretend he is 1 . decision trees. . we consult an expert. Nevertheless. Some people even argue that our ability to make decisions in complex situations is the main feature that distinguishes us from animals (it is also common to say that laughing is the main diﬀerence). • Groups or committees must also make decisions. is often used.g. into account. sometimes not presented as decision support techniques. called credit scoring. The director then sums the grades and compares the result to a threshold. e. he usually asks each teacher to assess the merits of the student by means of a grade. we ask an oracle. • When the mayor of a city decides to temporarily forbid car traﬃc in a city because of air pollution. they often use voting procedures.

You are right. unambiguous representations of a given problem. we think that it is important to deepen our understanding of evaluation and decision models and encourage their users to think more thoroughly about them. we can ﬁnd situations in which it will perform very poorly. For all these reasons (complexity.2 CHAPTER 1.2 Audience Most of us are confronted with formal evaluation and decision models. • Formal models require that the decision maker makes a substantial eﬀort to structure his perception or representation of the problem. This book is intended for the aware or enlightened practitioner.g. popularity) plus the fact that formal models lend themselves easily to criticism. This can be of great help if we want to devise robust recommendations. Very often. • Once a formal model has been established. You guessed it: this book is more than 200 pages long. hundreds of what-if questions can be answered in a ﬂash. usefulness. So. None of the evaluation and decision models that we examined are perfect or the best. whether it be for research or applications. These models–probably because of their formal character–inspire respect and trust: they look scientiﬁc. This is not really new: most decision models have had contenders for a long time. For example. we use them without even thinking about it. This eﬀort can only be beneﬁcial as it forces the decision maker to think harder and deeper about his problem. They all suﬀer limitations. They are therefore particularly well suited for facilitating communication among the actors of a decision or evaluation process. a battery of formal techniques (often implemented on a computer) become available for drawing any kind of conclusion that can be drawn from the model. We have tried to keep mathematics and formalism . But are they really well founded ? Do they perform as well as we want ? Can we safely rely on them when we have to make important decisions ? That is why we try to look at formal decision and evaluation models with a critical eye in this book. for anyone who uses decision or evaluation models–for research or for applications–and is willing to question his practice. there is probably a lot of criticism. they oﬀer a common language for communicating about the problem. to a large extent. 1. in voting) and seems empirically correct in other contexts–but we are convinced as well that formal evaluation and decision models are useful in many circumstances and here is why: • Formal models provide explicit and. Do we want to contend all models at the same time ? Deﬁnitely not ! Our conviction is that there cannot be a best decision or evaluation model–this has been proved in some contexts (e. importance of the interests at stake. Our aim with this book is to foster reﬂection and critical thinking among all individuals utilising decision and evaluation models. For each one. INTRODUCTION not using or suﬀering the consequences of one of them. to have a deeper understanding of what he does.

we informally present two theorems (Arrow and GibbardSatterthwaite) that in one way or another explain why we encountered so many diﬃculties in our twelve examples. Finally. We present many diﬀerent models. A rich bibliography will allow the interested reader to locate the more technical literature easily. These examples. each one illustrating a problem that arises with a particular voting method. placing what has been discussed in a broader context and indicating links with other chapters. hopefully. We chose some models because they are not often perceived as decision or evaluation models (student grades. . . almost independent of the other chapters. we explore some issues that are often neglected: who is going to vote? Who are the candidates? These questions are diﬃcult and we show that they are important. The other examples (cost-beneﬁt analysis. most of the material will be accessible to the not mathematically-inclined readers. The construction of the set of voters and the set of candidates. We decided to present seven examples of such models.3 Structure There are so many decision and evaluation models that it would be impossible to deal with all of them within a single book. Chapter 9 is somewhat diﬀerent from the seven previous ones: it does not focus on a decision model but presents a real world application. Some examples have been chosen because they correspond to decision models that everyone has experienced and can understand easily (student grades and voting). Although the goal of this book is not to overwhelm the reader with theory. Each example is presented in a chapter (Chapters 2 to 8). STRUCTURE 3 at a very low level so that. we present a sequence of twelve short examples. chosen in a wide variety of domains. most of them rely on similar kinds of principles. each one trying to outdo the previous one but suﬀering its own weaknesses. 1. 1. will hopefully allow the reader to grasp these principles. the role of the analyst. After showing the analogy between voting and multiple criteria decision support. We begin with simple methods based on pairwise comparisons and we end up with the Borda method. the position of the actors and their interactions. The aim of this chapter is to emphasise the importance of the decision aiding process (the context of the problem. . indicators and rule based control).1.3. Then we turn to the way voters’ preferences are modelled. as well as the choice of a voting method must be considered as part of the voting process. multiple criteria decision support and choice under uncertainty) correspond to well identiﬁed and popular evaluation and decision models. ). Each of these seven chapters ends with a conclusion.4 Outline Chapter 2 is devoted to the problem of voting. to show that many diﬃculties arise there as well and that a coherence between the decision aiding process and the formal model is necessary. . As will become apparent later.

. This problem is characterised by many diﬀerent uncertainties: for example. The price we pay for this is that results provided by these methods are not rich either. we turn in Chapter 3 to another very familiar topic for the reader: students’ marks or grades. Students are assessed in a huge variety of ways in diﬀerent countries and schools. The goal of this section is to show the interest of some formal tools (e. In Chapter 4. using a well documented example. deciding whether a student gets a degree. In Chapter 6. An indicator is a measure but. Chapter 7 is dedicated to the study of automatic decision systems. . the third one addresses the case of implicit decision rules. in the sense that conclusions that can be drawn regarding a decision are not clear-cut. INTRODUCTION After examining voting. three particular indicators are considered: the Human Development Index (used by the United Nations). The goal of Chapter 8 is to raise some questions about the modelling of uncertainty.g.g. a project should only be undertaken when its beneﬁts outweigh its costs. Then we turn to the so called outranking methods. the sum of utilities (direct and indirect assessment) and AHP (the Analytic Hierarchy Process). we present some diﬃculties that arise when one wants to choose from or rank a set of alternatives considered from diﬀerent viewpoints. We assert that some diﬃculties are the consequences of the fact that the role of an indicator is often manifold and not well deﬁned.4 CHAPTER 1. namely the weighted sum. we illustrate some diﬃculties encountered with CBA. Finally. Some of these methods can be used even when the data are not very rich or precise. the price of oil or the electricity demand in 20 years time. we clarify some of the hypotheses at the heart of CBA and criticise the relevance of these hypotheses in some decision aiding processes. often. ). This problem is classically described by using a decision tree and solved with an expected utility approach. fuzzy sets) to model decision rules but also to clarify some problems arising when simulating the rules. Cost-beneﬁt analysis (CBA) is a decision aiding method that is extremely popular among economists. We present a few examples illustrating some problems occurring with indicators. it is also a tool for controlling or managing (in a broad sense). the ATMO index (an air pollution indicator used by the French government) and the decathlon score. deciding whether a student is allowed to begin the next level of study. using an example in transportation studies. Then. First we present the principles of CBA and its theoretical foundations. Marks are used for diﬀerent purposes (e. Three examples are presented: the ﬁrst one concerns the control of an automatic watering system while the others are about the control of a food process. ranking the students. The ﬁrst two examples describe decision systems based on explicit decision rules. These systems concern the execution of repetitive decision tasks and the great majority of them are based on more or less explicit decision rules aimed towards reﬂecting the usual decision policy of humans. We examine several aggregation methods that lead to a value function on the set of alternatives. We use this familiar topic to discuss operations such as evaluating a performance and aggregating evaluations. . This seems to indicate that assessing students might not be trivial. After recalling some well known criticisms directed against this . We present a real-life problem concerning the planning of electricity production. Following the CBA approach.

It is a joint work. through the rehabilitation of a sewer network or the location of high-voltage lines. even if we did our best to write in correct English. we decided not to use the politically correct “he/she” but just “he” in order to make the text easy to read. Five of the six authors of the present volume presented their thoughts on the past and the objectives of future research in multiple criteria decision support in the Manifesto of the new MCDA era (Bouyssou. in France and in Belgium. The fact that all of the authors are male has nothing to do with this choice. social choice theory. engineering. law and geology but they are all active in decision support and more particularly in multiple criteria decision support. mainly from an axiomatic point of view. It concerns the evaluation of oﬀers following a call for tenders for a GIS (Geographical Information System) acquisition. are brieﬂy mentioned. a voter or an individual whose sex is not determined. business. fuzzy logic. They teach in engineering. the reader should not be surprised to ﬁnd . Pirlot. One should ideally never consider these elements separately from the aggregation process because they can impact the whole decision process and even the way the aggregation procedure behaves. . measurement theory. a The authors are very active in theoretical research on the foundations of decision aiding.5 Who are the authors ? The authors of this book are European academics working in six diﬀerent universities. Some of the drawbacks of this approach are discussed as well. In spite of the large number of co-authors. etc. Perny.6 Conventions To refer to a decision maker. None of the authors is a native English speaker. Convinced that there is more to decision aiding than just number crunching. Besides their interest in multiple criteria decision support. Among their special interests are preference modelling. Their background is quite varied as well: mathematics. The same applies for “his/her”. we present the approach that has been used by the team that “solved” this problem. Tsouki`s and Vincke 1993). fuzzy set theory and possibility theory. 1. computer science and psychology schools. we devote the last chapter to the description of a real world decision aiding process that took place in a large Italian company a few years ago. 1. but have been involved in a variety of applications ranging from software evaluation to location of a nuclear repository. economics. the construction of the criteria. .1. deserve greater consideration. Therefore. such as belief functions. The relevance of probabilities is criticised and other modelling tools. mathematics. operations research. aggregation techniques. Some important elements such as the participating actors. WHO ARE THE AUTHORS ? 5 approach. this book is not a collection of papers. problem structuring. the problem formulation. .5. they share a common view on this ﬁeld. artiﬁcial intelligence.

and Stefano e Abruzzini. who contributed to Chapter 8.%\newline The authors also wish to thank J. 1. this book would look like this paragraph.7 Acknowledgements We are ggreatly indebted to our collEague friend Philippe Fortemps \cite{Fortemps99} ///////// . a A special thank goes to Marjorie and Diane Gassner who had the patience to read and correct our continental approximation of the English language and to Fran¸ois Glineur who helped in solving a great number of latex problems. Chapter 6 is based on a report by S´bastien Cl´ment written to fulﬁl the requirements of a e e course on multiple criteria decision support. c We thank Gary Folven from Kluwer Academic Publisher for his constant support during the preparation of this manuscript. INTRODUCTION some mistakes or inelegant expressions. We beg the reader’s leniency for any incorrectness that might remain. . The adopted spelling is the British and not the American one.6 CHAPTER 1. H. Ottinger. Without him and his knowledge of Latex.-L. Large part of chapter 9 uses material already published in (Paschetta and Tsouki`s 1999). M´lot. who laid out the complex diagrams of that chapter. who gave us a number of references concerning indicators.

Otherwise a second stage is organised. Otherwise a second stage is organised. . he is elected. and so on until his least preferred candidate. France’s members of parliament As in the UK. In a constituency. Otherwise. During the second stage. Note that the winner does not have to win an overall majority of votes. During the second stage. a 2 next to his second preferred candidate. Once more. . If a candidate has more than 50 % of the ballot papers. . Once again. for the senate. the candidate that received fewer papers than any other is eliminated and the corresponding ballot papers are transferred to the candidates that got 7 . One representative is elected in each constituency. United Kingdom’s members of parliament The territory of the UK is divided into about 650 constituencies. in presidential elections. The winner is the candidate that received the most votes. Then the ballot papers are sorted according to the ﬁrst preference votes. the UK. . In a division. The winner is the candidate that is chosen by more voters than any other one. Australia’s members of parliament The territory is divided into single-seat constituencies called divisions. then a 3. each voter is asked to rank all candidates: he puts a 1 next to his preferred candidate. If one candidate has been chosen by more than 50 % of the voters. France’s president Each voter chooses one of the candidates. The winner is the candidate that has been chosen by more voters than the other one. he is elected. he is elected. Is there much to say about voting ? Well.5 % of the registered voters may compete. . each voter chooses one of the candidates. Each voter chooses one of the candidates in his constituency. the French territory is divided into single-seat constituencies. each voter chooses one of the candidates. France. . only two candidates remain: those with the highest scores. just think about the way heads of state or members of parliament are elected in Australia. each voter chooses one of the candidates. If one candidate receives more than 50 % of the votes. all candidates that were chosen by more than 12.2 CHOOSING ON THE BASIS OF SEVERAL OPINIONS: THE EXAMPLE OF VOTING Voting is easy! You’ve voted hundreds of times in committees.

Our aim in this chapter is. if you take a closer look at voting. as far as we know. In spite of its apparent simplicity. etc. The chapter ends with a conclusion. The leader of the party that has the most representatives becomes prime minister. CHOOSING ON THE BASIS OF SEVERAL OPINIONS a 2 on these papers. we make the following basic assumption: each voter’s preferences can accurately be represented by a ranking of all candidates from best to worse. Then we show some problems occurring when aggregating the rankings. on the other hand. if a candidate has more than 50 % of the ballot papers. The winner in a county is the candidate that is chosen by more voters than any other one. In most cases. Voting is not instantaneous. In Section 2. Each voter chooses one candidate. It is not just counting the votes and performing some mathematical operation to ﬁnd the winner. he is elected. the aggregation remains a diﬃcult task.8 CHAPTER 2. this process ends when all but two candidates are eliminated. it seems that the case of a tie is seldom considered in electoral laws. using classical voting systems such as those applied in France or the United Kingdom. Some models are poorer in information but more realistic. because. the Canadian parliament is elected as follows. This chapter is organised as follows. 2. Note that. the candidate that received fewer papers than any other is eliminated and the corresponding ballot papers are transferred to the candidates that got a 3 on these papers. each party can present one candidate. In the worst case. without ties. you will be amazed by the incredible complexity of the subject. we discuss the analogy with multiple criteria decision support. in fact. on the one hand. The diversity of the methods applied in practice probably reﬂects some underlying complexity and. In each county. We do this through the use of small and classical examples. In Section 1. to show that many diﬃcult and interesting problems arise in voting and.1 Analysis of some voting systems From now on. The territory is divided into about 270 constituencies called counties. thousands of papers have been devoted to the problem of voting (Kelly 1991) and our guess is that many more are to come. Once more. Some are richer and less realistic. In Section 3. one of the candidates necessarily has more than 50 % of the papers. Canada’s members of parliament and prime minister Every ﬁve years. we change the focus and try to examine voting in a much broader context. we will distinguish between the election—the process by which the voters express their preferences about a set of candidates—and the aggregation . It is a process that begins when somebody decides that a vote should occur (or even earlier) and ends when the winner begins his mandate (or even later). Those interested in voting methods and the way they are applied in various countries will ﬁnd valuable information in Farrell (1997) and Nurmi (1987). to convince the reader that a formal study of voting might be enlightening. He is thus the county’s representative in the parliament. Otherwise. we consider other preference models than the linear ranking of Section 1. In Section 4. unless they are tied.

Then a (resp. . . y. Suppose that 10 voters have preferences aP bP c. this might be diﬀerent from what a majority of voters wanted. each voter sincerely (or naively) reports his preferences. It is clear that 51 voters will vote for a while 49 vote for z. Thus a is chosen. in a uninominal election. P yP z and 49 voters have preferences zP bP cP . Candidate b could be a good compromise.2. Suppose that 51 voters have preferences aP bP cP .e. . i. In many cases. And candidate b seems to be a good candidate for everyone. ranks all candidates from best to worse. Let {a. b and c) obtains 10 votes (resp. . an absolute majority of voters prefers any other candidate to a (11 out of 21 voters prefer b and c to a). we shall assume that each voter votes for the candidate that he ranks in ﬁrst position. b. But is a really a good candidate ? Almost half of the voters perceive a as the worst one. 6 voters have preferences bP cP a and 5 voters have preferences cP bP a. For example. Nevertheless. .1. ANALYSIS OF SOME VOTING SYSTEMS 9 method—the process used to extract the best candidate or a ranking of the candidates from the result of the election. Thus a has an absolute majority and. a uninominal election combined with the majority rule allows a dictatorship of majority and doesn’t favour a compromise. consciously or not. z} be a set of 26 candidates for a 100 voters election. Dictature of majority Let {a. Thus. a wins. Indeed. suppose that a voter prefers candidate a to b and b to c (in short aP bP c). i. . P yP a. the election is uninominal and the aggregation method is simple majority. We are now ready to present a ﬁrst example that illustrates a diﬃculty in voting. 6 and 5). Each voter. . He votes for a. c} be the set of candidates for a 21 voters election. the election is uninominal.1. A possible way to avoid this problem might be to ask the voters to provide their whole ranking instead of their preferred candidate. c. . when voting. without ties and.1 Uninominal election Let us recall the assumption that we mentioned earlier and that will hold throughout Section 1. Example 1.e. . Respect of majority in the British system The voting system in the United Kingdom is plurality voting. in all uninominal systems we are aware of. each voter votes for one candidate only 2. As shown by this example. Example 2. This will be discussed later. Let us continue with some strange problems arising when using a uninominal election. b.

11 votes so that candidate b is elected. the results of a survey are as follows: 6 5 4 and 2 voters voters voters voters have have have have preferences preferences preferences preferences aP bP c. Monotonicity in the two-stage French system Let {a. Because it is not necessary to be a mathematician to ﬁgure out such problems. in order to increase his lead over b and to lessen the likelihood of a defeat. is called manipulable. CHOOSING ON THE BASIS OF SEVERAL OPINIONS Let us see. Suppose that the survey did exactly . Suppose that candidate a. Suppose that 10 voters have preferences bP aP cP d. 6 voters have preferences cP aP dP b and 5 voters have preferences aP dP bP c. After the ﬁrst stage. as no candidate has an absolute majority. c}. bP cP a bP aP c. b. With the French system. as shown by the following example. Thus a obtains 10 votes and b. Nonetheless we cannot conclude that the two-stage French system is superior to the British system from this point of view. Thus. casting a non sincere vote is useful for those 6 voters as they prefer a to b. c} be the set of candidates for a 17 voters election.10 CHAPTER 2. c. using the same example. If they had been sincere (as in the previous example). decides to strengthen his electoral campaign against b. Example 3. Example 4. a second stage is run between candidates b and c. cP aP b. Manipulation in the two-stage French system Let us continue with the example used above. Then candidate a wins after the ﬁrst stage because there is an absolute majority for him (11/21). Suppose that the six voters having preferences cP aP dP b decide not to be sincere and vote for a instead of c. none of the beaten candidates (a and c) are preferred to b by a majority of voters. a second stage is run between candidates a and b. This is not the only weakness of the French system as attested by the three following examples. Example 5. Such a system. b. d} be the set of candidates for a 21 voters election. between a and b and a would be chosen obtaining 11 out of 17 votes. as no candidate has absolute majority. that may encourage voters to falsely report their preferences. After the ﬁrst stage. This time. We suppose that the voters keep the same preferences on {a. b. if such a problem would be avoided by the two-stage French system. A few days before the election. Candidate b easily wins with 15 out of 21 votes though an absolute majority (11/21) of voters prefer a and d to b. a second stage would be run. Respect of majority in the two-stage French system Let {a. b would have been elected. some voters might be tempted not to sincerely report their preferences as shown in the next example.

Separability in the two-stage French system Let {a. Suppose that 4 voters have preferences aP bP c. The second stage opposes a to c and c wins. Clearly such a method does not encourage participation. It is clear with such a system that it is not always interesting or eﬃcient to sincerely report one’s preferences.2. 2 voters have preferences aP bP c. b. 5 voters have preferences cP aP b and 4 voters have preferences bP cP a. Candidate a thought that his campaign would be beneﬁcial. Suppose that 2 of the 4 ﬁrst voters (with preferences aP bP c) decide not to vote because c. He was wrong. 8 voters have preferences aP bP c.1. Suppose that the 13 voters located in the countryside have the following preferences. candidate c will loose while b will win. bP aP c. 4 voters have preferences cP bP a and 3 voters have preferences bP cP a. obtaining 9 votes. ANALYSIS OF SOME VOTING SYSTEMS 11 reveal the preferences of the voters and that the campaign has the right eﬀect on the last two voters. c} be the set of candidates for a 26 voters election. a second stage should oppose a to c and c should win the election obtaining 7 out of 11 votes. Such a method is called non monotonic because an improvement of a candidate’s position in some of the voter’s preferences can lead to a deterioration of his position after the aggregation. Participation in the two-stage French system Let {a. The voters are located in two diﬀerent areas: countryside and town. Example 6. After the ﬁrst stage. is going to win anyway. Contrary to all expectations. Using the French system. due to the campaign of a. Example 7. Our two lazy voters can be proud of their abstention since they prefer b to c. What will happen ? There will be only 9 voters. obtaining 5 out of 9 votes. 4 3 3 and 3 voters voters voters voters have have have have preferences preferences preferences preferences aP bP c. cP aP b cP bP a. b. Hence we observe the following preferences. You will note in the next example that some manipulations can be very simple. b is eliminated. 4 voters have preferences cP bP a and 3 voters have preferences bP cP a. . the worst candidate according to them. Suppose that the 13 voters located in the town have the following preferences. c} be the set of candidates for a 11 voters election.

obtaining 7 votes. when there are more than 2 candidates. The previous examples showed that. the second one is opposed to the winner . If there are two candidates. on an arbitrary decision. . bP cP a bP aP c. The 3 candidates will be considered two by two in the following order or agenda: a and b ﬁrst. Inﬂuence of the agenda in sequential voting Let {a. Then a is opposed to c and c defeats a with absolute majority. a will defeat b in the second stage. Let us note that sequential voting is very common in diﬀerent parliaments. And so on until no more candidates remain. we arbitrarily choose two of them and we use the British system to select one. this method suﬀers severe drawbacks. any candidate can be elected and the outcome depends completely on the agenda. The winner is opposed (using the British system) to a new arbitrarily chosen candidate. If the agenda is b and c ﬁrst.e.12 CHAPTER 2. Thus we might be tempted by a generalisation of the British system (restricted to 2 candidates). Clearly. During the ﬁrst vote. Unfortunately. The ﬁrst one is opposed to the status quo. But it is easy to observe that in the global election (26 voters) a is defeated during the ﬁrst stage. it is easy to see that. with 13 voters. in this example. CHOOSING ON THE BASIS OF SEVERAL OPINIONS 4 3 3 and 3 voters voters voters voters have have have have preferences preferences preferences preferences aP bP c. obtaining 7 votes. b wins against c and is elected. and so on. bP cP a cP aP b. it is not an easy task to imagine a system that would behave as expected. Example 8. i. Thus c is elected. Note that. Such a method is called non separable. This would require n − 1 votes between 2 candidates. If the agenda is a and c ﬁrst. in the presence of 2 candidates. such a method lacks neutrality. Suppose that 1 voter has preferences 1 voter has preferences and 1 voter has preferences aP bP c. c} be the set of candidates for a 3 voters election. Suppose now that an election is organised in the town. The diﬀerent amendments to a bill are considered one by one in a predeﬁned sequence. ﬁnally. If an election is organised in the countryside. a is elected. Candidates a and c will go to the second stage and a will be chosen. cP aP b. we use the British system. a is opposed to b and a wins with absolute majority (2 votes against 1). It doesn’t treat all candidates in a symmetric way. b. Consequently. using the British system. Naturally we expect a to be the winner in a global election. Candidates (or amendments) appearing at the end of the agenda are more likely to be elected than those at the beginning. the British system (uninominal and one-stage) is equivalent to all other systems and it suﬀers none of the above mentioned problems (May 1952). if there are more than two candidates. Thus a is the winner in both areas. it is easy to see that c defeats a and is then opposed to b. Hence. then c.

In case of tie. It can be shown that there is never more than one Condorcet winner. Why not try to palliate the many encountered problems by asking voters to explicitly rank the candidates ? This idea. At the end of the 18th century. candidate a is elected by the British method but b is the Condorcet winner. a winner is a candidate that.2. in some instances. Suppose that 1 voter has preferences 1 voter has preferences and 1 voter has preferences bP aP dP c. In fact. their methods are still at the heart of many scientists’ concerns. In example 2. Although the principle underlying the Condorcet method—the candidate that beats all other candidates in a pairwise contest is the winner—seems very natural. Candidate c wins the second vote and d is ﬁnally elected though all voters unanimously prefer a to d. a is the Condorcet winner although b is chosen by the French method. d} be the set of candidates for a 3 voters election. b. each voter provides a ranking without ties of the candidates. the other by Condorcet. Let us remark that this cannot happen with the French and British systems. this principle might be questioned: in example 1. wins by a majority. it is worth noting that.1. In other words. many methods are variants of the Borda and Condorcet methods. Although other methods have been proposed. cP bP aP d aP dP cP b. Note that both the British as well as the two-stage French methods are diﬀerent from the Condorcet method. 2. Up to now. Violation of unanimity in sequential voting Let {a. two aggregation methods for election by rankings appeared in France. a is the . Hence the task of the aggregation method is to extract from all these rankings the best candidate or a ranking of the candidates reﬂecting the preferences of the voters as much as possible. ANALYSIS OF SOME VOTING SYSTEMS Example 9. In example 3. One was proposed by Borda. then c and ﬁnally d. Candidate a is defeated by b during the ﬁrst vote. close to the concept of democracy and hence very appealing. though interesting. will lead us to many other pitfalls that we discuss just below. A candidate a is preferred to b if and only if the number of voters ranking a before b is larger than the number of voters ranking b before a. The Condorcet method Condorcet (1785) suggests to compare all candidates pairwise in the following way. A candidate that is preferred to all other candidates is called a (Condorcet) winner. we have assumed that the voters are able to rank all candidates from best to worse without ties but the only information that we collected was the best candidate. candidates a and b are indiﬀerent. opposed to each of the n − 1 other candidates. 13 Consider the following agenda: a and b ﬁrst.2 Election by rankings In this kind of election. c.1.

the sum for all voters of that candidate’s rank. Compute the Borda score of each candidate. all rankings are supposed to have the same probability. . In view of Table 2. Consider example 8: a is preferred to b. In such a case. and choose a candidate in any case (Fishburn 1977. Nurmi 1987). it seems that y should be elected. the higher the probability of such a paradox. if he exists. . each candidate has a rank: 1 for the ﬁrst candidate in the ranking. . Note that. CHOOSING ON THE BASIS OF SEVERAL OPINIONS Condorcet winner. The Borda method Borda (1781) proposed to use the following aggregation method. e. f. Critique of the majority principle Let {a. k x y 1 0 50 2 30 0 3 0 30 4 21 0 5 0 21 6 31 0 7 0 0 8 0 0 9 19 0 Table 2. Candidate x wins against every other candidate with a majority of 51 votes. and n for the last. Unfortunately it isn’t. c. eP f P gP xP yP aP bP cP d. 2 for the second. i. Example 10. f P xP yP aP bP cP dP eP g. Thus x is the Condorcet winner. Consider also example 10 taken from Fishburn (1977). In each voter’s preference. Let us summarise their results in Table 2. b. If you consider an election with 25 voters and 11 candidates.14 CHAPTER 2. in order to obtain this result. But let us focus on the candidates x and y.e. Many methods have been designed that elect the Condorcet winner.1. No candidate is preferred to all others. there are cases (called Condorcet paradoxes) where there is no Condorcet winner. Such an hypothesis is clearly questionable (Gehrlein 1983). eP xP yP aP bP cP dP f P g.1: Number of voters who rank the candidate in k-th place in their preferences Furthermore. One might think that example 8 is very bizarre and very unlikely to happen. gP xP yP aP bP cP dP eP f yP aP bP cP dP xP eP f P g. d. Suppose that 19 21 10 10 10 and 31 voters voters voters voters voters voters have have have have have have preferences preferences preferences preferences preferences preferences yP aP bP cP dP eP f P gP x. b is preferred to c and c is preferred to a. Then choose the candidate with lowest Borda score. the probability of such a paradox is signiﬁcantly high as it is approximately 1/2 (Gehrlein 1983) and the more candidates or voters. the Condorcet method fails to elect a candidate. g. although almost half of the voters consider him to be the worse candidate. x. . y} be a set of 9 candidates for a 101 voters election.1.

it is 6 = 2×1+1×4. Now consider a new election where the alternatives and voters are identical but they changed their preferences about c. then he is still a Condorcet winner after the elimination of some candidates.e. Note that the Borda method not only allows to choose one candidate but to rank them (by increasing Borda scores). They are considered as equivalent. because new solutions emerge during discussions. it is not always chosen by the Borda method. b. . Suppose that 2 voters have preferences bP aP cP d and 1 voter has preferences aP cP dP b. it can be shown that the Borda method never chooses a Condorcet looser. Comparison of the Borda and Condorcet methods Let {a. With the Borda method. Thus. . Nevertheless. d} be the set of candidates for a 3 voters election. Using the Condorcet method. it is less than 1 %. . i. If two candidates have the same Borda score.2. ANALYSIS OF SOME VOTING SYSTEMS 15 Note that there can be several such candidates. the new winner is b. Example 12. Thus a is the winner. In these cases. Note that once again. For example. b. b remains the winner and it can be shown that this is always the case: if a candidate is a Condorcet winner. we supposed that all rankings have the same probability. see Example 2). c} be the set of candidates for a 2 voters election. For b. a candidate that is beaten by all other candidates by an absolute majority (contrary to the British system. Suppose that 1 voter has preferences and 1 voter has preferences aP bP c bP cP a. for 3 candidates and 50 voters. the probability of all candidates being tied is 1/3. for 3 candidates and 2 voters. Suppose now that candidates c and d decide not to compete because they are almost sure to lose. This can be a problem as the set of the candidates is not always ﬁxed. Example 11. when a Condorcet winner exists. Suppose that 1 voter has preferences and 1 voter has preferences aP cP b bP aP c. But the likelihood of indiﬀerence is rather small and decreases as the number of candidates or voters increases. c. Thus the fact that a defeats or is defeated by b depends upon the presence of other candidates. It can vary because candidates withdraw. Borda and the independence of irrelevant alternatives Let {a. Thus b now defeats a just because c and d dropped out. because feasible solutions become infeasible or the converse. the conclusion is diﬀerent: b is the Condorcet winner. The Borda score of a is 5 = 2×2+1×1. the Borda method does not tell us which one to choose. Candidates c and d receive 8 and 11. With the Condorcet method. . then they are indiﬀerent. The alternative with the lowest Borda score is a.1.

the method must yield an overall ranking of the candidates. if aP b and bP c in the overall ranking. in preparation. we try to present various problems arising in evaluation and decision models in an informal way and to show the need for formal methods. First. He examined the methods verifying the following properties. The result of the aggregation must always be a ranking. second) voter prefers a (resp. The ﬁrst (resp. second. possibly with ties. But we think it is time to stop for at least two reasons.. we cannot resist to the desire to present now. Arrow’s theorem Arrow (1963) was interested by the aggregation of rankings with ties into a ranking. However. This can be seen as a shortcoming of the Borda method.. This implies that. if all voters rank a before b. Nevertheless. that any method you can think of suﬀers severe problems. Whatever the rankings provided by the voters. If all voters are unanimous about a pair of candidates. Universal domain. then a must be ranked before b in the overall preference. hence this approach lacks generality. none of the two voters changed their opinion about the pair {a. in an informal way. b is preferred to c and c is preferred to a. some of the most famous results of social choice theory. Transitivity. We should ﬁnd a way to answer questions like • Do non manipulable methods exist ? • Is it possible for a non separable method to satisfy unanimity ? • . A more general (and thus theoretic) approach is needed. It can be shown that the Condorcet method satisﬁes this property. in the present volume. This property rules out methods that would impose some restrictions on the preferences of the voters.3 Some theoretical results We could go on and on with examples showing. then aP c in the overall ranking. possibly with ties.1. b) in both cases.g. b}. we will follow such a general approach but. each example is related to a particular method. Only the relative position of c changed and this was enough to turn b into a winner and a into a looser. We will call this ranking the overall ranking. Unanimity. This property implies that the aggregation method must be applicable to all cases. In another book. it is not very constructive and. e. CHOOSING ON THE BASIS OF SEVERAL OPINIONS It turns out that b has the lowest Borda score. One says that the Borda method does not satisfy the independence of irrelevant alternatives. This seems quite reasonable but example 9 showed that some commonly used . Example 8 showed that the Condorcet method doesn’t verify transitivity: a is preferred to b. 2.16 CHAPTER 2.

we face an even more puzzling problem. it is not surprising that it is manipulable.2. there exists no aggregation method satisfying simultaneously the properties of universal domain. The French system satisﬁes universal domain and nondictatorship. we can deduce that it cannot satisfy the independence condition. unanimity and non-dictatorship properties. Therefore. Therefore other alternatives are considered as irrelevant with respect to that pair. are quite weak (at least at ﬁrst glance). For example. This may be seen as a minimal requirement for a democratic method. If. . These ﬁve conditions allow to state Arrow’s celebrated theorem. this theorem explains why we encountered so many diﬃculties when trying to ﬁnd a satisfying aggregation method. None of the voters can systematically impose his preferences on the other ones. a voter can improve the result of the election by not reporting his true preferences. Example 4 concerning the two-stage French system can be revisited bearing in mind theorem 2. .1. monotonicity. Independence. unanimity. This rules out aggregation methods such that the overall ranking is always identical to the preference ranking of a given voter. non-manipulability. Yet.2 (Gibbard-Satterthwaite) When the number of candidates is larger than two. transitivity. Note that we observed in example 12 that the Borda method violates the independence property. there exists no aggregation method satisfying simultaneously the properties of universal domain. especially those leading to the election of a unique candidate. independence and non-dictatorship properties. Therefore. as a consequence of theorem 2. a method is non-manipulable if. They proved the following result. in no case. Theorem 2. . Theorem 2. ANALYSIS OF SOME VOTING SYSTEMS 17 aggregation methods fail to respect unanimity. let us observe that the Borda method satisﬁes the universal domain. Hence it cannot verify transitivity (see example 8). unanimity. What about the Condorcet method ? It satisﬁes the universal domain. in addition. .1. independence and non-dictatorship. Note that Arrow’s theorem uses only ﬁve conditions that. we wish to ﬁnd a method satisfying neutrality. separability. This property is often called Independence of irrelevant alternatives. Non-dictatorship. The relative position of two candidates in the overall ranking depends only on their relative positions in the individual’s preferences. To a large extent. This property is often called Pareto condition.1 (Arrow) When the number of candidates is at least 3.2. Informally. the result is powerful. Gibbard-Satterthwaite’s theorem Gibbard (Gibbard 1973) and Satterthwaite (Satterthwaite 1975) were very interested by the (non-)manipulability of aggregation methods. non-manipulability and non-dictatorship. in addition to these ﬁve conditions. transitivity.

Furthermore. the converse or “a is indiﬀerent to b” (which is equivalent to “b is indiﬀerent to a”). b) to a voter. In some cases. This implies that when you present a pair of candidates (a. In this section. CHOOSING ON THE BASIS OF SEVERAL OPINIONS Many other impossibility results can be found in the literature. suppose that a parliament has been elected.g. in others. they are not. he is able to rank them but another kind of modeling of his preferences would be more accurate. only one candidate or law or project will have to be chosen. we have “a is preferred to b”. we can use a ranking without ties. we decided to focus on elections of a unique candidate. in some instances. a voter is not able to rank the candidates. using proportional representation. These results help to understand the fundamental principles of a method and to compare diﬀerent methods. This parliament will have to vote on many diﬀerent issues and. without ties. This model corresponds to the assumption of Section 1. a voter is unable to state if he prefers a to b or the converse because he thinks that both candidates are of equal value. just because he considers them as equivalent. we need to model his preferences by a ranking with ties. But this is not the place to review them. there are several candidates that a voter cannot rank. very often. Suppose that a voter prefers a to b. b). he prefers a. For each pair of candidates (a. many characterisations are available.2 Modelling the preferences of a voter Let us consider the assumption that we made in Section 1: the preferences of each voter can accurately be represented by a ranking of all candidates from best to worse. In fact. Those candidates are tied. There are many other reasons to question our assumption. At the beginning of this chapter. For example. b and c to d. Furthermore. he is always able to tell if he prefers a to b or the converse. He is indiﬀerent between a and b. in a ranking with ties. We can model his preferences by a ranking with ties. We all know that this is not always realistic. Thus. we list diﬀerent cases in which our initial assumption is not valid. 2. 2.2. A characterisation of a given aggregation method is a set of properties simultaneously satisﬁed by only that method. if he prefers a to b and b to c.1 Rankings To model the preferences of a voter. A graphic representation of this model is given in Fig. Note that.1 where an arrow between two candidates (e. ﬁnally. Some voting systems lead to the election of several candidates and aim towards achieving a kind of proportional representation. One might think that those systems are the solution to our problems.18 CHAPTER 2. he is indiﬀerent between b and c and. Indiﬀerence: rankings with ties In some cases. Those systems raise as many questions (perhaps more) as the ones we considered (Balinski and Young 1982). . c and d. Preference still is transitive. he necessarily prefers a to c (transitivity of preference). a and b) means that a is preferred to b and a line between them means that a is indiﬀerent to b. 2. Besides impossibility results.

b is far better than a. He might be embarrassed when asked to tell which candidate he prefers because. all the truth and nothing but the truth about their preferences. MODELLING THE PREFERENCES OF A VOTER 19 c a b Figure 2. There can be several reasons for this. he is also indiﬀerent between a and c. I like both but I cannot compare them. Conﬂicting information Suppose that a voter has to compare two candidates a and b about which he knows a lot. four situations can arise: 1. this list is not exhaustive. At the end of the meal.” Such situations are very common in real life where people do not tell the truth.1: A complete pre-order. a is far better than b but. in some respects. Such a voter cannot declare that he prefers a to b nor the converse. when comparing two candidates a and b. Poor information Suppose that a voter must compare two candidates a and b about which he knows almost nothing.2.2. It is very likely that one is better than the other but. as he doesn’t know which one. Arrows implied by transitivity are not represented indiﬀerence also is transitive. But this would not really reﬂect his preferences because he has no reasons to consider that they are equivalent. in other respects. except that their names are a and b and that they are candidates. d . he will probably rank a and b tied rather than ranking one above the other. your mother says “I have never eaten such a good pie! Does NameOfYourWife prepare it as well as I do ?” No matter what your preference is. it is diﬃcult to say. If a voter is indiﬀerent between a and b and between b and c. We therefore need to introduce a new model in which voters are allowed to express incomparabilities. a is preferred to b. In fact they are diﬀerent. If he is forced to express his preferences by means of a ranking with ties. And he does not know how to balance the pros and cons or he does not want to do so for the moment. Hence. Conﬁdential information Suppose that your mother invited you and your wife for dinner. not because he thinks that some of them are equivalent but because he cannot compare some of them. Of course. you would probably be very embarrassed to answer. Incomparability: partial rankings It can also occur that a voter is unable to rank the candidates. he is better oﬀ not stating any preferences about them. And your answer is very likely to be “Well.

it can model the preferences of our coﬀee drinker. But of course. I will then present him a cup with one grain and another with two. Example 14. . Furthermore–this is more serious–the coﬀee will be cold and he hates that. If I ask him which cup he prefers. if I ask him which one he prefers. Consequently. If we use a ranking with ties to model his preferences. Example 13. Suppose that you ask your child to choose between two presents for his birthday: a poney and a blue bicycle. adapted from (Armstrong 1939) will give the answer. he is necessarily indiﬀerent between a and c. Is this what we want ? We are going to borrow a small example from Luce (1956) to show that transitivity of indiﬀerence should be dropped. the structure we obtain is called a partial ranking. two grains and three grains. Thus transitivity of indiﬀerence is violated. is that right ?” you would say if you consider a transitive indiﬀerence. Transitivity and poneys: more semiorders Do we need semiorders only when a voter cannot distinguish between two very similar objects ? The following example. he will choose the cup with one thousand grains. see Pirlot and Vincke (1997). He will still be indiﬀerent. the other one with one grain of sugar. b is preferred to a. A possible objection to this is that the voter will be tired before he reaches the cup with one thousand grains. He equally dislikes both. Because of the transitivity of indiﬀerence. he will tell me that he is indiﬀerent (because he is not able to detect one grain of sugar). and so on until nine hundred ninety nine and one thousand grains. he will say he is indiﬀerent. There is a structure that keeps transitivity of preference and drops it for indiﬀerence. because of the transitivity of indiﬀerence. It is called semiorder. Transitivity and coﬀee: semiorders Consider a voter who is indiﬀerent between a and b as well as between b and c. he must also be indiﬀerent between a cup without sugar and a cup with one thousand grains (2 full spoons). Let us also suppose that he likes his coﬀee with sugar. you prefer the red bicycle to the poney. a and b are incomparable.20 CHAPTER 2. Suppose now that you present him a third candidate: a red bicycle with a small bell. Next. If we keep the transitivity of preference (and indiﬀerence). For details about semiorders. at least in some cases. However. He will probably tell you that he prefers the red one to the blue one. a is indiﬀerent to b or 4. “So. 3. CHOOSING ON THE BASIS OF SEVERAL OPINIONS 2. The voter will always be indiﬀerent between the two cups that I present to him because they diﬀer by just one grain of sugar. it is obvious that the child can still be indiﬀerent between the poney and the red bicycle. Let us suppose that I present two cups of coﬀee to a voter: one cup without sugar. As he likes both of them equally.

a voter might hesitate.2.g. he has to examine each pair and choose “a is preferred to b”. Many other families of binary relations have been considered in the literature in order to formally model the preferences of individuals as faithfully as possible (e.g.2: The poney vs bicycles semiorder Other binary relations Rankings with or without ties. . perhaps but more on the side of no. Fishburn 1988. “a is indiﬀerent to b” or “a and b are incomparable” (if indiﬀerence and incomparability are allowed). during the forthcoming mandate. 2. Tversky 1969. in a legislative election. For example. But he does not know if. perhaps.2 Fuzzy relations Fuzzy relations can be used to model preferences in at least two very diﬀerent situations. Pirlot and Vincke 1996). MODELLING THE PREFERENCES OF A VOTER poney red bike blue bike 21 Figure 2. • He does not have full knowledge about the candidates. Fuzzy relations and uncertainty When a voter is asked to express his preferences by means of a binary relation. a voter might prefer candidate a to candidate b. Note that even the transitivity of strict preference can be questioned due to empirical observations (e. It is easy to imagine situations where a voter would like to say “perhaps”. Sen 1997). And it is just a step further to imagine diﬀerent situations where a voter would hesitate but with various degrees of conﬁdence: almost yes but not completely sure. In fact. Let us now focus on another kind of mathematical structure used to model the preferences of a voter.2. partial rankings and semiorders are all binary relations. • He does have full knowledge about the candidates but not about some events that might occur in the future and aﬀect the way he compares the candidates. . . perhaps but more on the side of yes. reality is more subtle. a voter might ideally know everything about all candidates. Fishburn 1991. When facing a question like “do you prefer a to b”. “b is preferred to a”. Abbas. For example. . Roubens and Vincke 1985.2. a voter does not necessarily know what the position of all candidates is regarding a particular issue. If such a vote is to occur. the representatives will have to vote on a particular issue. again in a legislative election. There can be many reasons for his hesitations.

Fuzzy relations and preference intensity In some cases. You perfectly know the two options (budget.g. For example. time to completion.8 0. perhaps could be 0.9. For example. a and b) is the answer of the voter to the question “is a preferred to b”. . he will tend to express faint diﬀerences in his judgement. but by numbers. he answers 1.5 for (a. Can you tell which one you will choose ? What will you enjoy more ? To play tennis or to let your children play in the playground ? These three cases can be seen as three facets of a single problem. You have to vote. If he feels that “a is preferred to b” is deﬁnitely true. . For intermediate situations. This is due to the fact that preference is not a clear-cut concept.3: A fuzzy relation Note that. • He does not fully know his preferences. .3 a 1.22 CHAPTER 2. plan.3 where a number on the arrow between two candidates (e. A typical fuzzy relation on three candidates is illustrated by Fig. but because the concept of preference is vague and not well deﬁned. when a voter is asked to tell if he prefers a to b. In these cases. We might then model his preferences by a fuzzy relation and choose 0. the problem faced by the voter is no longer uncertainty but risk. CHOOSING ON THE BASIS OF SEVERAL OPINIONS In the other case.8 for (c. 2. probabilities of preference might be assigned to each pair. A value of 0 would correspond to no preference. d). 0. If he feels that “a is preferred to b” is deﬁnitely false. he chooses intermediate numbers. a voter might say “I deﬁnitely prefer a to b but not as much as I prefer c to d”. The voter is uncertain about the ﬁnal consequences of his choice. no longer by yes or no.0 0. 0. You will have access to both facilities under the same conditions. in some cases.0 c Figure 2. The voter must still answer the above mentioned question (do you prefer a to b ?). b) and 0.5 and almost yes. Fuzzy relations can be used to model such preferences.4 b 0. You like tennis and your children would love that playground. he answers 0. In such cases.6 0. not because he is uncertain about his judgement. a probability distribution on the possible consequences is assumed to exist. ). There are two options: a tennis court or a playground. Suppose that the community in which you live has decided to build a new recreational facility. he might prefer b to a because there is just one thing that he disapproves of the policy of b: his position about that particular issue. .

? Unfortunately.g. Approval voting received a lot of attention during the last twenty years and has been adopted by a number of committees. In this voting system. the candidates are voters that become candidates on a voluntary basis. We will not continue our list of preference models any further. fuzzy relations.2. Another important model is used in approval voting (Brams and Fishburn 1982). Here are a few points that are included in the voting process.2. But voting is much more than that. we considered only modelling the preferences of a voter and aggregating the preferences of several voters.1 Deﬁnition of the set of candidates Who is going to deﬁne the candidates or alternatives that will be submitted to a vote ? All the voters. Barrett and Pattanaik 1992). If the utilities of a and b are respectively are 50 and 40.3. see (Perny and Roubens 1998). the answer is no. Our aim was just to give a small overview of the many problems that can arise when trying to model the preferences of a voter. similar to those in Section 1. In addition. the preferences of a voter are modelled by a partition of the set of candidates into two subsets: a subset of approved candidates and a subset of disapproved candidates. . every voter votes for as many candidates as he wants or approves.1. we were using complete orders to model voters’ the preferences. the implication is that a is preferred to b. it implies that the preference between c and d is twice as large as the preference between a and b. . there are often some rules: not everyone can be a candidate. Many examples. Who should ﬁx these rules and how ? There is an even more fundamental question: who should decide that voting should occur. In this section. e. some of them or one of them ? In some cases. . We encountered many problems in Section 2. on what issue. Is it easier to aggregate individual preferences modelled by means of complete pre-orders. 2. 2. Salles.3. can be built to demonstrate this (Sen 1986. Nevertheless. THE VOTING PROCESS 23 Note that in many cases.3 The voting process Until now. according to which . An important one is the utilitarian one: a voter assigns to each candidate a number (the utility of the candidate). We then examined alternative models. But there is an important issue that we still must address. For a thorough review of fuzzy preference modelling. presidential elections. semiorders. 2.3 Other models Many other models can be conceived or have been described in the literature. Consequently. if the utilities of c and d are respectively 30 and 10. even if they are often left aside in the literature. uncertainty and vagueness are probably simultaneously present. The position of a candidate with respect to any other candidate is a function only of the utilities of the two candidates.

Suzumura 1999). the dictator of A applies the same policy in his country. Even if they do make this selection in a perfectly honest way. everyone. some studies show that an individual can prefer a to b or b to a depending on the presence or absence of some other candidate (Sen 1997).2 Deﬁnition of the set of the voters Who is going to vote ? As in the previous subsection. one representative for each faction. a number of representatives proportional to the size of that faction. white men. it can have far reaching consequences on the outcome of the process. The only diﬀerence is that the people in A do not vote.3 Choice of the aggregation method Even the choice of the aggregation method can be considered as part of the voting process for. . their benevolent dictator decides alone. men and women. What we value in B is freedom of choice. The board of directors of a company asks the executive committee to prepare a report on the future investment strategies. no formal method to do that. In what country would you prefer to live ? I guess you would choose B. . all governmental decisions are the same in A and B. experts who have some knowledge about the discussed problem. without voting. Hence. . Suppose that each time a policy is chosen by voting in B. Citizens. 2. A and B: A is ruled by a dictator.3. B is a democracy. even infeasible ones ? If infeasible ones are to be avoided. And you would probably choose B even if the decisions taken in B were a little bit worse than the decisions taken in A. Some references or more details on this topic can be found in (Sen 1997. CHOOSING ON THE BASIS OF SEVERAL OPINIONS rules ? All these questions received diﬀerent answers in diﬀerent countries and committees. unless you are the dictator.24 CHAPTER 2. 2. A more or less arbitrary selection needs to be made. To ﬁnd all feasible strategies might be prohibitively resource and time consuming. men. the aggregation method is at least as important as the result of the vote. noble people. suppose that the executive committee decides to explore only some strategies.3. Remember example 11 in which we showed that. . for some aggregation methods. This may indicate that they are far from trivial. Finally. There is no universal answer. How should the executive committee prepare its report ? Should they include all strategies. in some cases. the relative ranking of two candidates depends on the presence (or absence) of some other candidates. A vote on the proposed strategies will be held during the next board of directors meeting. rich people. And one can never be sure that all feasible strategies have been explored. Furthermore. There is no systematic way. Let us now be more pragmatic. Consider two countries. past or present. who should decide that they are infeasible. Creativity and imagination are needed during this process. let us look at diﬀerent democracies.

can be compared in the same way. For example. of course. we only discussed elections in which only one candidate must be chosen (single-seat constituencies. Vansnick 1986). there is an entity called group or society that has to choose a candidate from a set of candidates.1 Social choice and multiple criteria decision support Analogies There is an interesting analogy between voting and multiple criteria decision support. The very beginning of the process. However. by some aggregation method. In multiple criteria decision support. prime ministers or presidents). He then tries to structure his . most papers consider an entity. according to another criterion. This entity consists of individuals and. such cases are common. These criteria are often conﬂicting. Arrow and Raynaud 1986. And. in multiple criteria decision support.2. the comparison can be extended to the processes of voting and decisionmaking. SOCIAL CHOICE AND MULTIPLE CRITERIA DECISION SUPPORT 25 2. In fact. according to the available resources.4.4. alternatives by candidates and you get it. in social choice theory. that wants to choose an alternative from a set of available alternatives. Let us be more explicit. it is often the case that several candidates must be chosen. and Arrow (1963) have led to an important stream of research in the 20th century. according to a criterion. He just feels unsatisﬁed with his current situation. A committee that must select projects from a list often selects several ones. The decision-maker is often assumed to be an individual. is a crucial step. Replace criteria by voters. In multiple criteria decision support. A human resources manager chooses amongst the candidates those that will form an eﬃcient team. this similarity has widely been used (see e.e. When a decision maker enters a decision process. The main interest of this analogy lies in the fact that voting has been studied for a long time. In multiple criteria decision support. for some reasons. a given alternative is the best one while. that can vary largely in diﬀerent groups. the problem deﬁnition. In this chapter. of the best alternative from a performance tableau. a person. the choice made by this entity must reﬂect in some way the opinion of the individuals. In a large part of the literature on voting. i. as the preferences along a single viewpoint or criterion in multiple criteria decision support. In other words. the individuals often have conﬂicting views about the candidates. An investor usually invests in a portfolio of stocks. called decision-maker. and the global or multiple criteria preferences.g. Besides. in each constituency. the preferences of an individual play the same role. The collective or social preferences. the decision maker takes several viewpoints called criteria into account. Hence we have a huge amount of results on voting at our disposal for use in multiple criteria decision support. in Belgium and Germany.4 2. Condorcet (1785). he has no clearly deﬁned problem. other alternatives are better. in social choice. To make his choice. The seminal works by Borda (1781). several representatives are elected so as to achieve a proportional representation. the decision process is much broader than just the extraction. etc.

Daellenbach 1994). That is. CHOOSING ON THE BASIS OF SEVERAL OPINIONS view of the situation. The problem statement must not be too broad.g. have developed methods to help decision-makers to better structure their problem (Rosenhead 1989. In the ﬁrst section. describing very simple situations. The criteria. We also presented two theoretical results indicating that there is no hope of ﬁnding a perfect voting procedure. mainly in the United Kingdom. He should choose the one that satisﬁes some properties he judges important. The decisionmaker needs to identify all the viewpoints that are relevant with respect to his problem. like the voters. It usually contains a description of the reasons for which that situation is not satisfying and it contains an implicit description of the potential solutions to the problem. But this does not mean that we can use any procedure in any circumstance and any way. but no solution. etc. Depending on the aggregation method.5 Conclusions In this chapter. The ﬂaws of a particular procedure are probably less damageable in some instances than in others. are not given in a decision process. to look for relationships between entities. Some authors. 2. Brainstorming and other techniques promoting and stimulating creativity have been developed to support this step. So. Last but not least. He then must deﬁne a set of criteria that reﬂect all relevant viewpoints and that fulﬁlls some conditions. common sense is of very little help. in this domain. in formal language or not. He must construct the set of alternatives. the decision-maker has a problem. When the problem has been stated. See e. we found that intuition and common sense are not suﬃcient to avoid the many traps that await us when using aggregation procedures. like the candidates set in social choice. All criteria should be independent except if the aggregation method to be used thereafter allows dependence between criteria. And so on. if the statement is too narrow. to put labels on diﬀerent entities. of the current situation. using small examples. In fact. we have to choose the procedure that best matches our . It is a description. Therefore. for each voting context. the problem statement contains information that allows to recognise if a given action or course of actions is a potential solution or not. if we still want to use a voting procedure–this seems hardly avoidable–we must accept to use an imperfect one. otherwise anything can be a solution and the decisionmaker is not helped. On the contrary. the one he can understand. Roy (1996) and Keeney and Raiﬀa (1976). the aggregation method itself must be chosen by the analyst and/or the decision-maker.26 CHAPTER 2. some actions are not recognised as potential solutions even if they would be good ones. the one he trusts. as one can ﬁnd in books. It is hard to imagine how an aggregation procedure could be scientiﬁcally proven to be the best one. we have shown that the operation of voting is far from simple. The decision-maker must thus make a choice. Finally he obtains a “problem ”. the scales corresponding to the criteria must have some properties. Some features of a voting procedure may be highly desirable in a given context while not so important in another one. There must not be several criteria reﬂecting the same viewpoint.

most decision models suﬀer the same kind of problems that we have met in this chapter: there is no perfect aggregation procedure. Finally. . we showed that the voting process itself is highly complex. As you will see in the following chapters. . multiple criteria decision support (this has already been discussed in Section 4). They are constructed in some more or less arbitrary way. In Section 2. cost-beneﬁt analysis. we found that even the input of voting procedures–the preferences of the voters–are not simple things. they are imperfect and arbitrary models. Many diﬀerent models for preferences exist and can be used in aggregation procedures. ) is itself arbitrary. Nothing in the “problem” tells us what model to use. they do not take into account the fact that decision support occurs in a human process (the decision making process) and in a complex environment. When we feed our aggregation procedures with preferences. Voting procedures are decision models. CONCLUSIONS 27 needs. the data are not data. fuzzy relations. in Section 3. we must be aware that this match is not perfect. .2. . And. . the decision models are too narrow. . when we have made this choice. . This peculiarity doesn’t make voting procedures very diﬀerent from other decision and evaluation models. They are decision models devoted to the special case where a decision must be taken by a group of voters and are mainly concerned with the case of a ﬁnite and small set of alternatives. just like student grades. The choice of a particular model (ranking with ties. indicators.5. This shows that what is usually considered as data is not really data. these are not given. that we must use the procedure in such a way that the risk of facing a problematic situation is kept as low as possible.

.

1.org). we tried to show that “voting”. Our main objective in this chapter is similar. France. experts are used to rate the feasibility or the riskiness of projects. We were perhaps more surprised to realise that 29 . Although the entire chapter is based on the example of grading students. The purpose of this chapter is to build upon this shared experience. As with voting systems. Computer Science. The authors of this book have studied in four diﬀerent European countries (Belgium. This diversity is even increased by the fact that each “instructor” (a word that we shall use to mean the person in charge of evaluating students) has generally developed his own policy and habits. This will allow us to discuss. grading scales. both activities being central to most evaluation and decision models. etc. The authors of this book spend part of their time evaluating the performance of students through grading several kinds of work. an activity that you may also be familiar with. We were not overly astonished to discover that the rules that governed the way our performances were assessed were quite diﬀerent. it should be stressed that “grades” are often used in contexts unrelated to the evaluation of the performance of students: employees are often graded by their employers.1 Introduction Motivation In chapter 2. Operational Research. there is much variance across countries in the way “education” is organised.1 3. Physics) and in diﬀerent Universities. what is meant by “evaluating a performance” and “aggregating evaluations”.3 BUILDING AND AGGREGATING EVALUATIONS: THE EXAMPLE OF GRADING STUDENTS 3. We all share the – more or less pleasant – experience of having received “grades” in order to evaluate our academic performances. Curricula. are seldom similar from place to place (for information on the systems used in the European Union see www. products are routinely tested and graded by consumer organisations. Management. rules for aggregating grades and granting degrees.eurydice. raises many important and diﬃcult questions that are closely connected to the subject of this book. Greece and Italy) and obtained degrees in diﬀerent disciplines (Maths. based on simple and familiar situations. Geology. although being a familiar activity to almost everyone. The ﬁndings of this chapter are therefore not limited to the realm of a classroom.

Moom (1997) and Speck (1998). the degree is not granted immediately but there is still a possibility of obtaining it). Our general framework will be that of a programme at University level in which students have to take a number of “courses” or “credits”. Morris and Fitz-Gibbon (1987). Cardinet (1986). Dealing only with “technically-oriented” programmes at University level will clearly not allow us to cover the immense literature that has been developed in Education Science on the evaluation of the performance of students. ranks or average grades. seem to raise less “evaluation problems” than if we were concerned with. e. Depending on the programme. As we shall see.g.g. at least at ﬁrst sight. . Two types of questions prove to be central for our purposes: • how to evaluate the performance of students in a given “course”. see Bonboir (1972). this decision may take various forms. Lindheim. Music or Sports. Philosophy. Note that in Continental Europe. diﬀerent institutional constraints and the popularity of the classic book by Pi´ron (1963) have led e to a somewhat diﬀerent school of thought. Computer Science. this will however allow us to raise several important issues concerning the evaluation and the aggregation of performances. in some way before a decision is taken. we refer to Airaisian (1991). For good accounts in English. success or failure. our “grading policies” were quite diﬀerent even after having accounted for the fact that these policies are partly contingent upon the rules governing our respective institutions. Operational Research.2 Evaluating students in Universities We shall restrict our attention in this chapter to education programmes with which we are familiar. BUILDING AND AGGREGATING EVALUATIONS although we all teach similar courses in comparable institutions. In what follows. Such diversity might indicate that evaluating students is an activity that is perhaps more complex than it appears at ﬁrst sight. success or failure with possible additional information such as distinctions. we shall implicitly have in mind the type of programmes in which we teach (Mathematics. say. de Ketele (1982). McLean and Lockwood (1996). In each course the performance of students is graded. These grades are then collected and form the basis of a decision to be taken about each student.30 CHAPTER 3. de Landsheere (1980).1. we shall say “aggregated”. Engineering) that are centred around disciplines which. success or failure with the possibility of a diﬀered decision (e. Quite often the various grades are “summarised”. Davis (1993). the Piagetian inﬂuence. what is the meaning of the resulting “grades” and how to interpret them? • how to combine the various grades obtained by a student in order to arrive at an overall evaluation of his academic performance? These two sets of questions structure this chapter into sections. 3. Merle (1996) and Noizet and Caverini (1978). “amalgamated”.

on which the ﬁnal grade may be partly based. 1. administrations evaluating the performance of programmes. other instructors judging your severity and/or performance. We shall try here to give some hints on the process that leads to the attribution of a grade as well as on some of its pitfalls and diﬃculties. have a more complex role that is often both “certiﬁcative” and “formative”. Although this is less obvious in Universities than in elementary schools. Interpreting it necessarily calls for a study of the process that leads to its attribution. Whereas usually the ﬁnal grade of a course in Universities mainly has a “certiﬁcation” role. it should be noticed that grades are not only a signal sent by the instructor to each of his students. 2. i. GRADING STUDENTS IN A GIVEN COURSE 31 3. employers looking for all possible information on an applicant for a job. 3.e. All grades do not have a similar function. the result of a mid-term exam is included in the ﬁnal grade but is also meant to be a signal to a student indicating his strengths and weaknesses. parents watching over their child. Although this is clearly a very important task. They have many other potential important “users”: other students using them to evaluate their position in the class. the scale used for grading and the way of amalgamating these grades may vary in signiﬁcant ways for similar types or courses. 3. many instructors share the view that this is far from being the easiest and most pleasant part of their jobs.2.1 What is a grade? We shall understand a grade as an evaluation of the performance of a student in a given course. . it appears that a grade is therefore a complex “object” with multiple functions (see Chatel 1994. e. A grade should always be interpreted in connection with the objectives of a course. McLean and Lockwood 1996). Although it may appear obvious. This very general deﬁnition calls for some remarks.2 Grading students in a given course Most of you have probably been in the situation of an “instructor” having to attribute grades to students. Laska and Juarez 1992. a condition that is unfortunately not always perfectly met. Thus.2 The grading process What is graded and how? The types of work that are graded. an indication of the level to which a student has fulﬁlled the objectives of the course.3. intermediate grades.2. 3. this implies a precise statement of the objectives of the course in the syllabus.g.2. Lysne 1984.

Checca. We shall come back to that point in section 3. with the aim of giving students a basic understanding of the modelling process in OR and an elementary mastering of some basic techniques (Simplex Algorithm. Singer and Worthington 1994). exercises. they may be open-book or closed-book. Obviously we would not want to conclude from this that Italian instructors have come to develop much more sensitive instruments for evaluating performance than German ones or that the evaluation process is in general more “precise” in Europe than it is in the USA. standardising them.32 CHAPTER 3. BUILDING AND AGGREGATING EVALUATIONS 1. E to A or F to A. case-studies or even “class participation”. imposition of a minimal grade at the ﬁnal exam. In most courses the ﬁnal grade is based on grades attributed to multiple tests. Furthermore the way these various grades are aggregated is diverse: simple weighted average.3. including Linear Programming (LP). Many diﬀerent choices interfere with such a task. grade only based on exams with group work (e. Preparing and grading a written exam Within a given institution suppose that you have to prepare and grade a written. exam. elementary Network Algorithms). 6-1 (in Germany and parts of Switzerland). Branch and Bound. case-studies or exercises) counting as a bonus. 4. For reasons to be explained later. But there are many possible types of exams. They may be written or oral. e.g. such choices might not be totally without consequences. Integer Programming and Network models. It should however be noted that since grades are often aggregated at some point. e. The scale that is used for grading students is usually imposed by the programme. 1. Is the subject of adequate diﬃculty? Does it contain enough questions to cover all parts of the programme? . 3. 0-30 (in Italy). Their duration may vary (45 minute exams are not uncommon in some countries whereas they may last up to 8 hours in some French programmes). (an overview of grading policies and practices in the USA can be found in Riley. American and Asian institutions often use a letter scale. others modify the “raw” grades in some way before aggregating and/or releasing them. closed-book. Numerical scales are often used in Continental Europe with varying bounds and orientations: 0-20 (in France or Belgium). All instructors know that preparing the subject of an exam is a diﬃcult and time consuming task. Most of us would agree that the choice of a particular scale is mainly conventional. mid-term exam. Some instructors use “raw” grades. 0-100 (in some Universities). We shall take the example of an exam for an “Introduction to Operational Research (OR)” course. etc. 2. The number and type of work may vary a lot: ﬁnal exam. case-studies or essays.g.g. Some courses are evaluated on the basis of a single exam. Preparing a subject. Their content for similar courses may vary from multiple choice questions to exercises.

3. Will the marking scale include a bonus for work showing good communication skills and/or will misspellings be penalised? How to deal with computational errors? How to deal with computational errors that lead to inconsistent results? How to deal with computational errors inﬂuencing the answers to several questions? How to judge an LP model in which the decision variables are incompletely deﬁned? How to judge a model that is only partially correct? How to judge a model which is inconsistent from the point of view of units? Although much expertise and/or “rules of thumb” are involved in the preparation of a good subject and its associated marking scale. Although this is debatable. open questions be? 2. For this kind of “measure” the psychometric literature (see Ebel and Frisbie 1991. should measure what was intended to be measured and only that. Kerlinger 1986. • valid. such an evaluation is often thought of as a “measure” of performance. The preparation of the marking scale for a given subject is also of utmost importance. most experiments have shown that even in the more “technical” disciplines (Maths. Such experiments were conducted extensively in various disciplines and at various levels. Preparing a marking scale. A “nice-looking” subject might be impractical in view of the associated marking scale. give similar results when applied several times in similar conditions.e. Extensive research in Education Science has found that the process of giving grades to students is seldom perfect in these respects (a basic reference remains the classic book of Pi´ron (1963). we are aware of no instructor not having had to revise his judgement after correcting some work and realising his severity and/or to correct work again after discovering some frequently given half-correct answers that were unanticipated in the marking scale. Grammar) in which it is possible to devise rather detailed marking scales . GRADING STUDENTS IN A GIVEN COURSE 33 Do all the questions clearly relate to one or several of the announced objectives of the course? Will it allow to discriminate between students? Is there a good balance between modelling and computational skills? What should the respective parts of closed vs. We brieﬂy recall here some of the diﬃculties that were uncovered. has traditionally developed at least two desirable criteria.e. Physics. 3.2. hopefully. A grade evaluates the performance of a student in completing the tasks implied by the subject of the exam and. A measure should be: • reliable. Popham 1981). Not overly surprisingly. i. The crudest reliability test that can be envisaged is to give similar works to correct to several instructors and to record whether or not these works are graded similarly. will give an indication of the extent to which a student has met the various objectives of the course (in general an exam is far from dealing with all the aspects that have been dealt with during the course). i. Grading. Airaisian (1991) and Merle (1996) e are good surveys of recent ﬁndings).

Experience shows that “correction habits” tend to vary from one instructor to another. Some are used to giving the lowest possible grade after having spotted a mistake which. Deﬁning a grading policy A syllabus usually contains a section entitled “grading policy”. Although instructors do not generally consider it as the most important part of their syllabus. proposing a “non linear LP model”). they . Near the end of a correction task. it seems fair to suppose that they are no more reliable than written ones. BUILDING AND AGGREGATING EVALUATIONS there is much diﬀerence between correctors. 4. These auto-reliability tests give similar results since in more than 50% of the cases the second grade is “signiﬁcantly” diﬀerent from the ﬁrst one. Instructors accustomed to grading papers will not be surprised to note that: • grades usually show much auto correlation: similar papers handed in by a usually “good” student and by a usually “uninterested” student are likely not to receive similar grades.g. Although few experiments have been conducted with oral exams. in their minds. • “anchoring eﬀects” are pervasive: it is always better to be corrected after a remarkably poor work than after a perfect one. some instructors will tend to standardise the grades before releasing them (the so-called “z-scores”). Some will systematically avoid the extremes of the range and the distribution of their marks will have little variability. Other experiments have shown that many extraneous factors may interfere in the process of grading a paper and therefore question the validity of grades. The distribution of grades for similar papers will tend to be highly diﬀerent according to the corrector. implies that “nothing has been understood” (e. Some of them will tend to give an equal percentage of all grades and will tend to use the whole range of the scale. In order to cope with such eﬀects.g. • the order in which papers are corrected greatly inﬂuences the grades.34 CHAPTER 3. Even more strikingly on some work in Maths the diﬀerence can be as high as 9 points on a 0-20 scale (see Pi´ron 1963). arguing that either the basic concepts are understood or they are not. others will tend to equalise average grades from term to term and/or use a more or less ad hoc procedure. most correctors are less generous and tend to give grades with a higher variance. On average the diﬀerence between the more generous and the more severe correctors on Maths work can be as high as 2 points on a 0-20 scale. e In other experiments the same correctors are asked to correct a work that they have already corrected earlier. Others will tend to give only extreme marks e. The inﬂuence of correction habits. • misspellings and poor hand-writing prove to have a non negligible inﬂuence on the grades even when the instructor declares not to take these eﬀects into account or is instructed not to.

an “average grade” is computed and this average grade must be over a given limit to obtain the degree. dissertation. Some programmes attribute diﬀerent kinds of degrees through the use of “distinctions”. it usually also contains many “details” that may prove important in order to understand and interpret grades. The freedom of an instructor in arranging his own grading policy is highly conditioned by this environment. Weights We mentioned that the ﬁnal grade for a course was often the combination of several grades obtained throughout the course: mid-term exam. • the policy towards late assignments (no late assignment will be graded. Some courses (e. a “dissertation” may have to be completed. many degrees of freedom remain. The usual way to proceed is to give a (numerical) . let us mention: • the type of preparation and correction of the exams: who will prepare the subject of the exam (the instructor or an outside evaluator)? Will the work be corrected once or more than once (in some Universities all exams are corrected twice)? Will the names of the students be kept secret? • the possibility of revising a grade: are there formal procedures allowing the students to have their grades reconsidered? Do the students have the possibility of asking for an additional correction? Do the students have the possibility of taking the same course at several moments in the academic year? What are the rules for students who cannot take the exam (e. however. ﬁnal exam. GRADING STUDENTS IN A GIVEN COURSE 35 are aware that it is probably the part that is read ﬁrst and most attentively by all students. etc.2. Within a well deﬁned set of rules. We examine some of them below. On top of describing the type of work that will be graded. Among these “details”. In others. “core courses”) are sometimes treated apart. minus x points per hour or day).g. case-studies. this section usually describes the process that will lead to the attribution of the grades for the course in detail. In some programmes students are only required to obtain a “satisfactory grade” (it may or not correspond to the “middle” of the grading scale that is used) for all courses. because they are sick)? • the policy towards cheating and other dishonest behaviour (exclusion from the programme. Besides useful considerations on “ethics”. attribution of the lowest possible grade for the exam). A grade can hardly be interpreted without a clear knowledge of these rules (note that this sometimes creates serious problems in institutions allowing students pertaining to diﬀerent programmes with diﬀerent sets of rules to attend the same courses). the nature of exams and the way the various grades will contribute to the determination of the ﬁnal grade.3.g. attribution of the lowest possible grade for the course. Determining ﬁnal grades The process of the determination of the ﬁnal grades for a given course can hardly be understood without a clear knowledge of the requirements of the programme in order to obtain the degree.

for example. more important works receiving higher weights. However. an excellent group case-study may compensate for a very poor exam. e. some instructors standardise grades before averaging them. Passing a course In some institutions. It should be noted that the problem of positioning students with respect to a minimal passing grade is more or less identical to positioning them with respect to any other “special grades”.36 CHAPTER 3. When the ﬁnal grade is based on a single exam we have seen that it is not easy to build a marking scale. it is clear that the more or less arbitrary choice of a particular measure of dispersion (why use the standard deviation and not the inter quartile range? should we exclude outliers?) may have a crucial inﬂuence on the ﬁnal grades. Similarly weighted averages do not take the progression of the student during the course into account. given that an exam only gives partial information about the amount of knowledge of the student. It is even more diﬃcult to conceive a marking scale in connection to what is usually the minimal passing grade according to the culture of the institution. Although this might be desirable in some situations. The problem is clearly even more diﬃcult when the ﬁnal grade results from the aggregation of several grades. The use of weighted averages may give undesirable results since. In order to avoid such diﬃculties. to be cited on the “Dean’s honour list” or the “Academic Honour Roll”.g. it raises some diﬃculties that we shall examine in section 3. The same is true if the ﬁnal grade combines an exam with a dissertation. the manipulation of such “distorted grades” seriously complicates the positioning of students with respect to a “minimal passing grade” since their use amounts to abandoning any idea of “absolute” evaluation in the grades. you may either “pass” or “fail” a course and the grades obtained in several courses are not averaged. the former may only marginally contribute towards explaining diﬀerences in ﬁnal grades independently of the weighting scheme. The question boils down to deciding what amount of the programme should a student master in order to obtain a passing grade.3. the minimal grade for being able to obtain a “distinction”. Let us simply mention here that the interpretation of “weights” in such a formula is not obvious. the diﬀerences in the ﬁnal grades will be attributable almost exclusively to the mid term exam although it has a much lower weight than the ﬁnal exam. Although this process is simple and almost universally used. An essential problem for the instructor is then to determine which students are above the “minimal passing grade”. Since the variance of the grades is likely to be much lower for the dissertation than for the exam. Most instructors would tend to compensate for a very diﬃcult mid term exam (weight 30%) preparing a comparatively easier ﬁnal exam (weight 70%). Furthermore. BUILDING AND AGGREGATING EVALUATIONS weight to each of the work entering into the ﬁnal grade and to compute a weighted average. . if the ﬁnal exam is so easy that most students obtain very good grades.

In fact. he wants to have the opportunity to be dispensed from your class.g. if you were to know that the lowest grade was 13 and that 14 is the highest. Unless there is a lot of co-ordination between colleagues they may apply quite diﬀerent rules e. Secondly. First it should be observed that there is no clear implication in having obtained a similar grade in two diﬀerent courses.2. GRADING STUDENTS IN A GIVEN COURSE 37 3. Not aware of the grading policy of the instructor and of the culture and rules of the previous University this student attended. Consider a student joining your programme after having obtained a ﬁrst degree at another University. Grades from colleagues Being able to interpret the grade that a student obtained in your own institution is quite important at least as soon as some averaging of the grades is performed in order to decide on the attribution of a degree. This task is clearly easier than the preceding one: the grades that are to be interpreted here have been obtained in a similar environment. you would perhaps be tempted to conclude that the diﬀerence between 13 and 14 may not be very signiﬁcant and/or that you should not trust grades that are so generous and exhibit so little variability.2. Is it possible or meaningful to assert that a student is “equally good” in Maths and in Literature? Is it possible to assert that.3 Interpreting grades Grades from other institutions In view of the complexity of the process that leads to the attribution of a grade. he has satisﬁed to a greater extent the objectives of the Maths course than the objectives of the Literature course? Our experience as instructors would lead us to answer negatively to such questions even when talking of programmes in which all objectives are very clearly stated. each instructor still had many degrees of freedom to choose his grading policy. in dealing with late assignments or in the nature and number of exams. This seriously complicates the interpretation of the proﬁle of grades obtained by a student. Arguing that he has already passed a course in OR with 14 on a 0-20 scale. knowing that he obtained 14 oﬀers you little information. given the level of the programme. Interpreting your own grades The numerical scales used for grades throughout Europe tend to give the impression that grades are “real measures” and that. before manipulating numbers supposedly resulting from “measurements” it is always important to try to ﬁgure . we would like to argue that this task is not an easy one either.2.2 we mentioned that.3. However. The knowledge of his rank in the class may be more useful: if he obtained one of the highest grades this may be a good indication that he has mastered the contents of the course suﬃciently. However. There are many possible kinds of “measure” and having a numerical scale is no guarantee that the numbers on that scale may be manipulated in all possible ways. in section 3. even within ﬁxed institutional constraints. it should not be a surprise that most instructors ﬁnd it very diﬃcult to interpret grades obtained in another institution. consequently these numbers may be manipulated as any other numbers.

having completely failed to meet any of the objectives of the course. Saying that Mr. An interval scale allows comparisons of “diﬀerences in performance”. Hence it would seem that a 0-20 scale might be better viewed as an interval scale. a scale in which the unit of measurement is arbitrary (such scales are frequent in Physics. Consider two excellent. it should be recognised that this ratio scale is somewhat awkward because it is bounded above. It should be clear that the numerical value attributed to the highest point on the scale is somewhat arbitrary and conventional. If grades can be considered as measured on a ratio scale. because some bonus is added for particularly clever answers. The lowest point on the scale It should be clear that the numerical value that is attributed to the lowest point of the scale is no less arbitrary and conventional than was the case for the highest point. i. Let us notice that this is true even in Physics. since changing the unit and origin of measurement clearly preserves such comparisons. some ambiguity remains. Y “makes sense” because this assertion is true whether mass is measured in pounds or in kilograms. it makes sense to assert that the diﬀerence between 0 and 10 is similar to the diﬀerence between 10 and 20 or that the diﬀerence between 8 and 10 is twice as large as the diﬀerence between 10 and 11. BUILDING AND AGGREGATING EVALUATIONS out on which type of scales they have been “measured”.e. students. e. No loss of information would be incurred using a 0-100 or a 0-10 scale instead of a 0-20 one. Even when the lowest grade can be obtained by students having taken the exam. Unless you admit that knowledge is bounded or. i. more realistically. The highest point on the scale An important feature of all grading scales is that they are bounded above. but not necessarily “equally excellent”. They cannot obtain more than the perfect grade 20/20. problems might appear at the upper bound of the scale.e. X weighs twice as much as Mr. Let us notice that using a scale that is bounded below is also problematic. In some institutions the lowest grade is reserved for students who did not take the exam. whereas the computer system of most Universities would deﬁnitely reject such grades !). i. “Knowing nothing”. length can be measured in meters or inches without loss of information).g.e. a scale in which both the origin and the unit of measurement are arbitrary (think of temperature scale in Celsius or Fahrenheit). that “perfectly fulﬁlling the objectives of a course” makes clear sense. Equality of grades at the top of the scale (or near the top. At best it seems that grades should be considered as expressed on a ratio scale. i. Saying that the average temperature in city A is twice as high as the average temperature in city B may be true but makes little sense since the truth value of this assertion clearly depends on whether temperature is measured using the Celsius or the Fahrenheit scale. There is nothing easier than to transform grades expressed on a 0-20 scale to grades expressed on a 100-120 scale and this involves no loss of information. depending on grading habits) does not necessarily imply equality in performance (after a marking scale is devised it is not exceptional that we would like to give some students more than the maximal grade. is diﬃcult to deﬁne and is certainly contingent .38 CHAPTER 3. Clearly this does not imply that these students are “equally ignorant”.e.

the quality of the preceding papers.g. it might well be possible to assert that small diﬀerences in grades that do not cross any special grades may not be signiﬁcant at all. in order to obtain the degree. Therefore. few instructors will claim that there is a highly signiﬁcant diﬀerence between 4/20 and 5/20. they provide nothing more than ranking information” (but see French 1993.] we contend that the diﬃculty of nearly all academic tests is arbitrary and regardless of the scoring method. However.. On the contrary the gap between 9/20 and 10/20 may be much more important since before putting a grade just below the passing grade most instructors usually make sure that they will have good arguments in case of a dispute (some systematically avoid using grades just below the minimal passing grade). Cross (1995) stating that: “[. the time of correction. since these letter grades are usually obtained via the manipulation of a distribution of numerical grades of some sort. say. Although the latter exam seems slightly better than the former.3. Some authors have been quite radical in emphasising this point. care should be taken when manipulating grades close to the bounds of the scale. it makes sense to compare diﬀerences in grades.. If it is clear that an exam is well below the passing grade. with obviously no guarantee that they are “equally ignorant”). GRADING STUDENTS IN A GIVEN COURSE 39 upon the level of the course (this is all the more true that in many institutions the lowest grade is also granted to students having cheated during the exam. e. in view of the lack of reliability and validity of some aspects of the grading process. In between these “special grades” it seems that the reliable information conveyed by grades is mainly ordinal. in some institutions.2. A diﬀerence of 1 point on a 0-20 scale may well be due only to chance via the position of the work.3. not only the minimal passing grade has a special role: some grades may correspond to diﬀerent possible levels of distinction. First we already mentioned that a lot of care should be taken in manipulating grades that are “close” to the bounds. At ﬁrst sight this would seem to be a strong argument in favour of the letter system at use in most American Universities that only distinguishes between a limited classes of grades (usually from F or E to A with. The authors of this book (even if their students should know that they spend a lot of time and energy in grading them !) do not consider that their own grades always allow for such comparisons. In some programmes. the essential idea is that they are both well below the minimal passing grade. To a large extent “knowing nothing” — in the context of a course — is somewhat as arbitrary as is “knowing everything”. in between these bounds. In between We already mentioned that on an interval scale. Vassiloglou and French 1982). . Second. Furthermore the aggregation of letter grades is often done via a numerical transformation as we shall see in section 3. Finally it should be observed that. if grades are expressed on interval scales. Let us consider a programme in which all grades must be above a minimal passing grade. 10 on a 0-20 scale. the distinction between letter grades and numerical grades is not as deep as it appears at ﬁrst sight. the possibility of adding “+” or “–” to the letters). other may correspond to a minimal acceptable level below which there is no possibility of compensation with grades obtained in other courses. some grades are very particular in the sense that they play a particular role in the attribution of the degree.

as with most evaluation models of this type.2. none of us would be prepared to abandon grades. they will undoubtedly adapt their work and learning practice to what they perceive to be its severity and consequences. This might not be an impossible task.4 Why use grades? Some readers. We would like to mention that the literature in Education Science is even more pessimistic leading some authors to question the very necessity of using grades (see Sager 1994. . BUILDING AND AGGREGATING EVALUATIONS Once more grades appear as complex objects. of course. prepare an exam that would clearly deserve distinction. aggregating these evaluations will raise even more problems. Tchudi 1997). prepare an exam that is well below the passing grade. The resulting “scale of measurement” is unsurprisingly awkward. the “performance of students”. We suggest to sceptical instructors the following simple experiment. the existence of special grades complicates the situation in introducing some “absolute” elements of evaluation in the model (on the measurement-theoretic interpretation of grades see French 1981. see Sabot and Wakeman 1991. do not ﬁnd it very easy. If these lines have lead some students to consider that grades are useless. we. we suggest they try to build up an evaluation model that would not use grades without. As is the case with most evaluation models. Myers and King 1994). 3. This not to say that grades cannot be a useful evaluation model. however. Instructors are likely to use a grading policy that will depend on their perception of the policy of the Faculty (on these points. Furthermore. ask some of your colleagues to take it with the following instructions: prepare what you would think to be an exam that would just be acceptable for passing. Students cannot be expected to react passively to a grading policy. relying too much on arbitrary judgements. may have the impression that we have been overly pessimistic on the quality of the grading process. and most notably instructors. at least for the type of programmes in which we teach. We tend to consider grades as an “evaluation model” trying to capture aspects of something that is subject to considerable indetermination. their use greatly contributes to transforming the “reality” that we would like to “measure”.40 CHAPTER 3. Having prepared an exam. Stratton. Vassiloglou 1984). It would be extremely likely that the resulting grades show some surprises! However. The diﬃculties that we mentioned would be quite problematic if grades were considered as “measures” of performance that we would tend to make more and more “precise” and “objective”. Then apply your marking scale to these papers prepared by your colleagues. While they seem to mainly convey ordinal information (with the possibility of the existence of non signiﬁcant small diﬀerences) that is typical of a relative evaluation model.

These rules exhibit such variety that it is obviously impossible to exhaustively examine them here. This very simple rule has the immense advantage of avoiding any amalgamation of grades. e. it appears that they are often based on three kinds of principles (see French 1981).1 Aggregating grades Rules for aggregating grades In the previous section. a decision still has to be made about each student. success or failure with the additional possibility of partial success (the degree is not granted immediately but there remains a possibility of obtaining it). ranks or average grades.3. we hope to have convinced the reader that grading a student in a given course is a diﬃcult task and that the result of this process is a complex object. AGGREGATING GRADES 41 3. whether you are aware of such rules for the programmes in which your children are enrolled). distinctions. It is however seldom used as such because: • it is likely to generate high failure rates. verbal comments from instructors or extra-academic information linked to the situation of each student.3. . Depending on the programme. using several kinds of distinctions) between students obtaining the degree. they do not obtain the degree.g. etc. • it does not allow to discriminate (e. • it oﬀers no incentive to obtain grades well above the minimal passing grade.3 3. If they fail to do so after a given period of time. success or failure with possible additional information e. this is only part of the evaluation process of students enrolled in a given programme.g. Once they have received a grade in each course. students must pass all courses. i. • it does not allow to discriminate between grades just below the passing grade and grades well below it. Unfortunately. Most instructors and students generally violently oppose such simple systems since they generate high failure rates and do not promote “academic excellence”. we already mentioned that this decision may take diﬀerent forms: success or failure. However.e.g. if you do not teach. obtain a grade above a “minimal passing grade” in all courses in order to obtain the degree.3. Conjunctive rules In programmes of this type. What is required from the students to obtain a degree is generally described in a lengthy and generally opaque set of rules that few instructors—but generally all students—know perfectly (as an interesting exercise we might suggest that you investigate whether you are perfectly aware of the rules that are used in the programmes in which you teach or. Such decisions are usually based on the ﬁnal grades that have been obtained in each course but may well use some other information.

The average grade obtained by student a is then n computed as g(a) = i=1 wi gi (a). Using such a convention the average grade g(a) will be expressed on a scale having the same bounds as the scale . This amounts to aggregating evaluations that are highly inﬂuenced by the aggregation rule. The weights wi may. because of serious personal problems. an average grade is computed for each “category” of courses provided that the grade of each course is above a minimal level and such average grades per category of courses are then used in a conjunctive fashion. be n normalised in such a way that i=1 wi = 1.g.g.2 that the very nature of the grades was highly inﬂuenced by these rules. In such programmes. 3.42 CHAPTER 3. Furthermore. the (positive) weights wi reﬂecting the “importance” (in “academic” terms and/or in function of the length of the course) of the course for the degree. the minimal average grade for obtaining the degree with a distinction. e. grant the degree to someone who does not meet all the requirements of the programme e. We study some aspects of the most common aggregation rule for grades below: the weighted average (more examples and comments will be found in chapters 4 and 6). All these rules are based on “grades” and we saw in section 3. The rules that are used in the programmes we are aware of often involve a mixture of these three principles.3.2 Aggregating grades using a weighted average The purpose of rules for aggregating grades is to know whether the overall performance of a student is satisfactory taking his various ﬁnal grades into account. the grades of students are aggregated using a simple weighted average. all sorts of compensation eﬀects are at work with a weighted average. some programmes include rules involving “minimal acceptable grades” in each course. BUILDING AND AGGREGATING EVALUATIONS Weighted averages In many programmes. the minimal average grade for obtaining the degree. Minimal acceptable grades In order to limit the scope of compensation eﬀects allowed by the use of weighted averages. Using a weighted average system amounts to assessing the performance of a student combining his grades using a simple weighting scheme. the minimal average grade for being allowed to stay in the programme. etc. the ﬁnal decision is taken on the basis of an average grade provided that all grades entering this average are above some minimal level. This average grade (the so-called “GPA” in American Universities) is then compared to some standards e. We shall suppose that all ﬁnal grades are expressed on similar scales and note gi (a) the ﬁnal grade for course i obtained by student a. without loss of generality.g. for instance. This makes aggregation an uneasy task. it should be noticed that the ﬁnal decision concerning a student is very often taken by a committee that has some degree of freedom with respect to the rules and may. Whereas conjunctive rules do not allow for any kind of compensation between the grades obtained for several courses.

c should certainly be ranked ﬁrst and d should be ranked last.1. Students a and b should be ranked in between. both a and b are excellent in one course while having a serious problem in the other. Casual introspection suggests that if the students were to be ranked.1 should make clear that there is no loss of generality in supposing that weights sum to 1). For each course. their relative position depending on the relative importance of the two courses. 7 w < 16 (ﬁgure 3. As shown in ﬁgure 3. see Vassiloglou (1984)). Considering that both courses are of equal importance gives the following average grades: a b c d average grades 12 12 11 5 which leads to having both a and b ranked before c.e. Their very low performance in 50% of the courses does not make them good candidates for the degree. The use of a simple weighted sum is therefore not in line with the idea of promoting students performing reasonably well in all courses. The exclusive reliance on a weighted average might therefore be an incentive for students to concentrate their eﬀorts on a limited number of courses and beneﬁt . Example 1 Consider four students enrolled in a degree consisting of two courses. AGGREGATING GRADES 43 used for the gi (a). a ﬁnal grade between 0 and 20 is allocated. 1−w) that would rank c before both a and b. for the problems that arise when this is not so. i. we can say even more: there is no vector of weights (w. Ranking c before a implies that 11w + 11(1 − w) > 5w + 19(1 − w) which 8 leads to w > 15 .3. Ranking c before b implies 11w + 11(1 − w) > 20w + 4(1 − w).3. The use of simple weighted average of grades leads to very diﬀerent results. The simplest decision rule consists in comparing g(a) with some standards in order to decide on the attribution of the degree and on possible distinctions. The results are as follows: a b c d g1 5 20 11 4 g2 19 4 11 6 Student c has performed reasonably well in all courses whereas d has a consistent very poor performance. A number of examples will allow us to understand the meaning of this rule better and to emphasise its strengths and weaknesses (we shall suppose throughout this section that students have all been evaluated on the same courses.

he has reasonably good grades in both .1: Use of a weighted sum for aggregating grades from the compensation eﬀects at work with such a rule. A related consequence of the additivity hypothesis is that it forbids to account for “interaction” between grades as shown in the following example. it is felt that a should be ranked before b. Although a has a low grade in Economics. Example 2 Consider four students enrolled in an undergraduate programme consisting in three courses: Physics. Maths and Economics. BUILDING AND AGGREGATING EVALUATIONS 20 a 18 16 14 12 c 10 8 6 4 2 0 0 2 4 6 8 10 12 14 16 18 20 d l l l l b Figure 3. This is a consequence of the additivity hypothesis embodied in the use of weighted averages. a ﬁnal grade between 0 and 20 is allocated. The results are as follows: Physics 18 18 5 5 Maths 12 7 17 12 Economics 6 11 8 13 a b c d On the basis of these evaluations.44 CHAPTER 3. It should ﬁnally be noticed that the addition of a “minimal acceptable grade” for all courses can decrease but not suppress (unless the minimal acceptable grade is so high that it turns the system in a nearly conjunctive one) the occurrence of such eﬀects. For each course.

It is not unreasonable to suppose that since the minimal required average for the degree is 10. Although these preferences appear reasonable. Example 3 Consider two students enrolled in a degree consisting of two courses. they are not compatible with the use of a weighted average in order to aggregate the three grades. Therefore d is ranked before c. If 10 is a “special grade” then. which is contradictory. In this example it seems that “criteria interact”. having good grades in a both Math and Physics or in both Maths and Economics is better than having good grades in both Physics and Economics. • ranking d before c implies putting more weight on Economics than on Maths (5w1 + 17w2 + 8w3 > 5w1 + 12w2 + 13w3 ⇒ w3 > w2 ). cannot be dealt with using weighted averages. we . d appears to be a fair candidate for a programme in Economics. Using a similar type of reasoning. Both students will obtain the degree having performed equally well. The results are as follows: g1 11 12 g2 10 9 a b It is clear that both students will receive an identical average grade of 10. Student c has two low grades and it seems diﬃcult to recommend him for a programme in Engineering or in Economics. AGGREGATING GRADES 45 Maths and Physics which makes him a good candidate for an Engineering programme. Whereas Maths do not overweigh any other course (see the ranking of d vis-`-vis c). Taking such interactions into account calls for the use of more complex aggregation models (see Grabisch 1996). both courses have the same weight and the required minimal average grade for the degree is 10. it might be reasonable to consider that the diﬀerence between 10 and 9 which crosses a special grade is much more signiﬁcant than the diﬀerence between 12 and 11 (it might even be argued that the small diﬀerence between 12 and 11 is not signiﬁcant at all). a grade above 10 indicating that a student has satisfactorily met the objectives of the course. b is weak in Maths and it seems diﬃcult to recommend him for any programme with a strong formal component (Engineering or Economics). this grade will play the role of a “special grade” for the instructors.3.5: the diﬀerence between 11 and 12 on the ﬁrst course exactly compensates for the opposite diﬀerence on the second course.3. Such interactions. For each course a ﬁnal grade between 0 and 20 is allocated. It is easy to observe that: • ranking a before b implies putting more weight on Maths than on Economics (18w1 + 12w2 + 6w3 > 18w1 + 7w2 + 11w3 ⇒ w2 > w3 ). although not unfrequent. If this is the case. this is another consequence of the additivity hypothesis.

then a diﬀerence of x points on the ﬁrst grade will compensate for an opposite diﬀerence of −x points on the other grade. The linearity hypothesis embodied in the use of weighted averages has the inevitable consequence that a diﬀerence of one point has a similar meaning wherever on the scale and therefore does not allow for such considerations. This is another consequence of the linearity hypothesis embodied in the use of weighted averages. b could well be judged preferable to both a and c even though b is indiﬀerent to a and c. Example 4 Consider a programme similar to the one envisaged in the previous example. The results are as follows: . The use of linearity and additivity implies that if a diﬀerence of one point on the ﬁrst grade compensates for an opposite diﬀerence on the other grade. Furthermore. All courses have identical importance and the minimal passing grade is 10 on average. Example 5 Consider three students enrolled in a degree consisting of three courses. BUILDING AND AGGREGATING EVALUATIONS would have good grounds to question the fact that a and b are “equally good”. For each course a ﬁnal grade between 0 and 20 is allocated. these three students will not be distinguished: their equal average grade makes them indiﬀerent. to view the following three students as perfectly equivalent with an average grade of 15: g1 10 15 20 g2 20 15 10 a b c whereas we already argued that. for instance. This appears desirable since these three students have very similar proﬁles of grades. whatever the value of x. We have the following results for three students: g1 14 15 16 g2 16 15 14 a b c All students have an average grade of 15 and they will all receive the degree. However.46 CHAPTER 3. in such a case. if the degree comes with the indication of a rank or of an average grade. if x is chosen to be large enough this may appear dubious since it could lead.

As argued in section 3. A common “conversion scheme” is the following: A B C D E 4 3 2 1 0 (outstanding or excellent) (very good) (good) (satisfactory) (failure) . Not surprisingly.2). This is hardly compatible with grades that would only be indicators of “ranks” even with some added information (a view that is very compatible with the discussion in section 3. it does not seem that the use of “letter grades”. As shown by the following example. this example shows that a weighted average makes use of the “cardinal properties” of the grades.e. They all end up tied and should all be awarded the degree. This means that the following table might as well reﬂect the results of these three students: g1 11 13 4 g2 4 13 14 g3 12 6 11 a b c since the ranking of students within each course has remained unchanged as well as the position of grades vis-`-vis the minimal passing grade. which is nothing more than a weighted average of grades. we could have favoured any of the three students. only reﬂect the relative rank of the students in the class. instead of numerical ones.6). In this case. the GPA is usually computed by associating a number to each letter grade. AGGREGATING GRADES g1 12 13 5 g2 5 12 13 g3 13 5 12 47 a b c It is clear that all students have an average equal to the minimal passing grade 10. with the possible exception of a few “special grades” such as the minimal passing grade. i. is crucial for the attribution of degrees and the selection of students. Since courses are evaluated on letter scales.3.3. helps much in this respect.2 it might not be unreasonable to consider that ﬁnal grades are only recorded on an ordinal scale. Example 6 In many American Universities the Grade Point Average (GPA). only a b (say the Dean’s nephew) gets an average above 10 and both a and c fail (with respective averages of 9 and 9. Note that using diﬀerent transformations.

Allowing for the possibility of adding “+” or “–” to the letter grades generally results in a conversion schemes maintaining an equal diﬀerence between two consecutive letter grades. These letter grades are further transformed. BUILDING AND AGGREGATING EVALUATIONS in which the diﬀerence between two consecutive letters is assumed to be equal. This implies using a ﬁrst “conversion scheme” of numbers into letters. First. into a numerical scale and the GPA is computed. indicating the percentage of correct answers to a multiple choice questionnaire).g. This can have a signiﬁcant impact on the ranking of students on the basis of the GPA. The choice of such a scheme is not obvious. the calculation of the GPA is as follows: . Such a practice raises several diﬃculties. These numbers are then converted into letter grades using a ﬁrst conversion scheme. letter grades for a given course are generally obtained on the basis of numerical grades of some sort. a common conversion scheme (that is used in many Universities) is A B C D E 90–100% 80–89% 70–79% 60–69% 0–59% This results in the following letter grades: g1 A C A g2 D C C g3 C B D a b c Supposing the three courses of equal importance and using the conversion scheme of letter grades into numbers given above. the conversion scheme of letters into numbers used to compute the GPA is somewhat arbitrary. Now consider three students evaluated on three courses on a 0-100 scale in the following way: g1 90 79 100 g2 69 79 70 g3 70 89 69 a b c Using an E to A letter scale.48 CHAPTER 3. using a second conversion scheme. Note that when there are no “holes” in the distribution of numerical grades it is possible that a very small (and possibly non signiﬁcant) diﬀerence in numerical grades results in a signiﬁcant diﬀerence in letter grades. To show how this might happen suppose that all courses are ﬁrst evaluated on a 0–100 scale (e. Secondly.

73–76%. 80–82% 77–79%. AGGREGATING GRADES g1 4 2 4 g2 1 2 2 g3 2 3 1 GPA 2.33 2.3.33 49 a b c making the three students equivalent.33 2. 70–72%. 0–59% This scheme would result in the following letter grades: g1 A– C+ A+ g2 D C+ C– g3 C– B+ D a b c Maintaining the usual hypothesis of a constant “diﬀerence” between two consecutive letter grades we obtain the following conversion scheme: A+ A A– B+ B B– C+ C C– D F 10 9 8 7 6 5 4 3 2 1 0 .3. 60–69%. Now another common (and actually used) scale for converting percentages into letter grades is as follows: A+ A A– B+ B B– C+ C C– D F 98–100% 94–97% 90–93% 87–89% 83–86%.

BUILDING AND AGGREGATING EVALUATIONS which leads to the following GPA: g1 8 4 10 g2 1 4 2 g3 2 7 1 GPA 3. For each course a ﬁnal grade between 0 and 20 is allocated.00 4.50 CHAPTER 3. the following table appears even more problematic: g1 13 11 12 g2 12 13 11 g3 11 12 13 a b c since. If all instructors agree that a diﬀerence of one point in their grades (away from 10) should not be considered as signiﬁcant. b is signiﬁcantly better than c and c is . It should be clear that standardisation of the original numerical grades before conversion oﬀers no clear solution to the problem uncovered. a is signiﬁcantly better than b (he has a signiﬁcantly higher grade on g1 while there are no signiﬁcant diﬀerences on the other two grades). it is simply ignored. The results are as follows: g1 13 11 14 g2 12 13 10 g3 11 12 12 a b c All students will receive an average grade of 12 and will all be judged indiﬀerent. He can argue that he should be ranked before b: he has a signiﬁcantly higher grade than b on g1 while there is no signiﬁcant diﬀerence between the other two grades. b (again the Dean’s nephew) gets a clear advantage over a and c.66 5. All courses have the same weight and the minimal passing grade is 10 on average. most often. The situation is the same vis-`-vis c: a has a signiﬁcantly higher grade on g2 and a this is the only signiﬁcant diﬀerence. using the same hypotheses.33 a b c In this case. student a has good grounds to complain. In a similar vein. Example 7 We argued in section 3. The explicit treatment of such imprecision is problematic using a weighted average.2 that small diﬀerences in grades might not be signiﬁcant at all provided they do not involve crossing any “special grade”. Consider the following example in which three students are enrolled in a degree consisting of three courses. while all students clearly obtain a similar average grade.

write syllabi specifying a grading policy. its use may be problematic for aggregating grades. In view of these few examples.3. Aggregation rules using weighted sums will be dealt with again in chapters 4 and 6. • “evaluation operations” are complex and should not be confused with “measurement operations” in Physics. aggregating the grades obtained at the mid-term and the ﬁnal exams) and may be aﬀected by imprecision. uncertainty and/or inaccurate determination. In particular. the properties of these numbers should be examined with care. we discussed the many elements that may obscure the interpretation of grades and argued that the common weighted sum rule to amalgamate them may not be without diﬃculties. using “numbers” may be only a matter of convenience and does not imply that any operation can be meaningfully performed on these numbers. The information to be aggregated may itself be the result of more or less complex aggregation operations (e. we hope to have convinced the reader that although the weighted sum is a very simple and almost universally accepted rule. Although they are very familiar.4 Conclusions We all have been accustomed to seeing our academic performances in courses evaluated through grades and to seeing these grades amalgamated in one way or another in order to judge our “overall performance”. If it is admitted that there is no easy way to evaluate the performance of a student in a given course. 3. prepare exams. .g. • the aggregation of the result of several evaluation models should take the nature of these models into account. Actors are most likely to modify their behaviour in response to the implementation of the model. In particular. we have tried to show that these activities may not be as simple and as unproblematic as they appear to be. this is not overly surprising. When they result in numbers.4. CONCLUSIONS 51 signiﬁcantly better than a (the reader will have noticed that this is a variant of the Condorcet paradox mentioned in chapter 2). We would like to emphasise a few simple ideas to be drawn from this example that we should keep in mind when working on diﬀerent evaluation models: • building an evaluation model is a complex task even in simple situations. Even the simplest and most familiar ones may in some cases lead to surprising and undesirable conclusions. there is no reason why there should be an obvious one for an entire programme. • aggregation models should be analysed with care. Most of us routinely grade various kinds of work. We expect such diﬃculties to be present in the other types of evaluation models that will be studied in this book. Since grades are a complex evaluation model. etc. the necessity and feasibility of using rules that completely rank order all students might well be questioned.

.52 CHAPTER 3. This has surely been the case for the authors. BUILDING AND AGGREGATING EVALUATIONS Finally we hope that this brief study of the evaluation procedures of students will also be the occasion for instructors to reﬂect on their current grading practices.

Dow Jones. consumer price index. Therefore. as it is impossible to consider reality independently of our perception of it. The EU countries with a deﬁcit/GNP ratio lower than 3% will be allowed to enter the EURO. Why are these indicators (often called indices) so powerful ? Probably because it is commonly accepted that they faithfully reﬂect reality. it might be meaningless to consider that reality exists per se (Roy 1990).. physicians per capita. we must accept that an indicator accounts only for some aspects of reality. 1. The World Bank threatens country x to suspend its help if it doesn’t succeed in bringing indicator y to level z. an indicator must be designed so as 53 . . social position index. Violations of human rights are often presented as the main factor. an indicator might only be relevant for the person who constructed it. you could feel that these magic numbers rule the world. .g. poverty index. But it is worth noting that indicators of human rights also exist (see e. hence. Is there one reality. air quality. Hence. One could argue that these particular realities are just particular views of the same reality but. Each person has a particular perception of the world and. This forces us to raise several questions. Note that in many cases. 2. As a consequence. pregnant women and young children should stay indoors. Whatever the answer to the previous question.Q. the decisions of the World Bank to withdraw help are not motivated by economic or ﬁnancial reasons. can we hope that an indicator faithfully reﬂects reality (the reality or a reality) ? Reality is so complex that this is doubtful. a particular reality. several realities or no reality ? Many philosophers nowadays consider that reality is not unique. Today’s air quality is 7: older persons.4 CONSTRUCTING MEASURES: THE EXAMPLE OF INDICATORS Our daily life is ﬁlled with indicators: I. rate of return. Horn (1993)). . GNP. If you read a newspaper.

Are we sure that this indicator indicates what we want it to ? Do the arithmetic operations performed during the computation of the indicator lead to something that makes sense ? Let us now discuss three well known indicators arising in completely diﬀerent areas of our lives in detail: the human development index. . other indicators have sometimes been added to the HDI. [. UNDP proudly reports that The HDI has been used in many countries to rank districts or counties as a guide to identifying those most severely disadvantaged in terms of human development. have used such analysis as a planning tool. . . are the concerns of UNDP itself with respect to the HDI clearly deﬁned ? Why do they need the human development index ? To cut subsidies to nations evolving in the wrong direction ? To share subsidies among the poorest countries (according to what key) ? To put some pressure on the governments performing the worst ? To prove that Western democracies have the best political systems ? 3. page 14. . As an illustration. Suppose that the purpose of an indicator is clearly deﬁned. A composite index. ] The HDI has been used especially when a researcher wants a composite measure of development. ).1 The human development index The human development index measures the average achievements in a country in three basic dimensions of human development–longevity. . the air quality index and the decathlon score. secondary and tertiary enrollment) and real GDP (Gross Domestic Product) per capita expressed in PPP$ (Purchasing Power Parity $). businessmen. Several countries. educational attainment (adult literacy and combined primary. CONSTRUCTING MEASURES to reﬂect those aspects that are relevant with respect to our concerns. economists. the Human Development index (HDI) deﬁned by the United Nations Development Programme (UNDP) to measure development (United Nations Development Programme 1997) is used by many diﬀerent people in diﬀerent continents and in diﬀerent areas of activity (politicians. As stated by the United Nations Development Programme (1997). knowledge and a decent standard of living. Furthermore. the HDI thus contains three variables: life expectancy. This clearly shows that many people used the HDI in completely diﬀerent ways.54 CHAPTER 4. For such uses. Can we assume that their concerns are similar ? In the Human development report 1997. such as the Philippines. . 4.

In this formula. ∗ y + 2(y ∗ )1/2 + 3(y ∗ )1/3 + . As the value of one dollar for someone earning $100 is much larger than the value of one dollar for someone earning $100 000. THE HUMAN DEVELOPMENT INDEX 55 HDI’s precise deﬁnition is presented on page 122 of the 1997 Human Development Report. W (40 000) − W (100) Hence. 85 − 25 Hence.1. y ∗ y + 2[(y − y ∗ )1/2 ] if y ∗ ≤ y < 2y ∗ . the income scale is normalised. secondary or tertiary school that really go to school. The index is deﬁned as life expectancy at birth − 25 .4. Thereafter. educational attainment index and adjusted real GDP per capita (PPP$) index. y represents the income. is given by one of the following: if 0 < y < y ∗ . Educational Attainment Index (EAI) It is a combination of two other indicators: the Adult Literacy Index (ALI) and the combined primary. if (n − 1)y ∗ ≤ y < ny ∗ +n[(y − (n − 1)y ∗ )1/n ] . The EAI is a weighted average of ALI and ERI. the minimum value of $100 and the formula transformed income − W (100) . . it is a value between 0 and 1. ∗ y + 2(y ∗ )1/2 + 3[(y − 2y ∗ )1/3 ] if 2y ∗ ≤ y < 3y ∗ . . . The HDI is a simple average of the life expectancy index. . The transformed value of y. it is equal to 2ALI + ERI . i. . Note that W (40 000) = 6 154 and W (100) = 100. . Life Expectancy Index (LEI) This index measures life expectancy at birth. W (y). using the maximum value of $40 000. a minimum value (25 years) and a maximum one (85 years) have been deﬁned.e. secondary and tertiary Enrollment Ratio Index (ERI). it is a value between 0 and 1. the income is ﬁrst transformed using Atkinson’s formula (Atkinson 1970). W (y) the transformed income and y ∗ is set at $5 835 (PPP$) which was the World average annual income per capita in 1994. Here is how each index is computed. 3 Adjusted real GDP per capita (PPP$) Index (GDPI) This index aims at measuring the income per capita. In order to normalise the scale of this index. The ﬁrst one is the proportion of literate adults while the second one is the proportion of children in age of primary.

In the ﬁrst case. Hence.923.5 76.1].8 years.880 + 0.889 for Costa Rica. let’s compute the HDI for Greece (HDR97). Costa Rica is less developed than South Korea while in the second one. EAI = (2 × 0.95 South Korea Costa Rica Table 4. no one would ever have thought that life expectancy could be lower than 25.915 for South Korea and 0.916 for Costa Rica. . 4. We refer to them as HDR97. the range of the index is [0. the 199i HDI (in the 199j report) is an aggregate of data from 199i (for some dimensions) and from earlier years (for other dimensions). At that time. Greece’s real GDP per capita at $11 265 is above y ∗ by less than twice y ∗ .972. HDR97). they could have chosen a much lower value: 20 or 10. the 1997 report does not contain the 1997 data.820. Finally. Hence. the HDI computed in the 97 report is considered by the UNDP as the HDI of 1994.97 . CONSTRUCTING MEASURES Some words about the data and their collection time: the Human Development Report is a yearly publication (since 1990). The value of 25 was chosen for the ﬁrst report (1990).86 GDPI .967 and the ERI is 0. irrespective of the collection year. The choice of these bounds is quite arbitrary. Thus the adjusted real GDP per capita for Greece is $5 982 (PPP$) because 5 982 = 5 835+2(11 265−5 835)1/2 . In this volume. Obviously. the choice of the bounds matters. Indeed. then the HDI is 0.890 for South Korea and 0. then the HDI is 0. Hence GDPI = (5 982−W (100))/(W (40 000)− W (100)) = (5 982 − 100)/(6 154 − 100) = 0.6 (Rwanda.967 + 0.56 CHAPTER 4. we obtain the converse: Costa Rica is more developed than South Korea. The ALI is 0. To illustrate how the HDI works.93 . The likelihood of observing a value smaller than the minimum would have been much smaller. Consider the following example. Suppose that the EAI and GDPI have been computed for South Korea and Costa Rica (HDR97).1 Scale Normalisation To obtain the LEI and the GDPI. But the choice of the bounds is not without consequences. Hence. maximum and minimum values have been deﬁned so that.820)/3 = 0. But if the maximum and minimum for life expectancy are set to 80 and 25. LEI = (77. To avoid this problem. We also know the life expectancy at birth for South Korea and Costa Rica (see Table.918. EAI and GDPI for South Korea and Costa Rica (HDR97) are set to 85 and 25. when the lowest observed value was above 35. the lowest observed value is 22.1) If the maximum and minimum for life expectancy life expectancy 71. we use only data from the 1997 Human Development Report. Life expectancy in Greece is 77. Why 25 and 85 years ? Is 25 years the smallest observed value ? No.880. To make things more complicated.8−25)/(85−25) = 0.918 + 0.972)/3 = 0. Greece’s HDI is (0.1. Therefore the LEI is negative for Rwanda.1: Bounds: life expectancy. 4. after normalisation.6 EAI .

Hence it amounts to increasing the weight of LEI by the same factor. using other values than 0 and 1.1] and the scale could be normalised. extreme weaknesses should not be compensated. even if other dimensions are good. we should conclude that Gabon and Solomon Islands are at the same development level. The value of one year of life is thus 100.2: Compensation: performances of Gabon and Solomon Islands (HDR97) in spite of the informal analysis we performed on the table. Hence.4. Hence.2 Compensation Consider Table 4. Weaknesses on some dimensions are compensated by strengths on other dimensions.2 where the data for two countries (Gabon and the Solomon Islands. 4. Let us go further with compensation. HDR97) are presented.80] increases the diﬀerence between any two values of LEI by a factor (85−25)/(80−25).9$ (recall that W (40 000) = 6 154). a decrease in life expectancy by one year can be compensated by some increase in adjusted real GDP (income transformed by Atkinson’s formula). this is equivalent to choosing 1 for maximum and 0 for minimum.1 70. For us. As any weakness can be compensated by a strength. apparently.60 . it is not surprising that its position is improved when life expectancy is given more weight (by narrowing its range).1. It is obvious that values 0 and 1 have not been observed and are not likely to be observed in a foreseeable future. the adjusted real GDP must be increased by 0. This is probably desirable. This is also an arbitrary choice.63 .85] to [25.1.9$ (adjusted by Atkinson’s formula). .56 for both Gabon and Solomon Islands.016667. the GDPI must increase by the same amount. A decrease by one year yields a decrease of LEI by 1/(85 − 25) = 0. the HDI is equal to 0. Nevertheless. Hence the range of these scales is narrower than [0.47 real GDP 3 641 2 118 Gabon Solomon Islands Table 4.9 is called the substitution rate between life expectancy and adjusted real GDP. In reality.9$. Gabon is slightly better than the Solomon Islands on all dimensions except life expectancy where it is very bad. This problem is due to the fact that we used the usual average to aggregate our data into one number. In our example. a decrease in life expectancy by 2 years can be compensated by an increase in adjusted real GDP by 2 times 100.016667(6 154 − 100)= 100. a decrease in life expectancy by n years can be compensated by an increase in adjusted real GDP by n times 100.8 ALI . this very short life expectancy is clearly a sign of severe underdevelopment. to some extent. Yet. life expectancy 54. THE HUMAN DEVELOPMENT INDEX 57 In fact narrowing the range of life expectancy from [25. Note that. Accordingly. Therefore. Let us compute this increase. Costa Rica performed better than South Korea on life expectancy. To compensate this.9$. The value 100. The Solomon Islands perform quite well on all dimensions. no bounds were ﬁxed for the ALI and the ERI. even by very good performances on other dimensions.62 ERI .

w’s low income doesn’t seem to life expectancy 70 70 ALI .35 ERI . On the contrary.58 CHAPTER 4.1.30 for x and 0. you need an increase of the adult literacy index of n times 0.025.65 . a decrease in life expectancy of one year can be compensated by an increase in real GDP of 21 084$. HDR97).025.9$.35 ERI .40 real GDP 500 3 500 x y Table 4. except for life expectancy.4). But the diﬀerences in life .g. as life expectancy is high as well as education. the substitution rate between life expectancy and adult literacy is 0. Furthermore. The adult population is very important and its illiteracy is a severe problem. one might consider that adult literacy is not very important (because there are almost no adults) but income is more important because it improves quality of life in other respects. a decrease of life expectancy by one year can be compensated by an increase in real GDP by 100. poor people’s life expectancy has much less value than that of rich ones. the performance of z on adult literacy is really bad compared to that of w.3. we obtain 0. Hence. it might not be unreasonable to conclude that w is more developed than z. In a country where real GDP is 13 071$ (Cyprus. CONSTRUCTING MEASURES Other substitution rates are easy to compute: e.80 . one could conclude that y is more developed than x.3: Independence: performances of x and y life expectancy. 4. health conditions and life expectancy can be expected to improve rapidly due to a higher income.40 real GDP 500 3 500 w z Table 4. Countries x and y perform equally badly on life expectancy 30 30 ALI . Even if the high income of z is used to foster education. But if we compute the HDI. there is no diﬀerence between x and y on the one hand and w and z on the other hand.3 Dimension independence Consider the example of Table 4. HDR97).56 for z! This should not be a surprise. it will take decades before a signiﬁcant part of the population is literate. As life expectancy is very short.4: Independence: performances of w and z be a problem for the quality of life.65 .34 for y. To compensate a decrease of n years of life expectancy. Hence.016667(1 − 0)(3/2)=0. Let us now compare two countries. w and z similar to x and y except that life expectancy is equal to 70 for both w and z (see Table 4. Hence.52 for w and 0. In such conditions. Let us now think in terms of real GDP (not adjusted). Our conclusion is conﬁrmed by the HDI: 0. In a country where real GDP is 700$ (Chad.80 . y is much lower than x on adult literacy but much higher than x on income.

The weight of an object really exists (as far as reality exists). this results in the same increase of the HDI (compared to x and y) for both w and z.1. But there is more to scale construction than scale normalisation. as long as they remain identical. identical performances of by two items (countries or whatever) on one or more dimensions are not relevant for the comparison of these items. it is inherent to sums and averages. dimension independence might not be desirable. 4. When a sum (or an average) is used to aggregate diﬀerent dimensions. Atkinson’s formula reﬂects this. for example. The life expectancy index is the average over the population and for a determined time period of the length of the lives of the individuals in the population. one more dollar is considerable. On the . A country where a part of the population (rural or poor or of some race) dies early and where another part of the population lives until 80 might also have a life expectancy of 50 years. even if they are useful.1. It increases the health budget in such proportions that no more resources are available for other important areas: education. . employment policy.4. the real GDP is adjusted using Atkinson’s formula. 4. cannot reﬂect the variety present in the population. When we compare countries on the basis of life expectancy. If you earn 100 dollars. concerning real GDP. But why choosing y ∗ = $5 835 ? Why choose Atkinson’s formula ? Other formulas and other values for y ∗ would work just as well. . It is well known that averages. This is called dimension independence. . A country where approximately everyone lives until 50 has a life expectancy of 50 years. before normalising this scale. But we saw that this property is not always desirable. education and income. Note that the fact that life expectancy. One could argue that improving life expectancy by one year in a country where life expectancy is 30 is a huge achievement while it is a moderate one in a country where life expectancy is 70. THE HUMAN DEVELOPMENT INDEX 59 expectancy between x and w and between y and z are equal. Hence. an arbitrary choice has been made and we could easily build a small example showing that another arbitrary (but defendable) choice would yield a diﬀerent ranking of the countries.1. The goal of this adjustment is obvious: if you earn 40 000 dollars. we already have discussed this topic in Section 4. Once more. Some could even argue that increasing life expectancy above a certain threshold is no longer an improvement.1. we have several measures of the weight of an object and we consider the average as a good estimate of its actual weight. The identical performances can be changed in any direction. For example.1 (Scale Normalisation). adult literacy and enrollment have not been adjusted is also an arbitrary choice. they do not aﬀect the way both items compare to each other.4 Scale construction In a way. Note that this kind of average is quite particular. one more dollar is negligible. It is very diﬀerent from the average that we perform when.5 Statistical aspects Let us consider the four indices of the HDI from a statistical point of view.

he computes the average length of each edge. The ﬁrst one being a proportion and the second one being normalised. both kinds of averages were called by diﬀerent names (moyenne proportionnelle–diﬀerent measures of one object–and valeur commune–diﬀerent objects.60 means that 60% of the population is literate. i. What he gets is a triangle with edges of length 4. they can also be interpreted at individual level. During the 19th-century the Belgian astronomer and statistician Quetelet (1796-1894) invented the concept of the average human and uniﬁed both averages (Desrosi`res 1995). In the same spirit. then it is not an average. Until the 19th. even if reality exists. heart. A statistician wants to measure the average right triangle.e.century. They are quantities that are measured at country level. a triangle which is not right-angled for 42 + 82 = 92 . CONSTRUCTING MEASURES contrary. in the same proportion. Quetelet measured the average size of humans. To make it easy. it is designed to . irregular or noisy copies of that average human. What about the HDI itself. spleen and other organs. each measured once) and considered as completely different. What he got was an average human in which it was impossible to ﬁt all its average organs. If we consider that an ALI of 0. In order to do so. What do you get ? The adult literacy index! We can analyse the enrolment ratio index and the adjusted real GDP index in the same way as the ALI.1: Two right triangles and their average The adult literacy index is quite diﬀerent: it is just the number of literate adults. let us suppose that there are just two kinds of right triangles (see Fig. If we consider that an ALI of 0.1).60 means that the average literacy level is 60%. like averages. Consider a variable whose value is 0 for an illiterate adult and 1 for a literate one. 8 and 9. then it is an average. It is the length of life of a kind of average or ideal human.60 CHAPTER 4. The average right triangle is no longer a right triangle! What looks like a right angle is in fact approximately a 91 degrees angle. divided by the total adult population to allow comparisons between countries. the average of the length of life doesn’t correspond to something real. including the liver. in all dimensions. consider a country where all inhabitants are right triangles of diﬀerent sizes and shapes (example borrowed from Warusfel (1961)). as if we (the real humans) were imperfect. According to the United Nations Development Programme (1997). 4. Compute the average of this variable over the population and over some a time period. They were too large! 3 5 4 5 12 13 4 8 9 Figure 4. And this last interpretation is not more silly than computing a life expectancy index. e To convince you that the concept of the average human is quite strange (though possibly useful). In fact it depends on how we interpret it. Hence one could think it is not an average.

The ATMO index is based on the concentration of 4 major pollutants: sulfur dioxide (SO2 ). The resulting pollutant sub-index CO2 3 SO2 3 O3 2 dust 8 Table 4. one of them increased. . we expect the ATMO index to worsen as well. levels 5 and 6 are just around the EU long term norms. during the last decades.epa. . ] 61 Furthermore. 4. 4. For each pollutant.ujf-grenoble.g. particles). developed by the US Environmental Protection Agency ((Ott 1978) or http://www. the concentration is converted into a number on a scale from 1 to 10. several governments and international organisations edited some norms concerning pollutants’ concentration in the air (e. levels 8 corresponds to the EU short term norms and 10 indicates hazardous conditions. this corresponds to a worse air: no pollutant did decreased. as a good quality air is not guaranteed by norms. These two indicators are very similar and we will discuss the French ATMO.html).2. . dust.2 Air quality index Due to the alarming increase in air pollution. and the ATMO Index. the absence of wind and a very sunny day. In fact the . diﬀerent monitoring systems have been developed in order to provide governments as well as citizens with some information about air pollution.fr/ SANTE/paracelse/envirtox/Pollatmo/Surveill/atmo. The HDI somehow describes how developed the average human in a country is. Two examples of such systems are the Pollutant Standards Index (PSI). GDPI and HDI should be interpreted in this way as well.1 Monotonicity Suppose that. for each pollutant. . Clearly. nitrogen dioxide (NO2 ). that is 8.html). a concentration that should not be exceeded.5. In the following paragraphs. Naturally. a sub-index is computed and the ﬁnal ATMO index is deﬁned as being equal to the largest sub-index. the HDI contains an index (LEI) which can only be interpreted bearing in mind Quetelet’s average human. For each pollutant.5: Sub-indices of the ATMO index ATMO index is the largest value.gov/oar/oaqps/psi. due to heavy traﬃc.. Here is how each subindex is deﬁned. Therefore the ALI. ozone (O3 ) and particulate matter (soot. AIR QUALITY INDEX [. the ozone sub-index increases from 3 to 8 for the air described in Table 4. we discuss some problems arising with the ATMO index. To illustrate. developed by the French Environment Ministry (http://www-sante.4. these norms are just norms and they are often are exceeded. mainly in urban areas. In these conditions. the Clean Air Act in the US). suppose that the sub-indices are as in Table 4.2. Usually these norms specify.5. ] measure the average achievements in a country [. Hence the air quality is very bad. Therefore. Level 1 corresponds to an air of excellent quality.

the compensation between dimensions was too strong. the quality of air x is considered to be lower than that of air y. in a certain sense. It is of average quality on all dimensions and close to the EU long term norms for three dimensions. which is probably not better. Thus some changes. instead of choosing 5-6 for the EU long term norms and 8 for the short term ones. the values of today’s and yesterday’s index would be diﬀerent. say 7 and 4. we face another extreme: no compensation at all. In the case of human development. Some changes. To conclude. Air y is not good for any dimensions. are not reﬂected by the index. The relevant information provided by the index is not the ﬁgure itself.3 Meaningfulness Let us forget our criticism of the ATMO index and suppose that it works well. it is some information about the fact that we are above or below some norms that are related to the eﬀects of the pollutants on health (a somewhat similar situation has been encountered in Chapter 3). The index would work as well. Let us come back to the deﬁnition of the sub-indices. for carbon dioxide. 4. The ATMO index is 6 for air x and 5 for air y. This shows that the ATMO index is not monotonic. in both directions. This is done in an arbitrary way. 6-7 and 9 could have been chosen. For a given pollutant. as described by Table 4. What does it mean ? We are going to show that it is meaningless. the change is very signiﬁcant as the ozone sub-index was almost perfect and became very bad.62 CHAPTER 4. are not reﬂected by the index.2. nitrogen dioxide and dust). But in such a case. Contrary to what we observed with the HDI. For example. no compensation at all occurs between the diﬀerent dimensions. the ATMO index does not change either though the air quality improves. Here. The maximum is still 8.2 Non compensation Let us consider the ATMO index for two diﬀerent airs (x and y). and 7 is not twice as large as 4. CONSTRUCTING MEASURES ATMO index does not change. Air x is perfect on for all measurements but one: it scores just above pollutant x y CO2 1 5 SO2 1 4 O3 6 5 dust 1 5 Table 4.6.6: Sub-indices for x and y the EU long term norm for ozone. Note that if the ozone sub-index decreases from 8 to 3. even signiﬁcant ones. the concentration is measured in µg/m3 . 4. the statement “Today’s ATMO index (6) is twice as high as .2. Hence. The small weakness of x (6 compared to 5. The concentration ﬁgures are then transformed into numbers between 1 and 10. In our example. Consider the statement “Today’s ATMO index (6) is twice as high as yesterday’s index (3)”. for ozone) is not compensated by its large strengths (1 compared to 4 or 5.

4.3. THE DECATHLON SCORE

63

yesterday’s index (3)” would be valid, or meaningful, only in a particular context, depending upon arbitrary choices. Such a statement is said to be meaningless. On the contrary, the statement “Today’s ATMO sub-index for ozone (6) is higher than yesterday’s sub-index for ozone (3)” is meaningful. Any reasonable transformation of the concentration ﬁgures into numbers between 1 and 10 would lead to the same conclusion: today’s sub-index is higher than yesterday’s one. By “reasonable transformation” we mean a transformation that preserves the order: a concentration cannot be transformed into an index value lower than the index value corresponding to a lower concentration. Concentration of 110 and 180 µg/m3 can be transformed in 3 and 6, or 4 and 6, or 2 and 4 but not 4 and 2. More subtle: “Today’s ATMO index (6) is larger than yesterday’s ATMO index (3)”. Is this sentence meaningful ? In the previous paragraph, we saw that the arbitrariness involved in the construction of the 1 to 10 scale of a sub-index is not a problem when we want to compare two values of the same sub-index. But if we want to compare two values of two diﬀerent sub-indices, it is no longer true. A value of 3 on a sub-index could be more dangerous for health than a 6 on another sub-index. Of course, the scales have been constructed with care: 5 corresponds to the EU long term norms on all sub-indices and 8 to the short term norms. This is intended to make all sub-indices commensurable. Comparisons should thus be meaningful. But can we really assume that a 5 (or the corresponding concentration in µg/m3 ) is equivalent on two diﬀerent sub-indices ? Equivalent in what terms ? Some pollutants might have short term eﬀects and other pollutants, long term eﬀects. They can have eﬀects on diﬀerent parts of the organism. Should we compare the eﬀects in terms of discomfort, mortality after n years, health care costs, . . . ?

4.3

The decathlon score

The decathlon is a 10-event athletic contest. It consists of 100-meter, 400-meter, and 1 500-meter runs, a 110-meter high hurdles race, the javelin and discus throws, shot put, pole vault, high jump, and long jump. It is usually disputed over two or three days. It was introduced as a three-day event at the Olympic Games of Stockholm in 1912. To determine the winner of the competition, a score is computed for each athlete and the athlete with the best score is the winner. This score is the sum of the single-event scores. The single event scores are not just times and distances. It doesn’t make sense to add the time of a 100-meter run to the time of a 1 500-meter run. It is even worse to add the time of a run to the length of a jump. This should be obvious for everyone. Until 1908, the single-event scores were just the rank of an athlete in that event. For example, if an athlete performed the third best high jump, his singleevent score for the high jump was 3. The winner was thus the athlete with the lowest overall score. Note that this amounts to using the Borda method (see p.14) to elect the best athlete when there are ten voters and the preferences of each voter are the rankings deﬁned by each event. The main problem with these single-event scores is that they very poorly reﬂect

64

CHAPTER 4. CONSTRUCTING MEASURES

points

points

distance

distance

Figure 4.2: Decathlon tables for distances: general shape of a convex (left) and concave (right) tables the performances of the athletes. Suppose that an athlete arrived 0.1 second before the next athlete in the 100-meter run. They have ranks i and i+1. So the diﬀerence in the scores that they receive is 1. Suppose now that the delay between these two athletes is 1 second. Their ranks are unchanged. Thus the diﬀerence of in the scores that they receive is still 1 though a larger diﬀerence would be more appropriate. That is why other tables of single-event scores have been used since 1908 (de Jongh 1992, Zarnowsky 1989). In the tables used after 1908, high scores are associated to good performances (contrary to scores before 1908). Hence, the winner is the athlete that has the highest overall score. Some of these tables (diﬀerent versions, in use between 1934 and 1962) are based on the idea that improving a performance by some amount (e.g. 5 centimetres in a long jump) is more diﬃcult if the performance is close to the world record. Hence, it deserves more points. The general shape of these tables, for distances, is given in Figure 4.2 (convex table). For times (in runs), the shape is diﬀerent as an improvement is a decrease in time. A problem raised by convex tables is the following: if an athlete decides to focus on some events (for example the four kinds of runs) and to do much more training for them than for the other ones, he will have an advantage. He will come closer to the world record for runs and earn many points. At the same time, he will be further away from the world record for the other disciplines but that will make him lose less points as the slope of the curve is more gentle in that direction. The balance will be positive. Thus these tables encourage athletes to focus on some disciplines, which is contrary to the spirit of the decathlon. That is why, since 1962, diﬀerent concave tables (see Figure 4.2) have been used. These tables strongly encourage the athletes to be excellent in all disciplines. An example of a real table, in use in 1998, is presented in Figure 4.3. Note that a new change occurred: this table is no longer concave. It is almost linear but slightly convex. There are many interesting points to discuss about the decathlon score. • How are the minimum and maximum values set ? They can highly inﬂuence the score as it was shown with the HDI (in Section 4.1.1). Obviously, the maximum value must somehow be related to the world record. But as

4.3. THE DECATHLON SCORE

65

1200

1100

1000

900

score

800

700

600

500

400

9.5

10

10.5

11 11.5 100 meters time

12

12.5

13

Figure 4.3: A plot for the 100 meters run score table in 1998 everyone knows, world records are objects that athletes like to break. • Why adding single-event scores ? Other operations might work as well. For example, multiplication may favour the athletes that perform equally well in all disciplines. To illustrate this point very simply, consider a 3-event contest where single-event scores are between 0 and 10. An athlete, say x obtains 8 in all three events. Another one, y obtains 9, 8 and 7. If we add the scores, x and y obtain the same score: 24. If we multiply the scores, x gets 512 while y looses with 504. • ... The point on which we will focus, in this decathlon example, is the role of the indicator.

4.3.1

Role of the decathlon score

Although one might think that the role of the overall score is clearly to designate the winner, we are going to show that it plays many roles (like student grades, see Chapter 3) and that this is one of the reasons why it changes so often. Of course, one of the roles is to designate the winner and it was probably the only purpose that the ﬁrst designers of the score had in mind. But we can be quite sure that immediately after the ﬁrst contest, another role arose. Many people probably used the scores to assess the performance of the athletes. Such athlete has a score very

66

CHAPTER 4. CONSTRUCTING MEASURES

close to that of the winner and is thus a good athlete. Another one is far from the winner and is consequently not a good one athlete. Not much later (after the second competition), a third role appeared. How did the athletes evolve ? This athlete has improved his score or x has a better score in this contest than the score of y in the previous contest. This kind of comparison is not meaningful: suppose that an athlete wins a contest with a score of 16. In the next contest, he performs very poorly: short jumps, slow runs, short throws. But his main opponents are absent or perform equally poorly. He might still win the contest and even with a higher score although his performance is worse than the previous time. After some time, the organisers of decathlons became aware of the second and third role. It was probably part of the motivations to abandon the sum of ranks and to use convex tables. These tables, to some extent, made the comparisons of scores across athletes and/or competitions meaningful. At the same time, the score found a new role as a monitoring tool during the training. Before 1908, the scores could be computed only during competitions as they were sums of ranks. And it was not long before a wise coach used it as a strategic tool, advising his athlete to focus on some events. For this reason, since 1962, the organisers conferred a new role to the score: to foster excellence in all disciplines. This was achieved by the introduction of concave tables. But it is most likely that the score is still used as a strategic tool, hopefully in a less perverse way. It is worth noting that this new role doesn’t replace any of the previous ones. The score aims at rewarding equal performances in all disciplines but it is also used to assess the performance of an athlete. Even if we only consider only these two roles (the other ones could be seen as side eﬀects), it is amazing to see how incompatible they are.

4.4

Indicators and multiple criteria decision support

Classically, in a decision aiding process, a decision-maker wants to rank the elements of the set of alternatives (or to choose the best element). In order to rank, he selects several dimensions (criteria) that seem relevant with respect to his problem. Each alternative is characterised by a performance on each criterion (this is the evaluation matrix or performance tableau). A MCDA method is then used to rank the alternatives, with respect to the preferences of the decision-maker. When an indicator is built, several dimensions are also selected. Each item is characterised by a performance on each dimension. An index that can be used to rank the items is computed. The analogy between a decision support method and an index is obvious: both aim at aggregating multi-dimensional information about a set of objects. But there is a tremendous diﬀerence as well: when an indicator is built, it is often the case that there is no clearly deﬁned decision problem, decisionmaker and, a fortiori, preferences. To avoid the absence of preference, one could consider that the preferences are those of the potential users of the indicator. To some extent, this is possible because very often the preferences of the users

4.4. INDICATORS AND MULTIPLE CRITERIA DECISION

SUPPORT

67

go in the same direction for each dimension taken separately. For example, for each dimension of the ATMO index, everyone prefers a lower concentration. But it is deﬁnitely not reasonable to assume that the global preferences are similar. Furthermore, even if single-dimensional preferences go in the same direction, it does not mean that single-dimensional preferences are identical. Those who are not very sensitive to a pollutant will value a decrease in concentration much more if it occurs at high concentration than at low concentration. On the contrary, sensitive people might value concentration decreases at low and high levels equally. The relevance of measurement theory The absence of preferences is crucial. In decision support, many studies and concepts relate to measurement theory. Measurement theory is the theory that studies how we can measure objects (assign a number to an object) so as to reﬂect a relation on these objects. E.g., how can we assign numbers to physical objects so as to reﬂect the relation “heavier than” ? That is, how to assign a number (called weight) to each object so that “x’s weight > y’s weight” implies “x is heavier than y” ? Additional properties may be required. For example, in the case of weight measurement, one wishes that the number assigned to x and y taken together be the sum of their individual weights. Another example is that of distance. How to assign numbers to points in the space so as to reﬂect the relation “more distant than” with respect to some reference point ? Contrary to the previous example, this one has several dimensions (usually two or three: : x, y or x, y, z or altitude, longitude, latitude, etc.). Each object (point) is characterised by a performance (co-ordinate) in each dimension and one tries to aggregate these performances into one indicator: the distance to the reference point. This problem is at the core of geometry. Note that the answer is not unique. Very often the Euclidean distance is chosen (assuming that the shortest path between two points is the straight line). Sometimes, a Gaussian distance is more relevant (when you consider points on the earth’s surface, unless you are a mole, the shortest path is no longer a straight line but a curve). In other circumstances, the Manhattan distance is more appropriate (between two points in Manhattan, if you are not ﬂying, the shortest path is not a straight line nor a curve, it is a succession of perpendicular straight lines). And there are many other distances. As far as physical properties are concerned (larger than, warmer than, faster than, . . . ), the problem is easy: good measurements were carried out in Antiquity without any theory of measurement. But when we consider other kinds of relations, things are more complex. How to assign numbers to people or alternatives so as to reﬂect the relations “more loveable than”, “preferable to” or “more risky than” ? In such cases, measurement theory can be of great assistance but is insuﬃcient to solve all problems. In decision support, measuring objects with respect to the relation “is preferred to” can be of some help because, once the objects have been measured, it is rather easy to handle numbers. It is often assumed that a preference relation over the alternatives exists but is not well known and one tries to measure the alternatives

the aim of an index is precisely to build or create a relation over the items. the preference relation is not assumed to completely exist a priori. Measurement theory loses some of its power when there is no a priori relation to be reﬂected. the score is considered as the true measure of performance. it seems that. measurement theory cannot tell us much about the index.68 CHAPTER 4. On the contrary. not necessarily in the most eﬃcient way. Between 1908 and 1962. This is not particular to the decathlon score. the scores were designed to assess the performances and to compare them. But this does not mean that all indicators are equally good. The most eﬃcient way for them to make their claim credible is to exhibit a good ATMO index (or any other index in countries other than France). By “eﬃciently”. As one of the most important things for a professional athlete is to win (contrary to the opinion of de Coubertin). in such a case. among others. that is a pre-existent relation. Many indices are built without the assumption that a relation over the items a priori exists or without trying to reﬂect a pre-existent relation. CONSTRUCTING MEASURES so as to discover the preference relation. in some arbitrary way. claim. It institutes or settles reality (Desrosi`res 1995). it is very likely that no decathlon contest would ever have taken place. If the people that created the decathlon had decided to wait until a sound theory shows them how to designate the winner. Many governments probably try to exhibit good HDI for their country in order to keep international subsidies or to legitimise their authority to the population of the country or foreign governments. Sometimes. willing to attract high salaried residents. An indicator can be considered as a kind of language. even if other policies might be more beneﬁcial to the country. It is based on some (more or less necessarily arbitrary) conventions and helps us to eﬃciently communicate about diﬀerent topics or perform diﬀerent tasks. that. This is very obvious with the decathlon e score. Some city councils. we mean “more eﬃciently than without any language”. institutes reality. to have high air quality. but some characteristics of the preference relation still exist a priori. the indicators are not useless. Indicators and reality The index does not help to uncover reality. Any athlete that was not convinced of this had to change his mind and to behave accordingly if he wanted to compete. Nevertheless. Therefore. Preferences can emerge and evolve during the decision aid process. in many cases. it is not always precise and leaves room for ambiguities and contradictions. One might be tempted to reject any indicator that does not reﬂect reality. Measurement theory can therefore be used to build or to analyse a decision support method. As any language. Ambiguities and contradictions are certainly adequate for poetry otherwise we could never enjoy things like this: Mis pasos en esta calle Resuenan en otra calle donde .

preferences are not perfectly known a priori. They can occur at diﬀerent steps of the process: the choice of an analyst. measurement theory can be used to ensure that the model built during the aiding process does not contradict these elements of preferences. ambiguities and contradictions should generally be kept at a minimum. if some measurement (associating numbers to alternatives) is performed during the aiding process. probably cannot avoid some arbitrary elements. of the criteria. all indicators should reﬂect them. Ich liebe dich. it would be very unlikely that any aid would be required.5. But unlike cases where indicators are built without any decision problem in mind. of the aggregation scheme. at least some elements of preferences are present. / But when you swear: I love but thee! / Then I must weep–and bitterly.4. indicators are probably more widespread than any other model (this is deﬁnitely true if you think of cost-beneﬁt analysis or 1 Octavio Paz. Consequently. most decision aiding processes relate to a more or less precisely deﬁned decision problem. ¨ doch wenn du sprichst: ich liebe dich! so muss ich weinen bitterlich. relying solely on measurement theory is not possible. Here. to mention a few. when it comes to decision-making. translated by Louis Untermeyer(van Doren 1928) And when I lean upon your breast / My soul is soothed with godlike rest. When possible.5 Conclusions Among evaluation and decision models. Most decision aiding processes.2 69 But. Therefore. Therefore. CONCLUSIONS oigo mis pasos pasar en esta calle donde S´lo es real la niebla 1 o or Wenn ich mich lehn’ an deine Brust. kommt’s uber mich wie Himmelslust. translated by Nims (1990) My footsteps in this / street / Re-echo / in another street / where / I hear my footsteps / passing in this street / where / Nothing is real but the fog Heine. like most indicators. When certain elements of preferences are known for sure. that it reﬂects them and that all sound conclusions that can be drawn from the conjunction of these elements are actually drawn. 4. they should be avoided. Otherwise. 2 Heinrich . Back to multiple criteria decision support In a decision aiding process.

. actually. But what do we need information for ? For making decisions ! In this chapter. On the other hand. we analyzed three diﬀerent indicators: the human development index. indicators are pervasive in many domains of human activity. to an incapability of dealing with dimension dependence. to non monotonicity. are. all three indicators have been shown to present ﬂaws: they do not always reﬂect reality or what we consider as reality. contrary to student grades that are conﬁned to education (note that student grades could be considered as special cases of indicators). Student grades are also very popular. This is due to an excess or a lack of compensation. the ATMO (an air quality index) and the decathlon score. Indicators are usually presented as an eﬃcient way to synthesise information. besides the fact that most people use and/or encounter them. we saw that an indicator does not necessarily need to reﬂect reality or. On the one hand.70 CHAPTER 4. in many circumstances. Indicators are not often thought of as decision support models but. it does not need to reﬂect only reality. . CONSTRUCTING MEASURES multiple criteria decision support). at least. Some of them have already been discussed in Chapter 3 and/or will be met in Chapter 6. . . as well– almost every one has faced them at some point of his life–but. These problems are not speciﬁc to indicators.

choosing standard treatments for certain types of illnesses (Folland. Little and Mirlees 1968. reorganising the bus lines in a city (Adler 1987. Johannesson 1996).g. creating national parks. buying new diagnosis tools. approving the human consumption of genetically-modiﬁed organisms. allocating budgets among agencies. courses of action and/or to evaluate them. Toth 1997). or irradiated food (Hanley and Spash 1993. Garrod and Harvey 1998). the allocation of rare resources to some alternatives rather than to others (e. Little and Mirlees 1974). building a high-speed train. has attracted the attention of economists. deciding how to use one’s income).1 Introduction Decision-making inevitably implies. • Health: building new hospitals. Schoﬁeld 1989). Some examples of areas in which CBA has been applied will give a hint of the type of projects that are evaluated: • Economics: determining investment strategies for developing countries. developing an energy policy for a nation (Dinwiddy and Teal 1996. Kirkpatrick and Weiss 1996. public agencies and ﬁrms or international organisations are complex and have a huge variety of consequences. at some stage. • Environment: establishing pollution standards. International Atomic Energy Agency 1993. Johansson 1993. Goodman and Stano 1997. projects. setting up prevention policies. Cost-Beneﬁt Analysis (CBA) is a set of techniques that economists have developed for this purpose. 71 . • Transportation: building new roads or motor ways (Willis. It is based on the following simple and apparently inescapable idea: a project should only be undertaken when its “beneﬁts” outweigh its “costs”. CBA is particularly oriented towards the evaluation of public sector projects.5 ASSESSING COMPETING PROJECTS: THE EXAMPLE OF COST-BENEFIT ANALYSIS 5. It is therefore not at all surprising that the question of helping a decision-maker to choose between competing alternatives. Decisions made by governments.

After having started in the USA in the ﬁeld of Water Resource Management (see Krutilla and Eckstein (1958) for an overview of these pioneering developments). it is important to have a clear idea of what CBA is. Marglin and Sen (1972) and. two excellent introductory references are Dasgupta and Pearce (1972) and Lesourne (1975). • give a few hints on the scope and limitations of CBA. is not to support the nowadays-fashionable claim (especially among environmentalists) that CBA is an outdated useless technique either. Brent 1996. Although it has distant origins (see Dupuit 1844).g. Journal of Transport Economics and Policy. the Law makes it an obligation to evaluate projects using the principles of CBA.72 CHAPTER 5. Journal of Environmental Economics and Management. Journal of Policy Analysis and Management. Journal of Public Finance and Public Choice. Energy Economics. ASSESSING COMPETING PROJECTS These types of decision are immensely complex. These three objectives structure the rest of this chapter into sections. In many countries nowadays. Asian Development Bank: Kohli (1993)). American Journal of Agricultural Economics. if the claim of economists was to be perfectly well-founded there would be hardly any need for other decision/evaluation models. Little and Mirlees (1974). Water Resources Research). ONUDI: Dasgupta. While research on (and applications of) CBA grew at a very fast rate during the 50’s and 60’s. They aﬀect our everyday life and are likely to aﬀect that of our children. Environment and Planning. A good overview of the early history of CBA can be found in Dasgupta and Pearce (1972). the development of CBA has unsurprisingly coincided with the more active involvement of governments in economic aﬀairs that started after the great depression and climaxed after World War II in the 50’s and 60’s. Less ambitiously. It would be impossible to give a fair account of the immense literature on CBA in a few pages. Nas 1996). Most economists view CBA as the standard way of evaluating such projects and of supporting public decisionmaking (numerous examples of practical studies using CBA can easily be found in applied economics journals. Research on CBA is still active and economists have spent considerable time and energy in investigating its foundations and reﬁning the various tools that it requires in practical applications (recent references include Boardman 1996. Journal of Health Economics. Regional Science and Urban Economics. World Bank: Adler (1987). the principles of CBA were entrenched in a series of very inﬂuential “manuals for project evaluation” produced by several international organisations (OECD: Little and Mirlees (1968). Our aim. while clearly not being to promote the use of CBA. Land Economics. e. • give an idea of how these principles are applied in practice. Public Budgeting and Finance. we shall try here to: • give a brief and informal account of the principles underlying CBA. Since fairly diﬀerent approaches to these problems have been advocated. Pharmaco-Economics. the UK being the ﬁrst and more active one. more recently. In pointing out what we believe to be some . the principles of CBA were soon adopted in other areas and countries. Although somewhat old.

equipment will have to be replaced) the general case is that the duration of the project is more or less conventionally chosen as the period of time for which it seems reasonable and useful to perform the evaluation. real-world applications imply dividing the duration of the project into time periods of equal length.1 The principles of CBA Choosing between investment projects in private ﬁrms The idea that a project should only be undertaken if its “beneﬁts” outweigh its “costs” is at the heart of CBA. it is the only “consistent” way to support decision/evaluation processes (Boiteux 1994). because after a certain date the Law will change.g. If the very nature of the project may command this choice (e.2. At this stage. the (algebraic) sum a(0) . Although all components of the evaluation vector are expressed in identical monetary units (m. First a time horizon for its evaluation must be chosen. It is of little practical content however unless we deﬁne more precisely what “costs” and “beneﬁts” are and how to evaluate and compare them. a(1). under all circumstances. the evaluation model of the project has the form of an evaluation vector with T +1 components (a(0). Note that these evaluations are relative: they aim at capturing the inﬂuence of the project on the ﬁrm and not its overall situation. Let us denote b(i) (resp. c(i)) the beneﬁts (resp. An investment project may usefully be seen as an operation in which money is spent today (the “costs”). We seek to obtain an evaluation of the amount of cash that is generated by the project during each time period. this amount being the diﬀerence between the “beneﬁts” and the “expenses” generated by the project (including the residual value of the project in the last period). you should enjoy the free lunch and there is hardly any evaluation problem). .2.u. . we only want to give arguments refuting the claim of some economists that. with the hope that this money will produce even more money (the “beneﬁts”) tomorrow. A simple starting point is to be found in the literature on Corporate Finance on the choice between “investment projects” in private ﬁrms. In general. Such a task may be more or less easy depending on the nature of the project. a(T )) where 0 conventionally denotes the starting time of the project. A useful way to evaluate such an investment project is the following. the environment of the ﬁrm and the duration of the project.5.). Although a “continuous” evaluation is theoretically possible. the expenses) generated by the project during the ith period of time. . The next step is to try to evaluate the consequences of the project in each of these time periods. 5. This involves some arbitrariness (should we choose years or semesters?) as well as trade-oﬀs between the depth and the complexity of the evaluation model.2 5. This claim may seem so obvious that it need not be discussed any further. Some discussion will therefore prove useful. The net eﬀect of the project in period i is therefore a(i) = b(i) − c(i). . THE PRINCIPLES OF CBA 73 limitations of CBA. some of the components of this vector (most notably a(0)) will be negative (if not. Suppose now that a project is to be evaluated on T time periods of equal length.

• a perfect capital market was assumed to exist. the ﬁrm is indiﬀerent between undertaking the project or not.74 CHAPTER 5. The reverse conclusion obviously holds if N P V < 0. Suppose that there is a capital market on which the ﬁrm is able to lend or borrow money at a ﬁxed interest rate of r per time period (this market is assumed to be perfect: borrowing and lending will not aﬀect r and are not restricted). being sure of receiving 1 m. When N P V = 0. thus. This simple reasoning underlies the following well-known rule for choosing between investment projects in Finance: “when projects are independent. it appears that the project makes the ﬁrm richer and. . ASSESSING COMPETING PROJECTS is to be received today while a(1) will only be received one time period ahead.u. receiving 1 m. If you borrow 1 m. Hence. although expressed in the same unit. This is what is called discounting and r is called the discounting rate. • the duration was divided into conveniently chosen time periods of equal length. if you know 1 that you will receive 1 m. In deriving this simple rule. .e. are not directly comparable. you will have to spend (1 + r) m. • the eﬀect of uncertainty and/or imprecision was neglected.u. you can borrow an amount of 1+r m.u. . in period 1 will allow you to reimburse exactly what you 1 have to i.u. 1+r (1 + r) = 1 m. in period 1 in order to respect your contract. in period 1 i corresponds to an amount of (1+r)i m. Similarly. This sum. we have made various hypotheses. called the Net Present Value (N P V ) of the project is given by: T (5. Using a similar reasoning and taking into account compound interest.u. here and now. now.u.u.1) NPV = i=0 a(i) = (1 + r)i T i=0 b(i) − c(i) (1 + r)i If N P V > 0. • all consequences of the projects were supposed to be adequately modelled as beneﬁts b(i) and costs c(i) expressed in m. . i. for one time period on this market today. This suggests a simple way of summarising the components of the vector (a(0). a(1). There is a simple way however to summarise the components of the evaluation vector using a single number.u. the cash stream of the project is equivalent to receiving money now. choose all projects that have a strictly positive N P V ”.: your revenue of 1 m. in period 1 1 corresponds to receiving.e. a(T )) as the sum to be received now that is equivalent to this cash stream via borrowing and lending operations on the capital market. taking into account the costs and the beneﬁts of the project and their dispersion in time. should be undertaken. Therefore these two numbers.u. in period 1. for each time period.u. an amount 1+r m. . Most notably: • a duration for the project was chosen.u.

. THE PRINCIPLES OF CBA 75 • other possible constraints were ignored (e. i). letting: b(i) = j=1 p(j)b(j. These prices are used to summarise the vectors b(i) and c(i) into single numbers expressed in m. i)) denotes the “social beneﬁts” (resp. • in CBA “costs” and “beneﬁts” are not necessarily directly expressed in m. b( . c(2. . the beneﬁts b(i) and costs c(i) of a project in period i are seen in CBA as vectors with respectively and components: b(i) = (b(1. conveniently chosen “prices” are used to convert them into m.2 From Corporate Finance to CBA Although the projects that are usually evaluated using CBA are considerably more complex than the ones we implicitly envisaged in the previous paragraph. .u. .2. i). b(2.u. i).u. CBA may usefully be seen as using a direct extension of the rule used in Finance. i) and c(i) = k=1 p (k)c(k. generated by the project in period i. The main extensions are the following: • in CBA “costs” and “beneﬁts” are evaluated from the point of view of “society”. “costs” and “beneﬁts” are converted into m. i)) where b(j.u. projects may be exclusive. • in CBA the discounting rate has to be chosen from the point of view of “society”. .5. We denote by p(j) (resp p (k)) the price of one unit of social beneﬁt on the jth dimension (resp. c( . synergetic).g. on the kth dimension). i)) . using suitably chosen “prices”. one unit of the social cost on the kth dimension) expressed in m. . .2. i) . prices are assumed to be independent from the time period). when this happens. The literature in Finance is replete with extensions of this simple model that allow to cope with less simplistic hypotheses. i) (resp. In each period.u. c(i) = (c(1. i). c(k. evaluated in units that are speciﬁc to that dimension.. the “social costs”) on the jth dimension (resp. 5. Retaining the spirit of the notations used above. (for simplicity. and consistently with real-world applications.

. . These preferences can be conveniently represented using a utility function Uj (qj1 . We have: T (5. Clearly the foundations of such a method and the way of using it in practice deserve to be clariﬁed. Extra diﬃculties are easily seen to emerge: • how can one evaluate “beneﬁts” and “costs” from a “social point of view”? • is it always possible to measure the value of “beneﬁts” and “costs” in monetary units and how should the prices be chosen? • how is the social discount rate chosen? It is apparent that CBA is a “mono-criterion” approach that uses “money” as a yardstick. qjn ) where qji denotes the quantities of good i consumed by individual j.3 presents an elementary theoretical model that helps understanding the foundations of CBA.2. The important point here is that CBA conducts project evaluation within an “environment” in which markets are especially important instruments of social co-ordination. thus. An elementary theoretical model Consider a one-period economy in which m individuals consume n goods that are exchanged on markets. After this conversion and having suitably chosen a social discounting rate r.2) N P SV = i=0 b(i) − c(i) = (1 + r)i T j=1 i=0 p(j)b(j. costs) generated by the project in period i converted into m. It may be skipped without loss of continuity. i) − (1 k=1 + r)i p (k)c(k.2. ASSESSING COMPETING PROJECTS where b(i) (resp. Each individual j is supposed to have completely ordered preferences for consumption bundles. . should be implemented (in the absence of other constraints). It should be observed that the diﬃculties that we mentioned concerning the computation of the NPV are still present here. i) and a project where N P SV > 0 will be interpreted as improving the welfare of society and. Section 5. . c(i)) denotes the social beneﬁts (resp.u. it is possible to apply the standard discounting formula for computing the Net Present Social Value (N P SV ) of a project. 5.3 Theoretical foundations It is obviously impossible to give a complete account of the vast literature on the foundations of CBA which has deep roots in Welfare Economics here. qj1 . We would however like to give a hint of why CBA consistently insists on trying to “price out” every eﬀect of a project.76 CHAPTER 5. .

5. . Having chosen a particular good for num´raire (we shall call that good “money”). . These modiﬁcations are supposed to be marginal.4) where pi denotes the price of the ith good. It is useful to interpret W as representing the preferences of a “planner” regarding the various “social states”. for all j. Using 5. before the shock.5. we may always normalise W in such a way that λi Wj = 1. Under the hypothesis that. Un ). Under this hypothesis.6. The existence of markets for the various goods and the hypothesis that individuals operate on these markets so as to maximise utility ensure that. we have. this implies that: e (5.6 as: . . i. they will not aﬀect the prices of the various goods. THE PRINCIPLES OF CBA 77 Social preferences are supposed to be well-deﬁned in terms of the preferences of the individuals through a “social utility function” (or “social welfare function”) W (U1 . the conclusion is that the coeﬃcients λi Wj are constant over individuals (otherwise income would have been reallocated in favour of individuals for which λi Wj is the larger). for all individuals j and for all goods i and k: Uji pi = Ujk pk (5. consisting in a modiﬁcation of the quantities of goods consumed by each individual. as the e marginal utility of “income” for individual j. Starting from an initial situation in the economy.e. before the shock. The impact of such a shock on social welfare is given by (assuming diﬀerentiability): m n (5. We therefore rewrite equation 5.3 can be rewritten as: m n (5. the coeﬃcient λi Wj has a useful interpretation: it represents the increase in social welfare following a marginal increase of the income of individual j. interpreted as an external shock to the economy. the distribution of income is “optimal” in the society. consider a “project”.3) dW = j=1 i=1 Wj Uji dqji where ∂Uj ∂W Wj = ∂Uj and Uji = ∂qji Social welfare will increase following the shock if dW > 0. .2. 5.5) Uji = λj pi where λj can be interpreted as the marginal eﬀect on the utility of individual j of a marginal variation of the consumption of the num´raire good. U2 .6) dW = j=1 λi Wj i=1 pi dqji In equation 5.

. our model allows us to understand.7 without these restrictions. variations of social welfare are therefore conveniently measured in money terms using market prices.78 CHAPTER 5. the so-called consumer surplus). • the distribution of income was assumed to be optimal. A detailed treatment of the foundations of CBA without our simplifying hypotheses can be found in Dr`ze and Stern (1987). In this simple model. Extensions and remarks The limitations of the elementary model presented above are obvious.7) dW = j=1 i=1 pi dqji which amounts to saying that the social eﬀects of the shock are measured as the sum over individuals of the variation of their consumption evaluated at market prices (i.e. The general formula for computing the N P SV may be seen as an extension of 5.7 coincides with the computation of the N P SV when time is not an issue and the eﬀects (costs or beneﬁts) of a project can be expressed in terms of consumption of goods exchanged on markets. • the economy is closed (no imports or exports) and there is no government (and in particular no taxes). Although we shall not enter e into details. The appropriateness of equation 5. In spite of all its limitations. • the presence of numerous public goods for which no market price is available (think of health services or education).7 and of related formulas is particularly clear in situations that are fairly diﬀerent from the ones in which CBA is currently used as an evaluation tool. The most important ones seem to be the following: • the model only deals with marginal changes in the economy. • the presence of numerous externalities (think of the pollution generated by a new motorway). These are often characterised by: • non-marginal changes (think of the construction of a new underground line in a city). the rationale for trying to price out all eﬀects of a project in order to assess its contribution to social welfare. • the model considers a single-period economy without production. Returning to CBA. ASSESSING COMPETING PROJECTS m n (5. through the simple derivation of equation 5.7. it should be emphasised that the theoretical foundations of CBA are controversial on some important points. the relation 5.

• the diﬃculty of evaluating some eﬀects in well-deﬁned units (think of the aesthetic value of the countryside) and. Economists have indeed developed an incredible variety of tools in order to use the N P SV even in situations in which it would a priori seem diﬃcult to do so. the determination of an appropriate social discounting rate (useful references on this controversial topic include Harvey 1992. In order to illustrate the type of work involved in such studies. long term eﬀects of air pollution on health). 5.3. the inclusion of equity considerations in the calculation of the NPSV (Brent 1984). Harvey 1995. Champ. e. this would take an entire book even for a project of moderate importance. see Boiteux (1994) and Syndicat des Transports Parisiens (1998).5. taxes.3 Some examples in transportation studies Public investment in transportation facilities amounts to over 80 109 FRF annually in France (around 14 109 USD or 14 109 e). For concreteness. exploitation costs) although their evaluation may raise problems. It is impossible to review the immense literature that these eﬀorts have generated here. thus. CBA is presently the standard evaluation technique for such projects. we shall only take a few examples (for more details. the consideration of irreversible eﬀects (e. We will simply illustrate some of these points in section 5. It includes: the determination of prices for “goods” without markets. Brown and Lucero 1998). • eﬀects that are very unevenly distributed among individuals and raise important equity concerns (think of your reaction if a new airport were to be built close to your second residence in the middle of the countryside). leaving direct ﬁnancial eﬀects aside (construction costs. future prices. Loomis. maintenance costs. Eﬀects of such a project are clearly very diverse. Weitzman 1994). SOME EXAMPLES IN TRANSPORTATION STUDIES 79 • markets in which competition is altered in many ways (monopolies. a useful reference in English is Adler (1987)) based on a number of real-world applications.g. An overview of this literature may be found in Sugden and Wiliams (1983) and in Zerbe and Dively (1994). Peterson. • the overwhelming presence of uncertainty (technological changes.g. the treatment of uncertainty. contingent valuation techniques or hedonic prices (see Scotchmer 1985. It is impossible to give a detailed account of how CBA is currently applied in France for the evaluation of transportation investment projects. • eﬀects that are highly complex and may concern a very long period of time (think of a policy for storing used nuclear fuel). . regulations). Keeler and Cretin 1983. We will concentrate on some of them here. through the use of option values). we shall envisage a project consisting in the extension of an underground line in the suburbs of Paris.3. to price them out In spite of these diﬃculties. CBA still mainly rests on the use of the N P SV (or some of its extensions) to evaluate projects. Harvey 1994.

frequently account for more than 50% of the beneﬁts of these types of projects). a stage in which all details (concerning e. Traﬃc forecast models usually involve highly complex modal choice modules coupled with forecasting and/or simulation techniques. Implementing such forecasting models is obviously an enormous task.g. These models are not part of CBA and indicating their limitations should not be seen as a criticism of CBA. by moving away from the centre of the city) whereas such eﬀects are well-known and have proved to be overwhelming in the past. on top of being technically rather involved. None of them seem to integrate the potential modiﬁcations of behaviour of a signiﬁcant proportion of the population in reaction to the new infrastructure (e. stairs to be climbed. form the basis of the evaluation model. however. They diﬀer on many points.g. Local modiﬁcations in the oﬀer of public transportation may have consequences on the traﬃc in the whole region. much less eﬀorts have been devoted to the study of models allowing to convert time into generalised time than on the “price of time” that will be used afterwards. ASSESSING COMPETING PROJECTS 5. Its main “beneﬁts” consist in “time gains”. all these models forecast the traﬃc for a period of time that is not too distant from the installation of the new infrastructure. 5. the statistical tools used for modal choice or the segmentation of the population that is used (Boiteux 1994). increased following the observed rate of growth of the traﬃc in the past few years) in order to obtain ﬁgures for all the periods of study. Their outputs are clearly crucial for the rest of the study.e. In most models time gains are evaluated on the basis of what is called “generalised time” i. . Although this seems reasonable.2 Time gains Traﬃc forecasts are used to evaluate the time that inhabitants of the Paris region would gain with the extension of the metro line. the tariﬃng of the new infrastructure or the frequency of the trains) may not be completely decided yet.u.3.g. which are obviously directly related to traﬃc forecasts (time gains converted into m. These forecasts are then more or less mechanically updated (e. Nearly all public transportation ﬁrms and governmental agencies in France have developed their own tools for generating traﬃc forecasts. such forecasts are usually made at an early stage of development of the project. Their results. a measure of time that accounts for elements of (dis)comfort of the journey (e.g. Unsurprisingly these models lead to very diﬀerent results.g. raise some basic diﬃculties: • is one minute equal to one minute? Such a question may not be as silly as it seems. temperature. As far as we know. e.3.80 CHAPTER 5. Furthermore.1 Prevision of traﬃc An inevitable step in all studies of this type is to forecast the modiﬁcation of the volume and the structure of the traﬃc that would follow the implementation of the project. a more or less crowded environment). Such evaluations.

using the metro is far less risky than driving a car). SOME EXAMPLES IN TRANSPORTATION STUDIES 81 • is one hour worth 60 times one minute? Most models evaluating and pricing out time gains are strictly linear. it does not seem unfair to consider that such evaluations give. revealed preference approaches including smoking and driving behaviour. the loss of one hour daily for some users may have a much greater impact than 60 losses of 1 minute. the value of life insurance contracts. This is dubious since some gains (e. The following one consists in converting these ﬁgures into monetary units through the use of a “price for human life”. economists have developed many diﬀerent methods for evaluating the value of human life. including methods based on “human capital”.3 Security gains Important beneﬁts of projects in public transportation are “security gains” (hopefully. • what is the value of time and how should time gains be converted into monetary units? Should we take the fact that people have diﬀerent salaries into account? Should we rather use price based on “stated preferences”? Should we take into account the fact that most surveys using stated preferences have shown that the value of time highly depends on the motive of the journey (being much lower for journeys not connected to work)? The present practice in the Paris region is to linearly evaluate all (generalised) time gains using the average hourly net salary in the Region (74 FRF/hour in 1994 or approximately 13 USD/hour or 13 e/hour). sums granted by courts following accidents. Furthermore. 10 seconds per user-day) might well be considered insigniﬁcant. In view of the major uncertainties surrounding traﬃc forecasts that are used to compute the time gains and the arbitrariness of the “price of time” that is used. Although this might not appear as a very pleasant subject of study. the gain of security in terms of the number of (“statistical”) deaths and serious injuries that would be avoided annually by the project. 5.5. based on traﬃc forecasts. stated preference approaches. at best.g.3. at that time. wages . human life being. Using these ﬁgures and combining them with statistical information concerning the occurrence of car accidents and their severity. they should be divided by a little less than 6 in order to obtain 1993 USD): Death Serious injury Other injury 3 600 000 FRF 370 000 FRF 79 000 FRF these ﬁgures being based on several stated preference studies (it is not without interest to note that these ﬁgures were quite diﬀerent before 1993. leads to beneﬁts in terms of security which amount to 0.3. valued at 1 866 000 FRF). interesting indications. The following ﬁgures are presently used in France (in 1993 FRF. A ﬁrst step consists in evaluating.08 FRF per vehicle-km avoided in the Paris region.

Their evaluation is subject to much uncertainty and inaccurate determination. . seemingly similar. it remains the best way to evaluate such projects. The conclusions and recommendations of a recent oﬃcial report (Boiteux 1994) on the evaluation of public transportation projects stated that: • although CBA has limitations. We reproduce below some signiﬁcant ﬁgures for the value of life used in several European countries (this table is adapted from Syndicat des Transports Parisiens 1998). ASSESSING COMPETING PROJECTS for activities involving risk (Viscusi 1992).4 Other eﬀects and remarks The inclusion of other eﬀects in the computation of the NPSV of a project in such studies raises diﬃculties similar to the ones mentioned for time gains and security gains. “cost-eﬀectiveness” analysis is often preferred to CBA since it does not require to price out human life (see Johannesson 1995. As is apparent in Syndicat des Transports Parisiens (1998). • local air pollution. Weinstein and Stason 1977). Besides raising serious ethical difﬁculties (Broome 1985).82 CHAPTER 5. A period of evaluation of 30 years is recommended for this type of project. all ﬁgures are in 1993 European Currency Unit (ECU). Presently a rate of 8% is used (note e e that this rate is about twice as high as the rate commonly used in Germany). these studies exhibit incredible variations across techniques and. prices used to “monetarise” eﬀects like: • noise. in which “beneﬁts” mainly include lives saved. • contribution to the greenhouse eﬀect. The social discounting rate used for such projects is determined by the government (the “Commissariat G´n´ral du Plan”). countries (this explains why in many medical studies. are mainly conventional.3. one 1993 ECU being approximately one 1993 USD): Country Denmark Finland France Germany Portugal Spain Sweden UK Price of human life 628 147 ECU 1 414 200 ECU 600 000 ECU 406 672 ECU 78 230 ECU 100 529 ECU 984 940 ECU 935 149 ECU 5. Moreover the “prices” that are used to convert them into monetary units can be obtained using many diﬀerent methods leading to signiﬁcantly diﬀerent results.

the users of CBA can rely on more than 50 years of theoretical and practical investigations. which seems ridiculous. all words that are frequently spotted in texts on CBA (Boiteux 1994). • CBA emphasises the fact that decision and/or evaluation methods are not context-free.4 Conclusions CBA is an important decision/evaluation method. • CBA studies should remain as transparent as possible. 5. • all other eﬀects should be described verbally. seems no more convincing. it is not surprising that markets and prices are viewed as the essential parts of the environment in CBA. Monetarised eﬀects and non monetarised ones should not be included in a common table that would give the same statute and. A multiple criteria presentation would furthermore attribute an unwarranted scientiﬁc value to such tables. CONCLUSIONS 83 • all eﬀects that can reasonably be monetarised should be included in the computation of the NPSV. We would like to note in particular that: • it has a sound. “rational” and “objective” evaluation model. . • an independent group of CBA experts should evaluate all important projects. • extensive sensitivity analyses should be conducted. any decision/evaluation method that would claim to be context-free would seem of limited interest to us. Contrary to many other decision/evaluation methods that are more or less ad hoc. • the unavoidable elements of uncertainty and inaccurate determination entering in the evaluation model. • all public ﬁrms and administrations should use a similar methodology in order to allow meaningful comparisons. Having emerged from economics. theoretical basis. implicitly. • the rather unconvincing foundations of CBA for this type of project. In view of: • the immense complexity of such evaluation studies. However the insistence on seeing CBA as a “scientiﬁc”. importance to all. although limited and controversial on some points. the conclusion that CBA remains the “best” method seems unwarranted. CBA has often been criticised on purely ideological grounds.5.4. More generally.

It aims at providing simple tools allowing.g. the invention of alternatives. Even worse. • having sound theoretical foundations. negotiation.84 CHAPTER 5. We shall stress here why we think that decision/evaluation models should not be confused with CBA: • supporting decision/evaluation processes involves many more activities than just “evaluation”. Any decision/evaluation model should tackle this problem. • CBA explicitly acknowledges that the eﬀects of a project may be diverse and that all eﬀects should be taken into account in the model. Furthermore. As we shall see in chapter 9. formal methods based on an explicit logic can provide invaluable contributions allowing sensitivity analyses. this is worth recalling (Johannesson 1995). the foundations of CBA are especially strong in situations that are at variance with the usual context of public sector projects: . The determination of the “frontiers” of the study and of the various stakeholders. • CBA is a formal method of decision/evaluation. We already mentioned that we disagree with the view held by some economists that CBA is the only “rational” “scientiﬁc” and “objective” method for helping decision-makers (such views are explicitly or implicitly present in Boiteux (1994) or Mishan (1982)). Creativity. is probably a necessary but insuﬃcient condition to build useful decision/evaluation tools (let alone the “best” ones). such as CBA.g. exercise of power) clearly exist. ﬂexibility and reactivity are essential ingredients of the process. We strongly recommend Dorfman (1996) as an antidote to this radical position. CBA oﬀers little help at this stage. elections. traﬃc forecasts). to ensure a minimal consistency between decisions taken by various public bodies. the modelling of their objectives. It is the belief and experience of the authors of this book that such methods may have a highly beneﬁcial impact on the treatment of highly complex questions. • although the implementation of CBA may involve highly complex models (e. They do not seem always to be compatible with a too rigid view on what a “good decision/evaluation model” should be. ASSESSING COMPETING PROJECTS • CBA emphasises the need for consistency in decision-making. A recurrent theme in OR is that a successful implementation of a model is contingent on many other factors than just the quality of the underlying method. promoting constructive dialogue and pointing out crucial issues. form an important—we would tend to say a crucial—part of any decision/evaluation support study. in a decentralised way. too radical an interpretation of CBA might lead (Dorfman 1996) to an excessive attention given to monetarisation. Although other means of evaluation and of social co-ordination (e. In view of the popularity of purely ﬁnancial analyses for public sector projects. which may be detrimental to an adequate formulation. “formulation” is a basic activity of any analyst. the underlying logic of the method is simple and easily understandable.

This might also result in a model that might not appear transparent enough to be really convincing (Nyborg 1998). aggregation rule used in CBA can be subjected to the familiar criticisms already mentioned in chapters 3 and 4. Decision processes involving public sector projects are usually extremely complex. If there are limits to linearity. The complex calculations leading to the NPSV use a huge amount of “data” with varying levels of credibility.g. They last for years and involve many stakeholders generally having conﬂicting objectives.4. implicit.5. implicit. public goods. This leaves little room for political debate. Holland 1995. weighting of the various eﬀects of a project. CBA oﬀers almost no clue as to where to place these limits. which might be an incentive for some stakeholders to simply discard CBA. the price of human life) from the start might not give many opportunities to stakeholders for reaching partial agreements and/or for starting negotiations. Furthermore. it may not always be necessary. • CBA is a mono-criterion approach. it is not unlikely that some projects may be easily discarded and/or that some clearly superior project will emerge. this possibility is at much variance with more subtle views . Although the possibility of including in the computation of the NPSV individual “weights” (capturing a diﬀerent impact on social welfare of individual variations of income) exists (Brent 1984). • in CBA the use of “prices” supposedly revealed by markets (most often in “market-like” mechanisms) tend to obscure the. Even when monetarisation is reasonably possible.g. it is hardly ever used in practice. Probably all users of CBA would agree that an accident killing 10 000 people might result in a dramatic situation in which the “costs” incurred have little relation with the “costs” of 10 000 accidents each resulting in one loss of life (think of a serious nuclear accident compared to “ordinary” car accidents). CBA tries to summarise the eﬀects of complex projects into a single number. It would seem to be a heroic hypothesis to suppose that such limits are simply never reached in practice. CONCLUSIONS 85 non-marginal changes. • a decision/evaluation tool will be all the more useful that it lends itself easily to an insertion into a decision process. Although this allows to produce outputs in simple terms (the NPSV) it might be argued that the eﬀorts that have to be made in order to monetarise all eﬀects may not always be needed. • the additive linear structure of the. On the basis of less ambitious methods. Similarly. Merging rather uncontroversial information (e. the number of deaths per vehicle-km in a given area) with much more sensible and debatable information (e. they might be prepared to accept that there may exist air pollution levels above which all mammal life on earth could be endangered and that although these levels are multiples of those currently manipulated in the evaluation of transportation projects. they may have to be priced out quite diﬀerently. • the implicit position of CBA vis-`-vis distributional considerations is puza zling. Laslett 1995). externalities are indeed pervasive (see Brekke 1997.

Schneider. On the other hand.0860 ≈ 1%) seems debatable (see Harvey 1992. Gafni and Birch 1997. But if “social preferences” are ill-deﬁned. • decision/evaluation models can hardly lead to convincing conclusions if elements of uncertainty and inaccurate determination entering the model are not explicitly dealt with. We would argue that it gives. It seems hard to think of other forms of social co-ordination that could do much better. Fishburn and Sarin 1994. Fishburn and Straﬃn 1989. We are afraid to say that if you disagree on this point. the meaning of the NPSV of a project is far from being obvious. . Even accepting the rather optimistic view of a continuous increase of welfare and of technical innovation. a true “robustness analysis” should combine simultaneous variations of all parameters in a given domain. We showed in chapter 2 that “elections” were not likely to give rise to such a concept. ASSESSING COMPETING PROJECTS on equity and distributional considerations (see Fishburn 1984. you might ﬁnd the rest of this book of extremely limited interest. This is rather far from what we could expect in such situations. sensitivity analysis is often restricted to studying the impact of the variation of a few parameters on the NPSV. Practical texts on CBA always insist on the need for sensitivity analysis before coming to conclusions and recommendations. Due to the amount of data of varying quality included in the computation of the NPSV. • the use of a simple “social discounting rate” as a surrogate for taking a clear position on inter-generational equity issues is open to discussion. at best. taking decisions today that will have important consequences in 1000 years (think of the storage of used nuclear fuel) while using a method that gives almost no weight to what will happen 60 years from now 1 ( 1. CBA is far from exhausting the activity of supporting decision/evaluation processes (Watson 1981). We consider them as arguments showing that.86 CHAPTER 5. in spite of its many qualities. Weitzman 1994). This is especially true in the context of the evaluation of public sector projects. a partial and highly conventional view of the desirability of the project. Harvey 1994. • the very idea that “social preferences” exist is open to question. Weymark 1981). We doubt that markets are such particular institutions that they always allow to solve or bypass the problem in an undebatable way. These limitations should not be interpreted as implying a condemnation of CBA. if you expect to discover in the next chapters formal decision/evaluation tools and methodologies that would “solve all problems and avoid all diﬃculties” you should also realise that your chances of being disappointed are very high. one parameter varying at a time. Schieber. Fishburn and Sarin 1991. Eeckoudt and Gollier 1997.

Despite these facts. desires and/or phantasms of the potential buyer of a new or second-hand car can be so diversiﬁed that it will be very diﬃcult to establish a list of relevant points of view and build criteria on which everybody would agree. The relative importance of the criteria also very much depends on the personal characteristics of the buyer: there are various ideal types of car buyers. We describe the context of the case below and will invoke it throughout this chapter for illustrating a sample of decision aiding methods. One point should be made very clear: it is unlikely that a car could be universally recognised as the best. The case is simple enough to allow for a short but complete description. it also oﬀers suﬃcient potential for reasoning on quite general problems raised by the treatment of multi-dimensional data in view of decision and evaluation. for instance people who like sportive car driving. one can object that in many illustrations. the price for instance is a very delicate criterion since the amount of money the buyer is ready to spend clearly depends on his social condition. we have chosen to use the “Choosing a car” example.6 COMPARING ON THE BASIS OF SEVERAL ATTRIBUTES: THE EXAMPLE OF MULTIPLE CRITERIA DECISION ANALYSIS 6. or large comfortable cars or reliable cars or cars that are cheap to run. the motivations. However. needs.1 Thierry’s choice How to choose a car is probably the multiple criteria problem example that has been most frequently used to illustrate the virtues and possible pitfalls of multiple criteria decision aiding methods. The main advantage of this example is that the problem is familiar to most of us (except for one of the authors of this book who is deﬁnitely opposed to owning a car) and it is especially appealing for male decisionmakers and analysts for some psychological reason. 87 . the problem is too roughly stated to be meaningful. in a properly deﬁned context. for illustrating the hypotheses underlying various elementary methods for modelling and aggregating evaluations in a decision aiding process. even if one restricts oneself to a segment of the market. this is a consequence of the existence of decision-makers with many diﬀerent “value systems”.

is passionate about sportive cars and driving (he has taken lessons in sports car driving and participates in car races). each viewpoint being decomposed into sub-points that can be further decomposed . This is what he actually did. so he decides to explore the middle range segment. then to look for such a car in second hand car sale advertisements. The story dates back to 1993. Selecting the alternatives The initial list of alternatives was selected taking an additional feature into account.1: List of the cars selected as alternatives 6.1 Description of the case Our example is adapted from an unpublished report by a Belgian engineering student who describes how he decided which car he would buy. he cannot aﬀord to buy either a new car nor a luxury second hand sports car. 4 year old cars with powerful engines. He thus limits his selection of alternatives to the 14 cars listed in Table 6. Many authors have advocated a hierarchical approach to criteria building. His strategy is ﬁrst to select the make and type of the car on the basis of its characteristics. Thierry intends to use the car in everyday life and occasionally in competitions. Selecting the relevant points of view and looking for or constructing indices that reﬂect the performances of the alternatives for each of the viewpoints often constitutes a long and delicate task. estimated costs and performances. our student—call him Thierry—aged 21.1. So he does not want a car that would be too attractive to thieves.88 CHAPTER 6. Being a student.1. ﬁnding “the rare pearl” about twelve months after he made up his mind as to which car he wanted. COMPARING ON SEVERAL ATTRIBUTES Trademark and type Fiat Tipo 20 ie 16V Alfa 33 17 16V Nissan Sunny 20 GTI 16 Mazda 323 GRSI Mitsubishi Colt GTI Toyota Corolla GTI 16 Honda Civic VTI 16 Opel Astra GSI 16 Ford Escort RS 2000 Renault 19 16S Peugeot 309 GTI 16V Peugeot 309 GTI Mitsubishi Galant GTI 16 Renault 21 20 turbo 1 2 3 4 5 6 7 8 9 10 11 12 13 14 Table 6. Thierry lives in town and does not have a garage to park the car in at night. This explains why he discards cars like VW Golf GTI or Honda CRX. it is moreover a crucial one since the quality of the modelling will determine the relevance of the model as a decision aiding tool.

On the basis of these hypotheses he gets the estimations of his expenses for using the car during 4 years that are reported in Table 1 (Criterion 1 = Cost). The resale value of the car after 8 years is not taken into account due to the high risk of accidents resulting from Thierry’s oﬀensive driving style.6.e. actual mileage per year. The yearly costs involve another tax. etc. Evaluating the alternatives Evaluating the expenses incurred by buying and using a speciﬁc car is not as straightforward as it may seem. plus various taxes. One could alternatively have taken other indicators such as power of the engine. Maintenance costs are considered roughly independent of the car and hence neglected. Some of these values may be imprecisely determined: they may be biased when provided by the car manufacturer (the procedures for evaluating petrol consumption are standardised but usually under- . the European currency unit. actual selling price (in contrast to the oﬃcial quotation). A thorough analysis of the properties required of the family of criteria selected in any particular context (consistent family. non-redundant and monotonic) can be found in Roy and Bouyssou (1993) (see also Bouyssou (1990). Car performances are evaluated by their acceleration. Thierry somehow estimates his mileage at 12 000 km per year and the price of the petrol to . Evaluations of the cars on these viewpoints have been obtained from monthly journals specialised in the benchmarking of cars. For building the other criteria Thierry has a large number of performance indices whose value is to be found in the magazine benchmarks at his disposal. performance of the engine (criteria 2 and 3) and safety (criteria 4 and 5). Saaty (1980)). i. estimated by the oﬃcial quotation of the 4-year old vehicle. although it is a matter of importance. Large variations from the estimation may occur due to several uncertainty and risk factors such as actual life-length of the car. THIERRY’S CHOICE 89 (Keeney and Raiﬀa (1976). exhaustive.9 e per litre (1 e. for a survey). The purchase cost is also highly uncertain. The oﬃcial quotation of second hand vehicles of various ages is also published in such journals. Thierry evaluates the expenses as the sum of an initial ﬁxed cost and expenses resulting from using the car. The ﬁxed costs are the amount paid for buying the car. It is suﬃcient to say that Thierry’s concerns are very particular and that he accordingly selected ﬁve viewpoints related to cost (criterion 1). Petrol consumption is estimated on the basis of three ﬁgures that are highly conventional: the number of litres of petrol burned in 100 km is taken from the magazine benchmarks. is approximately equivalent to 1 USD). Thierry’s particular interest in sporty cars is reﬂected in his deﬁnition of the other criteria.1. Finally he expects (hopes) to use the car for 4 years. time needed to reach a speed of 100 km/h or to cover 400 meters that are also widely available.2) encodes the time (in seconds) needed to cover a distance of one kilometre starting from rest. insurance and petrol consumption. Note that the petrol consumption cost which is estimated with a rather high degree of imprecision counts for about one third of the total cost. criterion 2 (“Accel” in Table 6. We shall not emphasise the process of selecting viewpoints in this chapter.

33 2 2. brakes. when provided by specialised journalists in magazines. “Brakes” and “Road-h” in Table 6. “above average”.75 1. the procedures for measuring are generally unspeciﬁed and might vary since the cars are not all evaluated by the same person. “below average”. the reader is referred to Section 6. this dimension is considered important since Thierry also intends to use his car in normal traﬃc. In the magazine’s evaluation report.6 35.5 35.3 34.33 1. body.2.75 3.5 1.6 36.66 1. etc. For each of these.4.g. “exceptional”.7 30.75 2.2) is the time (in seconds) needed for covering one kilometre when starting in ﬁfth gear at 40 km/h.33 2. in terms of preferences. . equipment. .66 2 Crit5 Road-h 3 2.7 34.4 30 28. i. The indicator selected to measure this dimension (“Pick up” in Table 6. COMPARING ON SEVERAL ATTRIBUTES Name of cars Crit1 Cost 18 342 15 335 16 973 15 460 15 131 13 841 18 971 18 319 19 800 16 966 17 537 15 980 17 219 21 334 Crit2 Accel 30.5 2.66 1. only the qualities of braking and of road-holding are of concern to him and lead to the building of criteria 4 and 5 (resp. ﬁnish. “average”. cars that are specially prepared for competition may however lack suppleness in low operation conditions which is quite unpleasant in urban traﬃc. since they are generally positively correlated (powerful engines generally lead to quick response times on both criteria).8 35.66 1.4 29. a number of aspects are considered: 10 for comfort. For a short discussion about the notions of independence and interaction. The 3 or 4 partial aspects of each viewpoint are evaluated on an ordinal scale the levels of which are labelled “serious deﬁciency”.2 29 30.33 2.25 2.9 35.9 36. .75 2 2 2 1. 3 for brakes. The third criterion that Thierry took into consideration is linked with the pick up or suppleness of the engine in urban traﬃc.6 34.90 CHAPTER 6.e.9 29. . This dimension is not independent of the second criterion.9 Crit3 Pick up 37.7 37.6 30.2 41.3 29.7 30.7 Crit4 Brakes 2.3 36.8 35. To get an overall indicator of braking quality (and also for road-holding).25 1 2 3 4 5 6 7 8 9 10 11 12 13 14 Fiat Tipo Alfa 33 Nissan Sunny Mazda 323 Mitsubishi Colt Toyota Corolla Honda Civic Opel Astra Ford Escort Renault 19 Peugeot 309 16V Peugeot 309 Mitsubishi Galant Renault 21 Table 6. several other dimensions are investigated such as comfort. criteria 2 and 3 reﬂect diﬀerent requirements and are thus both necessary.2 28.5 1. Thierry re-codes the ordinal levels with integers . So.2: Data of the “choosing a car” problem estimate the actual consumption for everyday use). boot. the torque).33 1. maintenance. road-holding behaviour.33 2.25 2.66 2 2.2).8 28 28. 4 for road-holding. Again other indicators could have been chosen (e. from the point of view of the user. In view of Thierry’s particular motivations.

Being intrinsically part of this data is an appreciation (more or less explicit) of their degree of precision and their reliability. the ﬁrst criterion being represented in the [0. The reason for such an elimination is that a powerful engine is needless in competition if the chassis is not good enough and does not guarantee good road-holding. obviously. 8. This completes the description of the “data” which. eﬃcient brakes are also needed to keep the risk inherent to competition at a reasonable level. 4] interval and so on. Let us follow his reasoning. not necessarily because of imprecision in the evaluations but because of the arbitrary character of the cardinal re-coding of the ordinal information and its aggregation via an arithmetic mean (postulating implicitly that.6. the 3 components of each viewpoint are equally important and the levels of each of the three scales are equally spaced). Obviously these numbers are also imprecise. 13. Such a transformation of the data is not always innocent. the lowest evaluation observed for the sample of cars is mapped on the lower bound of the interval while the highest value is represented on the upper bound of the interval. In view of reaching a decision. Thierry ﬁrst discards the cars whose braking eﬃciency and road-holding behaviour is deﬁnitely unsatisfactory. in the [2. Thierry decides for himself and the consequences of his decision should not aﬀect him crucially).1.e. car numbers 4. Thierry will provide information about his preferences. in the relatively simple decision situation he was facing (“no wife. it is clear however that not too much conﬁdence should be awarded to the precision of these “evaluations”. Rules that would restate the set of remaining cars are for instance: criterion 4 ≥ 2 . 6. we brieﬂy discuss this point below. are not given but selected and elaborated on the basis of the available information. 6. The values for all criteria have been mapped (linearly) onto intervals of length 2. the second criterion. 5. the behaviour of each car from the corresponding viewpoint. 2] interval. In fact. We shall however consider that these ﬁgures reﬂect. 9. no boss”. he was able to make up his mind without using any formal aggregation method. The rules for discarding the above mentioned cars have not been made explicit by Thierry in terms of unattained levels on the corresponding scales. Figure 6. For each criterion. First of all he built a graphic representation of the data.2 Reasoning with preferences In the second part of the presentation of this case. i. popular spreadsheet software oﬀer a large number of graphical options for representing multi-dimensional data. in some way. in some sense. Many types of representations can be thought of. THIERRY’S CHOICE 91 from 0 to 4 and takes the arithmetic mean of the 3 or 4 numbers. this results in the ﬁgures with 2 decimals provided in the last two columns of Table 1. Note that the ﬁrst 3 criteria have to be minimised while the last 2 must be maximised. Note that the evaluations for the various criteria have been re-scaled in view of a better readability of the ﬁgure.1.1 shows such a representation.

5 3 2.92 CHAPTER 6.1: Performance diagram of all cars along the ﬁrst three criteria (above.5 0 Fiat Alfa Nissan Mazda MitsuColt Toyota Honda Opel Ford R19 Peu16 Peu MitsuGal R21 Road−h Brakes Figure 6.5 2 1. to be minimised) and the last two (below.5 1 0. COMPARING ON SEVERAL ATTRIBUTES Criteria to be minimised 6 5 4 3 2 1 0 Fiat Alfa Nissan Mazda MitsuColt Toyota Honda Opel Ford R19 Peu16 Peu MitsuGal R21 Supple Accel Cost Criteria to be maximised 4 3. to be maximised) .

the car labelled 14 is eliminated since it is dominated by car number 11.12. the value 1 corresponds to the lowest value for one of the cars in the initial set of 14 alternatives on each criterion. these still constitute reference points in relation to which the selected cars are evaluated. in this way Thierry was still able to compare the diﬀerence in the performances of two candidate cars for a criterion to typical diﬀerences for that criterion in the initial sample. their performances are shown on Figure 6. In interpreting the diagrams.1 instead after reordering the cars.7. we show a close-up of Figure 6.3. THIERRY’S CHOICE and criterion 5 ≥ 2 93 with at least one strict inequality.3 second) coupled with a deﬁnite disadvantage (. the value 3 corresponds to the highest value for one of the 14 cars.7 second) on the acceleration criterion. those labelled 1. he considers that the cost diﬀerence (car 7 about 1 500 e more expensive) is not balanced by the small advantage on acceleration (.6. The cars left after the above elimination process are those labelled 3. for the reader’s convenience.11. Among the 3 remaining cars the one he chooses is number 11. On Figure 6. remember that criteria 1. Thierry ﬁrst eliminates car number 12 on the basis of its relative weakness on the second criterion (acceleration). being mapped on the [1. those evaluations would all be represented by the origin. Here are the reasons for this decision. he drew the same diagram as in Figure 6. . 2. 2 and 3 are to be minimised while the others have to be maximised. Notice that car number 14 would not have been dominated if other criteria had been taken into consideration such as comfort or size: this car is indeed bigger and more classy than the other cars in the sample. Thierry considers that the price diﬀerence (about 500 e ) is worth the gain (. Comparing cars 3 and 11.3 that is focused on the 4 selected cars only. The set of remaining cars is restated for instance by the rule: criterion 2 < 30 Finally. Looking at the performances of the remaining cars.8 second) on suppleness. the 4 candidate cars were all put on the right of the diagram as shown in Figure 6. 2] is dictated by the mode of representation: the value “0” plays a special role since it is common to all axes. The choice of interval [1. 10 are further discarded. which makes the graph less readable. On each axis. 3] instead of interval [0. if an alternative was to receive a 0 value on several criteria.4. 1.1. This suggests that the evaluations of the selected cars should not be transformed independently of the values of the cars in the initial set. Thierry did not use the latter diagram (Figure 6. 2. their values on each criterion have all been linearly re-scaled.2. In these star-diagrams each car is represented by a pentagon. “Dominated by car 11” means that car 11 is at least as good on all criteria and better on at least one criterion (here all of them!). Comparing cars 7 and 11.2). 3] interval.

00 crit 2 (accel) crit 3 (supple) crit 4 (brakes) crit 3 (supple) crit 4 (brakes) Figure 6.3: Performances of the 4 candidate cars .6 34.00 crit 2 (accel) crit 4 (brakes) crit 3 (supple) crit 4 (brakes) crit 3 (supple) Peugeot 309 GTI 16 crit 1 (cost) 3.00 0.00 0.00 crit 2 (accel) Peugeot 309 GTI crit 1 (cost) 3.3 Crit4 Brakes 2.33 2.00 0.75 2.33 2.66 2.6 Crit3 Pick 34.00 2.9 35.00 2.00 crit 5 (road-h) 1.8 35.00 0.75 Table 6.5 2 2.00 2.94 CHAPTER 6.00 crit 5 (road-h) 1. COMPARING ON SEVERAL ATTRIBUTES Nissan sunny 20 GTI 16V crit 1 (cost) 3.00 crit 5 (road-h) 1.2: Star graph of the performances of the 4 cars left after the elimination process Name of car 3 7 11 12 Nissan Sunny Honda Civic Peugeot 16V Peugeot Crit1 Cost 16 973 18 971 17 537 15 980 Crit2 Acc 29 28 28.00 2.00 crit 5 (road-h) 1.00 crit 2 ( accel) Honda civic VTI 16 crit 1 (cost) 3.3 29.33 Crit5 Road 2.

1. the 4 candidate cars stand on the right .3: Performance diagram of all cars.6. THIERRY’S CHOICE 95 10 9 8 7 6 5 4 3 2 1 0 Fiat (1) Alfa (2) Mazda (4) Mitsu Colt (5) Toyota (6) Opel (8) Ford (9) R19 (10) Mitsu Gal (13) R21 (14) Nissan (3) Honda (7) Peu16 (11) Peu (12) Road−h (Max) Brakes (Max) Pick up (min) Accel (min) Cost (min) Figure 6.

Notice that these levels have not been set a priori as minimal levels of satisfaction. More sophisticated modes of combinations may be envisaged. The ﬁrst one is a screening process in which a number of alternatives are discarded on the basis of the fact that they do not reach aspiration levels on some criteria. . 2. Subtle considerations on whether the balance of diﬀerences in performance between pairs of cars on 2 or 3 criteria results in an advantage to one of the cars in the pair. to a value that could be described as both desirable and accessible. pp.96 CHAPTER 6. COMPARING ON SEVERAL ATTRIBUTES 10 8 6 4 2 0 Nissan (3) Honda (7) Peu16 (11) Peu (12) Pick up (min) Accel (min) Cost (min) Road−h (Max) Brakes (Max) Figure 6. Criteria 4 and 5 were not invoked. they could be insuﬃciently discriminating for the considered subset of cars (this is certainly the case for criterion 4): the values of the diﬀerences for the set of candidate cars could be such that they are not large enough to balance the diﬀerences on other criteria. there are several possible reasons for this: criteria 4 and 5 might be of minor importance or considered satisfactory once a certain level is reached. 264-266). The rules that have been used for eliminating certain alternatives have exclusively been combined in conjunctive mode since an alternative is discarded as soon as it does not fulﬁl any of the rules. In the second step of Thierry’s reasoning. Another elementary method that has been used is the elimination of dominated alternatives (car 11 dominates car 14). 1.4: Detail of Figure 6. they have been set after having examined the whole set of alternatives.3: the 4 cars remaining after initial screening Comments Thierry’s reasoning process can be analysed as being composed of two steps. for instance mixing up conjunctive and disjunctive modes with aspiration levels deﬁned for subsets of criteria (see Fishburn (1978) and Roy and Bouyssou (1993).

gn (a)). We report on how Thierry applied some of them to his case and extrapolate on how he could have used the others. the analysis of the remaining four cars has been much more delicate. it is not sure that the kind of reasoning he used for just choosing the best alternative for him would have ﬁt the bill. 6. . which would reﬂect the value of the alternatives on a synthetic “super scale of evaluation”. thus. The reasoning is not made on the basis of re-coded values like those used in the graphics. emphasising the very strong hypotheses underlying the use of this type of approach. In more complex situations (when more alternatives remain after an initial elimination or more criteria have to be considered or if a ranking of the alternatives is wanted). which is better supported by the original scales. that after the ﬁrst step consisting in the elimination of unsatisfactory alternatives. we consider the . we discuss a few formal methods commonly used for aggregating preferences.2 The weighted sum When dealing with multi-dimensional evaluations of alternatives. . In the simple case we are dealing with here. Since criteria 4 and 5 are aggregates and. the evaluation system should be more systematic. There is another rather frequent circumstance in which more formal methods are mandatory. the basic and almost natural (or perhaps. however. . ). if the decision-maker is bound to justify his decision to other persons (shareholders. THE WEIGHTED SUM 97 3. colleagues. This attitude is perhaps inherited from school practice where all other performance evaluations of the pupils have long been (and often still are) summarised in a single ﬁgure. more intuition is needed. the small number of alternatives and criteria has allowed Thierry to make up his mind without having to build a formal model of his preferences. for instance being able to cope with new alternatives that could be suggested by the other people. g2 (a). . In the rest of this chapter. this might also have been a reason for not exploiting them in the ﬁnal selection. We have seen.6. a weighted average of their grades in the various subjects. . . cultural?) attitude consists in trying to build a onedimensional synthesis.2. are not expressed in directly interpretable units. Note also that if Thierry’s goal had been to rank order the cars in order of decreasing preference. The problems raised by such a practice have been discussed in depth in Chapter 3. since the decision was actually made well before Thierry became aware of multiple criteria methods. Thierry has in addition tried to derive a ranking of the alternatives that would reﬂect his preferences. In his ex post justiﬁcation study. We discuss the application of the weighted sum to the car example below. This can be viewed as an ex post analysis of the problem. This kind of reasoning that involves comparisons of diﬀerences in evaluations is at the heart of the activity of modelling preferences and aggregating them in order to have an informed decision process. . it may appear necessary to use tools for modelling preferences. Starting from the standard situation of a set of alternatives a ∈ A evaluated on n points of view by a vector g(a) = (g1 (a).

max − gi. from 1.max − gi. b of alternatives remains unaltered: (6. the top evaluation will be mapped onto 1 while the bottom one goes onto 0. Once the weights ki have been determined.2. ratios are not preserved but ratios in diﬀerences of evaluations . that all criteria are to be maximised.4) gi (a) gi (a) = gi (b) gi (b) This transformation can be advanced when using ratio scales. asking for values of the weights ki in terms of the relative importance of the criteria without referring to the scales would yield absurd results. Similarly. Clearly.2) prompts a remark that was already made when we considered representing the “data” graphically. the larger the value gi (a). + kn gn (a) Suppose. The ranges of variation on the scales are very heterogeneous: from 13841 to 21334 on the cost criterion.66 on criterion 4. without loss of generality. In the case of gi .98 CHAPTER 6.1) f (a) = k1 g1 (a) + k2 g2 (a) + . i. Statements such as “alternative a is twice as good as b on criterion i” remain valid after transformation. gi.min and divide by the range gi. choosing an alternative becomes straightforward: the best alternative is the one associated with the largest values of f . These problems appear very clearly when trying to use the weighted sum approach on the car example. the better the alternative a on criterion i (if.2) (6. In the former case the maximal value of gi will be 1 while value 0 is kept ﬁxed which means that the ratio of the evaluations of any pair a. 6.min gi. One consists in dividing gi by the largest value on the ith scale. we suppose here that gi are positive. a ranking of the alternatives is obtained by ordering them in decreasing order of the value of f . substitute gi by −gi or use a negative weight ki ).1 Transforming the evaluations A look at the evaluations of the cars (see Table 6. The usual way out consists in normalising the values on the scales but there are several manners of doing this.33 to 2. This simple and most commonly used procedure relies however on very strong hypotheses that can seldom be considered plausibly satisﬁed. in which the value 0 plays a special role. .min . alternatively one might subtract the minimal value gi.e.e. i.max gi (a) − gi. COMPARING ON SEVERAL ATTRIBUTES value f (a) obtained by linearly combining the components of g . gi were to be minimised. These normalisations of the original gi functions are respectively denoted gi and gi in the following formulae (6.3) gi (a) = gi (a) = gi (a) gi. on the contrary.max . (6.min For simplicity. .

4 since gi.min and gi. −1 (since they have to be minimised). −2. c. to some extent. is suﬃcient to cause a rank reversal between the leading two alternatives. in other words.2.5.2 Using the weighted sum on the case Suppose we consider that 0 plays a special role in all scales and we choose the ﬁrst transformation option. let us consider Table 6.4 in decreasing order of the values of f . Varying the values that are considered imprecisely determined is what is called sensitivity analysis. 6. 6. As can be seen in the last column of Table 6. the diﬀerence in the values of f for those two cars is tiny (less than . Note that the above are not the only possible options for transforming the data. It is likely that by varying the weights slightly from their present value. Of course. the re-scaling of the criteria yields values of gi that are not the same as in Table 6.01) but we have no idea as to whether such a diﬀerence is meaningful. it helps to detect what the stable conclusions in the output of a model are. one could prevent such a drawback. it does not alter the validity of statements like “the diﬀerence between a and b on criterion i is twice the diﬀerence between c and d”. The alternatives are listed in Table 6. This perturbation.e. A set of weights has been chosen which is. one would readily get rank reversals i.5 where the set of alternatives is reduced to the 4 cars remaining after the elimination procedure.2.5) gi (a) − gi (b) gi (a) − gi (b) = gi (c) − gi (d) gi (c) − gi (d) 99 Such a transformation is appropriate for interval scales. The values of the gi ’s that are obtained are shown in Table 6.max depends on the set of alternatives. by using a normalising constant that would not depend on the .2. b. THE WEIGHTED SUM do: for all alternatives a. without any change in the values of the weights. The ﬁrst three criteria receive negative weights namely and respectively −1.max depend on the set of alternatives. d.4.4. Moreover. while the last two are given the weight .3 Is the resulting ranking reliable? Weights depend on scaling To illustrate the lack of stability of the ranking obtained. arbitrary but seems compatible with what is known about Thierry’s preferences and priorities.6. all we can do is being very prudent in using such a ranking since the weights were chosen in a rather arbitrary manner. this rough assignment of weights yields car number 3 as ﬁrst choice followed immediately by car number 11 which was actually Thierry’s choice. the ranking is not very stable. this is certainly a crucial activity in a decision aiding process. permutations of alternatives in the order of preference. (6. note also that these transformations depend on the set of alternatives: considering the 14 cars of the initial sample or the 4 cars retained after the ﬁrst elimination would yield substantially diﬀerent results since the values gi.

38 Value f -2.91 -2.86 0.00 0.00 0.94 0.80 0.896 -3.85 0.98 0.876 -2.75 0.00 1.86 0.96 0.75 0.92 0.4: Normalising then ranking through a weighted sum Nr 11 3 12 7 Name of car Peugeot 16V Nissan Sunny Peugeot Honda Civic -1 Cost 0.88 0.96 0.95 1.88 0.50 0.98 1.5 Road 1.98 1.95 0.00 Weights ki -2 -1 0.00 0.85 1.88 0.82 0.62 0.15 Table 6.96 -2.89 0.04 -3.62 0.93 1.85 -2.88 0.99 0.03 -3.88 0.00 0.66 -2.62 0.88 1.62 0.88 0.5 Road 0.92 0.83 0.91 0.80 0.84 1.99 0.91 0.77 0.92 -2.88 0.97 0.890 -2.71 -2.86 0.64 -2.00 0.81 Weights ki −2 −1 0.54 0.00 0.88 0.72 0.5: Normalising then ranking a reduced set of alternatives .86 0.75 0.54 0.84 0.98 0.46 0.92 0.73 Value f -2.65 0.96 0.85 0.89 0.62 1.5 Accel Pick Brak 0.63 -2.82 -2.86 0.71 0.98 0.84 1.100 CHAPTER 6.88 0.5 Accel Pick Brak 0.02 -3.69 0.91 1.75 0.62 0.090 Table 6.72 0. COMPARING ON SEVERAL ATTRIBUTES Nr 3 11 12 10 7 1 5 2 8 6 4 9 14 13 Name of cars Nissan Sunny Peugeot 16V Peugeot Renault 19 Honda Civic Fiat Tipo Mitsu Colt Alfa 33 Opel Astra Toyota Mazda 323 Ford Escort Renault 21 Mitsu Galant −1 Cost 0.85 0.00 0.62 0.77 0.00 0.94 0.94 0.89 0.00 0.97 -3.00 0.89 0.

6. you need an advantage of kj units for criterion i. After transformation. So. Recall that those were obtained by averaging equally spaced numerical codings of an ordinal scale of evaluation.min where λi is a constant. for instance the worst acceptable value (minimal requirement for a performance to be maximised. for instance codings with unequal intervals separating the levels on the ordinal scale. Some of these codings could obviously have changed the ranking. In particular.2.4 The diﬃculties of a proper usage of the weighted sum The meaning of the weights What is the exact signiﬁcance of the weights in the weighted sum model? The weights have a very precise and quantitative meaning. the source of the lack of stability would be the imprecision in the determination of the worst acceptable value. the weights have to be assessed in relation to a particular determination of the evaluations on each scale and eliciting them in practice is a complex task.2. a cost. Obviously. These ﬁgures however are treated in the weighted sum just like the “more quantitative” ones associated with the ﬁrst three criteria. Additive constants do not matter since they do not alter the rating. other codings of the ordinal scale might have been envisaged. An important consequence is that the weights depend on the determination of the unit on each scale. the weight ki of gi should be obtained by dividing the weight ki by κi .3. In a weighted sum model that would directly use the evaluations of the alternatives given in Table 6. gi is essentially related to gi by a multiplicative factor κi = 1. they certainly cannot be evaluated in a . they are trade-oﬀs: to compensate for a disadvantage of ki units for criterion j.max − gi. with such an option. their weights should be diﬀerent. both gi and gi are independent of the choice of a unit. In any case. in order to model the same preferences through a weighted sum of the gi and a weighted sum of the gi . The obtained ﬁgures presumably convey a less quantitative and more conventional meaning than for instance acceleration performances measured in seconds in standardisable (if not standardised) trials. in a consistent model. we have gi. yet they are not identical and.1. for instance) on each criterion. maximal level of a variable to be minimised. Indeed.2 and 6. Section 4.1.6) × gi (a) + λi = κi × gi (a) + λi gi. Notice that the above problem has already been discussed in Chapter 4.max gi (a) = (6. THE WEIGHTED SUM 101 set of alternatives. unless gi. Conventional codings Another comment concerns the ﬁgures used for evaluating the performances of the cars on criteria 4 and 5.6.2. This was implicitly a reason for normalising the evaluations as was done through formulae 6. it is clear that the weight of criterion 2 (acceleration time) has to be multiplied by 60 if times are expressed in minutes instead of seconds.min = 0.

but not in the statistical sense implying that the evaluations of the alternatives should be uncorrelated! They should be independent with respect to preferences. A second point is about independence.7 between cars 11 and 3. which may be used as attributes for assessing the alternatives for those viewpoints. it is very unlikely that Thierry’s preferences are correctly modelled by a linear function of the current scales of performance. As will be conﬁrmed in the sequel (see Section 6.3 seconds and 29 seconds on the acceleration scale? It seems rather clear from Thierry’s motivations.6 seconds. their relative position should not be altered when the proﬁle they share on a subset of criteria is substituted by any other common proﬁle. the latter diﬀerence being positioned between 28. One is perfectly entitled to work with attributes that are (even strongly) correlated. This means that the gain for passing from 29. if two alternatives that share the same proﬁle on a subset of criteria compare in a certain way in terms of overall preferences. reference to the underlying scale is essential.102 CHAPTER 6.6 seconds to 29 seconds has deﬁnitely less value than a gain of similar amplitude. The diﬀerence between car 12 and car 3 with respect to acceleration is 0.6 between 29 seconds and 29. this is because the attributes that are used to reﬂect these viewpoints are often linked by logical or factual interdependencies. Independence or interaction The next issue is more subtle. For instance. For instance in the car example. car number 12 is ﬁnally eliminated because it accelerates too slowly. On the contrary. that coming close to a performance of 28 seconds is what matters to him while cars above 29 seconds are unworthy. a famous example of dependence in the sense of preferences in a gastronomic context is the following: the preference for white wine or red wine usually depends on whether you are eating ﬁsh or meat. which consist in asking the . Note that translating the origin of a scale has no inﬂuence on the ranking of the alternatives provided by the weighted sum since it results in adding a (positive or negative) constant to f . the same for all alternatives. Evaluations of the alternatives for the various points of view taken into consideration by the decision-maker often show correlations. This does not mean that the corresponding points of view are redundant and that one should eliminate some of them.3 below). Up to this point we have considered the inﬂuence on the weights of multiplying the evaluations by a positive constant. COMPARING ON SEVERAL ATTRIBUTES meaningful manner through naive questions about the relative importance of the criteria. comfort and equipment. indicators of cost. are likely to be positively correlated. There is still a very important observation that has to be made: all scales used in the model are implicitly considered linear in the sense that equal diﬀerences in values on a criterion result in equal diﬀerences in the overall evaluation function f and this does not depend on the position of the interval of values corresponding to that diﬀerence on the scale. There are relatively simple tests for independence in the sense of preferences. In order to use a weighted sum. That is the ﬁrst point. Does Thierry perceive this diﬀerence as almost equally important as a diﬀerence of 0. In other words. the viewpoints should be independent. say from 29 to 28.3 seconds.

. This does not mean that no additive model would be suitable and it does not imply that the preferences are not independent (in the above-deﬁned sense). Independence is a necessary condition for the representation of preferences by a weighted sum. in which the evaluations gi may be “re-coded” through using “value functions” ui . varying the common proﬁle should not reverse the preferences when the points of view are independent. statistical data could help to master—to some extent—such a source of uncertainty.3). it will generally be very diﬃcult to get suﬃcient relevant and reliable statistical information in for this kind of problems. This uncertainty can be considered of stochastic nature. It is the concept of interacting criteria that was already discussed in example 2 of Chapter 3. If no re-coding is allowed (like in the assessment of students. Imprecision in the measurement of some quantities: for instance. provided they satisfy the independence property. how precise is the measurement of the acceleration? Such an imprecision can be reduced by making the conditions of the measurement as standard as possible and can then be estimated on the basis of the precision of the measurement apparatus. respectively acceleration and suppleness. criteria 2 and 3. Suppose that in the process of modelling the preferences of the decision-maker. in such a model the weight of a coalition of criteria may be larger or smaller than the sum of the weights of its components (see Grabisch (1996). Let us summarise some of them: 1. It may then prove impossible to model some preferences by means of a weighted sum of the evaluations such as those in Table 6. 2.6. see Chapter 3) there is a non-additive variant of the weighted average that could help modelling interactions among the criteria. more general than the weighted average. imprecision and uncertainty In the above discussion as well as in the presentation of our example we have emphasised the many sources of uncertainty (lack of knowledge) and of imprecision that bear on the ﬁgures used as input in the weighted sum. he declares that the inﬂuence of positively correlated aspects should be dimmed and that conjoint good performances for negatively correlated aspects should be emphasised. There is a diﬀerent concept that has been recently implemented for modelling preferences. it is not a suﬃcient one of course. Uncertainty in the evaluation of the cost: the buying price as well as the life-length of a second hand car are not known. With appropriate choices of u2 and u3 it may be possible to take the decision-maker’s preferences about positively and negatively correlated aspects into account. may be thought of as being positively correlated. In the next section we shall study an additive model.2 (and even of transformations thereof such as obtained through formulae like 6. in practice. THE WEIGHTED SUM 103 decision-maker about his preferences on pairs of alternatives that share the same proﬁle for a subset of attributes. In our case for instance.2. for more detail on non-additive averages). Arbitrariness.

Any re-coding that respects the order of the categories would in principle be acceptable. there is not even probabilistic information on the accuracy of the evaluations. Either one tries to prepare the inputs of the model (linearised evaluations and trade-oﬀs) as carefully as possible. which could be described as part of the validation of the model. quite often. there is generally little information on the size of the imprecisions. the range in which the parameters must be varied is not even clear as suggested above. contrary to what can (often) be done in physics. As a consequence. In view of the previous discussion. To master such an imprecision one could try to build quantitative indicators for the criteria or try to get additional information on the comparison between diﬀerences of levels on the ordinal scale: for instance. Arbitrary coding of non-quantitative data: re-coding of ordinal scales of appreciation of braking and road-holding behaviour. these operations are far from obvious and as a consequence. the imprecision of the linearisation process combines with the inaccuracy in the determination of weights. paying permanent attention to reducing imprecision and ﬁnishing with extensive sensitivity analysis. of course. 2. Or one takes imprecision into account from the start. the ratios of weights kj /ki must be elicited as conversion rates: a unit for criterion j is worth kj /ki units for criterion i. Imprecision in the determination of the trade-oﬀs (weights ki ). there are two main approaches to solve the diﬃculties raised by the weighted sum: 1. in the car problem for instance. On the one hand there are many possible strategies for varying the values of the imprecisely determined parameters. On the other hand. usually parameters are varied one at a time which is not suﬃcient but is possibly tractable. The usual way out is extensive sensitivity analysis. This part of the job is seldom carried out with the required exhaustivity because it is a delicate task at least in two respects.104 CHAPTER 6. COMPARING ON SEVERAL ATTRIBUTES 3. by avoiding to exploit precise values when knowing that they are not reliable but rather working with classes of values and ordered categories. the apparently straightforward decision— choosing the alternative with the highest value of f or ranking the alternatives in decreasing order of the values of f —might be unconsidered as illustrated above. is the diﬀerence between “below average” and “average” larger than the diﬀerence between “above average” and “exceptional”? 4. once the sensitivity analysis has been performed. Making a decision All these sources of imprecision have an eﬀect on the precision of the determination of the value of f that is almost impossible to quantify. the simple remarks made above strongly suggest that it will be very diﬃcult to discriminate between cars 3 and 11. Note that imprecision may well . the scales must ﬁrst be re-coded in order that one unit diﬀerence on a criterion has the same “value” everywhere on the scale (linearisation). one is likely to be faced with several almost equally valuable alternatives.

There is however a whole family of methods that we shall not consider here. Weights depend on the scaling of the criteria. 6.2. they do not lead to an explicit model of the decision-maker’s preferences. Consider two alternatives that share the same evaluation on at least one criterion. The weights are trade-oﬀs. then f (a) − f (b) = f (c) − f (d). for instance. produce the same eﬀect on the overall evaluation f : if alternatives a. called preference independence. The former option will lead us to the construction of multi-attribute value or utility functions. even extracted from perfectly precise evaluations. b. 1. 4.6. may prove rather diﬃcult to elicit. These implement various strategies for exploring the eﬃcient boundary. the so-called interactive methods (Steuer (1986). moreover.2. d are such that gi (a) − gi (b) = gi (c) − gi (d) for all i. it is guided by the decision-maker who is asked to tell. transforming the (linearised) scales results in a related transformation of the weights. c. The evaluations of the alternatives for all criteria are numbers and these values are used as such even if they result from the re-coding of ordinal data. Vincke (1992). we have settled on problems with a (small) ﬁnite number of alternatives and we concentrate on obtaining explicit representations of the decision-maker’s preferences. in the middle or at the top of the scale). Linearity of each scale. . which characteristics of the current solution he would like to see improved. whatever the location of the corresponding intervals on the scale (at the bottom. Cardinal character of the evaluations on all scales. Criteria do not interact. Preference independence.5 Conclusion The weighted sum is useful for obtaining a quick and rough draft of an overall evaluation of the alternatives. One should however keep in mind that there are rather restrictive assumptions underlying a proper use of the weighted sum. the exploration jumps from one solution to another. Equal diﬀerences between values on scale i.e. Weights tell how many units on the scale of criterion i are needed to compensate one unit of criterion j. Teghem (1996)). i. Varying the level of that common value on criterion i does not alter the way the two alternatives compare in the overall ranking. while the latter leads to the outranking approach. THE WEIGHTED SUM 105 lie in the link between evaluations and preferences rather than in the evaluations themselves. Such methods are mainly designed for dealing with inﬁnite and even continuous sets of alternatives. On the contrary. These two approaches will be developed in the sequel. can be formulated as follows. say criterion i. detailed preferential information. 3. 2. This property. As a conclusion to this section we summarise these conditions. the set of non-dominated solutions.

The additive value function model can thus be viewed as a clever version of the weighted sum since it allows us to take some of the objections—mainly the second hypothesis in Section 6. Accordingly. . the alternatives can be “measured”. n. i. this relation relates to the values u(a). This postulates that all alternatives may be evaluated on a single “super-scale” reﬂecting the value system of the decision-maker and his preferences. but is neither part of the model nor straightforward in practice). the preference relation on the set of alternatives is a complete preorder. .5—against a naive use of it into account. . i. if not. If the preferences of the decision-maker are compatible with an additive value model. a non-additive one. more generally. Suppes and Tversky (1990). the weights ki are incorporated in the ui functions. when making a decision. (see Krantz. in terms of “worth” on a synthetic dimension of value or utility. Chapter 19). a complete ranking possibly with ties. . if we denote by the overall preference relation of the decision-maker on the set of alternatives. 3.e. a nonindependent one. Much eﬀort has been devoted to characterising various systems of conditions under which the preferences of a decision-maker can be described by means of an additive value function model. another model should be looked for: a multiplicative model or. A slightly more general case is the following additive model: n (6. Chapter 7.2. . Luce. Krantz. Luce. a model that takes imprecision more intrinsically into account. etc. but we shall use it sometimes for “value”).7) a b iﬀ u(a) ≥ u(b) As a consequence. i = 1. n}. If this function is a linear combination of gi (a). at least partially.3 The additive multi-attribute value model Our analysis of the weighted sum brought us very close to the requirements for additive multi-attribute value functions. i = 1. COMPARING ON SEVERAL ATTRIBUTES 6.106 CHAPTER 6.e. the value u(a) usually is a function of the evaluations {gi (a). it may be possible to ask the decision-maker questions that will determine whether an additive value model is compatible with what can be perceived of his system of preferences. Depending on the context. the elicitation of the partial value functions ui may also be a diﬃcult task. Vol. . some systems of conditions may be interpretable and tested.8) u(a) = i=1 ui (gi (a)) where the function ui (single-attribute value function) is used to re-code the original evaluation gi in order to linearise it in the sense described in the previous section. we get back to the weighted sum. Note however that the imprecision issue is not dealt with inside the model (sensitivity analysis has to be performed in the validation phase. u(b) of the alternatives in the following way: (6. In other words. . . Of course. The most common model in multiple criteria decision analysis is a formalisation of the idea that the decision-maker. behaves as if he was trying to maximise a quantity called utility or value (the term “utility” tends nowadays to be used preferably in the context of decision under risk. . Suppes and Tversky (1971). a method of elicitation of the ui ’s may then be used.

3. this step will consist. (Krantz et al.2 meaning that from the chosen starting point. Wakker (1989)) that builds a series of equally spaced intervals on the scale of values. An assessment method based on indiﬀerence judgments Suppose we want to assess the ui s in an additive model for the Cars case. Continuing along the same line would for instance yield the following sequence of . the reader is referred to von Winterfeldt and Edwards (1986). say. a gain of 0. where ∼ denotes “indiﬀerent to”. The second step in the construction of the standard sequence is asking the decision-maker which value to assign to x2 to have (16 500.3 is ready to lose 4000 =25% of the potential reduction in cost for gaining 1.9.3 second on the acceleration time is worth an increase of 1 000 e in cost. for instance.5) as “average” values for cost and acceleration.1 Direct methods for determining single-attribute value functions A large number of methods have been proposed to determine the ui s in an additive value function model. Also ask the decision-maker to deﬁne a unit step on the cost criterion. pp. The answer might be. Relativising the gains as percentages of the half range from the central to the best values on each scale. Chapter 8. 267 sq for an example starting from a worst point) The range for the cost will be the interval between 21 500 e to 13 500 e and from 28 to 31 seconds for acceleration. one based on direct numerical estimations and the other on indiﬀerence judgements. (1971). von Winterfeldt and Edwards (1986).3. We will say in the sequel that the parity is equal when the decision-maker agrees to exchange a percentage of the half range on a criterion against an equal percentage on another criterion. Consider a pair of criteria.5 =20% of acceleration time. 28.2) ∼ (17 500.6. Then the standard sequence is constructed by asking which value x1 for the acceleration would make a car costing 16 500 e and accelerating in 29. The answer could be explained by the fact that at the starting level of performance for the acceleration criterion. THE ADDITIVE VALUE MODEL 107 6. of passing from a cost of 17 500 e to 16 500 e. It is assumed that the suitability of such a model for representing the decision-maker’s preferences has been established. We are going to outline a simulated dialog between an analyst and a decision-maker that could yield an assessment of u1 and u2 . x2 ).5 seconds indiﬀerent to a car costing 17 500 e and accelerating in x1 seconds. let us start with (17 500. 29. Suppose the answer is 29. this means that the decision-maker 1000 . say Cost and Acceleration. for ranges of evaluations corresponding to acceptable cars. For an accessible account of such methods. We brieﬂy describe the application of a technique of the latter category relying on what is called dual standard sequences. First ask the decision-maker to select a “central point” corresponding to medium range evaluations on both criteria. the corresponding single-attribute value functions. 29. There are essentially two families of methods. In view of the set of alternatives selected by Thierry. the decision-maker is quite interested by a gain in acceleration time. Note that we start the construction of the sequence from a “central point” instead of taking a “worst point” (see for instance von Winterfeldt and Edwards (1986).

5 Figure 6.5 with a slope proportional to .5 shows the re-coding u2 of the evaluations g2 on the interval [28.9) (17 500. The trade-oﬀ between u1 and u2 is easily determined through solving the following equation that just expresses the initial indiﬀerence in the standard sequence (16 500.9) (16 500.5 1 0.2) k1 u1 (16 500) + k2 u2 (29.5: Single-attribute value function for acceleration criterion (half range) indiﬀerences: (16 500. 28.9 and 29.2) (16 500.5) (16 500.5 29 acceleration (sec) 29. 28.5 to 31.108 3. 28.2 and the other valid between 28. one obtains a re-coding of each gi into a single-attribute value function ui . 28.3) (17 500. 4 and 5 in turn. 29. k1 u2 (29.5 2 1. Figure 6.3 .2) (17 500. 28. from 29. on the half range from 28 to 29. 28.5) = k1 u1 (17 500) + k2 u2 (29. COMPARING ON SEVERAL ATTRIBUTES value 28. there are two linear parts in the graph: one ranging from 28 to 28.2) from which we get k2 u1 (16 500) − u1 (17 500) = . 29. 29. 29.5) .2) − u2 (29. 28.9 where the slope is proportional to 1 1 . 29.5) (16 500.5) (17 500. considering (for instance) the cost criterion with criteria 3.7) (16 500.5 0 28 CHAPTER 6.5 3 2.1) Such a sequence gives the analyst an approximation of the single-attribute value function u2 .5) ∼ (17 500.5 seconds but it is easy to devise a similar procedure for the other half range.5]. 29. 28.3) ∼ ∼ ∼ ∼ ∼ ∼ (17 500. From there. one is able to re-code the scale of the cost criterion into the single-attribute value function u1 . 28.7) (17 500. using the same idea. Then.

The weights are usually derived through direct numerical judgements of relative attribute importance. it should be assigned to the interval [28. k4 and k5 are obtained similarly. starting from one reference point or another (worst point instead of central point) may result in variations in the assessments. Methods relying on numerical judgements In another line of methods.5 seconds. (with linear interpolation between the speciﬁed values. simplicity and direct intuition are more praised than scrupulous satisfaction of theoretical requirements. the questions are far from easy to answer. for more details. Note ﬁnally that such methods may not be used when the scale on which the assessments are made only has a ﬁnite number of degrees instead of being the set of real numbers. the anchor points are associated to the endpoints of a conventional interval of values. We just outline here a variant referring to von Winterfeldt and Edwards (1986). which is more a collection of methods than a single one. the evaluations for the acceleration criterion. this formula yields k2 and the trade-oﬀs k3 . developed by W. THE ADDITIVE VALUE MODEL 109 If we set k1 to 1. There are however many possibilities for checking for inconsistencies. one may validate this assessment by building a standard sequence that links its scale to another criterion and compare the two assessments of the same value function obtained in this way. This picture can be further improved by asking Thierry to see whether the relative spacings of the locations correctly reﬂect the strength of his preferences. Then 28. In order to re-code.6(b). 278 sq. Notice that the re-coding process of the original evaluations into value functions results in a formulation in which all criteria have to be maximised (in value).. A similar work has to be carried over for all criteria and the weights must be assessed.3. the ﬁnal version is drawn in Figure 6. Suppose he is then satisﬁed with all other diﬀerences of values. Thierry might say that almost the same gain in value (40) from 30 seconds to 29 as from 29 to 28 (gain of 50) is unfair and he could consequently propose to lower to 40 the value associated with 29 seconds. Assume for instance that a single-attribute value function has been assessed by means of a standard sequence that links its scale to the cost criterion. he also lowers to 65 the value of 28. at least numerous and densely spaced degrees are needed. Edwards. On the value scale. pp. one initially ﬁxes two “anchor” points that may be the extreme values of the evaluations on the set of acceptable cars. although rather intuitive and systematic is also quite complex. The above procedure. here 28 and 31 seconds. Thierry would be asked to rank-order the attributes. Since 29 seconds seems to be the value under which Thierry considers that a car becomes deﬁnitely attractive from the acceleration viewpoint. say. 29] a range of values larger than 1 . although the theory is not ignored.6(a).6. for instance 31 to 0 and 28 to 100. Thierry could for instance assign 29 seconds to 50 on the value scale.5 and 30 could be located respectively in 70 and 10. yielding the initial sketch of a value function shown on Figure 6. hopefully they will be consistent. otherwise some sort of retroaction is required. its size (in relative terms) in the original 3 scale. An example is SMART (“Simple Multi-Attribute Rating Technique”). an .

It is unlikely that a decision-maker would take the range of evaluations into account when asked to assess weights in terms of relative “importance” of criteria. This approach in terms of “importance” can be and has been criticised. If we had considered that the acceleration evaluations of admissible cars range from 27 to 32 seconds. to those raised in the approach based on indiﬀerence judgements. . a formulation that seems independent of the scalings of the criteria. a diﬀerence of one unit of value on the scale u2 illustrated in Figure 6. on the acceleration value scale that is normalised in the 0-100 range. (b) ﬁnal. the decision-maker is asked to compare alternatives that “swing” between the worst and the best level for each attribute in terms of their contribution to the overall value. A way of avoiding these diﬃculties is to give up the notion of importance that seems misleading in this context and to use a technique called swing-weighting. COMPARING ON SEVERAL ATTRIBUTES (a) 100 90 80 70 60 value 50 40 30 20 10 0 28 29 30 acceleration (sec) 31 value 100 90 80 70 60 50 40 30 20 10 0 28 29 30 acceleration (sec) 31 (b) Figure 6.110 CHAPTER 6.6 corresponds to a u (28)−u (31) on the scale u2 . In assessing the relative weights no reference is made to the underlying scales. we would have constructed a value function u2 with u2 (32) = 0 and u2 (27) = 100.6: Value function for acceleration criterion: (a) initial sketch. with initial sketch in dotted line “importance” of 10 could be arbitrarily assigned to the least important criterion and the importance of each other criterion should be assessed in relation to the least important one. directly as an estimation of the ratio of weights. instead of from 28 to 31. The weight attached to (less-than-unit) diﬀerence of 2 100 2 that criterion must vary in inverse proportion to the previous factor when passing from u2 to u2 . The argument of simplicity in favour of SMART is then lost since the questions to be answered are similar. both in diﬃculty and in spirit. This is not appropriate since weights are trade-oﬀs between units on the various value scales and must vary with the scaling. the meaning of one unit varies depending on the range of original evaluations (acceleration measured in seconds) that are represented between value 0 and value 100 of the value scale. For instance.

the inﬂuence of all criteria on the global goal are also compared pairwise. the level “Moderate” corresponds to an alternative that is preferred 3 times more than another or a criterion that is 3 times more important than another. in our case this amounts to comparing all cars from the point of view of a criterion and repeating this for all criteria. the decision-maker is assumed to give an approximation of the ratio f (a) . For instance. There are ﬁve main levels on the verbal scale. This is indeed Saaty’s basic assumption. At each level. the decision-maker is asked to assess the “priority” of a as compared to the “priority” of b. 8 can also be used. the pairwise comparison of the nodes in relation to the parent node is done by means of a particular method that allows. The assessment of the nodes may start (as is usually done) from the bottom nodes. there should . The questions are expressed in terms of “importance” or “preference” or “likelihood” according to the context. what the decision-maker expresses as a level on the scale is postulated to be the ratio of values associated to the alternatives or the criteria.6. Since verbal levels are automatically translated into numbers in f (b) Saaty’s method. to some extent. THE ADDITIVE VALUE MODEL 111 6. all nodes linked to the same parent node are compared pairwise. 5 second level nodes and 5 times 14 third level nodes also called leaves. but 4 intermediary levels that correspond to numerical codings 2. The conversion of verbal levels into numerical levels is described in Table 6. For each pair of nodes a. Let α(a. for instance in Keeney and Raiﬀa (1976)).2 AHP and Saaty’s eigenvalue method The eigenvalue method for assessing attribute weights and single-attribute value functions is part of a general methodology called “Analytic Hierarchy Process”. What we have to determine is the “strength” or priority of each element of a level in relation to their importance for an element in the next level. formally a weighted sum of single-attribute value functions (see Saaty (1980). The levels of the verbal scale correspond to numbers and are dealt with as such in the computations. it means that preference. 4. Thus the hierarchical tree is composed of 1 ﬁrst level node. The last level can be described as the list of potential cars. it consists in structuring the decision problem in a hierarchical manner (as it is also advocated for building value functions. In other words. a number f (a) is assumed to be attached to all a. 6. we shall concentrate on assessing directly on the numerical scale.3. The second level consists in the 5 criteria into which his global goal can be decomposed.3. the results of the pairwise comparisons may thus be encoded in a square matrix α. Such an interpretation of the verbal levels has very strong implications. If Saaty’s hypotheses are correct. The same is then done for all criteria in relation to the top node. b. b) denote the level of preference (or of relative importance) of a over b expressed by the decision-maker. when comparing a to b. importance and likelihood are considered as perceived on a ratio scale (much like sound intensity). the top level of the hierarchy is Thierry’s goal of ﬁnding the best car according to his particular views. In our case. It is asked for instance how much alternative a is preferred to alternative b from a certain point of view. constructing numerical evaluations associated with all levels of the hierarchy and aggregating them in a speciﬁc fashion. to detect and correct inconsistencies. The answers may be formulated either on a verbal or a numerical scale.6. Harker and Vargas (1987)).

namely. c) In view of the latter relation. b) ≈ 1 α(b. criteria must also be compared in a pairwise manner to model their importance. 2 Relation (6.11) v(a) = i=1 ki ui (a) and the alternatives can be ranked according to the values of v. 2. e.6: Conversion of verbal levels into numbers in Saaty’s pairwise comparison method. b. some sort of averaging of the columns is performed yielding an estimation of f . it is recommended either to revise the assessments or to choose another approach more suitable for the type of data. A test based on statistical considerations allows the user to determine whether the assessments in the pairwise comparison matrix show suﬃcient agreement with the hypothesis that they are approximations of f (a) . for an unknown f . b) × α(b. more weight will be put on a particular cost diﬀerence.10) α(a. COMPARING ON SEVERAL ATTRIBUTES Equal 1 Moderate 3 Strong 5 Very strong 7 Extreme 9 Table 6. c. we have answered the questions on pairwise comparisons on the basis of the information contained in his report. (6. “Moderate” means “3 times more preferred” be some sort of consistency between elements of α. If one wants to apply AHP in a multiple criteria decision problem. detect departure from the basic hypothesis in case the columns of α are too far from proportional. when comparing cars on the cost criterion. only one half (roughly) of the matrix has to be elicited.g. Each alternative a is then assigned an overall value v(a) computed as n (6. This process results in functions ui that evaluate the alternatives on each criterion i and in coeﬃcients of importance ki . (6. a) α(a.9) and in particular. If the test f (b) conclusion is negative. Applying AHP to the case Since Thierry did not apply AHP to his analysis of the case.9) implies that all columns of matrix α should be approximately proportional to f . correct errors made in the estimation of the ratios. which amounts to answering n(n−1) questions. say 1 000 e. for all a.112 Verbal Numeric CHAPTER 6. The pairwise comparisons enable to 1. For instance. c) ≈ α(a. when located in the range . pairwise comparisons of the alternatives must be performed for each criterion.

The problem is even more crucial for transforming scales such as those on which braking or road-holding are evaluated.5 times more preferred to Car 11. the relative importance of each criterion with respect to all others must be assessed. which is the amount of money he had budgeted for his car.09375 but this does not necessarily mean that Car 12 is preferred 1. we have restricted our comparisons to a subset of cars. The 17 500 ratio of these costs. consists in computing the eigenvector of the matrix corresponding to the largest eigenvalue (see Harker . For the sake of concision. for instance (supposing that the transformation of cost into preference is linear). according to equation 6. a transformation (re-scaling) is usually needed to go from evaluations to preferences.33? Similar questions arise for the comparison of importance of criteria. the zero of the cost scale x would be such that 17 500 − x = 1. For example. those weights have been obtained using the Prefcalc software and a method that is discussed in the next session. how many times is 2. This corresponds to the fact that Thierry said he is rather insensitive to cost diﬀerences up to about 17 500 e.09375 times more than Car 11 on the cost criterion.66 better than (preferred to) 2.5 times more preferred than Car 11 for the cost criterion. The most famous algorithm. For instance. All depends on what the decision-maker would consider as the minimum possible cost. several algorithms can be proposed to compute the “priority” of each criterion with respect to the goal symbolised by the top node of the hierarchy (under the hypothesis that the elements of the assessment matrix are approximations of the ratios of those priorities). A decision-maker might very well say that Car 12 is 1. for the cost. By default.e. 16 000 − x i. Indeed.3. for instance of alternatives in relation to a criterion. namely the top four cars plus the Renault 19. the blanks on the diagonal should be interpreted as 1’s. Mazda 323 and Toyota Corolla.6. is equal to 1. We discuss the determination of the “weights” ki of the criteria in formula 6. But even in linear parts.7. if Car 12 is declared to be 1. For computing those weights. or he could say 2 times or 4 times. the transformation is not linear since equal ratios corresponding to costs located either below or above 17 500 e do not correspond to equal ratios of preference. Car 11 costs approximately 17 500 e and Car 12 costs about 16 000 e. this is because the cost evaluation does not measure the preferences directly. according to Thierry himself.10. THE ADDITIVE VALUE MODEL 113 from 17 500 e to 21 500 e than when lying between 13 500 e and 17 500 e. A major issue in the assessment of pairwise comparisons. Of course the (ratio) scale of preference on i is not in general the scale of the evaluations gi .5 .11 below. the question is not easily answered. the blanks below the diagonal are supposed to be 1 over the corresponding value above the diagonal.7 has been ﬁlled. is to determine how many times a is preferred to b on criterion i from looking at the evaluations gi (a) and gi (b). x = 14 500 e. how many times is Car 3 preferred to Car 10 with respect to the braking criterion? In other words. Once the matrix in Table 6. Our assessments are shown in Table 6. We made them directly in numerical terms taking into account a set of weights that Thierry considered as reﬂecting his preferences. 16 000 . which was initially proposed by Saaty.

This means that the weights are not perceived as very contrasted.352. in order to get the sort of gradation of the weights as above (the ratio of the highest to the lowest value is about 3). one would quite naturally qualify the degree to which “Cost” is more important than “Acceleration” as “Moderate” until it is fully realised that “Moderate” means “three times as important”.060. Note that only the lowest degrees of the 1 to 9 scale have been used in Table 6. .5 by 1.489. .g. Applying the eigenvector method to the matrix in Table 6. for an interpretation of the “eigenvector method” as a way of “averaging ratios along paths”).117. which normally are not available on the verbal counterpart of the 1 to 9 scale described in Table 6.114 CHAPTER 6. some comparisons have been assessed by non-integer degrees. Note that the labelling of the degrees on the verbal scale may be misleading. . .5 Road-h 3 2 1. approximations should be made. one obtains the following values that reﬂect the importance of the criteria: (.254. Barzilai. the vector of priorities is the normalised eigenvector whose components sum up to unity. .060) . for instance by saying that cost and acceleration are equally important and substituting 1. When the assessments are made through this verbal scale. Since eigenvectors are determined up to a multiplicative factor. It should be emphasised that the “eigenvalue method” is not linear. Alternative methods for correcting inconsistencies have been elaborated. most of them are based on some sort of a least squares criterion or on computing averages (see e. COMPARING ON SEVERAL ATTRIBUTES Relative importance Cost Acceleration Pick-up Brakes Road-holding Cost Accel 1. for instance assessing the comparisons of importance by degrees twice as large as those in Table 6.172.7.5 Brakes 3 2 1. Cook and Golany (1987) who argue in favour of a geometric mean). the special structure of the matrix (reciprocal matrix) guarantees that all priorities will be positive. What would have changed if we had scaled the importance diﬀerently. the number 2 at the intersection of 1st row and 3rd column means that “Cost” is considered twice as important as “Pick-up’ and Vargas (1987).6.5 Pick-up 2 1.7: Assessment of the comparison of importance for all pairs of criteria.241.5 1 Table 6.117) . . .7.7 (except for 1’s that remain constant)? Would the coeﬃcients of importance have been twice as large? Not at all! The resulting weights would have been much more contrasted. . using the intermediary level between “Equal” and “Moderate” would still mean “twice as important”. For instance.137. . namely: (.

3.2694.1507.5).33 0.33 0.0 1. 28 and 29.8.7.25 0. Notice that the origin is arbitrary in the single-attribute value model. one may transform the value function of Figure 6. one may add any constant number to the values without changing the ranking of the alternatives (a term equal to the constant number times the trade-oﬀ associated to the attribute would just be added to the multi-attribute value function).25 3 2.0 2.2987. In the multiattribute value model. changing the unit on the vertical axis amounts to multiplying ui by a positive number.0 0. THE ADDITIVE VALUE MODEL Name of car Honda Civic Peugeot 309/16V Nissan Sunny Peugeot 309 Renault 19 Mazda 323 Toyota Corolla Nr 7 11 3 12 10 4 6 7 1. we now apply the method to determine the evaluation of the alternatives in terms of preference on the “Acceleration” criterion.5 so it coincides with AHP priority on the extreme values of the acceleration half range.0 2.0934. the scaling of the single-attribute value function is related to the value of the trade-oﬀ.0 3.5 0.0 4.50 0.5.0 1. i. these assessments are made on an absolute scale.50 1. So.7 shows the transformed single-attribute value function superimposed . no transformation is allowed.0 1.25 0.5 0.0 0. .0745.5 1.0 2. In order to compare the two ﬁgures.8: Pairwise comparisons of preferences of 7 cars on the acceleration criterion Using the latter set of weights instead of the former would substantially change the values attached to the alternatives through formula 6.0 1. .50 0. Suppose the pairwise comparison matrix has been ﬁlled as shown in Table 6. Applying the eigenvalue method yields the following “priorities” attached to each of the cars in relation to acceleration: (.0584.0 1.0 1.6.0 3.0 0. there is no degree of freedom in the assessment of the ratios in AHP.0 4.33 12 4. in other words. transformation of the former must be compensated for by transforming the latter.0 1. .0 1.0 6 5. without altering the way in which alternatives are ordered by the multi-attribute value function). . . In AHP since the assessments of all nodes are made independently.67 4 5.2 0. the solid line is a linear interpolation of the priorities in the eigenvector.0 1.0548).25 0. since trade-oﬀs depend on the scaling of their corresponding single-attribute value function. Figure 6.5 10 4.25 0.0 0. contrary to the determination of the trade-oﬀs in an additive value model (which may be re-scaled through multiplying them by a positive number.0 1.11 and might even alter their ordering.0 2.0 115 Table 6. . A re-scaling of the same criterion had been obtained through the construction of a standard sequence (see Figure 6. in a way that seems consistent with what we know of Thierry’s preferences.0 1. A picture of the resulting re-scaling of that criterion is provided in Figure 6.0 3.2 11 1.0 1.0 4.e.0 0. As a further example. Comparing these scales is not straightforward.67 0. the corresponding trade-oﬀ must then be divided by the same number.0 1.

the linearly transformed single-attribute values of Figure 6. This weakness (that can also be blamed on direct rating techniques. these evaluations are claimed to belong to a ratio scale.5 acceleration (sec) 30 30. i.116 0. Since the eigenvalue method yields a particular determination of this constant and this determination is not taken into account when assessing the relative importance of the various criteria. value (dotted) 0.15 0. which has been repeatedly criticised in the literature (see for instance Belton (1986) and Dyer (1990)). There seems to be a good ﬁt of the two curves but this is only an example from which no general conclusion can be drawn. the evaluations in terms of preference must be considered as if they were made on an absolute scale. There are striking diﬀerences between the two approaches from the methodological point of view.5 seconds (dotted line) on the graph of the priorities.25 priorities (solid).7: Priorities relatively to acceleration as obtained through the eigenvector method are represented by the solid line. COMPARING ON SEVERAL ATTRIBUTES 0.5 31 Figure 6. The ambition of AHP is to help construct evaluations of the alternatives for each viewpoint (in terms of preferences) and of the viewpoints with regard to the overall goal (in terms of importance).1 0.5 29 29. this does not mean that applying the respective methodologies of these theories normally yields the same overall evaluation of the alternatives. to be determined up to a positive multiplicative constant.5 are represented by the dotted line on the range from 28 to 29.05 0 28 28.e.2 0. as mentioned above) could be corrected by asking the decision-maker about the relative importance of the viewpoints in terms of passing from the least preferred value to the most preferred value on criterion i compared .3 CHAPTER 6. Comments on AHP Although the models for describing the overall preferences of the decision-maker are identical in multi-attribute value theory and in AHP.

That is probably why the original method.8) and inferring all together the shapes of all single-attribute value functions and the values of all the trade-oﬀs from declared global preferences on a subset of well-known alternatives. 1) is the value of an (ﬁctitious) alternative whose assessment on each criterion would be to the worst (resp. needs 28 seconds to cover 1km starting from rest and 34. its performance regarding brakes and road-holding are respectively 1. its performance regarding brakes and road-holding are respectively 2. starting in ﬁfth gear at 40km/h.7 seconds. This phenomenon was discussed in Belton and Gear (1983) and Dyer (1990) (see also Harker and Vargas (1987) for a defense of AHP). The idea is thus to infer a general preference model from partial holistic information about the decision-maker’s preferences.6.7 and 6. Vincke (1992)).25. e the software helps to build a function n u(a) = i=1 ui (gi (a)) such that a b ⇐⇒ u(a) ≥ u(b). i. Without loss of generality. there is an issue about AHP that has been discussed quite a lot. which e computes piece-wise linear single-attribute value functions and is based on linear programming (see also Jacquet-Lagr`ze (1990).3.3.3 An indirect method for assessing single-attribute value functions and trade-oﬀs Various methods have been conceived in order to avoid direct elicitation of a multi-attribute value function.6. it may happen that an alternative. In our example. although seriously attacked. costs 13 841 e. AHP has been criticised in the literature in several other respects. highest) value of u is conventionally set to 0 (resp.25. the assumption that the assessments at all levels of the hierarchy can be made along the same procedure and independently of the other levels. This ﬁctitious alternative is sometimes called the anti-ideal (resp. 6. the “anti-ideal” car. has remained unchanged. say. 1).33 and 1. costs 21 334 e. THE ADDITIVE VALUE MODEL 117 to a similar change on criterion j (Dyer 1990). Taking this suggestion into account would however go against one of the basic principles of Saaty’s methodology. More precisely.6 seconds. a among the remaining ones could now be ranked below an alternative b whilst it was ahead of b in the initial situation. The “ideal car” on the opposite side of the range. starting in ﬁfth gear at 40km/h. Besides the fact already mentioned that it may be diﬃcult to reliably assess comparisons of preferences or of importance on the standard scale described in Table 6. namely the possibility of rank reversal.8 seconds to cover 1 km starting from rest and 41. 0 (resp. A class of such methods consists in postulating an additive value model (as described in formulae 6. ideal ) point. the lowest (resp. it is implemented in a software called Prefcalc. . Suppose alternative x is removed from the current set and nothing is changed to the pairwise assessments of the remaining alternatives. needs 30.66 and 3. best) evaluation attained for the criterion on the current set of alternatives. Thierry used a method of disaggregation of preferences described in JacquetLagr`ze and Siskos (1982).e.

ui. .13 13. the single-attribute value function of the cost could for instance be represented as in Figure 6. say u1.2 Figure 6. the system tries to ﬁnd a solution among all feasible solutions.43 Acc .8. The user is asked to select a few alternatives that he is familiar with and feels able to rank-order according with his overall preferences.1 29 30 Road 34 . Prefcalc then tries to ﬁnd levels ui.e.7 1. suppose that you decide to set it to 2 (which is a parsimonious option and the default value proposed in Prefcalc). induces the corresponding order on their overall value and hence. which will make the additive value function compatible with the declared information.33 Brake 28 . Those values. Note also that with two linear pieces.2 3.23 Pick . i.1 .118 Cost CHAPTER 6.2 are variables of the linear program that Prefcalc writes and solves.3 2. that maximises the discrimination between .59 21. The ordering of these alternatives.2 2.1 . i. generates constraints of the linear program. which include the ﬁctitious ideal and anti-ideal ones. i.1 38 42 1.2 for each criterion i. if an additive value function (with 2-piece piece-wise linear single-attribute value functions) proves compatible with the preferences. The pieces of information on which the formulation of the linear program relies are obtained from the user.8.e. the single-attribute value function is completely determined by two numbers. Note that the maximal value of the utility (reached for a cost of 13 841 e) is scaled in such a way that it corresponds to the value of the trade-oﬀ associated with the cost criterion. If the program is not contradictory.8: Single-attribute value functions computed by means of Prefcalc in the “Choosing a car” problem. u1.43 in the example shown in Figure 6. The user ﬁxes the number of linear pieces.e. COMPARING ON SEVERAL ATTRIBUTES . one for each half of the cost range. the utility value at mid-range and the maximal utility.84 17.0 2. the value of the trade-oﬀ is written in the right upper corner of each box The shape of the single-attribute value function for the cost criterion for instance is modelled as follows.

the car is not only used in competition.6.3. the reader is referred to Vincke (1992). However. the system proposes to increase the number of variables of the model. e In his ex post study Thierry selects ﬁve cars. Renault 21 (Car 14) This ranking is compatible with an additive value function. If no feasible solution can be found. Thierry examines this result and makes the following comments. which he considers equally important as road-holding. Such a compatible value function is described in Figure 6. 2.8. 1. Mitsubishi Galant (Car 13) 4. for instance by using a higher number of linear pieces in the description of the single-attribute value functions. the modelling of the road-holding criterion. He agrees with many features of the ﬁtted single-attribute value functions and in particular with. the lack of sensitivity in the price in the range from 13 841 e to 17 576 e (he was a priori estimating his budget at about 17 500 e). This method could be described as a learning process. the set of alternatives on which the user declares his global preferences may be viewed as a learning set. Peugeot 309 GTI 16 (Car 11) 2. Thierry disagrees with the modelling of the braking criterion.23) given to approaching 28 seconds on the “acceleration” criterion (above 29 seconds. the car is useless since a diﬀerence of 1 second in acceleration results in the faster car being two car lengths ahead of the slower one at the end of the test. Thierry then looks at the ranking of the cars according to the computed value function. He believes that the relative importance of the fourth and ﬁfth criteria should be revised. the high importance (weight = . For more details on the method. the system ﬁts the parameters of the model on the basis of partial information about the user’s preferences. the importance (weight = . it must be pleasant in everyday use and hence. Jacquet-Lagr`ze and Siskos (1982). The ranking as well as the multi-attribute value assigned to each car are given in Table 6. the third criterion has a certain importance although it is of less importance than the second one).43). THE ADDITIVE VALUE MODEL 119 the selected alternatives.9. Thierry declares this criterion to be the second most important after cost (weight = . Nissan Sunny (Car 3) 3. 4.13) of getting as close as possible to 34 seconds in the acceleration test starting from 40 km/h (above 38 seconds he agrees that the car loses all attractiveness. 3. Ford Escort (Car 9) 5. . besides the ideal and anti-ideal ones and ranks them in the following order: 1.

10 and the revised multi-attribute value after each car name. Thierry modiﬁes the single-attribute value functions for criteria 4 and 5.120 CHAPTER 6.66 0.53 0. Thierry feels that the new ranking is fully satisfactory.9: Ranking obtained using Prefcalc. Running Prefcalc with the altered value functions returns the ranking in table 6.7 is raised to 0.54 0.9).2 is lowered to 0.01) associated with 2 remains unchanged while the utility of the level 2. contributes to raising the level of conﬁdence the user puts in the tool. The road-holding criterion is also modiﬁed. Comments on the method First let us emphasise an important psychological aspect of the empirical validation of a method or a tool. Note that Prefcalc normalises the value function in order that the ideal alternative is always assigned the value 1. of course due to the numbers display format with two decimal positions. . which is common in human practice: the fact that previous intuition or previous more informal analyses are conﬁrmed by using a tool.32 0. the utility (0.1 instead of 0. He observes that if he had used Prefcalc a few years earlier. the value (0.52 0. The cars ranked by Thierry are those marked with a * Thierry feels that Car 10 (Renault 19) is ranked too high while Car 7 (Honda Civic) should be in a better position. For the braking criterion. the sum of the maximal values of the single-attribute value functions may be only approximately equal to 1. here Prefcalc. In view of these observations. he considers this as a good point as far as Prefcalc is concerned. After he sees the modiﬁed ranking yielded by Prefcalc.01.1 (see Figure 6.61 0.84 0.16 * * * * * Table 6. COMPARING ON SEVERAL ATTRIBUTES Rank 1 2 3 4 5 6 7 8 9 10 11 12 13 14 Cars Peugeot 309/16 (Car 11) Nissan Sunny (Car 3) Renault 19 (Car 10) Peugeot 309 (Car 12) Honda Civic (Car 7) Fiat Tipo (Car 1) Opel Astra (Car 8) Mitsubishi Colt (Car 5) Mazda 323 (Car 4) Toyota Corolla (Car 6) Alfa 33 (Car 2) Mitsubishi Galant (Car 13) Ford Escort (Car 9) R 21 (Car 14) Value 0. in particular I am more conscious of the relative importance I give to the various criteria”.54 0.65 0.48 0.68 0.50 0. he would have made the same choice as he actually did.49 0. He ﬁnally makes the following comments: “Using Prefcalc has enhanced my understanding of both the data and my own preferences.2) associated with the level 3.

1 1. The cars ranked by Thierry are those marked with * .54 0.32 0.7 1. THE ADDITIVE VALUE MODEL 121 Brake .85 0.3 2.48 0.9: Modiﬁed single-attribute value functions for the braking and roadholding criteria Rank 1 2 3 4 5 6 7 8 9 10 11 12 13 14 * * * * * Cars Peugeot 309/16 (Car 11) Nissan Sunny (Car 3) Honda Civic (Car 7) Peugeot 309 (Car 12) Renault 19 (Car 10) Opel Astra (Car 8) Mitsubishi Colt (Car 5) Mazda 323 (Car 4) Fiat Tipo (Car 1) Toyota Corolla (Car 6) Mitsubishi Galant (Car 13) Alfa 33 (Car 2) Ford Escort (Car 9) R 21 (Car 14) Value 0.47 0.0 2.10: Modiﬁed ranking using Prefcalc.66 0.3.75 0.16 Table 6.65 0.55 0.2 Figure 6.51 0.61 0.1 Road .2 3.6.50 0.53 0.2 2.

45. Usually. What are the drawbacks and traps of Prefcalc? Obviously Prefcalc can only be used in cases where the overall preference of the decision-maker can be represented by an additive multi-attribute value function (as described by Equation 6. determining the trade-oﬀs with suﬃcient accuracy could be both crucial and challenging. Slight variations in the trade-oﬀ values can yield rank reversals. one may question the inﬂuence of the selection of a learning set. are constrained to appear in the .21. For instance. . This is hardly a restriction when dealing with a ﬁnite set of alternatives.3).10) to (.10. Prefcalc chooses one such representation according to the principles outlined above. It should thus be very clear that in practice.13. .10) results in exchanging the positions of Honda Civic and Peugeot 309.9.11. those on which Thierry focused after his preliminary analysis (see Table 6. . the top two alternatives were chosen to be in the learning set and hence. he simply validates the method by using it to reproduce results that he has conﬁdence in. with all trade-oﬀs within ±. by adapting the number of linear pieces one can obtain approximations of any continuous curve that can be as accurate as desired. After such a successful empirical validation step he will be more prone to use the method in new situations that he does not master that well. When bounded to a small number of pieces. This rank reversal is obtained by putting slightly more emphasis on cost and slightly less on performance.e. if the preferences declared on the set of well-known alternatives are compatible with an additive value model.122 CHAPTER 6. Other choices of a model albeit compatible with the declared preferences on the learning set. Dependence on the learning set In view of the fact that small variations of the trade-oﬀs may even result in changes in the ranking of the top alternatives. Note that such a slight change in the trade-oﬀs has an eﬀect on the ranking of the top 4 cars. see the survey by Fishburn (1991)). There are some additional restrictions due to the fact that the shapes of the single-attribute value functions that can be modelled by Prefcalc are limited to piece-wise linear functions. Passing from the set of trade-oﬀs (.02 of their value in Figure 6. the most discriminating (in a sense). . there will be several value functions that can represent these preferences.23. i. . this is not the case when preferences are not transitive or not complete (for arguments supporting the possible observation of non-transitive preferences. changes already occur.12. which are ranked 3rd and 4th respectively after the change.43. In the case under examination. Stability of ranking The main problem raised by the use of such a tool is the indetermination of the estimated single-attribute value functions (including the estimation of the tradeoﬀs). . It is therefore of prime importance to carry out a lot of sensitivity analyses in order to identify which parts of the result remain reasonably stable. may lead to variations in the rankings of the remaining alternatives.8). COMPARING ON SEVERAL ATTRIBUTES Observe that the user may well have a very vague understanding of the method itself. . . this may however be a more serious restriction. In particular.

.53. reintroducing in turn one of the 4 top cars and removing Renault 19. Peugeot 309/16V (3). two cars in the middle segment of the ranking. Honda Civic is relegated to the 12th position. the vector of trade-oﬀs is (. where. information is incomplete. It should be noted however that stability. Honda recedes due to its higher cost and its weakness on road-holding. When substituting the top 2 cars (Peugeot 309/16V. is in good agreement with the implementation options.10). . the value of the trade-oﬀs may depend drastically on the learning set. One may expect that the decision-maker will naturally choose alternatives that he considers as clearly distinct from one another as members of the learning set. stronger emphasis has been put on cost and safety (brakes and road-holding) and much less on performance (acceleration and pick up). the analyst might alternatively instruct the decision-maker to do so. The Renault 19 appears as an outsider due to excellent road-holding. one may consider that in the case of Prefcalc. Clearly. THE ADDITIVE VALUE MODEL 123 correct order in the output of Prefcalc. In the choice of the present learning set. The information should then be collected while taking the assumptions made into account. but one is usually more interested in the top ranked alternatives. one can be relatively satisﬁed with the results since the top 3 cars are usually well-ranked. is not necessarily a relevant requirement when the goal is to exploit partial available information. Mitsubishi Colt. the ranking of the Honda Civic is much more unstable and it is not diﬃcult to understand why (weakness on road-holding and relatively high cost).6. which may be a desirable property in the perspective of uncovering an objective model of preferences measurement. by choosing to maximise the contrast between the evaluations of the alternatives in the learning set) is not aimed at being as insensitive as possible with regard to the selection of a learning set.06. From a general point of view.08. In a learning process. Peugeot 309 (2). the option implemented in the mathematical programming model to reduce the indeterminacies (essentially. three of the former top cars remain in the top four. . . Nissan Sunny) by Renault 19. Renault 19 is heading the race mainly due to excellent road-holding. it must be decided how to complement the available facts by some arbitrary default assumptions. the analyst’s instructions of selecting alternatives that are as contrasted as possible. Other options could be experimentally investigated in order to see whether some could consistently yield more stable evaluations. Some sort of preliminary analysis of the user’s preferences can help to choose the learning set or understand the variations in the ranking and the trade-oﬀs a posteriori.08. and Nissan Sunny (4). typically. What would have happened if the learning set had been diﬀerent? Let us take another subset of 5 cars and declare preferences that agree with the ranking validated by Thierry (Table 6. Further experiments have been performed. Of course for the rest of the cars huge variations may appear in their ranking. .3.25) and the top four in the new ranking are Renault 19 (1). In the present case.

rank 2 if he is ranked second.4.3. since it is directly interpretable in terms of decision. COMPARING ON SEVERAL ATTRIBUTES 6. Indirect methods based on exploiting partial information and extrapolating it (in a recursive validation process) may help when the information is not available in explicit form. the decision becomes transparent. it remains that the quality of the information is crucial and that a lot of it is needed.4 6. direct assessment of multi-attribute value functions is a narrow road between the practical problem of obtaining reliable answers to diﬃcult questions and the risks involved in building a model on answers to simpler but ambiguous questions. In conclusion. With Borda’s method. but also provides less conclusive outputs. Suppose that each voter expresses his preferences through a complete ranking of the candidates. when established and accepted by the stake-holders. There is at least one additional advantage to theoretically well-founded decision models. it thus relies on ﬁrm theoretical bases. The counterpart of the clear-cut character of the conclusions that can be drawn from the model is that establishing the model requires a lot of information and of a very precise and particular type. which is undoubtedly part of the intellectual appeal of the method. and so on). Once the hypotheses of the model have been accepted or proved valid in a decision context and provided the process of elicitation of the various parameters of the model has been conducted correctly.124 CHAPTER 6.4 Conclusion This section has been devoted to the construction of a formal model that represents preferences on a numerical scale. Such a model can only be expected to exist when preferences satisfy rather demanding hypotheses. the winner is the candidate with . In the next section we shall explore a very diﬀerent formal approach that may be less demanding with regard to the precision of the information. inspiration can be gained from the voting procedures discussed in Chapter 2 (see also Vansnick (1986)). the Borda score of a candidate is the sum of the ranks assigned to him by the voters. The additive multi-attribute value model is rewarding. This means that the model may be inadequate not only because the hypotheses could not be fulﬁlled but also because the respondents might feel unable to answer the questions or because their answers might not be reliable. each candidate is assigned a rank for each of the voters (rank 1 if candidate is ranked ﬁrst by a voter. the best decision is the one the model values most (provided the imprecisions in the establishment of the model and the uncertainties in the evaluation information allow to discriminate at least between the top alternatives).1 Outranking methods Condorcet-like procedures in decision analysis Is there any alternative way of dealing with multiple criteria evaluation in view of a decision to the one described above for building a one-dimensional synthetic evaluation on some sort of super-scale? To answer this question (positively). such models can be used to legitimate a decision to persons that have not been involved in the decision making process. 6.

b in the “Choosing a car” problem the smallest Borda score. not taking into account criteria for which a and b are tied. A further step is thus needed in order to exploit this relation in view of the selection of one or several candidates or in view of ranking all the candidates.e. For each pair of cars a and b. the elements of the matrix are integers ranging from 0 to 5. A candidate is declared to be preferred to another according to a majority rule. we count the number of criteria according to which a is at least as good as b. This yields the matrix given in Table 6.11 and 0 to any number smaller than 3 yielding the . Note that we might have alternatively decided to count the criteria for which a is better than b. Condorcet’s method consists of a kind of tournament where all candidates compare in pairwise “contests”. We do this below. OUTRANKING METHODS Cars 1 2 3 4 5 6 7 8 9 10 11 12 13 14 1 5 2 4 3 3 2 3 3 2 4 4 4 3 2 2 3 5 4 1 3 2 3 2 3 4 4 4 2 3 3 1 2 5 1 1 1 1 1 1 2 3 2 0 1 4 2 4 4 5 5 2 4 4 4 3 4 4 2 3 5 2 2 4 1 5 2 4 4 4 2 4 4 1 3 6 3 3 4 3 3 5 4 4 3 3 4 4 2 3 7 3 2 4 1 2 2 5 3 1 2 4 4 1 1 8 2 3 4 2 2 2 3 5 2 3 5 4 2 3 9 3 3 4 1 2 2 4 3 5 3 4 3 1 3 10 2 1 3 2 3 2 3 2 2 5 3 4 1 2 11 2 1 2 1 1 1 2 0 1 3 5 3 1 0 12 2 1 3 1 1 1 2 2 2 2 4 5 0 1 13 2 4 5 4 5 3 4 4 4 4 4 5 5 4 14 3 3 4 2 2 2 4 3 3 3 5 4 1 5 125 Table 6. This method can be seen as a method of construction of a synthetic evaluation of the alternatives in multiple criteria decision analysis.11: Number of criteria in favour of a when compared to b for all pairs of cars a. the points of view corresponding to the voters and the alternatives to the candidates.4. b whether or not there is a (simple) majority of criteria for which a is at least as good as b. all criteria-voters have equal weight and coding by the rank number of the position of the candidate in a voter’s preference looks like a form of evaluation. This idea can of course be transposed in the multiple criteria decision context. The result of such a procedure is a preference relation on the set of candidates that in general is neither transitive nor acyclic. i. The preference matrix is thus obtained by substituting 1 to any number larger or equal to 3 in Table 6. using Thierry’s case again for illustrative purpose.11. we show how the problems raised by a direct transposition rather naturally lead to elementary “outranking methods”. if more voters rank him before the latter than the converse. the majority is reached as soon as at least 3 criteria favour alternative a when compared to b. What we could call the “Condorcet preference relation” is obtained by determining for each pair of alternatives a. Since there are 5 criteria.6.

13. To make things appear more clearly. an instance of such a cycle is 1. COMPARING ON SEVERAL ATTRIBUTES 2 1 1 1 0 1 0 1 0 1 1 1 1 0 1 3 0 0 1 0 0 0 0 0 0 0 1 0 0 0 4 0 1 1 1 1 0 1 1 1 1 1 1 0 1 5 0 0 1 0 1 0 1 1 1 0 1 1 0 1 6 1 1 1 1 1 1 1 1 1 1 1 1 0 1 7 1 0 1 0 0 0 1 1 0 1 1 1 0 0 8 0 1 1 0 0 0 1 1 0 1 1 1 0 1 9 1 1 1 0 0 0 1 1 1 1 1 1 0 1 10 0 0 1 0 1 0 1 0 0 1 1 1 0 0 11 0 0 0 0 0 0 0 0 0 1 1 0 0 0 12 0 0 1 0 0 0 0 0 0 0 1 1 0 0 13 0 1 1 1 1 1 1 1 1 1 1 1 1 1 14 1 1 1 0 0 0 1 1 1 1 1 1 0 1 Table 6. .126 Cars 1 2 3 4 5 6 7 8 9 10 11 12 13 14 1 1 0 1 1 1 0 1 1 0 1 1 1 1 0 CHAPTER 6. 12 (preferred to all but 2). . and ﬁnally 3 criteria saying that 13 is at least as good as 1. Obviously it is not straightforward to suggest a good choice on the basis of such a relation since one can ﬁnd 3 criteria (out of 5) saying that 1 is at least as good as 7.12: Condorcet Preference relation for the “Choosing a car”problem. 11. . All cycles in the previous relation disappeared. one might decide to impose more demanding levels of majority in the deﬁnition of a preference relation. 12 (twice). A “1” at the intersection of the a row and the b column means that a is rated not lower than b on at least 3 criteria relation described by the 0-1 matrix in Table 6. 5. 6. then come 10 (5 times) and 7 (6 times). by avoiding cycles as much as possible. 14. 2. How can we possibly obtain something from this matrix in view of our goal of selecting the best car? A closer look at the preference relation reveals that some alternatives are preferred to most others while some to only a few ones. When ranking the alternatives . 1. 8.12. 3. 3 (possibly diﬀerent) criteria saying that 7 is at least as good as 10. the relation is reﬂexive since any alternative is at least as good as itself along all criteria.13. excluding by themselves). 10. Majority rule and cycles It is not immediately apparent that this relation has cycles and even cycles that go through all alternatives. We might require that an alternative be at least better than another on 4 criteria. Note that a criterion counts both in favour of a and in favour of b only if a and b are tied on that criterion. 7 and 10 (preferred to all but 3). 12. 7. 3 (preferred to all but one). . The same alternatives appear as seldom beaten: 3 and 11 (only once. 4. 9. among the former are alternatives 11 (preferred to all). The new preference relation is shown in Table 6.

had the available information been rankings of the cars with respect to each criterion (instead of numeric evaluations). Conversely.e. then come 10 and 12 (beaten by one car). Car 6 Car 10 Car 5 1 Car 2 1 Car 4 1 Car 12 1 Car 3 1 Car 13 1 Car 11 1 Car 8 Car 1 1 Car 7 1 Car 9 1 Car 14 1 1 1 . then there is a big gap after which come 7. More precisely.4.3. One is that all criteria have been considered equally important.e. 7 is beaten by 3 cars. 11 and 12 come in the ﬁrst position (they are preferred to 10 other cars). i. it is possible however to take information on the relative importance of the criteria into account as will be seen in section 6.6. are at least as good on 4 criteria or more) one sees that 3. In other words. 8 and 10 that beat only 3 other cars. In the present case. 3 and 11. There are at least two radical diﬀerences between approaches based on the weighted sum and some more sophisticated way of assessing each alternative by a single number that synthesises all the criteria values. A “1” at the intersection of the a row and the b column means that a is rated not lower than b on at least 4 criteria by the number of those they beat (i. The second diﬀerence is more in the nature of the type of approach.13: Condorcet preference relation for the “Choosing a car” problem. the most striking point is that the size of the diﬀerences in the evaluations of a and b for all criteria does not matter. the result of the “Condorcet” procedure would have been exactly the same.4. suppose that all that we know (or that Thierry considers relevant in terms of preferences) about the cost criterion is the ordering of the cars according to the estimated cost. only the signs of those diﬀerences do. there are two non-beaten cars. OUTRANKING METHODS Cars 1 2 3 4 5 6 7 8 9 10 11 12 13 14 1 1 0 1 0 0 0 0 0 0 1 1 1 0 0 2 0 1 1 0 0 0 0 0 0 1 1 1 0 0 3 0 0 1 0 0 0 0 0 0 0 0 0 0 0 4 0 1 1 1 1 0 0 1 0 0 1 1 0 0 5 0 0 1 0 1 0 0 1 0 0 1 1 0 0 6 0 0 1 0 0 1 0 0 0 0 0 1 0 0 7 0 0 1 0 0 0 1 0 0 0 1 1 0 0 8 0 0 1 0 0 0 0 1 0 0 1 1 0 0 9 0 0 1 0 0 0 1 0 1 0 1 0 0 0 10 0 0 0 0 0 0 0 0 0 1 0 1 0 0 11 0 0 0 0 0 0 0 0 0 0 1 0 0 0 12 0 0 0 0 0 0 0 0 0 0 1 1 0 0 13 0 1 1 1 1 0 1 1 1 1 1 1 1 0 14 0 0 1 0 0 0 1 0 0 0 1 1 0 1 127 Table 6. we see that the simple approach that was used essentially makes the same cars emerge as the methods used so far.

Such issues were discussed extensively in section 6.1. in terms of preferences.e. let us list a few reasons for the possible failure of these methods: • time pressure may be so intense that there is not enough time available to engage in the lengthy elicitation process of a multiple criteria value function. neglecting the size of the diﬀerences for a criterion such as cost may appear as misusing the available information.33 to 2. Does the diﬀerence between the levels 2. • the decision-maker might not know how to answer the questions or might try to answer but prove inconsistent or might feel discomfort in being forced to give precise answers where things are vague to him. for instance.1). “ is cheaper than . In such cases it may be inappropriate or ineﬃcient to try building a value function and other approaches may be preferred. the analyst may be unable to make the various decision-makers agree on the answers to be given to some of the questions raised in the elicitation process. . Take. • in case of group decision.33 to 2. if this were the case we would have obtained the same matrices as in Tables 6. Such a perception can only be obtained .12 and 6. i. are not necessarily related with preferences on the cost criterion in a simple way.2.4.128 CHAPTER 6. • it may be that the importance of the decision to be made does not justify such an eﬀort. is this diﬀerence. ”.66 and 2? How much would you accept to pay (in terms of criterion 1) to raise the value for criterion 4 from 2. more than.33 and 2.66 or from 1. Of course. criterion 4 (Brakes). on Criterion 1 ”.13. even reliable ones. . less than or equal to the diﬀerence between the levels 1.33? Of course questions raised for eliciting value functions are more indirect but they still require a precise perception of the meaning of the levels on the scale of criterion 4 by the decision-maker. there are at least two considerations that could mitigate this commonsense reaction: • the assessments for the cars on the cost criterion are rather rough estimations of an expected cost (see section 6.1. This appears perhaps better if we consider the more artiﬁcial scales associated with criteria 4 and 5 (see section 6. . The many methods that can be used to build a value function by questioning a decision-maker about his preferences may well fail however. .66 have a quantitative meaning? If it does. The whole analysis carried out there was aimed towards the construction of a multiple criteria value function. in particular it is presumed that on average the lifetimes of all alternatives are equal. is it reasonable in those circumstances to rely on precise values of diﬀerences of these estimations to select the “best” alternative? • estimations of cost. which implies making any diﬀerence in evaluations on a criterion equivalent to some uniquely deﬁned diﬀerence for any other criterion. Suppose that similar hypotheses are made for the other 4 criteria.1 concerning the construction of these scales). COMPARING ON SEVERAL ATTRIBUTES where 1 represents “ is preferred to .

but such knowledge cannot be expected from a decision-maker (otherwise there would be no room on the marketplace for all the magazines that evaluate goods in order to help consumers spend their money while making the best choice). it builds up a set of which the best alternative—according to the decision-maker’s preferences—should be a member.2 A simple outranking method The Condorcet idea for a voting procedure has been transposed in decision analysis under the name of outranking methods. .4. cited by Vincke (1992). additional elements such as the notion of discordance have also been added. So. (For an overview of outranking methods. We shall then show how the fundamental ideas of ELECTRE I can be sophisticated. this does not favour a deep intuitive perception of what the levels on that scale may really mean. we discuss an application of the simplest of these methods. one has to admit that in many cases the deﬁnition of the levels on scales is quite far from precise in quantitative terms and it may be “hygienic” not to use the fallacious power of numbers. Such an approach has been operationalised through various procedures and particularly the family of ELECTRE methods associated with the name of B. in order not to take into account diﬀerences that are only due to the irrelevant precision of numbers. Let us emphasise that this set cannot be described as the set of best alternatives. This is deﬁnitely the option chosen in the methods discussed in the present section. 6. ELECTRE I. it is claimed that a “outranks” b if there are enough arguments to decide that a is at least as good as b. Note that taking strong arguments against declaring a preference into account is typically what is called “discordance” and is original with respect to the simple Condorcet rule. not even a set of good alternatives.4. yet usually in a coarse-grained fashion. but diﬀerences between levels on a scale are carefully categorised. Below. in particular the fact that criteria may be perceived as unequally important. ELECTRE I is a tool designed to be used in the context of a choice decision problem. Not that these methods are purely ordinal. Also remember that braking performance has been described by the average of 3 indices evaluating aspects of the cars’ braking behaviour. Roy. but just a set that contains the “best” alternatives.6. Such a transposition takes the peculiarities of the decision analysis context into account. OUTRANKING METHODS 129 by having experienced the braking behaviour of speciﬁc cars rated at the various levels of the scale. in particular in view of helping to rank the alternatives. to Thierry’s case. 58). when looking at alternatives a and b. we just want to present the basic ideas of such methods and illustrate some problems they may raise. the reader is referred to the books by Vincke (1992) and Roy and Bouyssou (1993)). p. Our goal is not to make a survey of all outranking methods. Each pair of alternatives is considered in turn independently of third part alternatives. while there is no essential reason to refute that statement (Roy (1974). The principle of these methods is as follows.

Once the outranking relation has been constructed. may be viewed not as a weakness but rather as faithfully reﬂecting preferences as they can be perceived at the end of the study. will seldom directly yield a ranking of the alternatives. Raiﬀa and Tversky (1988)). Explicit recognition that some alternatives are incomparable may be an important piece of information for the decision-maker. the feelings and the values of a decision-maker. Defenders of the approach support the idea that forcing preferences to be expressed in the format of a complete ranking is in general too restrictive. the job of suggesting a decision is thus not straightforward. for simplicity. Roy. see also Bouyssou (1992) and Perny (1992)). independently of the other alternatives. carefully. The pairs of alternatives that belong to the outranking relation are normally those between which the preference is established with a high degree of conﬁdence. models are built that reﬂect. however. Let us emphasise that the lack of transitivity or of completeness. In this approach very little hypotheses are made on preferences (like rationality hypotheses). it is no wonder that methods based on comparisons of alternatives by pairs. . prudently and interactively. Since the hypotheses of Arrow’s theorem can be re-formulated to be relevant in the framework of multiple criteria decision analysis (through the correspondence candidate-alternative. A phase of exploitation of the outranking relation is needed in order to provide the decision-maker with information more . in contrast with most artiﬁcial intelligence practice. the reader is referred to Roy (1993). one may even doubt that preferences pre-exist the process from which they emerge. preferences are not simply described through rules extracted from partial information obtained on a learning set. vague and evolving object that is named. helping him to understand a decision problem while taking his own values into account in the modelling of the decision situation. it has many features in common with a learning process. COMPARING ON SEVERAL ATTRIBUTES The lack of transitivity. that outranks a) or incomparabilities (neither a outranks b nor the opposite). The analysis of a decision problem is conceived as an informational process. Fishburn (1991)). . the outranking relation should be interpreted as what is clear-cut in the preferences of the decisionmaker. “the preferences of the decisionmaker”. something like the surest and most stable expression of a complex.130 CHAPTER 6. the way of thinking. the model of preferences is built explicitly and formally. contradictions are reﬂected either in cycles (a outranks b that outranks c that . The approach could be called constructive. in which. there is experimental evidence that backs their viewpoint (Tversky (1969). it may be useful to emphasise the fact that outranking methods (and more generally methods based on pairwise comparisons) do not generally yield preferences that are transitive (not even acyclic). This point was already made in Chapter 2 about Condorcet’s method. in this concept. although raising operational problems. to some extent. the concern is not making a decision but helping a decision-maker to make up his mind. For more about the constructive approach including comparisons with the classical normative and descriptive approaches (see Bell. voter-criterion. acyclicity and completeness issues As a preamble. In addition. as repeatedly stressed in the writings of B.

might be pinpointed as preventing a from outranking b. it measures the strength of the coalition of criteria that support the idea that a is at least as good as b. For each pair of alternatives a and b. If all criteria are equally important. one extracts the kernel of the graph of the outranking relation after having the cycles reduced. the kernel is deﬁned as a subset of alternatives that do not outrank one another and such that each alternative not in the kernel is outranked by at least one alternative in the kernel. OUTRANKING METHODS 131 directly interpretable in terms of a decision. the concordance index is proportional to the number of criteria in favour of a as compared to b as in the Condorcet-like method discussed above. a unique kernel always exists. in other words. 6. if this happens for at least one criterion one says that there is a veto to the preference of a over b. Such a two-stage process oﬀers the advantage of good control on the transformation of the multi-dimensional information into a model of the decision-maker’s preferences including a certain degree of inconsistency and incompleteness. with the simple majority rule. It should be emphasised that all alternatives in the kernel are not necessarily good candidates for selection. So.3 Using ELECTRE I on the case We brieﬂy review the principles of the ELECTRE I method. the kernel may be viewed as a set of alternatives on which the decision-maker’s attention should be focused. Another feature that contrasts ELECTRE with pure Condorcet but also with purely ordinal methods. In a graph without cycles. The strength of a coalition is just the sum of the weights associated to the criteria that constitute the coalition. this threshold is just half the number of criteria and in general one will choose a number above half the sum of the weights of all criteria. we successively have to determine . they are substituted by a unique representative node. This process yields a binary relation on the set of alternatives. it may happen that a outranks b and that b outranks a. is that some large diﬀerences in evaluation. One therefore checks whether there is any criterion for which b is so much better than a that it would make it meaningless for a to be declared preferred overall to b. the so-called concordance index is computed. in the Condorcet voting method. In order to apply the method to Thierry’s case. in the resulting relation without cycles. an alternative incomparable to all others is always in the kernel.6. alternatives in the kernel may be beaten by alternatives not in the kernel.4. In order to propose a set of alternatives of particular interest to the decision-maker from which the best compromise alternative should emerge. The level from which a coalition is judged strong enough is determined by the so-called concordance threshold. Note that the outranking relation is not asymmetric in general. when in disfavour of a. in particular all non-outranked alternatives belong to the kernel. all alternatives in a cycle are considered to be equivalent. then a outranks b.4. If the concordance index passes some threshold (“concordance threshold”) and there is no veto of b against a. The notion of weights will be discussed below. which may have cycles and be incomplete (neither a outranks b nor the opposite).

132 CHAPTER 6. if it is bad on the braking or road-holding criterion.5). b) = i:gi (a)≥gi (b) pi where the pi ’s are normalised weights that reﬂect the relative importance of the criteria.2. it is impossible for a car to be outranked when it is better on criteria 2 and 3 even if all other criteria are in favour of an opponent. the weight pi would be added when the converse inequality holds. Dividing the weights by their sum (= 5). To be more speciﬁc and contrast the meaning of the weights from those used in weighted sums. as usual. 0. as often as the evaluation of a passes or equals that of b on a criterion. This does not mean however that they are independent of the method and that one could use values given spontaneously by the decision-maker or through questioning in terms of “importance” without care. that measures the coalition of criteria along which a is at least as good as b may be computed by the formula (6.2.2.max attained for that criterion. the whole initial analysis shows on the contrary. 1. .2. yields the normalised weights (. . gi (a) denotes.e. Using these weights in outranking methods would lead to an overwhelming predominance of criteria 2 (Acceleration) and 3 (Pick-up). that a fast and powerful car is useless. they are completely independent of the scales for the criteria. its weight now enters into the weight of the coalition (additively) in favour of a. .4. 0. With such weights and a concordance threshold of at least .1). the weights are not trade-oﬀs. which are also linked since they are facets of the cars performance. 2. the evaluation of alternative a for criterion i (which is assumed to be maximised. . Such a feature of the preference structure could indeed be reﬂected . in this case to measure the strength of coalitions in pairwise comparisons and decide on the preference only on the basis of the coalitions.e.5. A criterion can count both for a against b and the opposite if and only if gi (a) = gi (b). So. (1. if it were to be minimised. gi (a) ≤ gi (b)). Note that these were not obtained through questioning on the relative importance of criteria but in the context of the weighted sum with Thierry bearing re-scaled evaluations in mind: the evaluations on each criterion had been divided by the maximal value gi. In the context of outranking. without reference to the evaluations as is done in Saaty’s procedure. It is important to bear in mind how the weights will be used. b).12) c(a.1. i. A practical consequence is that one may question the decision-maker in terms of relative importance of the criteria without reference to the scales on which the evaluations for the various viewpoints are expressed. COMPARING ON SEVERAL ATTRIBUTES • weights for the criteria • a concordance threshold • ordered pairs of evaluations that lead to a veto (and this for every criterion) Evaluating coalitions of criteria The concordance index c(a. let us ﬁrst consider those suggested by Thierry in section 6. It was never Thierry’s intention that once a car is better on criteria 2 and 3. i.5 . there is no need for looking at the other criteria. for instance.

one is inclined to choose less contrasted weights than those examined above. whether a is either slightly or by far better than b. . Using these weights for measuring strength of coalitions does not seem appropriate.60). multiplying them by the inverse of the range of the values for the corresponding criterion prior to the transformation) one would obtain (. .6.25) as a weight vector. 6.3. in outranking methods.24.45) becomes more important than the “performance coalition” (Criteria 2 and 3. At least the ordering of the values seems to be . There is another reasonable normalisation of the criteria that does not ﬁx the zero of the scale but rather maps the smallest attained value gi.4. Since the weights in a weighted sum depend on the scaling of each criterion and there is no acknowledged standard scaling. . If a is slightly better than b on a point of view i. Now look at the weights (.12.2). we might consider the weights used with the normalised criteria of Table 6.28. 8. . Hence. .e.12 ) obtained through Saaty’s questioning procedure in terms of “importance” (see section 6. weight = . OUTRANKING METHODS 133 through the use of vetoes. not by allowing a safe car to outrank a powerful one. Note that the above weights may nevertheless be appropriate for a weighted sum because in such a method. i. . since criteria 1 and 2’s predominance is too strong (joint weight = . the full weight of the criterion counts in favour of a.max onto 1. let us take weights proportional to (10. the “safety coalition” (Criteria 4 and 5. while the importance of the “performance coalition” (Criteria 2 and 3) would be overwhelming (weight = . 6. We see that the importance of the “safety coalition” (Criteria 4 and 5) would be negligible (weight = . but only in a negative manner. we will just choose a set of weights in an intuitive manner. .20). With these values as coeﬃcients of importance. If we nevertheless try to use them. To make it clearer. when a is better than b on some criterion. the inﬂuence of this fact in the comparison between a and b is reﬂected by the term ki × (gi (a) − gi (b)) which is presumably small. Although there are procedures that have been proposed to elicit such weights (see Mousseau (1993). important criteria count for little in pairwise comparisons when the diﬀerence between the evaluations of the alternatives are small enough.24 = . by removing the outranking of a safe car by a powerful one. As an additional conclusion. 6) as reﬂecting the relative importance of the criteria. Due to the all or nothing character of the weights in ELECTRE I.20. one may note that the values of the weights vary tremendously depending on the type of normalisation applied.13.min onto 0 and the largest gi. weights are not divided. On the contrary.35.17.59). Roy and Bouyssou (1993)).27) that Thierry may consider unfair. Transforming the weights accordingly (i.e. weight = .13) a b iﬀ i=1 ki × (gi (a) − gi (b)) ≥ 0.35 + .14.4. it makes no sense in principle to use the weights initially provided by Thierry as coeﬃcients measuring the importance of the criteria in an outranking method. the weights are multiplied by the evaluations (or re-coded evaluations). consider the following reformulation of the condition under which a is preferred to b in the weighted sum model (a similar formulation is straightforward in the additive value model) n (6. .

28 .39 1 .44 0 .17 .56 .17 .73. 4.78 .78 .12.83 .66.73 1 0 .14 are strong enough. we have three diﬀerent coalitions weighing . the “lightest” coalition of three criteria involves criteria 3.61 . b) computed with these weights is shown in Table 6.66 .72 will yield the relation in Table 6.39 .44 .44 . a poorer relation (i.44 . and three coalitions weighing .33 .56 . 4.17.73 14 .73 .33 .73 .33 .73 .39 11 .39 .56 .73 .61 .56 1 . which is Car 3.28 .73 .61 .83 1 .61 1 .83 .78 .44 .28 .61 .4.28 .60 thus only keeps the 3-coalitions that contain criterion 1 with the coalitions involving at least 4 criteria.44 . COMPARING ON SEVERAL ATTRIBUTES 3 .28 .56 1 .44 1 . ﬁnally there are three coalitions weighing .73 .33 .56 .78 .22. In the sequel we will .56 (two of the criteria 3.44 .61 .33 .78 .78 .73 .60.66 .61 .22 13 . 5 with criterion 1). Normalising the weight vector yields (.39 2 . with the weights we have now chosen.28 . namely .34 for safety. .56 .56 6 . a binary relation obtained through deciding which coalitions in Table 6. This tells us something about coalitions that we did not know.61 and .56 .83 . which we have already looked at.73 .73 .28 .73 .73 .73 .33 .56 . 5 together with criteria 1 and 2).73. The new thing that we can learn is the following: the relation obtained by looking at coalitions of at least 4 criteria plus coalitions of three that involve criterion 1 has a big cycle.28 .56 .39 .17.28 . obtained through looking at concordant coalitions involving at least three criteria.61 1 .22 . . .61 (two of the criteria 3. 4 and 5 and weighs .28 .61 .33 .00. then.51. The weights of the three groups of criteria are rather balanced.83 .39 for performance and .44 . with fewer arcs) is obtained when cutting above .56 5 . this is done by selecting a concordance threshold above which we consider that they are.78 1 .73 .73 .56 CHAPTER 6.44 .44 .28 1 Table 6.39 0 .44 .61 . Previous analysis with equal weights (see Section 6.28 .5 1 .44 .56 . Determining which coalitions are “strong enough” At this stage we have to build the concordance relation.17 .17 1 .17 .66 .49 .28 .28 .61 . The “lightest” 4-coalition weighs . .27 for cost.73 . .27.62 there is no longer a cycle.61 .13.73 .33 .17 8 .17 .22 .83 .39 .44 .39 .39 .66 .73 .56 .28 . If we set the concordance threshold at .56 9 .33 .1) showed that the relation in Table 6.66 .44 .83 . Cutting the concordance index at .56 10 . in increasing order.61 .83 .39 .73 .61 1 .83 .83 .134 Cars 1 2 3 4 5 6 7 8 9 10 11 12 13 14 1 1 .61 .39 .17) after rounding in such a way that the normalised weights sum up to 1.33 .66 and .14.14: Concordance index (rounded to two decimals) for the “Choosing a car” problem in agreement with what is known about Thierry’s perceptions.73 1 1 .56 .66 . When we cut above .83 .28 .28 0 12 .44 .73 .44 . So cutting between .44 . we obtain a concordance relation with a cycle passing through all alternatives but one.22 4 .73 .56 .44 .61 .33 .66 .73 1 1 .28 .44 1 .83 .78 .66 1 .56 .44 .73 and there is only one value of the concordance index between .39 . 5 with criterion 2). had a cycle passing through all alternatives.e.56 .17 . .73 .17 .66 (one of the three criteria 3.56 7 . The concordance matrix c(a.73 . 4.44 .73 .

i. Note that multiplying all the weights by a positive number would yield the same concordance relations provided the concordance threshold is multiplied by the same factor. 1 plus. there is no simple cycle passing once through all alternatives except Car 3. For illustrative purposes. 12. which beats almost all other alternatives in the cut at . Cars 3 and 11 are not outranked and car 10 is the only alternative that is not outranked either by car 3 or by car 11. This seems to be an interesting set in a choice process. consider the alternatives either in decreasing order of the number of alternatives they beat in the concordance relation or in increasing order of the number of alternatives by which they are beaten in the concordance . 1. 11.65. Rankings of the alternatives may also be obtained from Table 6. 7. they are less and less discriminating. OUTRANKING METHODS 135 concentrate on two values of the concordance threshold. Reducing the cycles of this concordance relation results in considering two classes of equivalent alternatives. cutting the concordance relation of Table 6. 12.15. up to a positive scaling factor. concordance relations tend to become increasingly poor. 2. 10. 5.e.65 of the concordance index.6. 13. Obviously. the information on how the alternatives compare with respect to all others is completely lost. can be used for supporting a choice or a ranking in a decision process. In view of supporting a choice process. Its kernel is composed of cars 3. For example. 9. Moreover. For instance. Introducing vetoes will just remove arcs from the concordance relation but the operations performed on the outranking relation during the exploitation phase are exactly those that are applied below to the concordance relation. the weights in ELECTRE I may be considered as being assessed on a ratio scale.4. 14. one class is composed of the single Car 3 while the other class comprises all other alternatives. reducing the cycles involves some drawbacks. starting from 12. 8. this relation is shown in Table 6. 6. the exploitation procedure of ELECTRE I ﬁrstly consists in reducing the cycles.60 of the concordance relation. 12. in view of the analysis of the problem carried out so far. above these values. 1 and again. .14 at . Beside the fact that this partition is not very discriminating it also considers as equivalent alternatives that are not in the same simple cycle. Supporting choice or ranking Before studying discordance and veto we show how a concordance relation.15 in a rather simple manner. which is the largest acyclic concordance relation that can be obtained. would be considered as equivalent to Car 6 which beats almost no other car. which amounts to consider all alternatives in a cycle as equivalent.60 yields a concordance relation with cycles involving all alternatives but Car 3. we consider the cut at level . an example of (non-simple) cycle is 1.60 and . which is just an outranking relation without veto. The kernel of the resulting acyclic relation is then searched for and it is suggested that the kernel contains all alternatives on which the attention of the decision-maker should be focused. for instance Car 12. 4. below. In the above presentation the weights sum up to 1. that are on both sides of the borderline separating concordance relations with and without cycles. 10 and 11.

22. ranking B) are the numbers of beaten (resp. it makes better use of the information contained in the concordance index.17 and concordance threshold . the corresponding rankings are respectively labelled “A” and “B” in Table 6. which we do not describe here. beating) alternatives for each alternative of the same column in the ﬁrst row relation.65). . ELECTRE II.17.16. with a strong preference threshold. the other.28.15: Concordance relation for the “Choosing a car” problem with weights . 12 (10) 12 (1) 3 8 (7) 10 (2) 4 7 (6) 7. .136 Cars 1 2 3 4 5 6 7 8 9 10 11 12 13 14 1 1 0 1 1 1 0 0 1 0 1 1 1 1 0 CHAPTER 6.17. 10 (3) 2. since the ranking is based on two cuts.16: Rankings obtained from counting how many alternatives are beaten (ranking “A”) or beat (ranking “B”) each alternative in the concordance relation (threshold . 11 (0) 2 3.15 and ranking the alternatives accordingly (we do not count the 1’s on the diagonal since the coalition of criteria saying that an alternative is at least as good as itself always encompasses all criteria). There are more sophisticated ways of obtaining rankings from outranking relations.65 Class A B 1 11 (11) 3. . the numbers between parentheses in the second row of ranking A (resp. This amounts to counting the 1’s respectively in rows and columns of Table 6.60 cut corresponds to weak preference (or weak outranking) while the . one could consider that the . 6 (0) 13 11 Table 6. COMPARING ON SEVERAL ATTRIBUTES 2 0 1 1 0 1 0 0 0 0 1 1 1 0 0 3 0 0 1 0 0 0 0 0 0 0 0 0 0 0 4 0 1 1 1 1 0 1 1 1 0 1 1 0 0 5 0 0 1 0 1 0 1 1 1 0 1 1 0 0 6 0 0 1 0 0 1 1 1 0 0 1 1 0 0 7 0 0 1 0 0 0 1 0 0 0 1 1 0 0 8 0 0 1 0 0 0 0 1 0 0 1 1 0 0 9 0 0 1 0 0 0 1 1 1 0 1 0 0 0 10 0 0 0 0 1 0 0 0 0 1 0 1 0 0 11 0 0 0 0 0 0 0 0 0 0 1 0 0 0 12 0 0 0 0 0 0 0 0 0 0 1 1 0 0 13 0 1 1 1 1 0 1 1 1 1 1 1 1 1 14 0 0 1 0 0 0 1 1 0 0 1 1 0 1 Table 6. 8 (3) 5 5 (5) 9 (4) 6 9. In the . 4 (2) 5 (6) 8 13. To some extent. was designed for fulﬁlling this goal. We observe that the usual group of “good” alternatives form the top two classes of these rankings. 14 (1) 1. 14 (5) 7 2. 6.65 cut corresponds to strong preference. 4 (8) 9 1. one linked with a weak preference threshold. for instance in our case. .

is given and criterion i is to be maximised. from which value gi (b) + ti (gi (b)) onwards.6. For instance. This is done through what we shall call “thresholding”. the size of the interval between the evaluations is not taken into account when deciding that a is overall preferred to b.2).g. but not all. both in the Condorcet-like method and the basic ELECTRE I method (without veto). need to be equivalent throughout the scale. ideally for each evaluation gi (a) of each alternative on each criterion. as mentioned at the end of section 6. for instance. the function ti should be negatively valued.3).2. i. one may be led to consider a non-null threshold to model preference. Note that constant thresholds could be used when a scale is “linear” in the sense that equal diﬀerences throughout a scale have the same meaning and consequences (see end of section 6. which represent the minimal diﬀerence evaluation above which a particular property holds. gi (b). In our case.4.e. however this is not a necessary condition since some differences. we treated the assessments of the alternatives as if they were ordinal data. since we examine reasons for saying that a is at least as good as b. we could have obtained exactly the same results (kernel or ranking) by working with the orders induced from the set of alternatives by their evaluations on the various criteria.65 cut.05 × gi (a)). it is reasonable to determine a threshold function ti and say that criterion i is such an argument as soon as gi (a) ≥ gi (b) + ti (gi (b)). One could ask the decision-maker to tell. OUTRANKING METHODS 137 above method.e.1. Hence one should be prudent when deciding that a criterion is or is not an argument for saying that a is at least as good as b. we have considered that ti (gi (b)) = 0. we have considered previously that b was preferred to a on criterion i as soon as gi (b) ≥ gi (a). ti (gi (a)) = . will an alternative a be said to be preferred to b? Implicitly. i. Thresholding is all the more important that. therefore. the information contained in other cutting levels has been totally ignored although the rankings obtained from them may not be identical. Things may become simpler if the threshold may be considered constant or proportional to gi (a) (e. not for saying that a is (strictly) better than b. Does this mean that outranking methods are purely ordinal? Not exactly! More sophisticated outranking methods exploit information that is richer than purely ordinal but not as demanding as cardinal.60 cut by using the method we applied to the .4. Deﬁnition 6. it is not likely that Thierry would really mark a preference between cars 3 and 10 on the Cost criterion since their estimated costs are within 10 e (see Table 6. They may even diﬀer signiﬁcantly as can be seen when deriving a ranking from the . In view of imprecision in the assessments and since it is not clear for all criteria that there is a marked preference when the diﬀerence |gi (a)−gi (b)| is small. from which value onwards an evaluation should be considered at least as good as gi (a). consider that the assessment of b on criterion i. Thresholding amounts to identifying intervals on the criteria scales.12 of the concordance index is adapted in a straightforward manner . Determining such a threshold function is not necessarily an easy task. Thresholding To this point. In any case.

1. the concordance index “measures” the arguments in favour of saying so. the criterion is to be minimised). but there may be arguments strongly against that assertion (discordant criteria).14) c(a. then the alternative against which there is a veto. If a veto threshold is passed on a criterion when comparing two alternatives. since here. say a. that lead to indiﬀerence zones.2). in view of Thierry’s particular interest in sporty cars. may not outrank the other one. which is in the analysis of discordance. say b. just like in the voting context. or (6.16) gi (a) < gi (b) − vi (gi (b)) when criterion i is to be maximised.3. lead to a veto (against claiming that the alternative with the higher evaluation could be preferred to the other one. (28. Of course it may be the case that the function vi be a constant. Thresholding is a key tool in the original outranking methods. this may result in incomparabilities in the outranking relation if in addition b does not outrank a. pairs such as (28. These discordant voices can be viewed as vetoes. it allows one to bypass the necessity of transforming the original evaluations to obtain linear scales. There is another occasion for invoking thresholds.6). a veto threshold on criterion i is in general a function vi encoding a diﬀerence in evaluations so big that it would be out of the question to say that a outranks b if (6. Let us emphasise that the eﬀect of a veto is quite radical. b) = i:gi (a)≥gi (b)+ti (gi (b)) pi . Discordance and vetoes Remember that the principle of the outranking methods consists in examining the validity of the proposition “a outranks b”.4). 30). there is a veto against declaring that a outranks b if b is so much better than a on some criterion that it becomes disputable or even meaningless to pretend that a might be better overall than b. Note that preference thresholds. In our case. COMPARING ON SEVERAL ATTRIBUTES as follows and the method for building an outranking relation remains unchanged: (6. If this would seem reasonable then we would not be far from . Although there was no precise indication on setting vetoes in Thierry’s preliminary analysis (section 6. the criterion most likely to yield a veto is acceleration. one might speculate that on the acceleration criterion.138 CHAPTER 6. 30. are used in a variant of the ELECTRE I method called ELECTRE IS (see Roy and Skalka (1984) or Roy and Bouyssou (1993)). either because the coalition of criteria stating that b is at least as good as a is not strong enough or because there is also a veto of a against b on another criterion.15) gi (a) > gi (b) + vi (gi (b)) when criterion i is to be minimised. 29. 30. (29.7) (all evaluations expressed in seconds) and all intervals wider than those listed. To be more precise. (29.

4 seconds (like Mazda 323) may not outrank a car that accelerates in 28. Using 1.6 or from 28. The pairs of evaluations are compared to intervals that can be viewed as typical of classes of ordered pairs of evaluations on each criterion (for instance the classes “indiﬀerence”.6 seconds (as is the case of Peugeot 309 GTI) could not conceivably outrank a car which does it in 28 (as Honda Civic does) whatever the evaluations on the other criteria might be. However.4. the underlying logic is quite similar to that on which statistical tests are based.4 Main features and problems of elementary outranking approaches The ideas behind the methods analysed above may be summarised as follows.6 second. b) it is determined whether a outranks b by comparing their evaluations gi (a) and gi (b) on each point of view i. here as well.4 second. brutally crystallise into signiﬁcant ones as soon as a crisp threshold is passed.5 second. . it means that a car that accelerates from 0 to 100 km/h in 29.4 have the same consequences in terms of preference. Of course. A related facet of using thresholds is that growing diﬀerences that are initially not signiﬁcant. b) is declared to be or not to be in the outranking relation.6.5 as a veto threshold thus implies that diﬀerences of at least 1.5 from 28 to 29. why not set the threshold at 1. detailed investigation is needed in order to decide which setting of the parameter’s value is most appropriate. It suﬃces to say that the outranking relation. Setting the value of the veto threshold obviously involves some degree of arbitrariness. its kernel and the derived rankings are not dramatically modiﬁed in the present case.9 to 30.9 (like Opel Astra or Renault 21) but might very well outrank a car that accelerates in 29 (like Nissan Sunny) if the performances on the other criteria are superior. In order not to be too long we do not develop the consequences of introducing veto thresholds in our example. If we decide that there is a veto with a constant threshold on the acceleration criterion for diﬀerences exceeding 1. setting the veto threshold to 1. which would imply that Mazda 323 may not outrank Nissan Sunny? In such cases.4.5 implies that a car needing 30.5 or 1. obviously methods using thresholds may show discontinuities in their consequences and that is why sensitivity analysis is even more crucial here than with more classical methods. if small variations do have a strong inﬂuence. “preference” and “veto”). For each pair of alternatives (a. Note that • a credibility index of outranking (for instance “weak” and “strong” outranking) may be deﬁned. On the basis of the list of classes to which it belongs for each criterion (its “proﬁle”). conventional levels of signiﬁcance (like the famous 5% rejection intervals) are widely used to decide whether a hypothesis must be rejected or not. the pair (a. We will allude in the next section to more “gradual” methods that can be designed on the basis of concordance-discordance principles similar to those outlined above. OUTRANKING METHODS 139 accepting a constant veto threshold of about 1. it must be veriﬁed whether small variations around the chosen value of a parameter (such as a veto threshold) do not inﬂuence the conclusions in a dramatic manner. 6. to each value of the index corresponds a set of proﬁles.

. . The various procedures that have been proposed for exploiting the outranking relation (for instance transforming it into a complete ranking) are not above criticism.4. one can hardly expect to get reliable answers when questioning him about the properties of this relation. b) is one of those associated with a particular value of credibility of outranking. Since the decisionmaker has no direct intuition of this object. • thresholds may be used to determine the classes in diﬀerences for preference on each criterion. there are of course rationality requirements for the sets of proﬁles associated with the various values of the credibility index. is then exploited in view of a speciﬁc type of decision problems (choice. • the rules for determining whether a outranks b (eventually to some degree of a credibility index) generally involve weights that describe the relative importance of the criteria. then the outranking of b by a is assigned this value of credibility index. it is especially diﬃcult to justify them rigorously since they operate on an object that has been constructed. the conclusion of the theorem is not necessarily valid and one may hope that there is no criterion playing the role of dictator. this property was satisﬁed in the construction of the outranking relation since outranking is decided by looking in turn at the proﬁles of each pair of alternatives. the smaller or larger diﬀerence in evaluations between alternatives does not matter once a certain threshold is passed. provided diﬀerences gi (a) − gi (b) equal to such thresholds have the same meaning independently of their location on the scale of criterion i (linearity property).e. this credibility index is to be interpreted in logical terms. Since this is an hypothesis of Arrow’s theorem and it is violated. it models the degree to which it is true that there are enough arguments in favour of saying that a is better than b while there is no strong reason of refuting this statement (see the deﬁnition of outranking in Section 6. It is supposed to include all the relevant and sure information about preference that could be extracted from the data and the questions answered by the decision-maker. the property of independence of irrelevant alternatives (see Chapter 2 where this property is evoked) is lost. the outranking relation (possibly qualiﬁed with a degree of a credibility index).2). which was discussed in the second . a direct characterisation of the ranking produced by the exploitation of an outranking relation seems out of reach. In the process of deriving a complete ranking from the outranking relation. the outranking relation. The result of the construction. On the other hand. Non-compensation The weights count entirely or not at all in the comparison of two alternatives. Due to their lack of transitivity and acyclicity. ). i. independently of the rest. ranking. . these weights are typically used additively to measure the importance of coalitions of criteria independently of the evaluations of the alternatives. procedures are needed to derive a ranking or a choice set from the outranking relation. This fact.140 CHAPTER 6. COMPARING ON SEVERAL ATTRIBUTES if the proﬁle of the pair (a.

for evaluations of the cost of human losses in various countries)? Other people support the idea that incomparability results from insuﬃcient information. In such a case a and b are said to be incomparable. there are several grades of outranking (weak. say. impeding that outranking be declared. incomparability is more concerned with very contrasted alternatives. for instance. strong in ELECTRE II. Vetoes only have a “negative” action. This may be interpreted in two diﬀerent ways. OUTRANKING METHODS 141 paragraph of this section 6. while incomparable alternatives may be ranked in classes quite far apart. Bouyssou and Vansnick (1986). .3.4.4.3.6. . • rules are invoked to decide which combinations of these classes lead to outranking. The treatment of the two categories is quite diﬀerent in the exploitation phase. Indiﬀerence occurs when alternatives are considered as almost equivalent. . that comparing a Rolls Royce with a small and cheap car proves impossible because the Rolls Royce is incomparably better on many criteria but is also incomparably more expensive. Incomparability and indiﬀerence For some pairs (a. . more generally. outranking is determined on the basis of the proﬁles of performance of the pair only. Bouyssou (1986). It has been argued. ) and rules associate speciﬁc combinations of classes to each grade. incomparability should not be assimilated to indiﬀerence. is sometimes called the non-compensation property of outranking methods. • the diﬀerences between the evaluations of a pair of alternatives for each criterion are categorised in discrete classes delimited by thresholds (preference. . One may advance that some alternatives are too contrasted to be compared. the available information sometimes does not allow to make up one’s mind on whether a is preferred to b or the converse. A large diﬀerence in favour. The reader interested in the non-compensation property is referred to Fishburn (1976). 6. ). indiﬀerent alternatives should appear in the same class of a ranking or in neighbouring one. should one prefer a more expensive project with a lower risk or a less expensive one with higher risk (see Chapter 5. . this can occur not only because of the activation of a veto but alternatively because the credibility of both the outranking of a by b and of b by a are not suﬃciently high. veto.4.3. b) it may be the case that neither a outranks b nor the opposite. Section 5. Another example concerns the comparison of projects that involve the risk of loss of human life. . In any case. of a over b on some criterion is of no use to compensate for small diﬀerences in favour of b on many criteria since all that counts for deciding that a outranks b is the list of criteria in favour of a.5 Advanced outranking methods: from thresholding towards valued relations Looking at the variants of the ELECTRE method suggests that there is a general pattern on which they are all built: • alternatives are considered in pairs and eventually.

b) on arc (a. Consider the following formula which is used in ELECTRE III. b) × j:Dj (a. b) if Dj (a.b) In the above formula. COMPARING ON SEVERAL ATTRIBUTES • specialised procedures are used to exploit the various grades of outranking in view of supporting the decision process. No special .e. i. when discordance is maximal there may not be any degree of outranking at all. Our analysis of the weighted sum in section 6. to avoid performing arithmetical operations on the evaluations gi (a). The justiﬁcation of such a formula is mainly heuristic in the sense that the response of the formula to the variation of some inputs is not counter-intuitive: when discordance raises outranking decreases. We do not enter into the detail of how c(a.142 CHAPTER 6. The weighted sum also has good heuristic properties at ﬁrst glance.b) 1−c(a. the converse with concordance. b) models the degree to which alternative a is preferred to alternative b on criterion j. It is thus appealing to work with continuous classes of diﬀerences of preference for each criterion. in the elementary outranking methods (ELECTRE I and II) much care was taken. Deﬁning the classes through thresholding raises the problem of discontinuity alluded to in the previous section. Then each combination of values of the credibility index on the various criteria may be assigned an overall value of the credibility index for outranking. b) ≤ c(a. just remember that they are valued between 0 and 1. b) or Dj (a. the outranking relation is also valued in such a context.b) otherwise c(a. in particular by the manner in which the indices are elaborated. A value cj (a. a method leading to a valued outranking relation (see Roy and Bouyssou (1993) or Vincke (1992)).b)>c(a. vetoes were used in a very radical fashion. for instance. rely on strong assumptions that suppose very detailed information on the preferences. Other formulae might have been chosen with similarly good heuristic behaviour. c(a. This is indeed a strong assumption that does not seem to us to be supported by the rest of the approach. Dj (a. b) is a degree of credibility of discordance. This does not mean that the formula is fully justiﬁed. only cuts of the concordance index were considered (which is typically an operation valid for ordinal data). b) ∀j S(a. but deeper investigation shows that the values it yields cannot be trusted as a valid representation of the preferences unless additional information is requested from the decision-maker and used to re-code the original evaluations gj . directly with valued relations. b) of the outranking of b by a. b) = 1−Dj (a. These degrees are often interpreted in logical fashion as a degree of credibility of the preference. The formula above involves operations such as multiplication and division that suppose that concordance and discordance indices are plainly cardinal numbers and not simply labels of ordered categories. Dealing with valued relations and especially combining “values” raises a question: which operations may be meaningfully (or just reasonably) performed on them. to compute the overall degree of credibility S(a.2 has taught us that operations that may appear as natural. b) can be computed.

For both.80 but there is a strong discordance on criterion 1. Consider for instance the following: S(a. That is. It is likely that in some circumstances a decision-maker might ﬁnd the latter model more appropriate. d) could be claimed as “signiﬁcant”. d). b) and the 1 − Dj (a. nothing guarantees that these indices can be combined by means of arithmetic operations and produce an overall index S representative of a degree of credibility of an outranking. b) − S(c.40 as well but on the second case. n}} On the ﬁrst case.e. if the information content of the c(a. 1] interval. b)’s by an increasing transformation of the [0.4: • the concordance index c(a. comparable to what was needed to build value functions from the evaluations. the option followed . at least tentatively. j = 1. For instance. namely taking the minimum. OUTRANKING METHODS 143 attention. b) = 0 for all j). For a survey of possible ways of aggregating preferences into a valued relation. then the former formula is not suitable.6. We agree with this statement but unfortunately it seems quite diﬃcult to assign a value to a threshold above which the diﬀerence S(a. b)’s just consists in the ordering of their values in the [0. b) = . it yields an outranking degree of . b) and the 1 − Dj (a. 1] interval would just amount to transforming the original value of S(a. the degree falls to .4. b) by the same transformation. .90 while Dj (a. D1 (a. The other option consists in revising the way concordance and discordance indices are constructed in order to have a quantitative meaning that allows to use arithmetic operations for aggregating them.10.40. b) = min{c(a. . was paid to building concordance and discordance indices. • the concordant coalition weighs . b) is signiﬁcantly larger than S(c. .417) that the value of the degree of outranking obtained by a formula like the above should be handled with care. There are thus two directions that can be followed for taking the objections to the formula of ELECTRE III into account. This is not the case with the former formula. consider the following two cases which lead to an outranking degree of . Note also that the latter formula does not involve arithmetic operations on c(a. This means that transforming c(a. in particular. . the formula yields a degree of outranking of . b) = 0 for all j = 1. b) and the 1 − Dj (a. min{1 − Dj (a. the reader is referred to chapters 2 and 3 of the book edited by Slowi´ski (1998). Obviously another formula with similar heuristic behaviour might have resulted in quite diﬀerent outputs. b) is equal to . b). b)’s but only ordinal operations. they advocate that thresholds be used when comparing two such values: the outranking of b by a can be considered to be more credible than the outranking of d by c only if S(a. b).40 and there is no discordance (i. In the ﬁrst option. one considers that the meaning of the concordance and discordance degrees is ordinal and one tries to determine a family of aggregation formulae that fulﬁl basic requirements including compatibility with the ordinal character of concordance and discordance. Hence. n The fact that the value obtained for the outranking degree may involve some degree of arbitrariness did not escape Roy and Bouyssou (1993) who explain (p. Dj (a.

Numbers may have an ordinal meaning. There are other continents that have been almost completely ignored.g. in particular all the methods that do not rely on a formal modelling of the preferences (see for instance the book edited by Rosenhead (1989) in which various approaches are presented for structuring problems in view of facilitating decision making). their signiﬁcance is not immediately in terms of preferences: the interval separating two evaluations must be reinterpreted in terms of diﬀerence in preferences. . We have also suggested that the signiﬁcance of a number may be intermediate between ordinal and cardinal. Preference modelling is speciﬁcally the activity that deals with the meaning of the data in a decision context. The way that this function is constructed in practice however. COMPARING ON SEVERAL ATTRIBUTES in the PROMETHEE methods (see Brans and Vincke (1985) or Vincke (1992). Weights and trade-oﬀs should not be elicited in the same manner depending on the type of model since e. large. this function would represent the overall diﬀerence in preference between any two alternatives. by “formal” we mean those methods relying on an explicit mathematical model of the decision-maker’s preferences. It makes no sense to manipulate raw evaluations without taking the context into account. • The (vague) notion of importance of the criteria and its implementation are strongly model-dependent.5 General conclusion This long chapter has enabled us to travel through the continent of formal methods of decision analysis.144 CHAPTER 6. leaves the door open to remarks analogous to those addressed to the weighted sum in Section 6. they may or may not depend on the scaling of the criteria. in which case it cannot be recommended to perform arithmetic operations on them. 6. • There are various types of models that can be used in a decision process.g. medium or small. we may summarise our main conclusions as follows: • Numbers do not always mean what they seem to. such as the type of scale or the degree of precision or the degree of certainty into account. in that case. Even if numeric evaluations actually mean what they seem to. • Preference modelling does not only take objective information linked with the evaluations or with the data. the interval separating two evaluations might be given an interpretation: one might take into consideration the fact that intervals are e. We neither looked into all methods nor did we explore those we looked into completely.2. On the particular topic of multi-attribute decision analysis. Evaluations may also be imprecise and knowing that should inﬂuence the way they will be handled. It also incorporates subjective information in relation to the preferences of the decision maker. all have their strong points and their weak points. There is no best model. they may be evaluations on an interval scale or a ratio scale and there are appropriate transformations that are allowed for each type of scale. these methods may be interpreted as aiming towards building a value function on the pairs of alternatives.

this in turn induces an output which enjoys particular properties. There are also internal and external consistency criteria that a method should fulﬁl. This does not puzzle us too much. External consistency consists in checking whether the available information matches the requirements of acceptable inputs and whether the output may help in the decision process. • A direct consequence of the possibility of using diﬀerent models is that the output may be discordant or even contradictory. in our view. in a given decision situation. the various outputs are remarkably consistent and the variants can be explained to some extent. Internal consistency implies making explicit the hypotheses under which data form an acceptable input for a method. of the chances of being able to elicit the parameters of the corresponding model in a reliable manner. the ideal decision analyst. involving conﬂicts and negotiation aspects. cars may be ranked in diﬀerent positions according to the method that is used. Second. GENERAL CONCLUSION 145 The choice of a particular approach (including a type of model) should be the result of an evaluation. then the method should perform operations on the input that are compatible with the supposed properties of the input. the way of thinking of the decision-maker. Does this mean that all methods are acceptable? Not at all. Notice that additional dimensions make the choice and the construction of a model in group decision making even more diﬃcult. The main goal of the above study was to illustrate the issue of internal and external validity on a few methods in a speciﬁc simple problem. One is that the method has to be accepted in a particular decision situation. So. Another factor that should be considered for choosing a model. but it remains that using problem structuring tools (such as cognitive maps) may prove proﬁtable. should master several methodologies for building a model. It is suﬃcient to recall that experiments have shown that there is much variability in the answers of subjects submitted to the same questions at time intervals. this of course induces variability. There are several criteria of validity. constructing complete formal models in such contexts is not always possible. This is no wonder since the information that decision analysis aims at capturing cannot usually be precisely measured. is the type of information that is wanted as output: the decision maker needs diﬀerent information when he has to rank alternatives to when he has to choose among alternatives or when he has to assign them to predeﬁned (ordered) categories (we put the latter problem aside in our discussion of the car choosing case). because the observed diﬀerences appear more as variants than as contradictions. his knowledge of the problem.5.6. First of all. the dynamics of such decision processes is by far more complex. these “chances” obviously depend on several factors including the type and precision of the available data. . We have encountered such a situation several times in the above study. the approaches use diﬀerent concepts and the questions the decision maker has to answer are accordingly expressed in diﬀerent languages. this means that the questions asked to the decision-maker must make sense to him and he should not be asked for information he is unable to provide in a reliable manner.

146

CHAPTER 6. COMPARING ON SEVERAL ATTRIBUTES

Besides the above points that are speciﬁc to multiple criteria preference models, more general lessons can also be drawn. • If we consider our trip from the weighted sum to the additive multi-attribute value model in retrospect, we see that much self-conﬁdence and therefrom much convincing power can be gained by eliciting conditions under which an approach such as the weighted sum would be legitimate. The analysis is worth the eﬀort because precise concepts (like trade-oﬀs and values) are sculptured through analysis that also results in methods for eliciting the parameters of the model. Another advantage of theory is to provide us with limits, i.e. conditions under which a model is valid and a method is applicable. From this viewpoint and although the outranking methods have not been fully characterised, it is worth noticing that their study has recently made theoretical progress (see e.g. Arrow and Raynaud (1986), Bouyssou and Perny (1992), Vincke (1992), Fodor and Roubens (1994), Tsouki`s and a Vincke (1995) , Bouyssou (1996), Marchant (1996), Bouyssou and Pirlot (1997)), Pirlot (1997)) . • An advantage of formal models that could not be overemphasised is that they favour communication. In the course of the decision process, the construction of the model requires that pieces of information, knowledge and priorities that are usually implicit or hidden, be brought into light and taken into account; also, the choice of the model reﬂects the type of available information (more or less certain, precise, quantitative). The result is often a synthesis of what is known and what has been learnt about the decision problem in the process of elaborating the model. The fact that a model is formal also allows for some sort of calculations; in particular, testing to what extent the conclusions are stable when the evaluation of imprecise data are varied is possible within formal models. Once a decision has been made, the model does not lose its utility. It can provide grounds for arguing in favour or against a decision. It can be adapted to make ulterior decisions in similar contexts. • The “decisiveness” of the output depends on the “richness” of the information available. If the knowledge is uncertain, imprecise or simply nonquantitative in nature, it may be diﬃcult to build a very strong model; by “strong”, we mean a model that clearly suggests a decision as, for instance, those that produce a ranking of the alternatives. Other models (and especially those based on pairwise comparisons of alternatives and verifying the independence of irrelevant alternatives property) are not able— structurally—to produce a ranking; they may nevertheless be the best possible synthesis of the relevant information in particular decision situations. In any case, even if the model leads to a ranking, the decision is to be taken by the decision-maker and it is not in general an automatic consequence of the model (due for instance to imprecisions in the data that calls for a relativisation of the model’s prescription). As will be illustrated in greater detail in Chapter 9, the construction of a model is not all of the decision process.

7

DECIDING AUTOMATICALLY: THE EXAMPLE OF RULE BASED CONTROL

7.1

Introduction

The increasing development of automatic systems in most sectors of human activities (e.g. manufacturing, management, medicine, etc.) has progressively led to involving computers in many tasks traditionally reserved to humans, even the more “strategic” ones such as control, evaluation and decision-making. The main function of automatic decision systems is to act as a substitute for humans (decision makers, experts) in the execution of repetitive decision tasks. Such systems can be in charge of all or part of the decision process. The main tasks to be performed by automatic decision systems are collecting information (e.g. by sensors), making a diagnosis of the current situation, selecting relevant actions, executing and controlling these actions. Automatisation of these tasks requires the elaboration of computational models able to simulate human reasoning. Such models are, in many respects, comparable to those involved in the scientiﬁc preparation of human decisions. Indeed, deciding automatically is also a matter of representation, evaluation and comparison. For this reason, we introduce and discuss some very simple techniques used to design rule-based decision/control systems. This is one more opportunity for us to address some important issues linked to descriptive, normative and constructive aspects of mathematical modelling for decision support: • descriptive aspects: the function of automatic decision systems is, to some extent, to be able to predict, simulate and extrapolate human reasoning and decision-making in an autonomous way. This requires diﬀerent tasks such as the collection of human expertise, the representation of knowledge, the extraction of rules and the modelling of preferences. For all these activities, the choice of appropriate formal models, symbolical as well as numerical, is crucial in order to describe situations and process information. • constructive aspects: in most ﬁelds of application, there is no completely ﬁxed and well formalised body of knowledge that could be exploited by the analyst responsible for the implementation of a decision system. Valuable information can be obtained from human experts, but this expertise is often 147

148

CHAPTER 7. DECIDING AUTOMATICALLY very complex and ill-structured, with a lot of “exceptions”. Hence, the formal model handling the core of human skill in decision-making must be constructed by the analyst, in close cooperation with experts. They must decide together what type of input should be used, what type of output is needed, and what type of consideration should play a role in linking output to input. One must also decide how to link subjective symbolic information (close to the language of the expert) and objective numeric data that can be accessible to the system.

• normative aspects: it is generally not possible to ask the expert to produce an exhaustive list of situations with their adequate solution. Usually, this type of information is given only for a sample of typical situations, which implies that only a partial model can be constructed. To be fully eﬃcient, this model must be completed with some general principles and rules used by the expert. In order to extrapolate examples as well as expert decision rules in a reasonable way, there is a need for normative principles putting constraints on inference so as to decide what can seriously be inferred by the system from any new input. Hence, the analysis of the formal properties of our model is crucial for the validation of the system. These three points show how the use of formal models and the analysis of the mathematical properties of the models are crucial in automatic decision-making. In this respect, the modelling exercise discussed here is comparable to those treated in the previous chapters, concerning human decision-making, but includes special features due to the automatisation (stable pre-existing knowledge and preferences, real-time decision-making, closed system completely autonomous, etc.). We present a critical introduction to the use of simple formal tools such as fuzzy sets and rule-based system to model human knowledge and decision rules. We also make explicit multiple criteria aggregation problems arising in the implementation of these rules and discuss some important issues linked to rule aggregation. For the sake of illustration, we consider two types of automatic decision Systems in this chapter: • decision systems based on explicit decision rules: such systems are used in practical situations where the decision-maker or the expert is able to make explicit the principles and rules he uses to make a decision. It is also assumed that these rules constitute a consistent body of knowledge, suﬃciently exhaustive to reproduce, predict and explain human decisions. Such systems are illustrated in section 7.2 where the control of an automatic watering system is discussed, and in section 7.4 where a decision problem in the context of the automatic control of a food process is brieﬂy presented. In the ﬁrst case, the decision problem concerns the choice of an appropriate duration for watering, whereas in the second case, it concerns the determination of oven settings aimed at preserving the quality of biscuits. • decision systems based on implicit decision rules: such systems are used in practical applications for which it is not possible to obtain explicit decision

7.2. A SYSTEM WITH EXPLICIT DECISION RULES

149

rules. This is very frequent in practice. The main possible reasons for it are the following: – the decision-maker or the expert is unable to provide suﬃciently clear information to construct decision rules, or his expertise is too complex to be simply representable by a consistent set of decision rules, – the decision-maker or the expert is able to provide a set of decision rules, but these decision rules are not easily expressible using variables that can be observed by the system. A typical example of such a situation occurs in the domain of subjective evaluation (see Grabisch, Guely and Perny 1997) where the quality of a product is deﬁned on the basis of human perception. – the decision-maker or the expert does not want to reveal his own strategy for making decisions. This can be due to the existence of strategic or conﬁdential information that cannot be revealed or alternatively because this expertise represents his only competence making him indispensable to his organisation. Such systems are illustrated in section 7.3, also in the context of the automatic control of food processes. We will use the problem of controlling the biscuit quality during baking as an illustrative case where numerical decision models based on pattern matching procedures can be used to perform a diagnosis of disfunction and a regulation of the oven, without any explicit rule.

7.2

A System with Explicit Decision Rules

Automatising human decision-making is often a diﬃcult task because of the complexity of the information involved in human reasoning. In some cases, however, the decision making process is repetitive and well-known so that automatisation becomes feasible. In this section, we would like to consider an interesting subclass of “easy” problems where human decisions can be explained by a small set of decision rules of type: if X is A and Y is B then Z is C where the X and Y variables are used to describe the current decision context (input variables) and Z is a variable representing the decision (output variable). Whenever X and Y can be automatically observed by the decision system (e.g. using sensors), human skill and experience in problem solving can be approximated and simulated using the fuzzy control approach (see e.g. Nguyen and Sugeno 1998). Such an approach is based on the use of fuzzy sets and multiple criteria aggregation functions. Our purpose is to emphasise the interest as well as the diﬃculty of resorting to such formal notions on real practical examples.

150

CHAPTER 7. DECIDING AUTOMATICALLY

7.2.1

Designing a decision system for automatic watering

Let us consider the following case: the owner of a nice estate has the responsibility of watering the family garden, and this task must be performed several times per week. Every evening, the man usually estimates the air temperature and the ground moisture so as to decide the appropriate time required for watering his garden. This amount of time is determined so as to satisfy a twofold objective: on the one hand he wants to preserve the nice aspect of his garden (especially the dahlias put in by his wife at the beginning of the summer) but on the other hand, he does not want to use too much water for this, preferring to allocate his ﬁnancial resources to more essential activities. Because this small decision problem is very repetitive and also because the occasional gardener does not want to delegate the responsibility of the garden to somebody else, he decided to purchase an automatic watering system. The function of this system is ﬁrst to check every evening, whether watering is necessary or not, and second to determine automatically the watering time required. The implicit aim of the occasional gardener is to obtain a system that implement the same rules as he does; in his mind, this is the best way to really preserve the current beautiful aspect of the garden. In this case, we need a system able to periodically measure the air temperature and the soil moisture and a decision module able to determine the appropriate duration of watering, as shown in Figure 7.1.

Figure 7.1: The Decision Module of the Watering System

7.2.2

Linking symbolic and numerical representations

Let t denote the current temperature of the air (in degrees Celsius), and m the moisture of the ground deﬁned as the water content of the soil. This second quantity, expressed in centigrams per gram (cg/g), corresponds to the ratio: m = 100 × x1 − x2 x2

where x1 is the weight of a soil sample and x2 the weight of the same sample o after drying in a low-temperature oven (75–105 C). Assuming the quantities t and m can be observed automatically, they will constitute the input data of the decision module in charge of the computation of the watering time w (expressed in minutes), which is the sole output of the module. Clearly, w must be deﬁned as a function of the input parameters. Thus, we are looking for a function f such that w = f (t, m) that can simulate the usual decisions of the gardener. Function f must be deﬁned so as to include the subjectivity of

structuring . if air temperature is Cool and soil moisture is Low then watering time is Long. A SYSTEM WITH EXPLICIT DECISION RULES 151 the gardener both in diagnosis steps (evaluation of the current situation) and in decision-making steps (choice of an appropriate action). general rules used by experts may appear to be partially inconsistent and must often include explicit exceptions to be fully operational. even if it is the result of a close collaboration with experts in that domain. In some situations. Indeed. B. if air temperature is Warm and soil moisture is High then watering time is Short. suppose the gardener is able to formulate the following empirical decision rules: Decision rules provided by the gardener: R1 R2 R3 R4 R5 R6 R7 R8 R9 R10 if air temperature is Hot and soil moisture is Low then watering time is VeryLong. W is an output variable used to represent the decision and A. if air temperature is Hot and soil moisture is Medium then watering time is Long.7. C are linguistic values (labels) used to describe temperature. we can use propositional logic and deﬁne rules of the following form: If T is A and M is B then W is C where T and M are descriptive variables used for temperature and soil moisture. resulting from several inferences due to the coexistence of apparently “reasonable” rules. if air temperature is Warm and soil moisture is Medium then watering time is Medium.2. moisture and watering time respectively. the individual acceptance of each rule is not suﬃcient to validate the whole set of rules. A common way to achieve this task is to elicit decision rules from the gardener using a very simple language. Even without any inconsistency. as close as possible to the natural language used by the gardener to explain his decision. This makes the validation of a set of rules particularly diﬃcult. For example. Even in the case of control rules where there is no need for chaining inferences (we assume here that the rules directly link inputs (observations) to outputs (decisions)). unsuitable conclusions may appear. if air temperature is Hot and soil moisture is High then watering time is Medium. if air temperature is Cool and soil moisture is High then watering time is VeryShort if air temperature is Cold then watering time is Zero Notice that the elicitation of such rules is usually not straightforward. For instance. if air temperature is Warm and soil moisture is Low then watering time is Long. if air temperature is Cool and soil moisture is Medium then watering time is Medium.

the rules can be synthesised by the following decision table (see Table 7. and the watering time W . These labels can be seen as diﬀerent words used to specify diﬀerent areas on the temperature scale. • Mlabels = {Low. the soil moisture M . These labels can be seen as diﬀerent words used to specify diﬀerent areas on the time scale Using these labels. Long. High}. To build such a function. Mj )). Warm. Short. the air temperature T . whose role is to compute a watering time w from any input (t.e.3. Now. Medium. they all take the following form: either or if T is Ti then W is Wk if T is Ti and M is Mj then W is Wk The possible labels Ti . the standard process consists in the following stages: 1.1: The decision table of the gardener This decision table represents a symbolic function F linking Tlabels and Mlabels to Wlabels (Wk = F (Ti . the problem is the following: suppose the current air temperature and soil moisture are known. how can a watering time be computed from these sentences. Moreover. in other words how can f be deﬁned so as to properly reﬂect the strategy underlying these rules? Some partial answers could be obtained if we could deﬁne a formal relation linking the various labels occurring in the decision rules and the physical quantities observable by the system. m). We will show alternative approaches that do not require the explicit formulation of decision rules in Section 7. Hot}.152 CHAPTER 7. we need to produce a numerical translation of function F in order to construct a numerical function f called “transfer function”. moisture and watering time are given by the sets Tlabels. assuming that the above set of decision rules has been obtained. Mlabels and Wlabels respectively: • Tlabels = {Cold. We can observe that the decision rules are expressed using only three variables. Cool. VeryShort. . Medium. Mj and Wk for temperature. Now. identify the current state (diagnosis) and provide a symbolic description of this state.1): Mj \ T i Low Medium High Cold Zero (R10 ) Zero (R10 ) Zero (R10 ) Cool Long (R3 ) Medium (R6 ) VeryShort (R9 ) Warm Long (R2 ) Medium (R5 ) Short (R8 ) Hot VeryLong (R1 ) Long (R4 ) Medium (R7 ) Table 7. DECIDING AUTOMATICALLY the expert knowledge so as to obtain a synthesis of the expert rules in the form of a decision table (table linking outputs to inputs) requires a signiﬁcant eﬀort. i. These labels can be seen as words used to specify diﬀerent areas on the moisture scale • Wlabels = {Zero.VeryLong}.

may feel uncomfortable in specifying the scalar translation precisely. In the decision step. A SYSTEM WITH EXPLICIT DECISION RULES 2. An individual. A possible way of constructing such tables is to put the expert in various situations. we present the main basic possibilities and discuss the associated representation and aggregation problems.4). for the sake of illustration. beliefs.3 Interpreting input labels as scalars A ﬁrst and simple way of building the symbolic/numerical correspondence is by asking the decision-maker to associate a typical scalar value to each input label used in the rules. and .2. intervals or fuzzy sets. Even for apparently simpler notions such as temperature and duration. both in the diagnosis and decision stages. the deﬁnition of the decision function f relies on a symbolic translation of the initial numerical information in the diagnosis stage. but also in linking input labels (T labels and M labels) to observable values chosen on the basis of the temperature and moisture scales. The inference stage consists of an activation of the rules whose premises match the description of the current state. There are several ways of establishing the symbolic/numeric translation ﬁrst in the diagnosis stage and then in the decision stage. a purely symbolic inference implementing the usual decisionmaking reasoning and then a numerical translation of the conclusions derived from the rules. that the following numerical information has been provided by the expert (see Tables 7. expert or not. the expert or decision-maker’s subjectivity can also be expressed by linking output labels (Wlabels) with elements of the time scale. 153 3. For example. in the gardener example.2. In the following subsections. Thus. 7. In both stages. etc). This is particularly true concerning parameters like “soil moisture” which are not easily perceived by humans and whose qualiﬁcation requires an important cognitive eﬀort. the subjectivity of the decision maker is not only expressed in choosing particular decision rules. activate the relevant decision rules for the current state (inference). We will see later how the diﬃculty can partly be overcome by the use of non-scalar translations of labels. Note that the simplicity of the task is only apparent. the expert may be reluctant to make a categorical symbolic/scalar translation. he will have to sacriﬁce a large part of his expertise and the resulting model may lose much of its relevance to the real situation. symbols can be linked to scalars. If nevertheless he is constrained to produce scalars. to ask him to qualify each situation with one of the admissible labels. 7.2. depending of the level of sophistication of the model. synthesise the recommendations induced from the rules and derive a numerical output (decision) The diagnosis stage consists in identifying the current state of the system using numerical measures and describing this state in the language used by the expert to express his decision rules.3 and 7. The decision stage consists of a synthesis of the various conclusions derived from the rules and the selection of the most appropriate action (at this stage. the selected action is precisely deﬁned by numerical output values). Let us assume now. The symbolic/numerical translation possibly includes the subjectivity of the decision-maker (perceptions.7.

This implies averaging the outputs associated to the reference points located in the neighbourhood of the observed parameters (t. w) where w = f (t. m). He must keep it in mind during the whole construction of the system and also later in interpreting the outputs of the system.32. 20). yj ) being . m).5: Typical reference points Hence.48.12. Of course.3: Typical moisture levels associated to labels Mj Wlabels Times (mn) VeryShort 5 Short 10 Medium 20 Long 35 VeryLong 60 Table 7.4: Typical times associated to Wk to measure the observable parameters with gauges so as to make the correspondence. if the observation is (t. 10). P2 = (25. This leads to a well-known mathematical problem since function f must be deﬁned so as to interpolate points of type (t. the solution is not unique and some additional assumptions are necessary to deﬁne precisely the surface we are looking for. R2 .154 CHAPTER 7. For instance.08. Of course. m. m) is known for a ﬁnite list of cases and must be extrapolated to the entire range of possible inputs (t. and P5 = (25. 0. 0. The simplest method is to perform a linear interpolation from the reference points given in Table 7. weight ωij of point (xi . the rules allow the following reference points to be constructed: t m w 30 10 60 25 10 35 20 10 35 30 20 35 25 20 20 20 20 20 30 30 20 25 30 10 20 30 5 10 10 0 10 20 0 10 30 0 Table 7. 10). DECIDING AUTOMATICALLY Tlabels Temperatures (o C) Cold 10 Cool 20 Warm 25 Hot 30 Table 7. m). m) = (29.2: Typical temperatures associated to labels Ti Mlabels Soil water content (cg/g) Low 10 Medium 20 High 30 Table 7. There is no space in this chapter to discuss the relative interest of the various possible interpolation methods that could be used to obtain f . 20) with the respective weights 0. the reliability of the information elicited with such a process is questionable. the “transfer function” f linking watering time w to the pair (t. The analyst must be aware of the share of arbitrariness attached to such a symbolic/numerical translation.P4 = (30. This yields points P1 = (30. and R5 . R4 .5. 16) the neighbourhood is given by 4 reference points obtained from rules R1 . From the above tables of scalars. 0.

For example. A SYSTEM WITH EXPLICIT DECISION RULES deﬁned by: (7. 20 and therefore. m. one can use more sophisticated interpolating methods based on B-spline functions that produce very smooth surfaces with good continuity and locality properties (see e. w).1) ωij = 1− |29 − xi | 30 − 25 × 1− |16 − yj | 20 − 10 155 The watering times associated to points P1 . 35. Thus. m) leads to the following piecewise linear approximation of function f . P2 . reference points are replaced by reference areas in the parameter’s space (t. one may prefer to modify the link between symbols and numerical scales in order to allow symbols to be represented by subsets of plausible numerical values.2. . 35. no information justiﬁes that function f is linear between points to be interpolated. Beatty and Barsky 1987). First of all. the need of interpolating the reference points given in Table 7. Figure 7.2. P4 and P5 are 60.g. This point is discussed below. As a consequence. and the interpolation problem must be reformulated. Bartels. making a non-linear f possible. Instead of performing an exact interpolation of these points.2: Approximation of f by linear interpolation This piecewise linear interpolation method is however not completely satisfactory. Many other interpolation methods could be used as well.5 is itself questionable. the ﬁnal time obtained by a weighted linear aggregation is 41 minutes and 12 seconds. as mentioned above. see Figure 7.7. Performing the same approach for any possible input (t. the deﬁnition of reference points from the gardener’s rules is far from being easy and other relevant sets of scalar values could be considered as well. Moreover.

of the two inputs (17. 15) and (22. 15) Medium [15. each label represents a range of values rather than a single value on a numerical scale. DECIDING AUTOMATICALLY 7. we can distinguish two cases. 15) that respectively . each corresponding to the most plausible values attached to a label Ti . 22. Hence. If the intervals are deﬁned so as to cover all plausible values. these intervals form a partition of the temperature and moisture scales respectively. +∞) Table 7.6: Intervals associated to labels Ti Mlabels Soil water content (cg/g) Low [0. can be translated into at least one label.156 CHAPTER 7.7: Intervals associated to labels Mj If (t. then the associated labels are {Hot.9) that both translate as (Cool.4. the translation diverges. m) corresponds to a pair {Ti .5) Hot [27. This process is simple but has serious drawbacks. 14.5.4 Interpreting input labels as intervals In the gardener’s example. m). thereis a unique active rule in Table 7.5.4.5) Warm [22. any possible input belongs to at least one interval and therefore. it does not provide a complete solution since function f is only known for a ﬁnite sample of inputs and requires interpolation to be extended to the entire set of possible inputs. 24.5 27. m) = (29. let us consider the following intervals: Tlabels Temperatures (o C) Cold (−∞. 100] Table 7. each input (t. This is the case of (17.5) Cool [17. 25) High [25.2. the only active rule is R4 whose conclusion is “watering time is long”. for some other pairs of inputs that are very similar. Thus. On the contrary. Mj } where Ti (resp. in many cases. depending on whether the intervals associated to labels partially overlap or not. substituting labels Ti and Mj by scalar values on the temperature and moisture scales has the advantage of simplicity. 16). This is the case. Assuming this is also possible for the moisture scale.1 and the conclusion is easy to reach. M edium). However. representing the diﬀerent labels used in the rules by intervals seems preferable. In this case. Mj ) is the label associated to the interval containing t (resp.9) and (17. for example. In such cases.4 the numerical output is 35. Labels represented by disjoint intervals Suppose that the gardener is able to divide the temperature scale into consecutive intervals. 17. Moreover. The granularity of the language used to describe the current state of the system is poor and many significantly diﬀerent states are seen as equivalent. if we keep the interpretation of “long” given in Table 7. M edium} and therefore. For example. Basically.5.5.

hot is valid but not warm. reducing discontinuity induced by interval boundaries without multiplying labels is possible. if Warm and Hot are represented by intervals [20. This raises a new problem since these rules may possibly conclude to diverging recommendations from which a synthesis must be derived. 30] and [25. A ﬁrst option for this is allowing for overlap between consecutive intervals. In the ﬁrst case. M edium). Medium. Zero. Any output label (labels Wk in the example) must be translated by numbers and these numbers must be aggregated to obtain the numerical output of the system (the value of w in the example). from 20o C to 25o C. we have to specify the links between the values of physical variables describing the system and the symbolic labels used to describe the current state of the system more carefully. 29o C becomes a temperature compatible with the two labels. This progressive transition between the two states warm and hot reﬁnes the initial sharp transition from warm to hot by introducing an intermediary state corresponding to an hesitation between the two labels. Zero. Labels represented by overlapping intervals In order to improve on the previous solution. Since it is diﬃcult to separate such intervals with precise boundaries. Low) and (Cool.2. as shown below. from 25o C to 30o C both labels are valid. and possibly leading to dysfunctions. the deﬁnition of a numerical output can be seen as an aggregation problem. This is not suitable because such decision systems are often included in a permanent observation/reaction loop. More precisely. rule R10 is activated and a zero watering time is decided. As an illustration. +∞) respectively. we . Typically. leading to alternate starts and stops of the system.7. As a consequence. This is more realistic. It is true that narrowing the intervals and multiplying the labels would reduce these drawbacks and reﬁne the granularity of the description. Nevertheless. where aggregation is used to interpolate between conﬂicting rules. two consecutive labels are associated to a given temperature. m). one can make them partially overlap. Warm is a valid label (a possible source of rule activation) but not Hot. rule R6 is activated and a medium watering time is recommended. This can produce alternated sequences of outputs such as Short. Such discontinuities cannot really be justiﬁed and make the output f (t. but the number of rules necessary to characterise f would grow signiﬁcantly with the number of labels. Note however that measuring a temperature of 29o C possibly allow several rules to be active in the same time. Suppose for example that several consecutive situations of temperature and moisture in a stable situation yield diﬀerent values for parameter t and m due to the imperfection of gauges and that these variations occur around a point of discontinuity in the system. 20 minutes according to Table 7. Expressing so many labels and rules requires a very important cognitive eﬀort that cannot reasonably be expected from the expert. A SYSTEM WITH EXPLICIT DECISION RULES 157 give (Cold. especially because there is no reasonable way of separating the “warm” and “hot” with a precise boundary. Thus. reﬂecting the possible hesitation of the gardener in the choice of a unique label. m) arbitrarily sensitive to the inputs (t. and from 30o C.4. In the second case. in some intermediary areas of the temperature scale.

Therefore. with the observation (t. 25] Warm [20.8: Intervals associated to labels Ti Mlabels Soil water content (cg/g) Low [0. For this. by convention. R ∈ B(α). 16). Hot} for temperature and {M edium.8 and 7. one can calculate the arithmetic mean of the 3 outputs. 20] Cool [15. For example. we set ω(R) = 1 when the decision rule R is activated and ω(R) = 0 otherwise. 35 and 60 minutes that must be aggregated. +∞) Table 7. 30] High [20. For example.9: Intervals associated to labels Mj If the observation of the current situation is t = 29o C and m = 16cg/g. namely R1 . R4 . R2 . Since 35 (minutes) is the scalar translation of Long. In the example. For any possible value α of w. Deriving a numerical duration from this synthesis is not any easier. one can link symbolic and numerical information using Table 7. Similarly . deﬁning what could be a fair synthesis of conﬂicting qualitative outputs is not an easy task. we obtain from the gardener’s rules B(35) = {R2 . m). we obtain three different durations. For example. High} for moisture.2) ω(α) = sup ω(R) R∈B(α) Hence. Long (by R2 . Let B(α) denote the subset of rules concluding to a watering time α. R4 }. R4 ) and VeryLong (by R1 ). Hence ω(35) = sup{ω(R2 ). DECIDING AUTOMATICALLY assume now that the labels are represented by the intervals given in Tables 7. This gives several symbolic values for the watering duration. we can choose: (7. i. we can observe 3 conﬂicting recommendations and the ﬁnal decision must be derived from a synthesis of these results. we have seen that the active rules are R1 . for any state (t. A simple idea is to process symbols as numbers. each watering time activated by at least one rule receives the weight 1 and any other time receives the weight 0. we can deﬁne a weight ω(R) for each decision rule R in the gardener database B. 20] Medium [10. This weight represents the activity of the rule and. Of course. These qualitative labels allow some of the gardener’s rules to be activated. namely Medium (by R5 ).4.9: Tlabels Temperatures (o C) Cold (−∞. R3 . ω(R3 ). 30] Hot [25. R4 and R5 and therefore ω(R1 ) = ω(R2 ) = ω(R4 ) = ω(R5 ) = 1 whereas ω(R) = 0 for any other rule R. Let us now present in detail the calculation of ω(35).158 CHAPTER 7. the relevant labels are {W arm. 100] Table 7. R2 . m) = (29. ω(R4 )} = 1.e 20. a weight ω(α) measuring the activity or importance of the set B(α) can be deﬁned as a continuous and increasing function of the quantities ω(R). R5 . More generally.

the most popular approach is the “centre of gravity” method which amounts to performing a weighted sum (see also chapter 6) of all possible times α. ω(α) = 0 for all other α.α α ω(α) From the observation (t. m) = (29. The option (7.3) ω(α) = ω(R) R∈B(α) Coming back to the example.4) w= α ω(α). Because there are no active rules left. the more it becomes important in the calculation of the ﬁnal watering time. whereas equation (7. 16). there is only a ﬁnite number of times activated by the rules in a given state. equations (7. read the current values of input parameters t and m 2. Since there is a ﬁnite number of rules.2) and (7. when the activation of a subset of rules necessarily implies that another subset of rules is also active.2. Formally the ﬁnal output is deﬁned by: (7.4) yield a watering time of (60 + 35 + 20)/3 yielding 38 minutes and 20 seconds.3) could be preferred when the activation of the various rules are independent. Note that the choice of a weighted sum as ﬁnal aggregator in equation (7. one can easily imagine that the choice of one of these options is not easy to justify. aggregate these numerical outputs .4) is questionable and one could formulate criticisms similar to those addressed to the weighted average in the previous chapters (especially in chapter 6). On the contrary. In order to synthesise these diﬀerent times.25 × (60 + 35 + 35 + 20) that amounts to 37 minutes and 30 seconds. ﬁnd the symbolic qualiﬁers that best ﬁt these values 3. The more a given time is supported by the set of active rules. Another option taking account of the number of rules supporting each time α could be: (7. as in the linear interpolation approach used in the previous subsection. detect the decision rules activated by these observations 4. the ﬁnal result has been obtained as a result of the following sequence: 1.7. In a practical situation. This second option gives more importance to a time α supported by several rules than to a time α supported by a single rule. A SYSTEM WITH EXPLICIT DECISION RULES 159 we get ω(20) = 1 thanks to R5 and ω(60) = 1 thanks to R1 . collect the symbolic outputs resulting from the inferences 5. Everything works as if each active rule was voting for a time. translate symbols into quantitative numerical outputs 6.2) so as to avoid possible overweighing due to redundancy in the set of rules. we now obtain: ω(60) = ω(R2 )+ ω(R3 )+ ω(R4 ) = 2 whereas the others ω(α) remain unchanged. one could prefer resorting to (7. In this approach.3) yields: w = 0.

R8 and R9 . it is not easy to describe a continuum of states (characterised by all pairs (t. any decision can be explained very simply. R8 and R9 whose recommendations are Medium. 20). • if necessary. The valid labels are {Cool.4) is therefore 13 minutes and 45 seconds. state s1 makes valid the labels {W arm. m) = (25. The outputs can always be presented as a compromise between recommendations derived from several of the expert’s decision rules. Medium respectively. Medium. The main advantages of such a process are the following: • it relies on simple decision rules expressed in a language close to the natural language used by the expert. Mj ).8 and 7. M edium} for soil moisture. m) = (25. This activates rules R1 . the following question can be raised: why not usenumbers directly? . Things are really diﬀerent for s2 however. Since the numerical/symbolic and then symbolic/numerical translations are both sources of arbitrariness. R6 . despite the similarity of the states. • it allows one to deﬁne a reasonable decision function allowing numerical outputs to be computed from any possible numerical input. R4 and R5 whose recommendations are VeryLong. 19. despite the close similarity between states s1 and s2 .01. This is due to the discontinuity of the transfer function that deﬁnes the watering time from the input (t.01). the decision rules R1 . It is worth noting that. Nevertheless. Consider two very similar states s1 and s2 characterised by the observations (t.99.4) is therefore 38 minutes and 45 seconds. Short. but the diﬃculty can partly be overcome. and {Low. They lead to very diﬀerent outputs. DECIDING AUTOMATICALLY This process is perhaps the more elementary way of using a set of symbolic decision rules to build a numerical decision function. and {M edium. High} for soil moisture. In fact. It shows a simple illustration of the so-called “computing with words” paradigm advocated by Zadeh (see Zadeh 1999). In the right neighbourhood of this entry (t > 25 and m < 20). The resulting watering time obtained by equation (7. there is a signiﬁcant diﬀerence in the watering times computed from the two input vectors. thus leading to a much shorter time. R2 and R4 are fully active but this is no longer the case in the left neighbourhood of the point (t < 25 and m > 20) where they are replaced by rules R6 . the numerical outputs resulting from the decision rules may vary signiﬁcantly. m) in the gardener example) with a ﬁnite number of labels of type (Ti . This activates rules R5 . W arm} for temperature. as shown by the following example. The activations and computations performed for s1 and s2 diﬀer signiﬁcantly. interpreting labels as intervals does not really prevent discontinuous transfers from inputs to outputs. The resulting watering time obtained by equation (7. m) = (24. VeryShort respectively. Long. 20. According to Tables 7. depending on the choice of the numerical encoding of the labels. It is true that.9. R2 . Example (1). Long.99) and (t. Hot} for temperature. m) for (t. This induces arbitrary choices in the description of the current state which could disrupt the diagnosis stage and make the automatic decision process discontinuous. ♦ This criticism is serious.160 CHAPTER 7.

This is the case of labels deﬁned in ﬁgure 7. he could explain that Warm means between 20 and 30 degrees with 25 as the most plausible value. More precisely. the possibility of justifying decisions is a great advantage. It is not our purpose to cover all possibilities in detail. in many decision contexts. A simple example of such fuzzy labels is represented in Figures 7.5 Interpreting input labels as fuzzy intervals One step back in the modelling process. the gardener can easily specify the typical temperatures associated with each label.3: Fuzzy labels for the air temperature Note that sometimes. one can deﬁne the relative likelihood of each temperature when the temperature has been qualiﬁed as Hot. Cool or Cold. each label Ti is represented by a [0. there are several ways of improving the process proposed above and of reﬁning the formal relationship between qualitative labels and numerical values.4. the ability of automatic decision systems to simulate human reasoning and explain decision by rules is generally seen as an important advantage. even if each decision considered separately is of marginal importance. Second. and µTi (t) = 1 when t is perfectly representative of the label. As a convention.7.3 and 7. Warm. We only present and discuss some very simple and intuitive ideas used to construct more sophisticated models and tools in this context. we can redeﬁne the relationship between a given label and the numerical scale associated to the label more precisely.5) ∀m ≥ 0. we set µTi (t) = 0 when temperature t is not connected to the label Ti . Figure 7. Although this is not crucial in our illustrative example. This argument often justiﬁes the use of rule-based systems to automatise decision-making. A SYSTEM WITH EXPLICIT DECISION RULES 161 There are two partial answers: ﬁrst.2. each label Ti is deﬁned with fuzzy boundaries and characterised by the function µTi . As an expert. 7. the fuzzy labels are deﬁned in such a way that membership adds up to 1 for any possible value of the numerical parameter. 1]-valued function µTi deﬁned on the temperature scale in such a way that µTi (t) represents the compatibility degree between temperature t and label Ti . These fuzzy labels can partially overlap but they must be deﬁned in such a way that any part of the temperature scale is covered by at least one label.2. µLow (m) + µM edium (m) + µHigh (m) = 1 . Thus. He can also deﬁne areas that are deﬁnitely not concerned with each label.4 for which we have: (7. For example. In this case.

4 and therefore. the weight of the rule R1 is min(0.10: The weights of the rules when (t. It is therefore natural to state: (7. The observation (t. m). This is the degree to which the numerical inputs match the premises of the rule.4 (R1 ) 0.4. Note however that this property makes sense only when membership values have a cardinal meaning. As a numerical example. Mj ).8 and µLow (m) = 0.6 0 Cold 0 0 (R10 ) 0 (R10 ) 0 (R10 ) Cool 0 0 (R3 ) 0 (R6 ) 0 (R9 ) Warm 0.6) ωij = h(µTi (t). each decision rule can be activated to a certain degree. e.5) the numerical translation of a natural condition requiring that the fuzzy labels Low.8.6 (R4 ) 0 (R7 ) Table 7.2 (R5 ) 0 (R8 ) Hot 0. This importance depends on the matching of the input (t. h(x. for any rule Rij of type: if T is Ti and M is Mj then W is Wk where Wk = F (Ti . With such fuzzy labels. and for any numerical observation (t. m) and the premise (Ti .8 and the moisture is Low to the degree 0.2 (R2 ) 0. 16) leads to µHot (t) = 0. DECIDING AUTOMATICALLY Figure 7. m) = (29. Thus. the temperature is Hot to the degree 0.4. 16) . the weight (or activation degree) ωij of the rule Rij reﬂects the importance (or relevance) of the rule in the current situation. consider the gardener’s rule R1 .4) = 0.2 0.4 0. Mj ). y).10): ωij Mj Low Medium High Ti µMj \ µTi 0. More precisely.162 CHAPTER 7.8 0. Using this approach for each rule with h = min yields the following activation weights (see Table 7. m) = (29.4: Fuzzy labels for the soil moisture This property (7.g. 0. Medium and High form a partition of the set of possible moistures. µMj (m)) where h is an aggregation function representing the logical “and” used in the rule. y) = min(x.

since the aggregation function .002 (R6 ) 0. for state s2 . This enables a soft control of the output that can be perfectly illustrated by the example discussed at the end of subsection 7.001 (R9 ) Warm 0. the resulting activation weights are those given in Tables 7. If we consider the two neighbour states s1 and s2 introduced in this example.2 + 0. we notice that the activity of each rule does not vary signiﬁcantly when passing from state s1 to state s2 .002 0 (R3 ) 0.001 (R8 ) Hot 0 0 (R1 ) 0 (R4 ) 0 (R7 ) Table 7. 1] scale and the weights directly reﬂect the adequacy of the rule in the current situation. and the membership functions deﬁning the labels have soft variations.2 × 35 + 0.002 (R4 ) 0 (R7 ) Table 7.2. These weights depend continuously on input parameters t and m. m) = (25.999 0 Cold 0 0 (R10 ) 0 (R10 ) 0 (R10 ) Cool 0 0 (R3 ) 0 (R6 ) 0 (R9 ) Warm 0.99. The more the premise of the rule matches the current situation.2.6) are only slightly diﬀerent from those for s1 and the ﬁnal output derived from Table 7. As a consequence.6 × 35 + 0.4.2 163 and therefore the watering time is 40 minutes.998 0 (R2 ) 0.001 (R1 ) 0.01.11: The weights of rules when (t.999 0. Similarly. the activation of the rules obtained from equation (7.01) Hence.998 (R5 ) 0. we get w(s1 ) = 20 minutes and 5 seconds as the ﬁnal output.4) we get w= 0.998 (R5 ) 0 (R8 ) Hot 0.12: The weights of rules when (t.2 × 20 0. The activation level of each rule is graduated on the [0.001 (R2 ) 0.4) and Table 7. A SYSTEM WITH EXPLICIT DECISION RULES Hence. ωij Mj Low Medium High Ti µMj \ µTi 0.99) ωij Mj Low Medium High Ti µMj \ µTi 0 0.7. using equation (7.4) gives w(s2 ) = 19 minutes and 58 seconds. from equation (7. This is due to the way activation weights are deﬁned and used in the process. m) = (24. Note that the deﬁnition of an aggregation function yields a compromise solution between the various active decision rules whose outputs are partially conﬂicting.4).4. In the additive formulation characterised by equation (7.001 0. 20. and if we choose h = min in equation (7.4 + 0.001 Cold 0 0 (R10 ) 0 (R10 ) 0 (R10 ) Cool 0.12 using equation (7.6 + 0.4 × 60 + 0. everything works as if each active rule was voting for one candidate chosen in the set Wlabels.998 0. 19. the more important the rule is in the voting process.12.002 0. Here.11 and 7.6).

DECIDING AUTOMATICALLY used to derive the ﬁnal watering time w is also a continuous function of quantities ω(R) (see equation (7. Bouchon 1995.g.2.4)). quantity w depends continuously on input parameters t and m. A perfectly sound deﬁnition of such membership values would require more information than can easily be obtained in practice. (see e. The resulting decision system is more realistic and robust to slight variations of inputs. However.164 CHAPTER 7. Thus. Paralleling the treatment of symbolic inputs. We have to sophisticate the previous construction so as to improve the output processing. the choice of min is often justiﬁed by the fact that h is used to evaluate a conjunction between several premises of a given rule (a conjunction of type “temperature is Ti and moisture is Mj ”). which is rarely explicit. the product could perhaps replace the min and the particular choice of the min is not straightforward. we assume here that Wlabels are . let us mention the following: • the choice h = min in equation (7. Sugeno 1985. Moreover. This assumption. More generally. Mj ). 7. • the interpretation of symbolic labels used to describe outputs of the rules as scalar values is not easy to justify. Fodor and Roubens (1994)). Nguyen and Sugeno 1998).6) requires that quantities of type µTi (t) and µMj (m) are commensurate. Note however that the idea of the conjunction is captured by any other t-norm (see for instance. in the same way as for input labels? The last criticism suggests an improvement of the current system. two moistures) to a Label Ti (resp. is very strong because it requires much more than comparing the relative ﬁt of two temperatures (resp. This point is discussed in the next subsection. moisture) into symbolic variables used in decision rules. Among them. Why not use a description of these labels as intervals. Gacogne 1997. Mamdani 1981.6 Interpreting output labels as (fuzzy) intervals Suppose for example that Wlabels are no longer described by scalar values but by subsets of the time scale. we can use intervals or fuzzy intervals later in the process so as to continuously link symbolic outputs of the rules (Wlabels) to numerical outputs (watering times). several criticisms can be addressed to the small fuzzy decision module presented above. It also requires comparing the ﬁt of any temperature to any label Ti with the ﬁt of any moisture to any label Mj . the labels Wk could be represented by a set of intervals (overlapping or not) with advantages similar to those mentioned for input labels Ti and Mj . This is problematic because this choice is not without consequence on the deﬁnition of the watering time. This advantage is due to the use of fuzzy sets and has greatly contributed to the practical success of the fuzzy approach in automatic control (fuzzy control. For instance. Thus. This explains the observed improvement with respect to the previous model based on the use of all or nothing activation rules. the use of fuzzy labels to interpret input labels has a signiﬁcant advantage: it makes it possible to deﬁne a continuous transformation of numerical input data (temperature.

However.8) w= α ωt. m) of the system. m) and whose conclusion Wk is compatible with α. R2 . µMj (m). Hence the set of relevant watering times is [10. “Long” and “VeryLong”. In the example. we can use an equation similar to (7. the range of relevant watering times is the union of all values compatible with labels Wk derived from active rules. 70]. Figure 7.m (α) is deﬁned as an increasing function of quantities µTi (t).g.m (α) dα . the active rules are R1 . this equation must be generalised because there may be an inﬁnity of times activated by the rules (e. µMj (m) and µWk (α). m) = (29. all times are not equivalent inside this set.2. the weight of any watering time α can be deﬁned by: (7. by analogy with Mamdani’s approach to fuzzy control (Mamdani 1981). In more nuanced situations. a time must be perfectly representative of a label Wk that has been obtained by a fully active rule.7. R4 .16 (w) represented in Figure 7.7) is that a watering time α must receive an important weight when there is at least one rule Rij whose premises (Ti . A SYSTEM WITH EXPLICIT DECISION RULES 165 represented by fuzzy intervals of the time scale. This explains that ωt. For example. a whole interval).5. In order to obtain a precise watering time. µWk (α)) Rij ∈B where B represents the set of rules (here the gardener’s rules) and Rij represents the rule: If T = Ti and M = Mj then W = Wk and h is a non-decreasing function of its arguments (in Mamdani’s approach.7) ωt. 16) leads to a function ω29. To be fully considered. the observation (t. Mj ) are valid for the observation (t.4).m (α) = sup h(µTi (t). However. The usual extension of the weighted average to an inﬁnite set of values is given by the following integral: (7. we let us consider the labels represented in Figure 7. The idea in equation (7.6. R5 .m (α) dα ωt. Each of them represents a possible numerical translation of a label Wk obtained by the activation of one or several rules. In our example. h = min).7) is a natural extension of equation (7. For the sake of illustration.5: Fuzzy labels for the watering time For any state (t. Notice that equation (7. and therefore the Wlabels concerned are “Medium”.2). the weight attached to a possible time is function of the ﬁtness of the times activated to a certain degree by the rules.

166 CHAPTER 7. Indeed.7) from an increasing aggregation function h is not very natural. z) = min(1 − min(x. µMj (m). Perny and Pomerol 1999). µWk (α)) stands for the numerical translation of the proposition: (Ti = t and Mj = m) implies Wk = α In the ﬁelds of multi-valued logic and fuzzy sets theory. the quantity h(µTi (t). In our example.1 gives a ﬁnal time of 37 minutes and 32 seconds. As an example the value attached to the sentence “A implies B” can be deﬁned by the Lukasiewicz implication min(1 − v(A) + v(B).αi i ωt. admissible functions used to translate implications are required to be non-increasing with respect to the value of the left hand-side of the implication and non-decreasing with respect to the value of the right hand-side (Fodor and Roubens 1994. the conjoint use of the min operator to interpret the conjunction on the left hand-side and that of the Lukasiewicz implication would lead to the following h function: h(x. y.7–7. y) + z.m (α) proposed in equation (7. the use of equations (7.m (αi ). However.m (αi ) where (αi ) is a strictly increasing sequence of times resulting from a ﬁne discretisation of the time scale. 1) where v(A) and v(B) are the values of A and B respectively. This . as required above in the text. a discretisation with step 0.9) w= i ωt.6: Weighted times induced by rules that can be approximated by the following quantity: (7. bearing in mind the form of rule Rij . Bouchon 1995. resorting to implication operators instead of conjunctions in order to implement an inference via rule Rij also seems legitimate.9) can be seriously criticised: • the deﬁnition of ωt. However. In our case. 1) Note that this function is not increasing in its arguments. This last sophistication meets our objective because it provides a transfer function f with good continuity properties. DECIDING AUTOMATICALLY Figure 7.

5 only means that 25 minutes is a better numerical translation of the qualiﬁer Long than 21 minutes. For example.7) requires even more commensurability than equation (7. Even with this information. the label Long in Figure 7. Thus.m (α) in equation (7.5 is deﬁned by support [20.2. Thus. This is a very strong assumption. One could expect that the decision-maker is able to specify the support and core of each fuzzy label. Some general guidelines for choosing a suitable h are given in (Bouchon 1995). their membership is equal to 1. (Bouchon 1995). The above information leaves room for an inﬁnity of functions. we should be able to determine whether any temperature t is a better representative of a label Ti than time α is representative of label Wk . 30] ∪ [40. 55]. – the core. i. (Dubois and Prade 1988). i. the choice of a precise membership function often remains arbitrary. A SYSTEM WITH EXPLICIT DECISION RULES 167 is usual in the ﬁeld of fuzzy inference and approximate reasoning where a formula like (7. the shape of the membership function in the transition area is often chosen as linear or gaussian (for derivability) but rarely justiﬁed by questioning the decision-maker. in many cases. core [30. time.4. For example. however. the deﬁnition of weights ωt. even if µM edium (30) = 0. y) could be the Lukasiewicz t-norm: max(x + y − 1. the interval of all numerical values compatible with the label. 0). nor that 25 minutes is more Long than 26o C is Hot. To go further in this direction.2. A reasonable alternative to min(x.6). without such assumptions. µLong (21) = 0. one could also discuss the use of min to interpret a conjunction whereas the Lukasiewicz implication is used to interpret implications. (Baldwin 1979).e. inequalities of type µTi (t) > µWk (α) play a role in the process. Now. the only reliable information contained in the membership function is the relative adequation of each temperature. the interval of all numerical values perfectly representative of the label (the core is a subset of the support). their membership must be strictly positive. . moisture. – the membership function making a continuous transition from the border of the support to the border of the kernel.1 and µLong (25) = 0. as well as the trend of the membership function (increasing from the border of the support to the border of the core). However. even if µHot (26o ) = 0. This does not necessarily mean that 25 minutes is more Long than 30 minutes is Medium. As a conclusion. In practice.7) is used to generalise the so-called “modus ponens” inference rule (Zadeh 1979). Usually.e. especially if we consider the way these labels are represented in the model. 40] and two linear transitions (membership to non-membership) in the range [20. the deﬁnition of h is not straightforward and must be justiﬁed in the context of the application.8) with h = min is diﬃcult to justify. a label thought as a fuzzy interval is assessed on the basis of 3 elements: – the support. to each label. 55].7. • Equation (7.

because the membership value is 5 times larger. ♦ ωij Mj Low Medium High Ti µMj \ µTi 0 0.1 (R4 ) 0. Although this inversion of duration is not a crucial problem in the case of the watering system. the choice of . Even when the commensurability assumption of membership scales is realistic.15 and 7.16. as well as the slope (increasing or decreasing) of membership functions. µMj (m). the activation tables are altered as shown in Tables 7. and µWk (α). Then. In fact.4) to deﬁne the watering time w. we use the non-fuzzy labels given in Table 7. but the soil water content is also lower. Consider the two following input vectors i1 = (29. However. we need to consider that 25 minutes is 5 times better than 21 minutes to represent “long”. This can be easily explained by observing that.1 (R5 ) 0. if we use a similar system (based on fuzzy rules) to rank candidates in a competition.2 0 (R2 ) 0. consider the following example showing the impact of an increasing transformation of membership values on the output watering time: Example (2). it could be more problematic in other contexts. 29) and i2 = (18.168 CHAPTER 7. we transform all membership functions of the labels by the function φ(x) = 3 x. the membership values should have a cardinal interpretation. For instance.13: The weights of the rules for input i1 This example shows that comparison of output values is not invariant to monotonic transformations of membership values and this explains the “more than ordinal” interpretation of membership values in the computation of w. it represents the same ordinal information about membership degrees. DECIDING AUTOMATICALLY • Bearing in mind that the weights ωt. the temperature is lower. For example. the weights cannot necessarily be interpreted as cardinal values and the weighted aggregation proposed in equation (7.2 (R8 ) Hot 0. in the second case.13 and 7.8 (R7 ) Table 7. w(i2 ) = 19 minutes and 42 seconds.m are used as cardinal weights in (7.8 0 (R1 ) 0. Note that we now have w(i1 ) > w(i2 ) whereas it was just the opposite before the transformation of membership values. despite the important diﬀerence between inputs i1 and i2 .2) and (7. we obtain the following result: w(i1 ) = 19 minutes and 33 seconds and w(i2 ) = 21 minutes and 40 seconds.4) while they are deﬁned from membership values µTi (t). As an illustration of the latter.14. This preserves the support and the core of each label. √ Now. This is one more very strong hypothesis.4 for interpretation of labels Wk .8) is questionable. and the two aspects compensate each other. Notice that the times as not so diﬀerent. assuming we use equations (7. Then. for the sake of simplicity.1 0. These two inputs lead to activation weights given in Tables 7. This gives the following watering times: w(i1 ) = 20 minutes and 34 seconds. 16).9 Cold 0 0 (R10 ) 0 (R10 ) 0 (R10 ) Cool 0 0 (R3 ) 0 (R6 ) 0 (R9 ) Warm 0.

They are not as discriminating as the weighted sum and they cannot completely avoid commensurability problems (see Dubois. which has received much attention in the past decades.843 (R6 ) 0 (R9 ) Warm 0 0 (R2 ) 0 (R5 ) 0 (R8 ) Hot 0 0 (R1 ) 0 (R4 ) 0 (R7 ) Table 7. and could be used advantageously to process ordinal weights. e. (Gacogne 1997) and (Nguyen and Sugeno 1998) for a recent synthesis on the subject. Several alternatives to the weighted sum are compatible with ordinal weights.14: The weights of the rules for input i2 ωij Mj Low Medium High Ti µMj \ µTi 0 0. There is no room here to discuss the use of numerical representations in rulebased automatic decision systems further.464 (R5 ) 0.737 (R3 ) 0.16: The modiﬁed weights of the rules for input i2 a particular shape for membership must be well justiﬁed because it may really change the winner.464 (R4 ) 0. they also have some limitations.843 0 Cold 0. (Bouchon 1995). Another possibility is resorting to other aggregation methods that do not require the same level of information. one can consult (Mamdani 1981).464 0.g. These works present formal models but also empirical principles derived from practical applications and thus provide a variety of techniques that have proved . Prade and Sabbadin 1998. Sugeno integrals (see Sugeno 1977.4 0.585 (R10 ) 0.2 (R10 ) 0. the reader should consult the literature about fuzzy inference and fuzzy control.585 (R8 ) Hot 0.2 0. A SYSTEM WITH EXPLICIT DECISION RULES ωij Mj Low Medium High Ti µMj \ µTi 0.2 (R10 ) 0 (R10 ) Cool 0.2. (Sugeno 1985).737 0.585 (R10 ) 0 (R10 ) Cool 0.4 (R3 ) 0.6 (R6 ) 0 (R9 ) Warm 0 0 (R2 ) 0 (R5 ) 0 (R8 ) Hot 0 0 (R1 ) 0 (R4 ) 0 (R7 ) 169 Table 7.6 0.585 0 (R2 ) 0. Fargier and Perny 1999). As a ﬁrst set of references for theory and applications.965 Cold 0 0 (R10 ) 0 (R10 ) 0 (R10 ) Cool 0 0 (R3 ) 0 (R6 ) 0 (R9 ) Warm 0.6 0 Cold 0.843 0.15: The modiﬁed weights of the rules for input i1 ωij Mj Low Medium High Ti µMj \ µTi 0. However.928 0 (R1 ) 0.585 0.928 (R7 ) Table 7.7. Dubois and Prade 1987). To go further with rule-based systems using fuzzy sets.

Le Guennec and Guely 1996. which are not easily linked to human perception. In the case of an automatic system. diagnosis and decision tasks that could perhaps be automatised. when an overcooked biscuit is detected. near the biscuit line. For instance. • concerning the biscuit aspect. some theoretical justiﬁcations of choices of representations and operators are now available. t is deﬁned as the mean of 6 consecutive measures performed on biscuits and expressed in mm and the desired values are about 33 or 34 mm. a colour sensor is located in the oven. In the example of automatic diagnosis during baking. Perrot and Guely 1995. The overall eﬃciency of production lines and the quality of the ﬁnal product highly depend on the ability of human supervisors to identify a degradation of the quality of the ﬁnal product and on their aptitude to best ﬁt the control parameters to the current situation. 7. In the ﬁeld of biscuit manufacturing. DECIDING AUTOMATICALLY eﬃcient in practice. The evaluation m is given in cg/g (centigrams per one gram of dry matter) in the range [0. 1997). However. Moreover. which are the luminance L. • the thickness t of the biscuit is measured every 10 minutes. It measures colours with 3 parameters.3 7. Perrot 1997. the colour of the biscuits and on the operator’s skill in reacting to possible perturbations of the baking process.170 CHAPTER 7.g.1 A System with Implicit Decision Rules Controlling the quality of biscuits during baking The control of food processes is a typical example where humans traditionally play an important role to preserve the standard quality of the product. 10] with the desired values being around 4 cg/g. such automatisation is not obvious because human expertise in oven control during the baking of biscuits mainly relies on a subjective evaluation.3. bringing justiﬁcations to some methods used by engineers in practical applications and also suggesting also multiple improvements (see Dubois. As an example. let us report some elements of an application concerning the control of the quality of biscuits through oven regulation during baking (for more details see Trystram. Trystram. human operators controlling biscuit baking lines have the possibility of regulating the ovens during the baking process. within the oven. a level a on . the operator properly retroacts on the oven settings after checking its current temperature. This implies periodic evaluation. e. Prade and Ughetto 1999). the only available measures are the following: • a sensor located in the oven measures the air moisture. Perrot. a visual inspection of the general aspect. the only information accessible to the system consists of physical objective parameters obtained from measures and sensors. Grabisch et al.

Assuming a representative sample of biscuit is available. bi ) in the multiple attribute space of physical variables used to describe biscuits. the diagnosis task performed by the expert controlling baking can be seen as a pattern recognition task. the patterns are implicit and subjective. and a decision stage. Thus. the decision-making process consists of two consecutive stages: a diagnosis stage.2 seems problematic. a. .g. if necessary. ti .3. t. subjective evaluation of biscuits can be partially explained by their objective description. L. a pattern associated to each disfunction z is deﬁned by the set of points xi such that d(xi ) = z. using sensors. Moreover. we can construct an explicit representation of patterns in a more “objective” space formed by the observable variables. “oven too hot”. which must determine a regulation action on the oven. a standard regulation action is known. ai . The desired color is not easy to specify. However. t. which consists in evaluating the state of the last biscuits. . They can be approximated by observing the action of a human controller on the oven in a variety of cases. b) measured by an automatic system. Hence. L. Hence.4 we will see an approach integrating expert rules in the control of baking). “oven not hot enough”).2 Automatising human decisions by learning from examples In performing oven control. In this context.g. a classiﬁcation procedure can be seen as a function assigning to .7. Li . a characteristic set of “irregular” biscuits) the diagnosis stage consists in identifying the relevant pattern for any irregular biscuit and the decision stage consists in performing the regulation action appropriate to the pattern. Determining the right pattern for any new input vector x is a classiﬁcation problem where the categories C1 . .e. it is not always possible to obtain suﬃciently explicit knowledge from the expert to construct a satisfactory rule database (in section 7. a. Then. Like in many other domains. each biscuit can be evaluated by the expert and a diagnosis of disfunction d(xi ) can be obtained for each description xi .3. Cq are the q possible disfunctions and the objects to be assigned are vectors x = (m. The following subsection presents an alternative way of establishing this link using similarity from known examples. Sometimes. A SYSTEM WITH IMPLICIT DECISION RULES 171 the red-green axis and a level b on the yellow-blue axis. assuming that a ﬁnite list of categories is implicitly used by the expert (each of them being associated to a pattern. b) describing an object (e. explaining the bad quality of biscuit i (e. a. 7. Let X be the set of all possible vectors x = (m. It is not unrealistic to assume that usual disfunctions have been identiﬁed and categorised by the expert and that for each of them. we can represent each biscuit i of the sample by a vector xi = (mi . In this space. especially concerning the aspect of the biscuit that cannot be easily linked by the expert to the physical parameters (L. . i. following the approach adopted in section 7. a biscuit). the only information accessible must be directly inferred from the expert’s observation during his control activity. b).

172 CHAPTER 7. B´reau and Dubuisson 1991). Chuah and Leep 1986. Indeed. for any sample S ⊂ X of vectors whose correct assignment is known. .10) µCj (x) = 1 0 if j = Arg maxi { otherwise y∈Nk (x) µCi (y)} where Arg maxi . Note that membership induction of a new input x is also a matter of aggregation.10). More precisely. among the k-nearest neighbours of x that have been assigned to category i. This supposes that the maximum is reached for a unique i. In equation (7. Indeed. g(i) represents. .11) µCj (x) = y∈Nk (x) x−y m−1 1 x−y 2 m−1 where m ∈ (1. This formula seems natural but several points are questionable. choose all of them. the fuzzy k−NN e rule proposed by Keller et al. the value i for which g(i) is maximal. . It includes several implicit assumptions that are not necessarily valid (see chapter 6) and alternative compromise aggregators could possibly be used advantageously. . Bezdek. Moreover. the choice of the weighted sum as an aggregator of membership values µCj (y) for all y in the neighbourhood Nk is not straightforward. However. When this information is not available (this is the case in our example) the nearest neighbour algorithm is very useful. weighted by coeﬃcients inversely proportional to a power of the Euclidean distance between x and y. The choice of a compromise operator itself can be criticised and one can readily imagine cases where a disjunctive or a conjunctive operator should be preferred. (1985) is deﬁned by: µCj (y) y∈Nk (x) 2 (7. µCq (x)) giving the membership of x to each category (e. in most cases. Gray and Givens 1985. y ∈ Nk (x). . The basic principle of the k−Nearest Neighbour assignment rule (k−NN) introduced in (Fix and Hodges 1951) is to assign an object to the class to which the majority of its k-nearest neighbours belong. if Nk (x) represents the subset of S formed by the k nearest neighbours of x within S. It has been proved that the error rate of the k−NN rule tends towards the optimal Bayes error rate when both k and n tend to inﬁnity while k/n tends to 0 (see Cover and Hart 1967). the neighbours are not equally distant from x and one may prefer to give less importance to neighbours very distant from x. one can use a second criterion for discriminating between all g-maximal solutions or. even . +∞) is a technical parameter. . alternatively. . which is not frequent in practice. Firstly. the k−NN rule is deﬁned for any k ∈ {1. n} by: (7. the membership value µCj (x) is deﬁned as the weighted average of quantities µCj (y). DECIDING AUTOMATICALLY each vector x ∈ X the vector (µC1 (x).g possible disfunction of the oven). several weighted extensions of the k−NN algorithm has been proposed (see Keller. When this is not the case. The main drawback of the k − N N procedure is that all elements of Nk (x) are equally weighted. the rule requires knowing the prior and conditional probability densities of all categories. . For this reason. For example. function g(i) equals y∈Nk (x) µCi (y) and represents the total number of vectors. One of the most popular classiﬁcation methods is the so called Bayes rule which is known to minimise the expected error rate.

y) as a function of quantities of type xi − yi for any attribute i. representing the relative closeness of x and y for the expert. This is particularly suitable in the ﬁeld of subjective evaluation in which preferences and perceptions of the expert (or decision-maker) are not usually linearly related to the observable parameters.7.µCj (yi )) and ∼ (x. . (Henriet and Perny 1996) and (Perny and Zucker 1999) where the membership of µCj (x) is deﬁned by: k (7.12) µCj (x) = ψ(µCj (y1 ).3. . we can use a general aggregation rule of type: (7. For instance. allowing to distinguish diﬀerences that are signiﬁcant for the expert from those that are negligible. one may include discrimination thresholds (see chapter 6) in the comparison. y) is the weighted average of one-dimensional similarity indices (∼i (x. yi ). . one could deﬁne a fuzzy similarity relation ∼ (x. A SYSTEM WITH IMPLICIT DECISION RULES 173 when the weighted arithmetic mean seems convenient. . ∼ (x. µCj (yk ). the use of weights linked to distances of type x − y and to parameter m is not obvious. This is the proposition made in (Henriet 1995). . ∼ (x. y1 ). These coeﬃcients evaluate the necessity for a regulation action. Then. Moreover the linear transition from similarity to non-similarity is not easy to justify and a full justiﬁcation of the shape of the similarity function ∼i would require a lot of information about diﬀerence of type xi − yi . . Usually. y).14) ∼i (x. . . when units are diﬀerent and non commensurate on the various axis. by analysing the measure x of the last biscuit. This is the case. . µtoo hot (x) = 1 and µnot hot enough (x) = 0 means that decreasing the oven temperature is necessary. yk } and ψ is an aggregation function. For instance. the norm of x − y is not necessarily a good measure of the relative dissimilarity between the two biscuits represented by x and y. . y) = if qi < |xi − yi | < pi pi −qi 0 if |xi − yi | ≥ pi In the above formula. . In order to distinguish between signiﬁcant and non signiﬁcant diﬀerences on each dimension.13) µCj (x) = 1 − i=1 (1− ∼ (x. yk )) where Nk (x) = {y1 . . the k−NN algorithm can be used for periodically computing two coeﬃcients µtoo hot (x) and µnot hot enough (x). for instance. Coming back to the example. It requires assessing two thresholds for attribute level xi . one per attribute i) deﬁned as follows: if |xi − yi | ≤ qi 1 |xi −yi |−qi (7. It should be noted however that the deﬁnition of similarity indices ∼i (x. y) is very demanding. the construction of such similarity functions is only based on empirical evidence and common sense principles. The decision process .7. qi and pi are thresholds (possibly varying with the level xi or yi ) used to deﬁne a continuous transition from full similarity to dissimilarity as shown in the example given in Figure 7. Indeed.

Actually. the automatic pre-ﬁltering of loan ﬁles in a bank.4 An hybrid approach for automatic decisionmaking In the case reported in (Perrot 1997) about the control of biscuits during baking. ﬁrst to validate a priori the system. the human expertise in the diagnosis stage is expressed using these labels by rules of type: If moisture is normal or dry and colour is overdone then the oven is too hot . which are the moisture (m). in many other decision problems involving an automatic system. the quality of the biscuit is evaluated by the expert on the basis of 3 attributes. the thickness (t) and the aspect of the biscuit (colour). the need for explanations is more crucial. “not done”. 7. and these values can be interpreted as indicators of the amplitude of the regulation and help the system in choosing a soft regulation action. The use of rules in the context of baking control is discussed in the next section. The qualiﬁers used for labelling these attributes are: • moisture: “dry”. y) is improved if we use the fuzzy version of the k−NN algorithm in the diagnosis stage. y) 1 0 x i . However. and secondly to explain decisions a posteriori to the clients. “normal”. “done”. Indeed.7: One-dimensional similarity indices ∼i (x. “good”. This is not a real drawback in this context because the quality of biscuits is a suﬃcient argument for validation. The main drawback of this automatic decision process is the absence of explicit decision rules explaining the regulation actions. “overdone”. “too thick” • aspect “burned”. “underdone”.pi xi . “humid” • thickness: “too thin”. the values µtoo hot (x) and µnot hot enough (x) possibly take any value within the unit interval. it was possible to elicit decision rules for the diagnosis stage. Then.qi xi + qi x i + pi yi Figure 7. e. DECIDING AUTOMATICALLY ~ i (x.g. subjectively evaluated. the diagnosis stage was not uniquely based on the k−NN algorithm.174 CHAPTER 7. In this case. in this application.

L.9).8: Fuzzy labels used to describe biscuit moisture too thin good too thick 1 0 t 28 32 35 38 (mm) Figure 7. The numeric-symbolic translation is natural for moisture and thickness.8 and 7. ai . Then the fuzzy k−NN algorithm is applied with reference points (xi . dry normal humid 1 0 m 3 3. bi ) describing the biscuit i in the physical space. AN HYBRID APPROACH FOR AUTOMATIC DECISION-MAKING 175 If moisture is humid or normal and colour is underdone then the oven is not hot enough It has therefore been decided to construct membership functions linking parameters (m.2 to infer a regulation action. a. in order to be able to implement a hybrid approach based on k−NN algorithms to get a fuzzy symbolic description of the biscuit and the fuzzy rule-based approach presented in section 7. with a label yi each element i of a representative sample of biscuits.8 4. yi ) for all biscuits i in the sample. For any input x = (L. This problem has been solved by the fuzzy k−NN algorithm.7 5. a. t. At the same time. The labels used for these two parameters are represented by the following fuzzy sets (see Figures 7.9: Fuzzy labels used to describe biscuit thickness The translation is more diﬃcult for labels used for the biscuit aspect because the aspect is represented by a fuzzy subset of the 3-dimensional space characterised by the components (L. the sensors assess the vector xi = (Li . b) to the labels used in the rules.8 (cg/g) Figure 7.7. a. It is indeed suﬃcient to ask an expert in baking control to qualify.4. b) it gives the membership values µyj (x) . using only the 5 labels introduced to describe aspect. b).

j = 1. The task is diﬃcult because human diagnosis is mainly based on human perception whereas sensors naturally give numerical measures. whereas computers are basically suited to perform numerical computations. 5 by fuzzy subsets of the (L. alternating action and retroaction steps. the output of the system highly depends on the choice of numbers used to represent symbolic knowledge. the integration of the k − N N algorithm to a fuzzy rule-based system provides a soft automatic decision system whose action can be explained by the expert’s rules. .176 CHAPTER 7. 5} used to describe the biscuit’s aspect. This control system can be integrated within a continuous regulation loop. Indeed. As shown in this chapter. as illustrated in Figure 7. convenient for an automatisation. DECIDING AUTOMATICALLY for any label yj . j ∈ {1. and because human reasoning is mainly based on words and propositions drawn from the natural language.2. The fuzzy nearest neighbour algorithm provides a representation of labels yj . reasoning and decision-making. we have shown that many “apparently natural” choices in the modelling process possibly hide strong assumptions that can turn out to be false in practice. . one must be aware that multiplying arbitrary choices in the construction of membership functions can make the output of the system completely meaningless. in the context of rule based control systems. small numerical examples given in the chapter show that. b) space. in the context of repeated decision problems. . In the biscuit example. In particular. a. They are based on the deﬁnition of fuzzy sets linking labels to observable numerical measures through membership functions. We have shown the importance of constructing suitable mathematical representation of knowledge and decision rules. a proper use of these fuzzy sets requires a very careful analysis. This makes it possible to resort to the fuzzy control approach presented in section 7.10 m t L a b x Diagnosis Module µ too hot (x) µ not hot enough(x) Decision Module ∆t Measures biscuits Baking oven settings Figure 7.10: The action-retroaction loop controlling baking 7. . . . some simple and intuitive formal models have been proposed. However. . .5 Conclusion We have presented simple examples illustrating some basic techniques used to simulate human diagnosis. . enabling to establish a formal correspondence between symbolic and numeric information. For instance.

due to the need for learning examples to show the system what the right decisions in a great number of situations are. the learning-oriented approach is only possible when the decision task is completely understood and mastered by a human. both theoretical and empirical validations of the decision system are necessary.5. This can be used to determine suitable membership functions characterising the rules. there is a need of weighting propositions and aggregating numerical information.g Bouchon-Meunier and Marsala 1999) or (Nauck and Kruse 1999) for neuro-fuzzy methods in fuzzy rule generation. Designing an automatic decision process in which the arbitrary choice of numbers used to represent knowledge is more decisive than the knowledge itself is certainly the main pitfall of the modelling exercise. Indeed. This is the opportunity to control the continuity and the derivatives of the function. but also to check whether the computation of the outputs is meaningful with respect to the nature of the information given to the system as input. the outputs of any automatic decision system are more the consequences of arbitrary choices in the modelling process than those of a sound deduction justiﬁed by the observations and the decision rules. It takes the form of trial and errors sequences enabling a progressive tuning of the fuzzy-rule based model to better approximate the expected decisional behaviour. CONCLUSION 177 Moreover. This can even be used to learn the rules themselves. . It must be clear that by not thoroughly respecting these constraints. because it takes advantage of the efﬁciency of neural networks while preserving the “easy to interpret” feature of a rule based-system. when a suﬃciently rich basis of examples is available. the rules and the membership values can be learned automatically (see e.7. The theoretical validation consists in investigating the mathematical properties of the transfer function that forms the core of the decision module. we have shown that. Notice however that. The neuro-fuzzy approach is very interesting for designing an automatic decision system. This is usually the case when the automatisation of a decision task is expected. their properties and the constraints to be satisﬁed in order to preserve the meaningfulness of conclusions. This shows the great importance of mastering the variety of aggregation operations. The empirical or practical validation consists in testing the decisional behaviour of the system in various typical states of the system. at any level of computation. Since one cannot reasonably expect to avoid all arbitrary choices in the modelling process. but one should be aware that this approach is not easily transposable to more complex decision situations where preferences as well as decision rules are still to be constructed.

.

Section 8. were neglected. of the progressive formulation of the problem. of the diﬃculties encountered. It illustrates the interest and the importance of having well-studied formal models at our disposal when we are confronted with a decision problem. The main purpose of this presentation is to show how diﬃcult it is to build (or to improvise) a pragmatic decision model that is consistent and sound. of the assumptions chosen. of minor interest in the framework of this book. A detailed presentation of the ﬁrst discussions. then it gives a detailed description of the approach that was applied in the concrete case. The description was thus voluntarily simpliﬁed and some aspects. especially on the modelling of uncertainties. we describe an application that was the theme of a research collaboration between an academic institution and a large company in charge of the production and distribution of electricity.8 DEALING WITH UNCERTAINTY: AN EXAMPLE IN ELECTRICITY PRODUCTION PLANNING 8.1 Introduction In this chapter.2 and 8.3 present the context of the application and the model that was established. of the hesitations and backtrackings.5 provides some general comments on the advantages and drawbacks of this approach. of the methodology adopted and of the resulting software would require nearly a whole book.2 The context The company must periodically make some choices for the construction or closure 179 . Our purpose is to point out some characteristics of the problem. 8. Section 8. We do not give an exhaustive description of the work that was done and of the decision-aiding tool that was developed.4 is based on a didactical example: it ﬁrst illustrates and comments some traditional approaches that could have been used in the application. Sections 8.

1). Coal. A decision for a block of 3 years could thus be for example {1N. 1C. DEALING WITH UNCERTAINTY of coal.3 8. They considered that aggregating ﬁnancial.3. The next section points out these aspects through the description of the model as it was formulated in collaboration with the company’s engineers. the decisions are only taken at chosen milestones. technical and environmental points of view into a type of generalised cost (see Chapter 5) was neither possible nor very serious. each unit and modiﬁcation of the downgrade plan has diﬀerent speciﬁcities (see Table 8.1 The model The set of actions In this chapter. separated by a time period of about 3 years (this period between two decisions is called block ). one coal and two gas production units are planned and that the downgrade plan has to be anticipated. 2G. and the choice concerning the downgrade plan (follow. 8. meaning that one nuclear.180 CHAPTER 8. A}. beside the multiple criteria aspect. the managers of the production department wanted to develop a multiple criteria approach for evaluating and comparing potential actions. At most one unit of each type per year may be ordered. In terms of electricity production and delay. Type N C G A D Power (MW) 900 400 350 −300 +300 Delay (years) 9 6 3 0 0 Table 8. an enormous set of potential actions. in order to ensure the production of electricity and satisfy demand. . anticipate or delay) is of course exclusive. Due to the diversity of points of view to be taken into account. A collaboration was established between the company and an academic department (we will call it “the analyst”) that rapidly discovered that. Gas) to be planned and in specifying whether the downgrade plan (previously deﬁned by another department of the company) has to be followed. we call decision a choice made at a speciﬁc point in time: it consists in choosing the number of production units of the diﬀerent types of fuel (Nuclear. gas and nuclear power stations. or partially anticipated (A) or delayed (D). a signiﬁcant temporal dimension and a very high level of uncertainty on the data needed to be managed.1: Power and construction delay for the diﬀerent types of production unit For simplicity.

exploitation cost. in Belgian Francs (BEF). in BEF. 2G} . anticipation and delay are only allowed on the ﬁrst and second blocks..1). as for example no new unit for 20 years or 3G and 3C in every block: they can be eliminated by ﬁxing reasonable limits on the power production of the park. • exploitation cost. the decision-maker only kept the actions so that. {1G. {2G}. for the time period of the simulation: • fuel cost.e. 8.1. Many of these actions are completely unrealistic. to minimise. to minimise. not only on the production of electricity. an anticipation followed by a delay (or the inverse) is forbidden–the number of actions is still of around 108 . • deﬁcient power in TWh. . • marginal cost. • investment cost. A}. Depending on the block considered. the following eight criteria were taken into account. i. in BEF. the amount of total cost for a variation of 1 GWh. In the application described here. {}.3.. as seen in Table 8.e.8. The temporal dimension of the problem naturally leads to a tree structure for these actions. built on decision nodes (represented by squares in Figure 8.3. to minimise.2). {1N. to minimise. {1C}. but also in terms of investment. 1C. These limitations led to a set of approximately 100 000 potential actions. In this problem.2 The set of criteria The list of criteria was deﬁned by the industrial partner in order to avoid unbearable diﬃculties in data collection and to work on a suﬃciently realistic situation. a period of about 20-25 years or 7 blocks. the surplus is less than 1 000 MW and the deﬁcit be less than 200 MW. {3G}.3. to minimise. It was important to test the methodology with a realistic set of criteria but it was also clear that the methodology should be independent of the criteria chosen. i. 1C}. Remember that the purpose of the study was to build a decision-aiding methodology and was not to make a decision. The number of possible actions is of course enormous. there are typically between 3 and 30 branches leaving each decision node. . An action is thus for example {1N. (see Section 8. for each block. Even after adding some simple rules–only one (or zero) nuclear units are allowed exclusively on the ﬁrst and last block. THE MODEL 181 Each decision is irrevocable and naturally has consequences for the future. safety. An action is a succession of decisions over the whole time period concerned by the simulation (the horizon). environmental eﬀects. in BEF. 2G.

{1N }. Table 8. in tons. {}. in BEF. {3G}. 8. to maximise.3 Uncertainties and scenarios Generally speaking. in tons. Other scenarios must be envisaged in order to improve the realism and usefulness of the model.2 presents an example of evaluations for two particular actions in a scenario where the fuel price is low and the demand for electricity is relatively weak. The uncertainties have an impact on the evaluations. {} B : {1N. The evaluations of the actions on these criteria are of course not known with certainty. the determination of the value of a parameter at a given moment can lead to the following situations: • the value is not known: the value is relative to the past and was not measured. the coal power stations will be more intensively exploited than the gas ones. 2G. to minimise.3. {2C. {2G}. the value is relative to the present but is technically impossible or very expensive to obtain. 1G}.7 22 000 70 23 000 B 31 000 49 000 770 000 620 10. DEALING WITH UNCERTAINTY A : {}. {}. • SO2 and N Ox emissions. this will have an impact on the fuel costs and the environmental impacts of the production park). {} A 33 500 45 000 360 000 730 16. the value is relative to the future for a parameter with a completely erratic evolution. 2C}. • purchase and sales balance.3 16 000 48 30 000 Fuel cost Exploitation cost Investment cost Marginal cost Deﬁcient power CO2 emissions SO2 + N Ox emissions Sales Balance MBEF MBEF MBEF KBEF/GWH TWH Ktons Ktons MBEF Table 8.182 CHAPTER 8. {3C}. because they depend on many factors that are not or not well known by the decision-maker. to minimise.2: The evaluations of two particular actions • CO2 emissions. {3G}. . which can be direct (the prices of the raw materials inﬂuence their total costs) or indirect (if the gas price increases more than the coal price. {2C}. {2G}.

or more severe. again a probability. but in both cases. So. a possibility. For the uncertainties. He wanted to have another methodology in order to take better account of the number of potential actions and the multiple criteria aspects. The selling price of electricity was also considered as an “alea” in order to be able to capture the deregulation phenomena due to a forthcoming new legislation. the demand for electricity (same reasoning) and the legislation concerning pollution (in this example. Suppose for instance that a variable x may be equal to 0 or 1 in the future. the interval is due to the imprecision of the measure or to the use of a forecasting method. however. at a given time. a conﬁdence index or the result of a voting process can be associated with each value. THE MODEL 183 • the value can be approximated by an interval: the bounds result from the properties of the system considered. they were used to working with probabilities and the framework of the study did not allow to suggest anything else. and the uncertain parameters after this block are thus strongly related: either the same as for the ﬁrst blocks. This information may modify the choices of the decision-maker. • the value is unique but not reliable. several scenarios are possible. scenarios were deﬁned and subjective probabilities were assigned to them by the company’s experts. The “major uncertainties” allow for a learning process that must be taken into account in the analysis: each decision.3. sometimes. the law may change for the third block. with a certain information on the degree of reliability. The “major uncertainties” (for which some dependence can exist between the values at diﬀerent moments) were the fuel price (the market presents global tendencies and a high price for the ﬁrst two blocks reinforces the probability of having a high price for the third one). In the particular situation described here.8. The corresponding probabilities are assessed as follows: . The industrial partner considered that nuclear availability in the future was completely independent of the knowledge of the past and called this type of uncertainty “alea”: this means that the level of nuclear availability was completely open for each period of three years (a breakdown at a given time does not imply that there will be no breakdown in the near future). two types of uncertainties were distinguished and respectively called “aleas” and “major uncertainties”: the diﬀerence between them is based on the more or less strong dependence between the past and the future. a probability. More precisely. may use the previous values of the uncertain parameters and deduce information from them about the future. a possibility or a conﬁdence index can be associated with each value of the interval. • the value is not unique: several measures did not yield the same value. the industrial partner was already using stochastic programming for the management of the production park. constant over all blocks after block 2).

5. and if the horizon is divided into 7 blocks. and it is possible to take the whole set of scenarios into account. If there are 3 levels for the fuel price.4 The temporal dimension Independently of the dependence between the past and the future in the modelling of the uncertainties. with the same probability distribution. Second. DEALING WITH UNCERTAINTY P (x = 0) > 0. because their independence does not allow for direct inference from the past. 2 levels for the legislation. the time period between the decision to build a certain type of power station and the beginning of the exploitation of that station is far from being negligible. In practice. P (x = 0) < 0. after past scenario A. Because of the statistical dependence and of the possible learning process in the major uncertainty case. the complete scenario for a decision node at time t is not known but a probability is associated to each of them. First. If there are 3 levels for the selling price and 2 levels for the availability of nuclear units .3. Of course. allowing to compute the conditional probability of each complete scenario knowing the already observed partial scenario at time t. a complete treatment and a tree-structure for these scenarios (a scenario is a succession of observed uncertainties) are necessary. If he prefers a when x = 0 and b when x = 1. and a sequence of levels for the fuel price such as HHLMHLH (H for high. the tree structure of the “aleas” is obvious: each node gives rise to the same possibilities. On the contrary. it was imposed that scenarios could only change after two blocks. Fortunately. The decision-maker has to choose between two decisions: a and b. most of these scenarios are negligible because the probability of a very ﬂuctuating scenario is very small: the “major uncertainty” scenarios are rather strongly correlated. a priori. the consequences . there are. and each modiﬁcation was penalised so that very ﬂuctuating scenarios were hardly possible. Third.5. (3 × 3 × 2)7 6×108 possible scenarios. the temporal dimension plays an important role in this kind of problem. The analyst ﬁnally retained around 200 representative scenarios that were gathered in a tree-structure of major uncertainty nodes (represented by circles in Figure 8. where the “past scenario” is known at the time of decision. 3 levels for the demand. two sequences were retained for legislation (MMMMMMM and MMHHHHH ). 8. Fortunately. For these reasons. then the number of scenarios is (3 × 2)7 = 279 936. a reasonable decision will be to choose a after scenario A and b after scenario B. the aleas act much more simply than the major uncertainties. The previous explanation is not valid for “aleas”. some consequences of the decisions appear after a very long time (as the environmental consequences for example).1). M for medium and L for low) is much less probable than a sequence HHHMMMM. the “aleas” are by essence uncorrelated and there is no reason to neglect any scenario. after past scenario B.184 CHAPTER 8.

8. THE MODEL 185 First decisions First period Second decisions Last decisions Last period Consequences Figure 8.1: The decision tree .3.

That is why the analyst kept the possibility to introduce discounting or not. during the ﬁrst period. At time t = 0. such an approach may not be the best one and the decision-maker could be more conﬁdent in the ﬂexible approach and the richness of the scenarios. the decision nodes (squares) correspond to active parts of the analysis where the decision-maker has to establish his strategy.5 Summary of the model The complete model can be described by a tree structure including decision nodes (squares) and uncertainty nodes (circles).2 presents the tree and the evaluation of each action (set of decisions) for each complete scenario. a ﬁrst decision is made (a branch is chosen) without any information on the scenario. G are eligible if the ﬁrst decision was B. for a long term decision problem with important consequences for future generations.3. In the resulting tree (Figure 8. electricity selling price. Remark that this didactic example contains only one . to introduce a discounting rate that decreases the weight of the evaluations for distant consequences (see Chapter 5) and the industrial partner did this here. 8. It is rather usual. DEALING WITH UNCERTAINTY themselves can be dispersed over rather long periods and vary within these periods.4 A didactic example Consider Figure 8. leading to a circle node.1). However. 8. Fourth. in planning models. determining one branch leaving the considered circle node and leading to one of the decision nodes at time t = 1. one may observe the actual values of the uncertain parameters (nuclear disponibility. During the second period. A new decision is then made. two events U and V are possible after S (with respective probabilities 1/4 and 3/4) and two events Y and Z are possible after T (with respective probabilities 3/4 and 1/4). F.2 describing two successive time periods. taking the previous information into account. Figure 8. the consequences of a decision can be diﬀerent according to the moment that decision is taken. two decisions C and D are eligible if the ﬁrst decision was A and three decisions E. each with probability 1/2. electricity demand and environmental legislation). and so on until the last decision (square) node and the last scenario (circle) node that determine the whole action and the whole observed scenario. During block 1. two decisions A and B are eligible. two events S and T are possible. fuel price.1. At the beginning of the second period. At t = 0 (square node at the beginning of block 1). as illustrated in Figure 8.186 CHAPTER 8. while the uncertainty nodes (circles) correspond to passive parts of the analysis where the decision-maker undergoes the modiﬁcations of the parameters.

e. the best decision at node N2 is D and the expected value associated to N2 is 42/8. At node N2 (beginning of the second period). this is only possible when the evaluations are elements of a numerical scale.8. as illustrated in the example presented in Figure 8. so the best decision is B. Of course. Petersburg game (see for example Sinn 1983) showing that the expected value approach does not always represent the attitude of the decision-maker towards risk very well.4. N4 and N5. A DIDACTIC EXAMPLE 187 evaluation for each action (problem with one criterion). 8. The game consists of tossing a coin repeatedly until the ﬁrst time it lands on “heads”. the expected value of decision C is (1/4 × 7 + 3/4 × 4. one obtains the tree represented in Figure 8.5) = 42/8. At node N1.5 + 3/4 × 5. In this example. applying the expected value approach. T and U are all equal to 1/3. the expected values of decisions A and B are respectively 39/8 and 5. whose probability is 3/4.4. In conclusion. However. For example. although B is better than A in two scenarios out of three. the expected value will give preference to A. We do not insist on the multiple criteria aspect of the problem here (this was treated in Chapter 6) and focus on the treatment of uncertainty. the nodes of the tree are considered from the leaves to the root (“folding back”) and the decisions are taken at each node in order to maximise their expected values. the answer depends on the player but. A consequence is that a big diﬀerence in favour of a speciﬁc decision in some scenario could be suﬃcient to overcome a systematic advantage for another decision in all the other scenarios. we see that the expected gain is ∞ k=1 1 k . the expected value presents some characteristics that the user must be aware of. 8.1 The expected value approach In the traditional approach. So.2 = +∞. the mean of the corresponding probability distributions for the evaluations. Making similar calculations for N3.5) = 41/8 while the expected value of decision D is (1/4 × 4. i. if this happens on the k th toss.4. 2k . the player wins 2k e. depending on whether the event occurred in the ﬁrst period was S or T. probabilities intervene as tradeoﬀs between the values for diﬀerent events: the diﬀerence of one unit in favour of C over D for event V. if the probabilities of S.2 Some comments on the previous approach Just as the weighted sum (already discussed in the other chapters of this book). The question is to ﬁnd out how much a player would be ready to bet in such a game. Remember the famous St.3. the “optimal action” obtained by the traditional approach will consist in applying decision B at the beginning of the ﬁrst period and decision E or G at the beginning of the second period.4. the amount would not be very big. Of course. in any case. would be completely compensated by a diﬀerence of three units in favour of D over C for event U because its probability is 1/4.

5 1 5 3.2: A didactic example .188 CHAPTER 8.5 4.5 5. DEALING WITH UNCERTAINTY Value U (1/4) C V (3/4) N2 S (1/2) D U (1/4) N6 N7 N8 N9 N10 N11 N12 N13 N14 N15 N16 N17 N18 N19 N20 N21 N22 N23 N24 N25 7 4.5 3 1 1 1 6 1 2 2 5 5 V (3/4) Y (3/4) C Z (1/4) N3 A Z (1/4) U (1/4) D Y (3/4) T (1/2) N1 N4 E F G B S (1/2) V (3/4) U (1/4) V (3/4) U (1/4) V (3/4) Y (3/4) T (1/2) N5 E F G Z (1/4) Y (3/4) Z (1/4) Y (3/4) Z (1/4) Figure 8.5 4.5 5.5 4.

more generally. 2.5. Petersburg game. in terms of preferences.. u(4. The expected utility can also be ﬁnite with an unbounded utility function such as.. u(6) = u(7) = 4. and is negative for larger values. allows to resolve this paradox and. A DIDACTIC EXAMPLE 189 The expected utility model.8. for example. Denoting by u(xi ) the utility of the evaluation xi . 8. contrary to what was obtained with the expected value approach. the expected utility value of a decision leading to the evaluation xi with probability pi (i = 1.4. . which is the subject of the next section.5) = 3. while the expected utility of betting an amount of s e in the game is ∞ 1/2k u(2k − s). if we denote by u(x) the utility of “winning x e”.5) = 2.. In the case of the St.3 The expected utility approach As the preferences of the decision-maker are not necessarily linearly linked to the evaluations of the actions. In the example in Figure 8.2 and with a utility function deﬁned by u(1) = u(2) = 1. u(3) = u(3.5) = u(5) = u(5. we obtain the tree given in Figure 8. were only studied in the present century (see for instance von Neumann and Morgenstern 1944).4. the logarithmic function. x > 220 . it may be useful to replace these evaluations by the “psychological values” they have for the decision-maker through so-called utility functions (Fishburn 1970). to take diﬀerent possible attitudes towards risk into account. n) is given by pi u(xi ). . the reader can verify that for a utility function deﬁned by u(x) = x/220 1 iﬀ iﬀ x ≤ 220 . The optimal action is then to apply decision A at the beginning of the ﬁrst period and decision C at the beginning of the second period. the expected utility of betting s e in the game is positive (hence superior to the expected utility of refusing the game) as long as s is less than 21(1 − 1/220 ) e. k=1 As an exercise. i This model dates back at least to Bernoulli (1954) but the basic axioms. the expected utility of refusing the game is u(0).

DEALING WITH UNCERTAINTY Best decision S(1/2) N2 D Value 5.4: Illustration of the compensation eﬀect .5 N1 B S(1/2) N4 E 5 T(1/2) N5 G 5 Figure 8.3: Application of the expected value approach 10 S T A 15 U 20 15 B S T 20 U 9 Figure 8.190 CHAPTER 8.25 A T(1/2) N3 C 4.

At the same time. The following example illustrates the well-known Allais paradox (see Allais 1953) .8. Hammond and Seidl (1998)) We simply recall one or two characteristics here that every user should be aware of.11u(500 000) > 0. the probabilities being objective or subjective: see for example Savage (1954). McCord and de Neufville (1983).1 and 0 e with probability 0. Bell et al. . hence. it is reasonable to prefer an alternative providing 2 500 000 e with probability 0.89. Loomes (1988). 2 500 000 e with probability 0. 0.4 Some comments on the expected utility approach Much literature is devoted to this approach. Applying the expected utility model leads to the following inequality u(500 000) > 0.01. As in every model.4.1 and 0 e with probability 0.11 and 0 e with probability 0. grouping terms. Fishburn (1970) and Fishburn (1982).9 to an alternative providing 500 000 e with probability 0.01u(0).9u(0) > 0.5: Application of the expected utility approach 8.1u(2 500 000) + 0.11u(500 000) + 0.89u(500 000) + 0. (1988). Ellsberg (1961).1u(2 500 000) + 0.89u(0).89. the expected utility approach implicitly assumes that the preferences of the decision-maker satisfy some properties that can be violated in practice. It is not unusual to prefer a guaranteed gain of 500 000 e to an alternative providing 500 000 e with probability 0. the expected utility model yields 0.4. Barbera. Luce and Raiﬀa (1957). Allais and Hagen (1979). A DIDACTIC EXAMPLE Best decision S(1/2) N2 C Value 13/4 191 A T(1/2) N3 C 1/2 N1 B S(1/2) N4 E 11/4 T(1/2) N5 E 1/2 Figure 8.01u(0). In this case.1u(2 500 000) + 0.

This type of situation shows that the use of the probability concept may be debatable for representing attitude towards risk or uncertainty. The following example illustrates the so-called Ellsberg paradox and is extracted from Fishburn (1970. 0. . You only know that the two other balls are either both red (R). let us mention why using probabilities may cause some trouble in modelling uncertainties or risk.172). other tools (possibility theory. or both green (G). (1988). while the expected utility approach leads to indiﬀerence between A and B and as well as between C and D. Machina (1982. red or green. grouping the terms.11u(500 000). the violated property is the so-called independence axiom of Von Neumann and Morgenstern). Another interpretation is that the expected utility approach sometimes implies unreasonable constraints on the preferences of the decisionmaker (in the previous example. R. DEALING WITH UNCERTAINTY A B W R 100 0 0 100 W 100 0 G 0 0 C D R G 0 100 100 100 Table 8. Intuition leads many people to prefer A to B and D to C. A possible attitude in this case is to consider that the decision-maker should revise his judgment in order to be more “rational”. that is. The ﬁgures are what you will be paid (in Euros) after you make your choice and a ball is drawn. Barbera et al.192 CHAPTER 8. belief functions or fuzzy integrals) can also be envisaged. (1998).3 hence. in order to satisfy the axioms of the model. This last interpretation led scientists to propose many variants of the expected utility model. the expected utility model cannot explain the two previous preference situations simultaneously. Bell et al. which is in contradiction with the inequality obtained above.01u(0) > 0.1u(2 500 000) + 0. and G represent the three states according to whether one ball drawn at random is white. or one is red and one is green. An urn contains one white ball (W) and two other balls. as in Kahneman and Tversky (1979). So.3 where W. Consider the two situations in Table 8. p. 1987). Before explaining why the expected utility model (or one of its variants) was not applied by the analyst in the electricity production planning problem.

However. . thus avoiding some of the pitfalls mentioned in Chapter 6 on the multi-attribute value functions. This approach is certainly not ideal (some drawbacks will be pointed out in the presentation).5) + 3/4 × f (4.5 5. Other functions can be deﬁned similarly to what is done in the Promethee method. it does not introduce a discounting rate for the dynamic aspect (see Chapter 5) and it allows to model the particular preferences of the decision-maker along each evaluation scale. On the one hand. On the other hand. In the electricity production planning problem described in Section 8. As we see. it was deﬁnitely excluded to transform all the consequences into money and to aggregate them with a discounting rate (as in Chapter 5). The comparison between C and D was made on the basis of the diﬀerences in preference between them for each of the considered events similarly to what is done in the Promethee method (Brans and Vincke 1985). This function expresses the fact that a diﬀerence which is smaller or equal to 1 is considered to be non signiﬁcant.5 D 4.5 193 8.4. 1/4 3/4 Table 8.4. scenario by scenario. we have to consider Table 8. an advantage of this approach is to enable the introduction of indiﬀerence thresholds.4.5 − 5. The analyst proposed the following index to measure the preference of C over D. where x is the diﬀerence in the evaluations of two decisions. Let us consider a preference function deﬁned by f (x) = 1 ∀x > 1.5 The approach applied in this case: ﬁrst step We will now present the approach that was applied in the electricity production planning problem.4 C 7 4. 0 elsewhere. the analyst did not know whether the probabilities given by the company were really probabilities (and not “plausibility coeﬃcients”) and it was not sure that the consequences of one scenario were really comparable to the consequences of another. The analyst decided to propose a paired comparison of the actions. it does not aggregate the multiple criteria consequences of the decisions into a single dimension.5) = 1/4.8.3. as illustrated below for the didactical example presented in Figure 8.4: 1/4 × f (7 − 4. the company was not prepared to devote much time to the clariﬁcation of the probabilities and to long discussions about the multiple criteria and dynamic aspects of the problem. A DIDACTIC EXAMPLE Events U V Probab. on the basis of the data contained in Table 8. Moreover. At node N2. so that it was impossible to envisage an enriched variant of the expected utility model.2.

7. 3/4 1/4 Table 8. leading to the preference indices presented in Table 8. could have been used here. we have to consider Table 8. At node N5.5 − 7) + 3/4 × f (5. he used them to calculate a sort of expected index of preference for each decision over each other decision.5 − 1) + 1/4 × f (4. a (possibility weighted) sum is computed for all the criteria in order to obtain the global score of a decision. which will be described in a volume in preparation. The score of each decision is then the sum of the preferences of this decision over the other minus the sum of the preferences of the other over it. despite the analyst’s doubt about the real nature of the “probabilities”.5 4. At node N3.8.5. These preference indices are summarised in Table 8.6.7 .5 Events Y Z Probab. decision E dominates F and G and is thus chosen (where “dominates” means “is better in each scenario”).194 CHAPTER 8.5 − 5) = 3/4. the preference index of C over D is 3/4 × f (4. the chosen decision at node N2 is C. The maximum score determines the chosen decision.5.5 D 1 5 C D Table 8. At node N4. In the case of Table 8. For example. Note also that.6 while the preference of D over C is given by 1/4 × f (4. The scores of C and D are respectively 3/4 and −3/4. The preference index of G over E (for example) is C 0 0 D 3/4 0 C 4. So.5 − 4. this trivially gives 1/4 and −1/4 as respective scores for C and D. DEALING WITH UNCERTAINTY C 0 0 D 1/4 0 C D Table 8. in the multiple criteria case. so that the chosen decision at node N3 is also C.5) = 0. Remark that. This is certainly a weak point of the method and other tools. we must consider Table 8.

5(C) 4.5) + 3/8f (−1) + 3/8f (−0.10 associated to N1. giving A as the best ﬁrst decision.5) + 1/8f (−0.5) = 0. A DIDACTIC EXAMPLE Probab.5(C) 4. so that G is the chosen decision at node N5.5(E) 5(G) 5(G) Table 8. Let us illustrate this point for the example of Figure Scenarios S-U S-V T-Y T-Z Probab.8.5) = 1/8. The other preference indices are presented in Table 8. The values in this table are those that correspond to the chosen decisions at the nodes N2 to N5 (they are indicated in parentheses). 1/8 3/8 3/8 1/8 A 7(C) 4. F and G. they yield 1/2. We can now consider Table 8.9.4. the preference of A over B is 1/8f (3. −7/4 and 5/4 as respective scores for E.5) + 3/8f (1) + 3/8f (0.5(E) 5. while the preference of B over A is 1/8f (−3.5(C) B 3. 3/4 1/4 E 6 1 F 2 2 G 5 5 195 Y Z Table 8.10 . In conclusion.5) + 1/8f (0.8 E F G E 0 0 1/4 F 3/4 0 1 G 0 0 0 Table 8. This approach allows to take the comparisons of the decisions separately for each scenario into account.9 3/4 × f (5 − 6) + 1/4 × f (5 − 1) = 1/4. On basis of this table. the “optimal action” obtained through this ﬁrst step consists in choosing A at the beginning of the ﬁrst period and C at the beginning of the second period.

with a probability equal to 2/3. but there are no uncertainties during the ﬁrst two periods.196 CHAPTER 8.11. we see that B is better than A for events S and T. At the beginning of the third period. At node N2. If the probabilities of S. each with a probability of 1/3. At the beginning of the second period. this will lead to the choice of B. two decisions C and D are possible after A and only one decision is possible after B.11. Let us apply the approach described in Section 4. At node N4. T and U can occur. where the values of C are those of F (decision chosen at node N4).6 Comment on the ﬁrst step As this approach is based on successive pairwise comparisons. On basis of Table 8. In this example. the preference index of E over F will be 1/3 × f (10 − 15) + 1/3 × f (15 − 20) + 1/3 × f (20 − 0) = 1/3. three events S. 8. we must consider Table 8.4.5 with the same function f .6 will allow to illustrate a ﬁrst drawback. the expected utility approach gives the same value 1/3 u(10) + u(15) + u(20) to A and B that are thus considered as indiﬀerent.T and U are equal to 1/3. three periods of time are considered. However.4. DEALING WITH UNCERTAINTY 8. Two decisions A and B are possible at the beginning of the ﬁrst period. The approach described in this section will give a preference index of A over B equal to 1/3 × f (10 − 15) + 1/3 × f (15 − 20) + 1/3 × f (20 − 10) and a preference index of B over A equal to 1/3 × f (15 − 10) + 1/3 × f (20 − 15) + 1/3 × f (10 − 20). where 9 has been replaced by 10 in the evaluation of B for event U. two decisions E and F are possible after C while only one decision is possible in each of the other cases. The example presented in Figure 8. so that F will be the decision chosen at node N4. if we compare A and B separately for each event. we compute the preference index of C over D by 1/3 × f (15 − 20) + 1/3 × f (20 − 0) + 1/3 × f (0 − 5) = 1/3. and the preference of D over C by . it also presents some pitfalls which must be mentioned. Making the (natural) assumption that f (x) = 0 when x is negative. while the preference index of F over E will be 1/3 × f (15 − 10) + 1/3 × f (20 − 15) + 1/3 × f (0 − 20) = 2/3. we see that this approach will lead to indiﬀerence between A and B only with a function f such that f (20 − 10) = f (15 − 10) + f (20 − 15). During the last period. With the same function f as before.

A DIDACTIC EXAMPLE 197 S T N7 N8 N9 N10 N11 N12 10 15 20 15 20 0 20 0 5 0 5 10 E N4 C F N2 A U S T U D N5 S T U N13 N14 N15 N16 N17 N18 N1 S B T N3 N6 U Figure 8.4. 1/3 1/3 1/3 Table 8.6: A pitfall of the ﬁrst step Events S T U Probab.8.11 C 15 20 0 D 20 0 5 .

4. The conclusion was many indiﬀerences between the decisions at each decision node. a second step was added by the analyst.7 The approach applied in this case: second step In order to introduce more information into the comparisons of local decisions and to take the tree as a whole into account. In conclusion. At node N1. the preference index of A over B is given by 1/3 × f (20 − 0) + 1/3 × f (0 − 5) + 1/3 × f (5 − 10) = 1/3. . in the concrete application described in this chapter another drawback was the fact that. This is due to the fact that the comparisons are “too local” in the tree. On basis of Table 8. 1/3 1/3 1/3 Table 8.198 CHAPTER 8.12 Events S T U Probab. we must consider the Table 8. DEALING WITH UNCERTAINTY Events S T U Probab. the analyst proposed to introduce a second step that is the subject of the next section. the evaluations were not very diﬀerent. due to the large common part of the actions and scenarios preceding these decisions.12. while the preference index of B over A is 1/3 × f (0 − 20) + 1/3 × f (5 − 0) + 1/3 × f (10 − 5) = 2/3.C. so that D will be the decision chosen at node N2.13. To improve the methodology.13 1/3 × f (20 − 15) + 1/3 × f (0 − 20) + 1/3 × f (5 − 0) = 2/3. the methodology leads to the choice of the action B despite the fact that it is dominated by the action (A.E) 10 15 20 A 20 0 5 B 0 5 10 Table 8.E) as is shown in Table 8. 8. for decisions at nodes relative to the last periods. so that B will be chosen at node N1. where the values of A are those of D (decision chosen at node N2). 1/3 1/3 1/3 B 0 5 10 (A.C.12.

16 . the preference of E over C is [1/4 × f (−3. G and C (N2)) and of N5 (comparison of E.14 Using the same preference function as before.5) + 3/4 × f (−1)] = 1/4.5 199 Table 8. C and D are also compared to the best decision in N4. However.8. so that C is also chosen in N3.14 At each decision node. A DIDACTIC EXAMPLE Events U V Probab.5) + 3/4 × f (1)] = 0. at each decision node. F. at node N2. G and C (N3)) lead to the same conclusions as in the ﬁrst step. the preference of D over C is still 0. At node N3. The scores of C and D are respectively 3/4 and −3/2.17 gives the preference indices. The analysis of N4 (comparison of E. i. the local decisions are also compared to the best actions in the same scenarios in each of the branches of the tree.5 5.4.16.e.15 summarises these values. 3/4 1/4 C 4.5 D 4. the preference of D over E is [1/4 × f (1) + 3/4 × f (0)] = 0 and the preference of E over D is [1/4 × f (−1) + 3/4 × f (0)] = 0.5 4.15 The scores for C and D are respectively 1/2 and −1/4. not only locally. with G (after event T). in this example. This leads to the consideration of Table 8. In Figure 8. the preference of C over E is [1/4 × f (3. F.4). C 0 0 0 D 1/4 0 0 E 1/4 0 0 C D E Table 8. Table 8. i. Table 8. but also in Events Y Z Probab. so that. C is therefore chosen at node N2. a decision leading to a ﬁnal result that is strong. on basis of Table 8. to E (after event S). 1/4 3/4 C 7 4.5 D 1 5 G 5 5 Table 8.2. the second step does not change anything. the preference of C over D is still 1/4 (see section 4.5 5. we compare C and D with the best decision in N5. the interest of this second step is to choose.e.5 E(N4) 3.

1/3 1/3 1/3 E 10 15 20 F 15 20 0 D 20 0 5 B 0 5 10 Table 8.19 . 8. This is illustrated by the example in Figure 8. DEALING WITH UNCERTAINTY C 0 0 0 D 3/4 0 3/4 G 0 0 0 C D G Table 8. so that the best decision at N4 is now E. through Table 8. The scores of E and F respectively become 1 and 1/3.19 presents the preference indices. we have to compare C (followed by E) with D and B (best action in the other branch): the scores of C and D are respectively 4/3 and -2/3.18 Table 8. we have to compare A (followed by C and E) with B and we choose A (that dominates B). although this property is not guaranteed in all cases.6 where the second step works as follows. So we see that this second step somehow avoids to choose dominated actions. At N1.18 comparison with the strongest results obtained during the ﬁrst step in the other branches of the tree (always in the same scenarios).17 Prob.200 CHAPTER 8.5 Conclusions This approach (ﬁrst and second steps) was successfully implemented and applied by the company (after many diﬃculties due to the combinatorial aspects of the problem) and some visual tools were developed in order to facilitate the decisionE 0 2/3 1/3 0 F 1/3 0 2/3 1/3 D 2/3 1/3 0 2/3 B 1 2/3 1/3 0 E F D B Table 8. we compare E and F with D and B (the best actions in the other branches as they are unique). At N2. At node N4. so that the best decision in N2 is now C.

Gilboa and Schmeidler (1993). a lot of other approaches were studied by many authors. . Jaﬀray (1989). . mixture separability. Moreover. They pointed out more or less desirable properties: linearity. . Munier (1989). . more generally. It presents the following advantages: • it compares the consequences of a decision in a scenario with the consequences of another decision in the same scenario. • it allows to introduce indiﬀerence thresholds or. this approach also presents some mysterious aspects that should be more thoroughly investigated: • it computes a sort of expected index for preference of each action over each other action. The literature on the management of uncertainty is probably one of the most abundant in decision analysis. . CONCLUSIONS maker’s understanding of the problem. Beside the expected utility model (traditional approach). . replacement separability. stochastic dominance. However. such as Dekel (1986). but it does not guarantee that the chosen action is non-dominated. diﬀerent kinds of independence.5. A dy- . as mentioned by Machina (1989). to model the preferences of the decision-maker for each evaluation scale. although the role of the so-called probabilities is not that clear in the modelling of uncertainty. Quiggin (1993). • it is a rather bizarre mixture of local (ﬁrst step) and global (second step) comparisons of the actions. 201 Let us now summarise the characteristics of this approach.8. it is important to make the distinction between what he calls static and dynamic choice situations.

can present some pitfalls that have to be known by the analyst.2 B 1 50 0.1 (and nothing with probability 0.5). So. It can be shown that any departure from the traditional approach can lead to dynamic inconsistency. Machina (1989) showed that this argument relies on a hidden assumption concerning behaviour in dynamic choice situations (the so-called consequentialism) and argued that this assumption is inappropriate when the decision-maker is a “non-expected utility maximiser”. This example shows that no approach can be considered as ideal in the context of decision under uncertainty. the actual choice at N1 diﬀers from the planned choice for that node. while if he chooses B.5 A N1 0. Now consider the tree of Figure 8. so that the best choice for him (before knowing the ﬁrst choice of nature) is A. At the same time. each procedure. As for the other situations studied in this book. if he has to plan the choice between A and B before knowing the ﬁrst choice of nature.9) to a game where he wins 10 e with probability 0.7.8).8 Figure 8.202 CHAPTER 8.5 0 10 0 0. each model. Note that these preferences violate the independence axiom of Von Neumann and Morgenstern. at node N1. However. will be B. he can easily calculate that if he chooses A. an interesting property is the so-called dynamic consistency: a decision-maker is said to be dynamically inconsistent if his actual choice when arriving at a decision node diﬀers from his previously planned choice for that node. DEALING WITH UNCERTAINTY 0. Assume that a decisionmaker prefers a game where he wins 50 e with probability 0.2 (and nothing with probability 0.5 (and nothing with probability 0. he wins 10 e with probability 0. the actual choice of the decision-maker.7: The dynamic consistency namic choice problem is characterised by the fact that at least one uncertainty node is followed by a decision node (this is typically the case of the application described in this chapter). However. he wins 50 e with probability 0.2 (and nothing with probability 0. Knowing the underlying assumptions of the decision-aid model which .8). he prefers to receive 10 e with certainty to a game where he wins 50 e with probability 0. According to the previous information.9). Let us illustrate this concept by a short example. illustrating the so-called dynamic inconsistency.1 (and nothing with probability 0. In such a context.

It is a fact that. due to the short delays and to the necessity of overcoming the combinatorial aspects of the problem). many decision tools are developed in real applications without taking enough precautions (this is also the case in the example presented in this chapter. This is why we consider providing some guidelines for modelling a decision problem important to the analysts: this will be the subject of a volume in preparation. to guarantee an as scientiﬁc as possible approach of the decision problem. CONCLUSIONS 203 will be used is probably the only way. for the analyst.8.5. . due to lack of time and other priorities.

.

actors. the actors involved. often neglected in many conventional decision aiding methodologies and in operational research. The second reason is our will to introduce the reader to some concepts and problems that will be extensively discussed in a forthcoming volume by the authors. Section 2 presents the decision process for which the decision support was requested. We introduce such a real case description for two reasons. evaluation model etc. Corrˆa and Vansnick 1999. The ﬁrst reason consists in our will to give an account of what providing decision support in a real context means and to show the importance of elements such as the participating actors. the construction of the criteria etc. Bana e Costa. a 205 . the decision aiding process. in late 1996 and early 1997. From this point of view the reader may ﬁnd questions already introduced in previous chapters of the book. problem formulation. the resources 1A large part of this chapter uses material already published in Paschetta and Tsouki`s (1999). e Roy and Bouyssou 1993). Ensslin. 2. Ackermann and Shepherd 1997. the evaluation model created and the multiple criteria method adopted. Section 1 introduces and deﬁnes some preliminary concepts that will be used in the rest of the chapter such as decision process. We will try to extensively present the decision process for which the decision support was requested. but here they are discussed from a decision aiding process perspective. the problem formulation. The reader should be aware of the fact that very few real world cases of decision support are reported in literature although much more occur in reality (for noteworthy exceptions see Belton. decision aiding process. More precisely.. including the problem structuring and formulation. 1. the chapter is organised as follows. Our objective is to stimulate the reader to reﬂect on how decision support tools and concepts are used in real life situations and how theoretical research may contribute to aide real decision– makers in real decision situations. concerning the evaluation of oﬀers following a call for tenders for a very important software acquisition. the actors involved and their concerns (stakes).9 SUPPORTING DECISIONS: A REAL-WORLD CASE STUDY Introduction In this chapter1 we report on a real world decision aiding process which took place in a large Italian ﬁrm.. Vincke 1992.

• Actors: the participants in a decision process. • Problem Formulation: a formal representation of the problem for which the client asked the analyst to support him (this is one of the products of the decision aiding process). Ostanello and Tsouki`s 1993). Jacquet-Lagr`ze. • Decision Process: a sequence of interactions amongst persons and/or organisations characterising one or more objects or concerns (the “problems”). Rosenhead 1989. • Client: an actor in a decision process who asks for a support in order to deﬁne his behaviour in the process. Masser 1983.1 Preliminaries We will make extensive use of some terms (like actor. Heurgon 1982. The term decision–maker is also used in the literature and in other chapters of this book. decision process etc. . Ostanello 1997. Ostanello 1990. • Decision Aiding Process: part of the decision process and more precisely the interactions occurring at least between the client and the analyst. but in this context we prefer to use the term client. Moscarola. Svenson and V´ri 1993. e e Checkland 1981. although present in literature (see Simon 1957. the evaluation model and the ﬁnal recommendation) and discusses the experience conducted. 9. Humphreys. Mintzberg. Raisinghani and Th´oret 1976. Roy and Hirsch 1978. while the complete list of the evaluation attributes is provided in Appendix B. SUPPORTING DECISIONS involved and the timing. a Moscarola 1984. All technical details are included in Appendix A (an ELECTRETRI type procedure is used). • Problem Situation: a descriptive model of what happens in the decision process when the decision support is requested and what the client is expecting to obtain form the decision support (this is one of the products of the decision aiding process).) in this chapter that. Section 3 describes the decision aiding process. mainly through the diﬀerent “products” of such a process that are speciﬁcally analysed (the problem formulation. can have diﬀerent interpretations. The clients’ comments on the experience are also included in this section. Section 4 summarises the lessons learned in such an experience. In order to help a the reader understand how such terms are used in this presentation we introduce some informal deﬁnitions. • Evaluation Model: a model creating a speciﬁc instance of the problem formulation for which a speciﬁc decision support method can be used (this is one of the products of the decision aiding process). Nutt 1984. • Analyst: an actor in a decision process who supports a client in a speciﬁc demand.206 CHAPTER 9.

At this point we can make the following remarks. • the company required a very particular version of GIS that did not exist as a ready made product on the market. • the question asked by the ISD was very general. some of the company’s external consultants concerned with software engineering. the RDA.2 The Decision Process In early 1996 a very large Italian company operating a network based service decided. . THE DECISION PROCESS 207 9. The actors involved at this level are the company’s IS manager. as part of a strategic development policy. because it included an evaluation prior to an acquisition and not just a simple description of the diﬀerent products. to equip itself with a Geographical Information System (GIS) on which all information concerning the structure of the network and the services provided all over the country was to be transferred. • the GISD felt able to describe and evaluate diﬀerent GIS products based on a set of attributes (at the end several hundreds). However. • The decision process for which the decision aid was provided concerned the “acquisition of a GIS for X (the company)”. the purpose of which was just as obscure (the use of a weighted sum was immediately set aside because it was perceived as “meaningless”). but was not able to provide a synthetic evaluation. but also very committing. since (at that time) this was quite a new technology. • A ﬁrst decision aiding process was established where the client was the IS manager and the analyst was the GIS department of the RDA.2. The GISD of the RDA noticed that: • the market oﬀered a very large variety of software which could be used as a GIS for the company’s purposes. The MCDA/SE unit responsible then decided to activate its links with an academic institution in order to get more insight and advice on the problem that soon appeared to overcome the knowledge level of the unit at that time. acquisition (AQ) manager. At this point of the process the GISD found out that a unit concerned with the use of the MCDA (Multiple Criteria Decision Analysis) methodology in software evaluation (MCDA/SE) was operating within the RDA and presented this problem as a case study opening a speciﬁc commitment.9. but had to be created by customising and combining diﬀerent modules of existing software. the company’s Information Systems Department (ISD) asked the aﬃliated research and development agency (RDA) and more speciﬁcally the department concerned with this type of information technology (GISD) to perform a pilot study of the market in order to orient the company towards an acquisition. diﬀerent suppliers of GIS software. with the addition of ad-hoc written software for the purpose of the company.

the AQ manager. We will focus our attention on this second decision aiding process where four actors are involved: the IS manager. but a collection of existing modules of GIS software which was expected to be used in order to create ad-hoc software for the speciﬁc necessities of the company. For this purpose the GISD drafted a decision aiding process outline where the principal activities to be performed were speciﬁed. From a procedural point of view the administration of a bid of this type is delegated to a committee which in this case included the IS manager. After such a negotiation the GISD’s activity has been deﬁned as “technical assistance to the IS manager in a bid. The ﬁrst advice by the analyst to the GISD was to negotiate a more speciﬁc commitment such that their task could be more precise and better deﬁned with their client. and submitted this draft to its client (see ﬁgure 9. the whole budget being several million e. as well as the timing. . the MCDA/SE unit as the analyst and the supervisor. 2. providing him with expert methodological knowledge and framing his activity.208 CHAPTER 9. As already noted before. From such a perspective the task of the GISD (and of the decision aiding process) was to provide the IS manager with a “global” technical evaluation of the oﬀers that could be used in the negotiations with the AQ manager (inside of the committee) and the suppliers (outside of the committee). Two diﬃculties arose from this: • the a priori evaluation of the software behaviour and its performance without being able to test it on speciﬁc company-related cases. 1. 3. a delegate of the CEO and a lawyer from the legal staﬀ. SUPPORTING DECISIONS • A second decision aiding process was established where the client was the GIS department of the RDA and the analyst was the MCDA/SE unit. • the timing of the evaluation (including testing the oﬀers) could be extremely long compared with the rapidity of the technological evolution of this type of software. At this point it is important to note the following. concerning the acquisition of a GIS for the company” and its speciﬁc task was to provide a “technical evaluation” of the oﬀers that were expected to be submitted. From a ﬁnancial point of view it represented a large stake for the company and a high level of responsibility for the decision–makers. plus the hardware platforms on which such software was expected to run. A third actor involved in this process was the “supervisor” of the analyst in the sense of someone supporting the analyst in diﬀerent tasks. the bid concerned software that was not ready made. The call for tenders concerned the acquisition of hundreds of software licenses.1). the GISD (or team of analysts) as the client (bear in mind their particular position of clients and analysts at the same time).

sorting & final ranking Final Choice Figure 9.1: the bid process .2. THE DECISION PROCESS 209 Bid Start Preparation of call for tenders Call for tenders Client desired environment study Methodology study technical advisor Call for tenders answer preparation First set of answers from suppliers First Selection Definition of requirements.9. points of view & decision problem client advisor + client supplier Problem Formulation Make invitation letter Invitation letter Tender preparation Completion of decision model for second selection Definition of prototype requirements Second set of answers from suppliers Lab preparation for prototype evaluation Second selection Completion of decision model for ranking: definition of criteria & aggregation procedure Prototype Requirements Prototype Development Prototypes Prototype analysis.

but we can anticipate that the ﬁnal formulation consisted in an absolute evaluation of the oﬀers under a set of points of view that could be divided into two parts: the “quality evaluation” and the “performance evaluation”. the tenderers requirements section. an outline of the evaluation model has to be included in . but the client provided us with some written remarks that were also reported during a conference presentation (see Fiammengo. the evaluation model and the ﬁnal recommendation. We will discuss the problem formulation and the evaluation model in detail in the next section. Although the set of alternatives was relatively small (only six alternatives were considered). The discussion was conducted in a very informal way. in reality and in this case speciﬁcally. We will discuss such constructions in detail in the next sections. expanded in an hierarchy with 134 leaves resulting in 183 evaluation nodes (see Appendix B). SUPPORTING DECISIONS Once the call for tenders had been prepared (including the software requirements sections. they have been generated contemporaneously. the set of attributes was extremely complex (as often happens in software evaluation). Iob. We should remember that the problem formulation and a ﬁrst outline of the evaluation model were established while the call for tenders was under elaboration for two reasons: • for legal reasons. Buosi. A second step in the decision aiding process was the generation of a problem formulation and of an evaluation model. Actually there were seven basic evaluation dimensions. how they perceived it.3 Decision Support We present the three products of the decision aiding process here: the problem formulation. It is this extended group that signed the ﬁnal recommendation presented to the IS manager and that we will hereafter call “team of analysts” (for the IS manager) or client (for the MCDA/SE unit and for us).210 CHAPTER 9. Maﬃoli. the timing and evaluation procedure). what they learned and what their appreciation was. a set of was presented to the company and the technical evaluation activity was settled. Such remarks are introduced in the following section. Panarotto and Turino 1997). A third and ﬁnal step in the decision aiding process was the elaboration of the ﬁnal recommendation after all the necessary information for the evaluation had been obtained and the evaluation performed. Some months after the end of the process and the delivery of the ﬁnal report we asked our client (the team of analysts) to discuss their experience with us and to answer some questions concerning the methodology used. but we can anticipate that such an elaboration highlighted some questions (substantial and methodological) that have not been considered before. Although we formally consider the two as two distinct products of the process. 9. software engineering experts in the company’s sector who practically acted as the IS manager’s delegates in the group. It is interesting to notice that the GISD staﬀ charged with this evaluation has been “supported” by external consultants.

.. “.. could be greatly eased by the use of any process centred methodology.. when signiﬁcant stakes are considered (as in our case)..as a formal process MCDA guaranteed greater control and transparency to the process. 211 • the evaluation model implicitly contains the software requirements of the oﬀers which in turn deﬁnes the information to be provided by the tenderers. such as a bid. The choice to introduce some tests was made during the deﬁnition of the evaluation model. Reporting the client’s remarks: “.. Moreover. how to deﬁne the call for tenders.. Moreover..”. such an ambiguity might result in an impossibility to understand and ultimately to propose viable solutions..3... In fact. “A complex process.. DECISION SUPPORT the call for tenders..as a formal approach MCDA generated greater control and transparency..3. The tenderers therefore knew that they had to produce a prototype within a certain time frame.... decision–makers may consider it dangerous to make a decision without having a clear idea of the .1 Problem Formulation From the presentation of the process we can make the following observations: 1.”... “.MCDA (Multi Criteria Decision Analysis) was very useful in organising the overall process and structure of the bid: what were the important steps to do.. We actually agree with their comment that “any process modelled methodology could be useful” and we consider that their positive perception of MCDA is based on the fact that it was the ﬁrst decision support approach process they came to know. what his client expected and what they were able to provide. the call for tenders speciﬁed that a prototype was requested in order to test some performances. It was extremely important for the client (the team of analysts) to understand his role in the process...” It is this last sentence which clearly highlights the necessity for the client to have a support along the whole process and for all its aspects. at the beginning of the process.”. We recall the client’s remarks: “.. the client considered to be able to understand that the expectations of the other actors involved in the process were extremely relevant both for strategic reasons (having to do with organisational problems of the company) and operational reasons (recommend something reliable in a clear and sound way for all the actors involved in the bid). which could be able to take what was happening in the decision process into account. Complex decision processes are based on human interactions and these are based on the intrinsic ambiguity of human communication (thanks to ambiguity human communication is also very eﬃcient). For instance.MCDA was used as a background for the whole decision process. With such a perspective it turned out to be very useful because every activity had a justiﬁcation. However.9. the problem situation was absolutely unclear...”. 9. 2.

the team of analysts interpreted the client’s demand as a question of whether the oﬀers could be considered as intrinsically “good”. There were two reasons for this choice. Using the terminology introduced by Roy (1996). 1. SUPPORTING DECISIONS consequences of their acts. was eliminated due to the particular technology where no consolidated producers exist. If we interpret the concept of measurement in a wide sense (comparing the oﬀers to pre-established proﬁles can be viewed as a measurement procedure) the result that the team of analysts was looking for appeared to be the conclusion of repeated aggregations of measures. as well as the oﬀers. No cost estimates were required by the client and so they were not considered in this set. the problem statement appeared to be an hierarchically organised sorting of the oﬀers.212 CHAPTER 9. It is clear that deﬁning a precise problem formulation became a key issue for the client because it clariﬁed his role in the decision process (the bid management). In other words it could happen that the best bid could be “bad” and this was incompatible with the importance and cost of the acquisition. One concerning “quality” including speciﬁc technical features required for the software plus some ISO/IEC 9126 (1991) based dimensions and the second concerning the performance of the oﬀered software to be tested on prototypes. We deﬁne (Morisio and Tsouki`s 1997) a problem formulation as the collection a of: a set of actions. Such points of view formed a huge hierarchy (see further on for details). The use of a formal approach enables the reduction of ambiguity (without completely eliminating it) and thus appears to be an important support to the decision process. After some discussion the problem statement adopted was the one of an “absolute” evaluation of the oﬀers both on a disaggregated level and on a global one. A simple ranking of the oﬀers could conceal the fact that all of them could be of very poor quality or satisfy the software requirements to a very low level. The team of analysts felt uncomfortable with the idea of comparing the merits (or de-merits) of an oﬀer with merits (or de-merits) of another oﬀer. The set of points of view was deﬁned using the team of analysts’ technical knowledge and can be viewed in two basic sets. the sorting being repeated at all levels of the hierarchy. As far as the problem formulation is concerned. his relation with the IS manager (his client) and gave him a precise activity to perform. 2. A ﬁrst informal discussion of the problem of compensation convinced them to overcome this question by comparing the oﬀers to proﬁles about which they had suﬃcient knowledge. A ﬁrst idea to evaluate the tenderers. an ex-post remark made by the team of analysts concerned the length of the evaluation process. and not to compare bids amongst themselves. Actually. a set of points of view and a problem statement. “bad” etc. They considered that such a process was so long that the information available at the beginning and . The set of alternatives was considered to be the set of oﬀers submitted after the call for tenders. The only point that caused a discussion in the analysts team concerning the problem formulation was the problem statement.

9. While for relatively short decision aiding processes the problem may be irrelevant. In the following we present their deﬁnition as they occurred in the decision aiding process. Moreover the client himself may revise the problem formulation or update his perception of the information and modify his judgements. a frequent attitude of technical committees charged with evaluating complex objects (as in our case) is to deﬁne an “excellence list” where every possible aspect of the object is considered.the choice of the attributes to use. The result is that such a list is an abstract collection of attributes. they have been considered as wholes.3. it could be considered unfair to modify the evaluations just before the ﬁnal recommendation. . they could revise some of their judgements. due to the knowledge acquired in this period (mainly due to the process itself). We consider that this is a critical issue for decision support and decision aiding processes. Such a list is generally provided by the literature. This is rarely considered in decision aiding methodologies. The ﬁnal report did not consider any revision of the formulation and the evaluations since in the context of a call for tenders. Another observation made by part of the team of analysts was that towards the end of the process.the semantics of each attribute. international standards etc.3. DECISION SUPPORT 213 the formulation itself could no longer be valid at the end of the process. Although each oﬀer was composed of diﬀerent modules and software components. it is certain that in long processes such a problem cannot be neglected and requires speciﬁc consideration. Blin and Tsouki`s 1998. Each node was subject to extensive discussion before arriving at a ﬁnal version. Information is valid only for a limited period of time and consequently the same is true for all evaluations based on such information. Regarding the ﬁrst issue.. independent from the spe- . the length of the evaluation was considered as a negative critical issue in the client’s remarks. We may notice that despite the fact that we had a large amount of information to handle in our model. the experience. This is a typical situation in software evaluation (see Morisio and Tsouki`s a 1997. The set of evaluation dimensions was a complex hierarchy with seven root nodes. This was partly due to the very rapid evolution of GIS technology that could completely innovate the state of the art in six months. The key idea was that a a each node of the hierarchy was an evaluation model itself for which the evaluation dimensions to aggregate and the aggregation procedure had to be deﬁned. the case did not present any exogenous uncertainty since the client considered the basic data and its judgements reliable and felt conﬁdent with them. 134 leaves and 183 nodes in total (the complete list is available in Appendix B). No preliminary screening of the oﬀers was expected to be made.9. Stamelos and Tsouki`s 1998).2 The Evaluation Model The diﬀerent components of the evaluation model were speciﬁed in an iterative fashion. Actually. Basically two issues have been considered in such discussions: . The set of alternatives was identiﬁed as the set of oﬀers legally accepted by the company in reply to the call for tenders.

or alternatively consider single independent quality aspects whose evaluation depends on how the process attribute is considered. at a certain point in the hierarchy deﬁnition process. “acceptable” etc. there was a discussion about some attributes that could also be considered as leaves at the top level of the hierarchy. one can consider a process attribute (at the ﬁnal level) and then subdivide it in quality aspects.it was not necessary to be so detailed in the evaluation. These were the so called “process attributes”. In fact.”. the client wrote. Despite this work. thus containing redundancies and conceptual dependencies which can invalidate the evaluation. With respect to the second issue we pushed the client to provide us with a short description of each attribute and when a preference model was associated to it. Such an activity also helped the client to realise that they needed an absolute evaluation of the alternatives for almost all the intermediate nodes of the hierarchy thus implicitly deﬁning the problem statement of the model. With this term we want to indicate that each alternative could be described by a vector of the 134 elementary pieces of information that were in the large majority either subjective evaluations by experts (mostly part of the team of analysts. but had no knowledge and no tools to enable him to simplify and reduce the ﬁrst version of the list they had deﬁned. The repeated use of a coherence test (in the sense of Roy and Bouyssou 1993) for each intermediate node of the hierarchy made it possible to eliminate a signiﬁcant number of redundant and dependent attributes (more than 30%) and to better understand the semantics of each attribute used. Verifying the separability of each sub–dimension with respect to the parent node was very helpful. SUPPORTING DECISIONS ciﬁc problem at hand. The latter were expressed on nominal scales. type or descriptions of the “operating system X”. the whole process could be faster because we needed the software for a due date. The ﬁnal choice was to put process attributes at the top level because directly emanating from the evaluation scope. it could be preferable to use a limited number of criteria. The basic information available was of the “subjective ordinal measurement” type.. On the other hand it is also true that it is only after the process that the client was able to determine which were the really signiﬁcant criteria that discriminated among the alternatives.. “compatible with graphic engine Y” etc. in his ex-post considerations: “. Such an approach helped the client both to eliminate redundancies (before using the coherence test which is time consuming) and in better understanding the contents of the evaluation model. type.. they were intended to evaluate special functionality inside diﬀerent processes (in this context “process” means a chunk of functionality aiming towards supporting a stream of activities of a software). i. a short description of the model (why a certain value was considered as better than another).. It was almost impossible that the experts could be able to give more information than such an order and it was exactly this type of information that pushed the client to look for another evaluation model than the usual weighted sum widely diﬀused in software evaluation manuals and standards (see ISO/IEC . Our client was aware of the problem.. while the former were expressed on ordinal scales.. in the sense that each sub–node should be able to discriminate alone the oﬀers with respect to the evaluation considered at the parent level.e. the client) of the “good”.214 CHAPTER 9. For instance.

For all leave nodes an ordinal scale was established. The available technical knowledge consisted in diﬀerent possible “states” in which an oﬀer could ﬁnd .. the presence of ordinal information for almost all leaves and the problem statement that required a “repeated sorting” of the oﬀers. but to a-priori deﬁned (by the client) standards of “good”. Gathering and obtaining the relevant information for an evaluation model is often considered as a second level activity and therefore neglected from further speciﬁc considerations. At this point the team was ready to deﬁne their speciﬁc evaluation models for all nodes.. IEEE 92 1992). oriented the team of analysts to choose an aggregation procedure based on the ELECTRE-TRI method (see Yu 1992). The preference among the alternatives was expected to be induced once the alternatives could be “measured” by the attributes. the client did not compare the alternatives amongst themselves. the discussion on the diﬀerent typologies of measurement scales helped the client to understand the problem of choosing an appropriate aggregation procedure. “acceptable” etc. As already reported. Moreover. “acceptable” etc.1).3. When asked to formulate preferences they concerned the elements of the nominal scales and not the alternatives themselves. An important discussion with the client concerned the distinction between measures and preferences. the basic information consisted either in observations concerning the oﬀers (expressed in nominal scales) or in expert judgements (expressed in ordinal scales of value of the “good”. See also appendix A for a presentation of the procedure. Moreover. associating a preference model on the elements of the nominal scale of the attribute. Before continuing the deﬁnition of the model associated to each node the problem of the aggregation procedure was faced since it could inﬂuence the construction of such models.9. Actually. except for the ﬁnal aggregation level. 1. the client needed to aggregate ordinal measures and not preferences (in the sense that they had to aggregate the ordinal measures obtained when comparing the alternatives to the standards and not to compare the alternatives amongst themselves). Clearly all nominal scales had to be transformed into ordinal ones. In particular we had the following cases. In our case. DECISION SUPPORT 215 9126 1991. From a certain point of view we can claim that. Under such a perspective it was important for the client to understand on what they were expressing their preferences on. but a time consuming process that required the establishment of an ad-hoc procedure during the process (see ﬁgure 9. All the intermediate nodes were expected to provide information of the second type. type). Obtaining the information was not a diﬃcult task. We consider that this is also a critical issue in a decision aiding process. We can consider that the information is constructed during the decision aiding process and cannot be viewed as a simple input. Such an observation greatly helped the client to understand the nature and scope of the evaluation model and ultimately to deﬁne the problem statement of the model. the information used in an evaluation model results from the manipulation of the rough information available at the beginning of the process. But such a problem can invalidate the problem formulation adopted.

S. other acceptable graphic engine (OA). two possibilities for deﬁning the relationship between the values on the sub–nodes and the values on the parent nodes were established.E. . The relative importance of the sub–nodes and the concordance threshold .1.OA.OA. The three ordinal scales associated to the three nodes were ( representing the scale order): 1. 2. Then.S T.2 When an exhaustive combination of the values was impossible.SG. consider the leave nodes 1.2: station M (M. The possible states on these characteristics were: 1. availability of an advanced graphic language (E).1 When possible. .VG: T.S T E. .1. 2. 1.M. .E.G: T. no customisation available (N).the concordance threshold for the establishment of the outranking relation among the oﬀers and the proﬁles. For this purpose.M. SUPPORTING DECISIONS itself.216 CHAPTER 9.SG or T.1.SG or E.1.SG or T.SG or T. 1. availability of a standard programming language (S).S.2: M OA ON.SG. 1. 1.1 (user interface of the land-base management) which has the three evaluation models introduced in the previous example as sub–nodes.SG or E.1.SG or T.1.OA.S E S N.A: all remaining cases except the unacceptable.the relative importance of the diﬀerent sub nodes.M. very good (VI).1: SG NSG. In this case we have the following evaluation model: . an exhaustive combination of the values of the sub– nodes was provided. For instance consider node 1.SG or T. 2.1: standard graphics (SG).SG.1. a brief descriptive text of what the node was expected to evaluate was provided.E.a veto condition on the sub node such that the value on the parent node could be limited (possibly unacceptable).S). For instance.E. .U: all cases where 1. 1.2 (graphic engine of the user interface in the land-base management).S.1 is NSG or 1.1.M. For all parent nodes. the following information was requested: .3: availability of a graphic tool (T).S. good (G).M.M. All parent nodes were equipped with the same number of classes: unacceptable (U). 1.1 (type of presentation on the user interface in the land-base management). In this case diﬀerent possible combinations were possible (for instance a software could provide both an advanced graphic language and a standard programming language: value E. other non acceptable graphic engine (ON).1.OA.3 is N. non standard graphics (NSG).2 is ON or 1.S. . excellent (E).E.3 (customisation of the user interface in the land-base management). an ELECTRETRI procedure was used. graphic engine already adopted in other software used in the company). acceptable (A).E: T.3: T.1.E T.1.1.

that very strong reasons were required to qualify an oﬀer as very good. Such choices imply that no coalition that excluded nodes 1. of at least a part of the team of analysts. corresponding to six (among seven) of the root nodes of the model. 1.7. sub–nodes 7.4) concerned the evaluation of the performances of the prototypes submitted to tests by the team of analysts.7 was acceptable and that the smallest acceptable coalition should necessarily include the nodes 1.7: Integration between land-base products and the Spatial Data manager. 1. revised the importance parameters several times.6: Interoperability. Such performances are basically measured in the time necessary to execute a set of speciﬁc tasks under certain conditions and with some external ﬁxed . In other words the team of analysts established the characteristics of the sub–nodes for which an oﬀer could be considered very good (therefore should outrank the very good proﬁle) and consequently compared the values of the parameters of relative importance and of the concordance threshold. 1. 1. the team of analysts considered any “unacceptable” value to be a severe technical limitation of the oﬀer.2.1.6) = 8. 1. 3. DECISION SUPPORT 217 have been established using a reasoning on coalitions (for details see Chapter 6). 7.8: Integration among land-base products.5: Work ﬂow connection.3. very often around 90%) that result in very severe evaluations.7) = 8. w(1.6. As already mentioned.3 and any two of the nodes 1. In other words. w(1.3) = 5.4) = 4. 1. The reader may notice that this is a very strong interpretation of a veto condition among the ones used in the outranking based sorting procedures. The relative importance parameters were established as follows:w(1. 7.2 or 1. 1.4 and 1. but it was the one with which the team of analysts felt comfortable at the time of construction of the evaluation model. The veto condition was established as the presence of the value “unacceptable” at a sub–node. w(1.9. w(1. this conviction had wider eﬀects than the team of analysts could imagine.8) = 2 and the concordance threshold was ﬁxed as 29/36 (around 0. w(1. The team of analysts also established very high concordance thresholds (never less than 80%. w(1. w(1. The seventh root node (node 7. The presence of a veto also produced an “unacceptable” value at the level of the parent node. The analyst and the supervisor explained this aspect to the client who on this basis.1) = 4.3: Development environment.8). For example we can take node 1 (land-base management) which has eight sub nodes: 1.1. which will be called the “quality attributes” or “quality criteria” or “quality part of the hierarchy” hereafter. Since the whole model was calibrated starting from the very good value. 1.3. the set of dimensions was built around two basic points of view: the “quality” and the “performances”. Such a choice reﬂected the conviction.4: Administration tools.5) = 1. 7. 1. 1. The ﬁrst generated six evaluation dimensions.2) = 8.2: Functionality.1: User interface.2.

The combination is obtained. was deﬁned as the seven root nodes equipped with a simple preference model: the weak order induced by the ordinal scale associated to each of these nodes. is less true for node 7 and its sub–nodes.218 CHAPTER 9. the technology is quite new and there are no standards of what a “ very good” performance could be. No exogenous uncertainty was considered in the evaluation model. The information provided by the tenderers concerning their oﬀers was considered to be reliable and the use of ordinal scales made it possible to avoid the problems of imprecision or of measurement errors. the interpolation is not necessarily linear). This process was repeated for all the intermediate nodes up to the seven root nodes representing the seven basic evaluation dimensions. consider node 7. all performances presenting a diﬀerence of more than 20% and less than 25% “third”. The dimension is expected to evaluate the performance of the prototype while the quantity of data that have to be elaborated increases. it prevailed in the team and ﬁnally was accepted. mainly of a technical nature (concerning the speciﬁc contents of the values for each node). SUPPORTING DECISIONS parameters. A sorting procedure could then be established to obtain the ﬁnal evaluation. through the following formula: v(x) = Wx (t)Tx (t)dt In this case there are no external proﬁles with which to compare the performances because the prototypes are created ad-hoc. Some endogenous uncertainty appeared as soon as the model was put into practice (the oﬀers being available). However. The length of the process is justiﬁed. This reasoning however. it was the only way to obtain meaningful values for the oﬀers. We shall . The same model was applied to all sub–nodes of node 7. The most discussed concept of the model was the concordance threshold and the veto condition since part of the team considered that the required levels were extremely severe.3 (performance under load). For instance. The value v(x) (x being an oﬀer) combines an observed measure Wx (t) and an interpolated one Tx (t) (t representing the data load. all performances presenting a diﬀerence of more than 25% and less than 50% “fourth” and all performances presenting a diﬀerence of more than 50% “ﬁfth”. Although this process can be often qualiﬁed as “subjective measurement”. but also because the team of analysts was obliged to deﬁne a new measurement scale and a precise measurement aggregation procedure for each node. not only by the quantity of nodes to deﬁne. if a preference aggregation comparing the alternatives amongst themselves was requested. since such an approach corresponded to a cautious attitude. in this case. all performances presenting a diﬀerence of more than 5% and less than 20% “second”. It took four to ﬁve months for all the nodes to be equipped with their evaluation model and the process generated several discussions inside the team of analysts. but the team of analysts felt suﬃciently conﬁdent with the tests and did not analyse the problem further. The set of criteria to be used. An ordinal scale was created considering the best performances as “ﬁrst”.

9.3. DECISION SUPPORT

219

discuss this problem in more detail in the next section (concerning the elaboration of the ﬁnal recommendation), but we can anticipate that the problem was created by the “double” evaluation provided by the chosen ELECTRE-TRI type aggregation consisting in an “optimistic” and a “pessimistic” evaluation which may not necessarily coincide. The evaluation model was coded in a formal document that was submitted (and explained) to the ﬁnal client receiving his consensus. It is worthwhile to note that the ﬁnal client was not able to participate in the elaboration of the model (technical details, establishment of the parameters etc.). Part of the team of analysts (some of the external consultants) were acting as his delegates. The establishment of the evaluation model and its acceptance by the client opened the way for its application on the set of oﬀers received and for the elaboration of the ﬁnal recommendation. The client greatly appreciated his involvement in the establishment of the evaluation model that turned out to be a product considered to be their own (from their ex-post remarks: “....this (the involvement) turned out to be important....for the acceptability of the evaluation results”). The fact that each node of the hierarchy was discussed, analysed and ﬁnally deﬁned by the team of analysts allowed them to understand the consequences for the global level, to be able to explain the contents of the model to their client and justify the ﬁnal result on the grounds of their own knowledge and experience, not of the procedure adopted. In other words we can claim that the model was validated during its construction. Such an approach helped both the acceptability of the model and the ﬁnal result, eased the discussion when the question of the ﬁnal aggregation was settled and deﬁnitely legitimated the model in the eyes of the client.

9.3.3

The ﬁnal recommendation

The evaluation of the six oﬀers, which eﬀectively had been submitted after the call for tenders was elaborated, was carried out in two main steps. The ﬁrst consisting in evaluating the six “quality attributes” and the second consisting in testing the prototypes provided by the tenderers. The method adopted to aggregate the information and construct the ﬁnal evaluations was a variant of the ELECTRE TRI procedure (see Yu 1992). The reader can also see Appendix A and refer to Chapter 6 for more details. We have the following remarks on the use of such a method. 1. The key parameters used in the method are the proﬁles (to which the alternatives are compared in order to be classiﬁed in a speciﬁc class), the importance of each criterion for each parent criterion classiﬁcation and the concepts of concordance thresholds and veto conditions. For each intermediate node such parameters were extensively discussed before reaching a precise numerical representation. As already mentioned in section 3.2 the relative importance of each criterion and the concordance threshold were established using a reasoning based on the identiﬁcation of the “winning coalitions” enabling the outranking relation to hold. The veto

220

CHAPTER 9. SUPPORTING DECISIONS condition was initially perceived as a theoretical possibility of no practical use, then, as an eliminatory threshold, but the client soon realised its importance mainly when it was necessary to have an incomparability instead of an indiﬀerence that was a counterintuitive situation when very diﬀerent objects were compared. Further on and as soon as the veto conditions were understood by the client, they decided to introduce a similar concept each times they wanted to distinguish between positive reasons (for the establishment of the outranking relation) and negative reasons (against the establishment of the outranking relation), since they are not necessarily complementary and must be evaluated in a separate and independent way. The proﬁles were established using the knowledge of the team of analysts (experts in their domain) that were able to identify the minimal requirements to qualify an object in a certain class. It is interesting to notice that for the client, the intuitive idea of a proﬁle was that of a typical object of a class and not of the lower bound of the class. The shift from the intuitive idea to the one used in the case study was immediate and presented no problems. The fact remains, that the distinction between the two concepts of proﬁle is crucial, while the lower bound approach appears to be less intuitive than the typical element one.

2. The whole method (and the model) was implemented on a spreadsheet. This was of great importance because spreadsheets are a basic tool for communication and work in all companies and enable an immediate understanding of the results. Moreover, they enable on-line what-if operations when speciﬁc problems, concerning precise information and/or evaluation, appeared during the discussions inside the team of analysts. The experimental validation of the model was greatly eased by the use of the spreadsheet. Further on it helped the acceptability and legitimation of the model through the idea that “if it can be implemented on a spreadsheet it is suﬃciently simple and easy to be used by our company”. In fact some of the critiques by the client about the approach adopted in this case were that “....MCDA is not yet a universally known method....”, “....seems less intuitive than other well known techniques such as the weighted sum...”, “....it is time consuming to apply a new methodology....”, all these problems limiting the acceptability of the methodology towards the client’s client (the IS manager) and the company more generally. Being able to implement the method and the model on a spreadsheet was, for them, a proof that, although new, complex and apparently less intuitive, the method was simple and easy and therefore legitimately used in the decision process. A speciﬁc problem which was raised in the ﬁrst step was the generation of uncertainty due to the aggregation procedure. The ELECTRE-TRI type procedure adopted produces an interval evaluation consisting in a lower value (the pessimistic evaluation) and an upper value (the optimistic evaluation). When an alternative has a proﬁle on the sub–nodes that is very diﬀerent from the proﬁles of the classes on the parent node then, due to the incomparabilities that occur when comparing

9.3. DECISION SUPPORT O1 A-A A-A A-A A-G U-U A-A O2 G-G G-VG G-G G-VG G-VG VG-VG O3 A-VG A-VG A-VG A-VG G-G E-E O4 A-G A-VG G-G G-VG A-G VG-VG O5 G-VG G-G A-A A-VG G-VG G-G O6 A-A A-G A-A A-G U-U VG-VG

221

C1 C2 C3 C4 C5 C6

Table 9.1: the values of the alternatives on the six quality criteria (U: unacceptable, A: acceptable, G: good, VG: very good, E: excellent)

the alternative to the proﬁles, it may happen that the two values do not coincide (see more details in Appendix A). When the user of the model is not able to choose one of the two evaluations in an hierarchical aggregation can be a problem since at the next aggregation the sub–nodes may have evaluations expressed on an interval. This is a typical case of endogenous uncertainty created by a method itself and not by the available information. The client was keen to consider the pessimistic and optimistic evaluation as bounds of the “real” value, but there was no uncertainty distribution on the interval. For this purpose, the following procedure was adopted. Two distinct aggregations were made, one where the lower values were used and the other where the upper values were used. Each of these, in turn, may produce a lower value and an upper value. At the next aggregation step, the lowest of the two lower values and the highest of the two upper values is used. This is a cautious attitude and has the drawback of widening the intervals as the aggregation goes up the hierarchy. However, this eﬀect did not occur here and the ﬁnal result for the six dimensions is represented in table 9.1 (from here on we will represent the criteria by Ci and the alternatives by Oi).

We consider that the problem of interval evaluation on ordinal scales is an open theoretical problem that deserves future consideration (very little literature on the subject is available to our knowledge: (see Roubens and Vincke 1985, Vincke 1988, Pirlot and Vincke 1997, Tsouki`s and Vincke 1999). a Another modiﬁcation introduced in the aggregation procedure concerned the use of the veto concept. As already mentioned, a strong veto concept was used in the evaluation model such that the presence of an “unacceptable” value on any node (among the ones endowed with such veto power) could result in a global “unacceptable” value. However, during the evaluation of the oﬀers, weaker concepts of veto appeared necessary. The idea was that certain values could have a “limitation” eﬀect of the type: “if an oﬀer has the value x on a sub–node then it cannot be more than y on the parent node”. The results on node 7 concerning the performances of the prototypes are pre-

222 O1 A-A O2 G-G

CHAPTER 9. SUPPORTING DECISIONS O3 G-G O4 A-A O5 E-E O6 A-A

C7

Table 9.2: the values of the alternatives on the performance criterion (U: unacceptable, A: acceptable, G: good, VG: very good, E: excellent)

sented in table 9.2. Remember that such a result is an ordinal scale obtained by aggregating the four scales deﬁned as explained in the previous section. Therefore, it could be considered more as a ranking than as an absolute evaluation. For this reason the team of analysts decided to use such an attribute only to rank the diﬀerent oﬀers after their sorting obtained by using the six quality attributes. For this purpose the team of analysts tested three diﬀerent aggregation scenarios corresponding to three diﬀerent hypotheses about the importance of the performance attribute.

1. The performance attribute is considered to have the same importance as the set of six quality attributes. This scenario represents the idea that the tests on the software performances correspond to the only “real” or “objective” measurement of the oﬀers and it should therefore be viewed as a validation of the result obtained through the subjective measurement carried out on the six quality attributes. The aggregation procedure consisted in using the six quality attributes as criteria equipped with a weak order from which to obtain a ﬁnal ranking. Since the evaluations for some of the six attributes were in the form of an interval, an extended ordinal scale was deﬁned in order to induce the weak order: E V G G − V G G A − V G A − G A U . The importance parameters are w(1.) = 2, w(2.) = 2, w(3.) = 4, w(4.) = 1, w(5.) = 4, w(6.) = 2 and the concordance threshold 12/15 (0.8). The six orders are the following (x,y standing for indiﬀerence between x and y): - O5 O2 O3 O4 O1, O6; - O2 O5 O3 O4 O6 O1; - O2 O4 O3 O5, O1, O6; - O2, O4 O3, O5 O1, O6; - O2, O5 O3, O4 O1, O6; - O3 O2 O6, O4 O5 O1. The ﬁnal result is presented in table 9.3. In order to rank the alternatives a “score” is computed for each of them. It is the diﬀerence of the number of alternatives preferred to this speciﬁc alternative and the number of alternatives to which this speciﬁc alternative is preferred. Then, the alternatives are ranked by decreasing magnitude of this score. The ﬁnal ranking thus obtained is given in ﬁgure 9.2 2a (it is worthwhile noting that the indiﬀerence obtained in the ﬁnal ranking corresponds to incomparabilities obtained in the aggregation step). An intersection was therefore operated with the

9.3. DECISION SUPPORT

223

O2 d d O2 © d O3

O5

c O3,O4,O5

c O4

c O6

c O6

c O1 2a

c O1 2b

Figure 9.2: 2a: the ﬁnal ranking using the six quality criteria. 2b: the ﬁnal ranking as intersection of the six quality criteria and the performance criterion

ranking obtained on node 7. resulting in a ﬁnal ranking reported in ﬁgure 9.2 2b. 2. The performance attribute is considered to be of secondary importance, to be used in order to distinguish among the alternatives assigned in the same class using the six quality attributes. In other words, the principal evaluation was to be considered as the one using the six quality attributes and the performance evaluation was only a supplement enabling an eventual further distinction. Such an approach resulted in a low conﬁdence evaluation being awarded to the performance and the undesirability of assigning it high importance. A lexicographic aggregation has been therefore applied using the six quality criteria as in the previous scenario and applying the performance criterion to the equivalence classes of the global ranking. The ﬁnal ranking is O2 O5 O3 O4 O6 O1. 3. A third approach consisted in considering the seven attributes as seven criteria to be aggregated to obtain a ﬁnal ranking assigning them a reasoned importance parameter. The idea was that while the client could be interested in having the absolute evaluation of the oﬀers (result obtainable only using the six quality attributes) he could also be interested in a ranking of the alternatives that could help him in the ﬁnal choice. From this point of

) = 4. w(4. O5 O3.8).) = 2. The ﬁnal result is reported in table 9. O4 O1. O6. Using the same ranking procedure the ﬁnal ranking is now: O2 O3. O4 O3. SUPPORTING DECISIONS O3 0 1 1 0 0 0 O4 0 1 0 1 0 0 O5 0 1 0 0 1 0 O6 0 1 1 1 1 1 O1 O2 O3 O4 O5 O6 Table 9. .O2 O4 O3 O5. w(3. O4 O5 O1. .O2. O5 O1. The seven weak orders are the following: . . O6.224 O1 1 1 1 1 1 1 O2 0 1 0 0 0 0 CHAPTER 9. w(5. . O3 O4.) = 1. it was not meaningful to translate the weak order obtained for the performance attribute as an ordinal measurement of the oﬀers. w(2. O6.O2.O2 O5 O3 O4 O6 O1.) = 2. The two basic reasons were: . O6. the third scenario was adopted and used as the ﬁnal result.4. . w(6. O1. .) = 4. w(7.) = 4 and the concordance threshold 16/19 (more than 0. O5 Finally and after some discussions with the client. O6. O1.O3 O2 O6. O4 O6 O1.while it was meaningful to interpret the ordinal measures for the six quality attributes as weak orders representing the client’s preferences.) = 2.O5 O2 O3 O4 O1.4: the outranking relation aggregating the seven criteria view the absolute evaluations on of the six quality attributes were transformed into rankings as in the ﬁrst scenario adding the seventh attribute as a seventh criterion. The importance parameters are w(1.3: the outranking relation aggregating the six quality criteria O1 1 1 1 1 1 1 O2 0 1 0 0 0 0 O3 0 1 1 0 0 0 O4 0 1 0 1 0 0 O5 0 0 0 0 1 0 O6 0 1 1 1 1 1 O1 O2 O3 O4 O5 O6 Table 9. .O5 O2.

the client considered the approach to be useful because “every activity was justiﬁed”.it is possible to give a numerical representation to both the ordinal measurement obtained using the six quality attributes and to the ﬁnal ranking obtained using the seven criteria. recommendations and decisions towards a director. but not to the client’s perception of the problem. DECISION SUPPORT 225 .). using an ordinal aggregation procedure construct a ﬁnal choice (then the negotiation should concentrate on deﬁning the importance parameters. Such a justiﬁcation applies both to how a speciﬁc result was obtained and to how the whole evaluation was conducted.9. It was soon clear that the question originated from the will of the ﬁnal client to be able to negotiate with the AQ manager on a monetary basis since it was expected that he would introduce the cost dimension into the ﬁnal decision.it is possible to compare the result with a cost criterion following two possible approaches: 1. 2. the thresholds etc. As already reported. A ﬁnal question that arose during the elaboration of the ﬁnal recommendation was elaborated was whether it would be possible to provide a numerical representation of the values obtained by the oﬀers and of the ﬁnal ranking. a superior in the hierarchy of the company.) either induce an ordinal scale from the cost criterion and then. For this purpose an appendix was included in the ﬁnal recommendation where the following was emphasised: . In fact the performance criterion is associated with an importance parameter of 4 which combined with the concordance threshold of 16/19 implies that it is impossible for an alternative to outrank another if its value on the performance criterion is worse (and this satisﬁed the part of the team of analysts that considered the performance criterion as a critical evaluation of the oﬀers). The importance parameters and the concordance threshold adopted in the ﬁnal version made it possible to deﬁne a compromise of these two extreme positions expressed during the decision aiding process.3. an inspector. but is was meaningless to use such a numerical representation in order to establish implicit or explicit trade-oﬀs with a cost criterion.the ﬁrst and second scenarios implicitly adopted two extreme positions concerning the importance of the performance attribute that correspond to two diﬀerent “philosophies” present in the team of analysts. In this case.. the choice of the ﬁnal aggregation was justiﬁed by a speciﬁc attitude towards the two basic evaluation “points of view”: the quality information and the performance of the prototypes. for instance. a committee etc. It was extremely important for the client to be able to summarise the correspondence between an aggregation procedure and an operational attitude because it enabled them to better argue against the possible objections of their client. A major concern for people involved in complex decision processes is to be able to justify their behaviour. . Giving a regular importance parameter to the performance criterion avoided the extreme situation in which all other evaluations could become irrelevant. The ﬁnal ranking obtained respects this idea and the outranking table could be understood by all the members of the team of analysts.) or establish a value function of the client using one of the usual protocols available in literature (see also in Chapter 6) to obtain the trade-oﬀs between the .

experience. A ﬁnal consideration can be the fact that it is sure that there was space (but no time) to experiment with more variants and methods for the aggregation procedure and the construction of the ﬁnal recommendation. dynamic assignment of alternatives to classes and other innovative techniques were considered too “new” by the client who already considered the use of an approach diﬀerent from the usual grid and weighted sum a revolution (compared with the company’s standards). A careful analysis of the problem situation.pointed out that it was not necessary to always use ratio scales and weighted sums. but actu- . simulations and whatever else.. By this we want to indicate the fact that the client will be much more conﬁdent in the result and much more ready to apply it if he feels that he owns the result in the sense that it is a product of his own convictions. If the support was limited to answering the client demand on how to deﬁne a global evaluation (based on the weighted sum of their notes on the products) we may have provided them with an excellent multi-attribute value model that would have been of no interest for their problem. values.4 Conclusions Concluding this chapter we may try to summarise the lessons learned in this real experience of decision support. The ﬁnal client was very satisﬁed with the ﬁnal recommendation and was also able to understand the reply about the numerical representation...226 CHAPTER 9. He nevertheless decided to conduct the negotiations with the AQ manager personally and so the team of analysts terminated its task with the delivery of the ﬁnal recommendation. 9... What the client needed was continuous assistance and support during the decision process (the management of the call for tenders) enabling them to understand their role.”. The most important lesson perhaps concerns the process dimension of decision support. Such ownership can be achieved if the client not only participates in elaborating the parameters of the evaluation model. the fact of being able to aggregate the ordinal information available in a correct and meaningful way was more than satisfactory as they report in their ex-post remarks: “. . A second lesson learned concerns the “ownership” of the ﬁnal recommendation. but that it was possible to use judgements and aggregate them.. the expected results. interval comparisons using extended preference structures. the performance evaluations and the cost criterion (then the negotiations should concentrate on a value function). In their view. and the way to provide a useful contribution. which in other decision aiding processes can be extremely useful. SUPPORTING DECISIONS quality evaluations. a consensual problem formulation. Valued relations. a correct deﬁnition of the evaluation model and an understandable and legitimated ﬁnal recommendation are the products that we have to provide in a decision aiding process. computations.the team of analysts was also available to conduct this part of the decision aiding process if the client desired it. but an emphasis on a process based decision aiding activity. This is not against multiattribute value based methods. as we thought before. valued similarity relations.

Even when the available information is considered reliable. preference and/or measure aggregation procedures and other modelling tools deﬁnitely helps such a process. A fourth lesson concerns the importance of the distinction between measures and preferences. Therefore. hesitation modelling. Although the speciﬁc case may be considered exceptional (due to the speciﬁc dimension of the evaluation model and the double role of the client being analyst for another client at the same time) we claim that is always possible to include the client in the construction of the evaluation model in a way that allows him to feel responsible and to own the ﬁnal recommendation. The seconds refer to the clients values. ordinal measurement. CONCLUSIONS 227 ally build the model with the help of the analyst (which has been the case in our experience). Last. uncertainty may appear (as in our case). thus implying a process of adaptation guided by reciprocal learning for the client and the analyst.4. is always subjective and depends on the problem situation. The previous chapters of this book provide enough evidence that universal methods for aggregating preferences and/or measures do not exist. It might be interesting to notice that a customised implementation of the model on the tools on which the client is accustomed (as in our case the company spreadsheet) greatly improves the acceptance and legitimisation of the evaluation model. hierarchical measurement. ordinal value theory etc. However. uncertainty can appear in a very qualitative way and not necessarily in the form of an uncertainty distribution. Such “ownership” greatly eases the legitimisation of the recommendation since it is not just the “advice recommended by the experts who do not understand anything”. It is possible that such two dimensions may conﬂict. The second is a practical one and refers to the necessity to manipulate the information in a way understandable by the client and corresponding to his intuitions and concerns. The ﬁrst refer to observations made on the set of alternatives either through “objective” or through “subjective” measures. A ﬁfth lesson concerns the deﬁnition of the aggregation procedure in the evaluation model. We hope that the case study oﬀered an introduction to this problem. A sixth lesson is about uncertainty. The construction of the evaluation model must obey two dimensions of meaningfulness.9. but not obvious and has to be carefully studied. we emphasise the signiﬁcant number of open theoretical problems the case study highlights (interval evaluation. but not least. the aggregation procedures included in an evaluation model are choices that have to be carefully studied and justiﬁed. Moving from one to the other might be possible. . The existence of clear and sound theoretical results for the use of speciﬁc preference modelling tools. while another has m function points does not imply any particular preference between them. the evaluation model has to satisfy both requirements. Knowing that a software has n function points. A third lesson concerns the key issue of meaningfulness.). The ﬁrst is a theoretical and conceptual one and refers to the necessity to manipulate the information in a sound and correct way. Moreover. It is necessary to have a large variety of uncertainty representation tools in order to include the relevant one in the evaluation model.

y) should be read as “x is at least as good as y”. h = 1 · · · t.∀x ∈ A Pj (eh . eh ) ⇔ gj (x) ≈ eh j j . A relative importance wj (usually normalised in the interval [0. y) and not D(x. The procedure works in two basic steps. • A set of preference relations Pj . • A set P of proﬁles ph . y) where ∀x ∈ A. eh ) ⇔ gj (x) eh j j .∀x ∈ A Ij (x. ≈ induced by the ordinal scale associated to criterion gj . y) ⇔ j∈G± wj ≥ c and ( j∈G+ wj ≥ j∈G− wj ) ∀y ∈ A. such that if eh belongs to proﬁle ph . j = 1 · · · n. ph = eh · · · eh . Ij for each criterion gj such that: . y ∈ P : C(x. • An outranking relation S ⊂ (A × P) ∪ (P × A). y) ⇔ ( j∈G± wj ≥ c and j∈G+ wj ≥ j∈G− wj ) or ( j∈G+ wj > j∈G− wj ) . x ∈ P : C(x. 1. • A set A of alternatives ai . • A set C of categories cλ . y) ⇔ C(x. i = 1 · · · m. eh+1 cannot belong to proﬁle n 1 j j ph−1 . SUPPORTING DECISIONS Appendix A The basic concepts adopted in the procedure used (based on ELECTRE TRI) are the following. l = j 1 · · · k. 1]) is attributed to each criterion gj . where s(x. ph being a collection of degrees. λ = 1 · · · t + 1.228 CHAPTER 9. x) ⇔ gj (x) eh j j .∀x ∈ A Pj (x. • Each criterion gj is equipped with an ordinal scale Ej with degrees el . such that the proﬁle ph is the upper bound of category ch and the lower bound of category ch+1 . • A set G of criteria gj . Establish the outranking relation on the basis of the following rule: s(x.

G± = G+ ∪ G= . • Three alternatives: a1 = D. ﬁx c = 0. • Further on. B. 2.d: the discordance threshold d ∈ [0. y)} . B. B deﬁning three categories: unacceptable (U).as soon as s(ai . A . CONCLUSIONS 229 ∀(x. C and p2 = A. y) j∈G− where . ai )∧¬s(ai . If the optimistic and pessimistic assignments coincide. When the relation S is established. A.as soon as is established s(ph . expressed on criterion gj . d = 0. C . 2. • Two proﬁles p1 = C. assign any element ai on the basis of the following rules.G+ = {gj ∈ G : Pj (x. ph ) is established. • Four criteria g1 · · · g4 . C.c: the concordance threshold c ∈ [0. B. In order to better understand how the procedure works consider the following example. x)} . Otherwise. y)} .vj (x. acceptable (A) and good (G) (p2 being the minimum proﬁle for category G.4.9. then no uncertainty exists for the assignment. each of them equipped with an ordinal scale A B C D. The optimistic procedure ﬁnds the proﬁle against which the element is surely the worse.40 and ∀j vj (x. . a2 = B. B.75.ai is iteratively compared with p1 · · · pt . an uncertainty exists and should be considered by the user. p1 being the minimum proﬁle for category A). a3 = A. 1] . ph ) then assign ai to category ch−1 . C. y) ∈ (A × P) ∪ (P × A) : not D(x. y): veto. 1] .1 pessimistic assignment . . B. y) ⇔ x = D .2 optimistic assignment . The pessimistic procedure ﬁnds the proﬁle for which the element is not the worst. y) ⇔ wj ≤ d and ∀gj not vj (x.ai is iteratively compared with pt · · · p1 . of equal importance (∀j wj = 1/4). of y on x 2. C.G− = {gj ∈ G : Pj (y. B. B . assign ai to category ch .G= = {gj ∈ G : Ij (x.5.

a3 ). (a3 . a2 ). (p2 . (p2 . p1 ). . (a2 . SUPPORTING DECISIONS With such information it is possible to establish the outranking relation that is S = {(p2 . The reader can easily check that the pessimistic assignment puts alternative a1 in category U and alternatives a2 and a3 in category A. p1 )}. a1 ). while the optimistic assignment puts all three alternatives in category A.230 CHAPTER 9.

1 1.3 Graphical rendering functions 1.3 Interface personalisation 1.4.3.5.1 Availability 1.8 Integration among Land-base products 1.4.3.2 1.2 Code browsing 1.4.2 Functionality 1.3 1.1.3.1 1.9.3.3.2.5.4.4 Libraries personalisation Development support tools Debugging support tools Code documentation 1.2.1.2 1.4 Administration tools 1.1 Graphics type 1.3.1.2 Data sharing .3.3.7.1 Interfaces integration 1.2 Topological connectivity functions 1.7 Integration between Land-base products and the Spatial Data Manager 1.8.7.2.5 Work ﬂow connection 1.5.3 Performance data collection 1.2 1.2 Software conﬁguration management 1.2.2.4.4.2.6 Interoperability 1.2 Graphics engine adequacy 1.4 Vectorial data products integration Descriptive data products integration Raster data products integration Digital Terrain Model products integration 1.1 User administration functions 1.4 Completeness Documentation support type Information retrieval ease Contextual help 1.8.3 Development environment 1.7.5.1 Documentation support tools 1.3.3 1. CONCLUSIONS 231 Appendix B The complete list of the attributes used in the evaluation model 1 LAND-BASE MANAGEMENT 1.2 Adequacy 1.1 Planes analysis functions 1.3.2.7.1 1.2.1 User interface 1.3.3 1.5 Documentation Quality 1.

3 Interface personalisation 3.3.4.2.2 Graphical rendering functions 2.7.2.1 Planes analysis functions 2.6.2 Graphics engine adequacy 3.3.4 Libraries personalisation Development support tools Debugging support tools Code documentation 2.1.5.1 Graphics type 2.3.4 Administration tools 2.6.2.4.5.4.5.2 2.3. IMPLEMENTATION AND OPERATING SUPPORT 3.3.3.2 Code browsing 2.1 Documentation support tools 2.1 Interfaces integration 2.7.6. DESIGN.3 Development environment 2.1.1.3.3.232 2 GEOMARKETING 2.3. SUPPORTING DECISIONS 2.1 User interface CHAPTER 9.2 Graphics engine adequacy 2.1 2.3 2.1 Availability 2.3 Interface personalisation 2.6 Integration between Geomarketing products and the Spatial Data Manager 2.2.3.1.1 Vectorial data products integration 2.4 Completeness Documentation support type Information retrieval ease Contextual help 2.3 Raster data products integration 2.1 Graphics type 3.1.5 Interoperability 2.2.2.1 User interface 3.2 Adequacy 2.5.2 Functionality .3 2.1 Software conﬁguration management 2.3.1 2.2 Data sharing 3 PLANNING.2 Functionality 2.7 Integration among Geomarketing products 2.2 Descriptive data products integration 2.2 2.5 Documentation Quality 2.1.

1 Documentation support tools 3.3.2 Data sharing 4 DIAGNOSIS SUPPORT AND CUSTOMER CARE 4.8.2 Functionality 4.2 Graphics engine adequacy 4.4 Administration tools 3.3.2.5 Documentation Quality 3.5.4.3 3.3.7.2.4.5.2.2 3.9.3 Performance data collection 3.4.5 Work ﬂow connection 3.7.2 3.4 Completeness Documentation support type Information retrieval ease Contextual help 3.2 3.7.3.1 3.2.1 3.4 Vectorial data products integration Descriptive data products integration Raster data products integration Digital Terrain Model products integration 3.1 3.2.1 User interface 4.3.7 Integration between this process products and the Spatial Data Manager 3.2 Software conﬁguration management 3.8 Integration among this process products 3.4.4.7.1.5.6 Interoperability 3.2.3 3.1 User administration functions 3.2.2.3.1 Graphics type 4.3 3.3.1 3.5.2.8.1.4 Libraries personalisation Development support tools Debugging support Code documentation 3.1 Availability .3.2 3.3.3 3. CONCLUSIONS 3.2.4.2 Code browsing 3.1.4 Planes analysis functions Topological connectivity functions Graphical rendering functions Network schema creation 233 3.2.3 Development environment 3.3.2 Adequacy 3.3 Interface personalisation 4.1 Availability 3.3.1 Interfaces integration 3.

2.3 4.1 Documentation support tools 4.1.3 4.1 Interfaces integration 4.1 5.2.7 Integration among this process products 4.1 4.3.4 Data model Data management Data integration Spatial operators .4.2.3.1.2 Descriptive data products integration 4.2 5.3 Development environment 4.234 4.2 Performance data collection 4.1.3 5.7.2 5.3.5.2 4.2.6 Integration between this process products and the Spatial Data Manager 4.6.4.2 Basic properties of the Spatial Data Manager 5.3.5.5 Documentation Quality 4.1.2.1 5.2.1 4.3.1 Data base properties 5.6.3 5.2 4.2.2 Data sharing 5 SPATIAL DATA MANAGER 5.3.2.4.4 Libraries personalisation Development support tools Debugging support Code documentation 4.3 4.2 4.1 4.2 Adequacy 4.2.3.2. SUPPORTING DECISIONS Planes analysis functions Topological connectivity functions Graphical rendering functions Network schema creation 4.3 Raster data products integration 4.1 Software conﬁguration management 4.3.3.7.2.4 Fundamental properties Transaction typology support Data / Function association Client data access libraries 5.4 CHAPTER 9.3.3.2 Code browsing 4.4 Completeness Documentation support type Information retrieval ease Contextual help 4.4.5.6.4 Administration tools 4.1 Vectorial data products integration 4.5.2.2.5 Interoperability 4.

2 5.5 Coordinate systems 5.1 Robustness 6.1 Database distribution 5.1 Server data access libraries 5.6 5.4 Graphical interfaces performances .3 Backup 6 SOFTWARE QUALITY 6.4 Data sharing constraints Feature versioning Feature life-cycle management Data distribution 235 5.1.2 Data Manager under diﬀerent operation typology 7.7 Independence from features structure Integration with Oracle Integration with Unix and MVS relational databases Integration with Oracle Designer 2000 Logical scheme import capability Spatial Data Manager platform 5.4.5.1.4.3 5.2 5.5.3 5.2 Maturity 6.4.4. CONCLUSIONS 5.1 5.3.3 Easiness of installation and maintenance 7 PERFORMANCES 7.5 5.2 Database access control 5.9.6 Vectorial data continuous management 5.4 5.3 Special properties of the Spatial Data Manager 5.2 Structured Query Language to access descriptive data 5.1 Public libraries for feature manipulation 5.3.3.3.4.4.4.2.3 Data Manager under diﬀerent concurrent transactions 7.4.1 Single transaction under diﬀerent data volume 7.4.5.2.5 Data administration tools 5.4 Integration between the Spatial Data Manager and the Data Layer 5.4.

.

astrology or graphology. although being a very familiar concept. Therefore. say. Economics Operational Research. Such “electoral rules” contribute towards shaping the entire political debate in a country and. is in fact a complex evaluation model. assess and process information in order to make recommendations in decision and/or evaluation processes. Not surprisingly. Computer Science. the exams we passed or not. Education Science. “Your child has a GPA of 9. We brieﬂy summarise below the main methods presented in this book and the diﬃculties that have been encountered. thus.) and are used to support numerous kinds of decision or evaluation processes. under a slightly diﬀerent electoral system. their underlying logic should be explicit contrary to.54. that a “grade”. “Following a democratic election Mr. we mean a set of explicit and well-deﬁned rules to collect. etc. Therefore we cannot allow him to continue with this programme” Our early life at school was governed to a large extent by the grades we obtained. the decision made concerning your child might well 237 . Mr. we hopefully have to cast several kinds of votes. Similar votes may well lead to very diﬀerent results depending on the rules used to process them. Although these methods may not be entirely formalised.10 CONCLUSION 10. confronted with such methods. Statistics. the aggregation of such evaluations is not an obvious task. It is not an overstatement to say that nowadays nearly everyone is. It is likely that the present professional life of many readers is still governed by some type of formal evaluation method that somehow uses “grades” (this is clearly the case for most academics). X has been elected” As citizens. In chapter 3 we saw. Engineering. Therefore.1 Formal methods are all around us The aim of this book was to provide a critical introduction to a number of “formal decision and evaluation methods”. Such methods emanate from many diﬀerent disciplines (Political Science. Decision Theory. elections are governed by “rules” that are very far from being innocuous. As mentioned in chapter 2. inﬂuence the type of democracy we live in. X might not have been elected. implicitly or explicitly. By this.

the apparently objective calculations invoked to refuse the creation of a maternity department in our hospital. the safety regulations applied to factories near our homes. the pricing of a number of statistical “delivery incidents” due to a longer transportation time for some mothers). claiming that the ‘well-being’ index has increased by 10% gives. how to cope with equity considerations in the distribution of costs and beneﬁts. the comparison of these strategies raises many problems.g. “Things are going well since the ‘well-being’ index in our country raised by more than 10% over the last three years” Statisticians have elaborated an incredible number of indicators or indices aiming at capturing many aspects of reality (including the quality of the air we breeze. This raises many diﬃculties outside simple cases: how to convert the various consequences of a complex project into monetary units. We showed that. the very notion of a ‘best buy’ is highly debatable. It is not unlikely that other reasonable hypotheses may have led to an opposite decision. we saw that such “measures” should not be confounded with the familiar “measurement operations” in Physics.) by using numbers.238 CHAPTER 10. like the “importance” of criteria. Since such assessments shape preference information as much as they collect it. The resulting numbers do not appear to be measured on some well-deﬁned type of scale. are highly dependent on numerous debatable hypotheses (e. the richness of a country. Therefore.. . apparently familiar concepts. etc. its state of development. the analyst has the choice between several “aggregation strategies” that could lead to diﬀerent results. Cost-beneﬁt analysis evaluates such projects using money as a yardstick. generally conﬂicting. in most cases. Not only are our newspapers full of these kinds of ﬁgures but they are also routinely used to make important political or economic decisions. the fact that his exams were corrected late at night or on the way his various grades were aggregated. how to take the distribution in time of these consequences into account? In chapter 5 we saw that cost-beneﬁt analysis can hardly claim to always solve all these diﬃculties in a satisfactory manner. Therefore. the quality of our social security system. at best. Each of these strategies requires the assessment of more or less rich and precise “inter-criteria” information. Their properties are sometimes intriguing and they surely should be manipulated with care. In chapter 4. etc. the way our electricity is produced. Therefore. known as Multiple Criteria Decision Making (MCDM) is the subject of chapter 6. the tariﬃng of public transportation. depend on particular ways of assessing and summarising the costs and the beneﬁts of alternative projects. are shown to have little (if any) clear meaning outside a well-deﬁned aggregation strategy. a very crude indication. “Calculations show that it is not proﬁtable to equip this hospital with a maternity department” The quality of the roads on which we drive. CONCLUSION have been signiﬁcantly diﬀerent depending on the grading policy and/or correction habits of some teachers. criteria into account when making a decision ? This area. Furthermore. because each potential buyer has his own preferences and interests and there are many diﬀerent and yet reasonable ways to aggregate them. “Based on numerous tests it appears that the ‘best buy’ is car Z” How to take several.

g. What might be slightly more surprising.2 What have we learned? Although the methods examined in this book are apparently very diﬀerent and emanate from various disciplines. The authors of this book believe that it may be interesting and proﬁtable to give them a closer look. the clarity of an image. relying on the automatic decisions taken by the new camera might not always be your best option. is an enormous task. WHAT HAVE WE LEARNED? 239 “Relax. Contrary to the situation in chapter 6 however. is that most of these methods and tools are plagued with many diﬃculties. thus. important considerations. our cameras. “Given what you told me about your preferences and beliefs. We may ignore them.10. they are used in real time without human intervention after the implementation stage. concerning the amount of water or energy to use. Therefore. Whether we like it or not. belief functions. there might be more than one way to assess preferences and beliefs and to combine them in order to make a recommendation. This raises new diﬃculties and issues. they appear to have a lot in common. in chapter 8. like the dynamic consistency of choices and the aggregation of consequences over time were shown to be largely open questions. fuzzy sets and other kinds of non-additive uncertainty measures may appear as good contenders although their theoretical basis may be seen as less ﬁrm than the one underlying standard Bayesian analysis. We saw that they are based on concepts and techniques that are very similar to the ones examined in chapter 6 and. Raiﬀa 1970) are often seen as synonymous with decision support methods in risky and/or uncertain situations. it seems diﬃcult nowadays to escape from formal decision and evaluation methods. you should not invest in this project in view of its expected utility” Standard decision analysis techniques (see e.g. Alternative tools. the assessment and revision of (subjective) probability distributions in highly ambiguous environments and in situations involving a long period of time. supposedly “optimal” tuning of channels. the. e. our new camera will choose the ‘optimal focus’ for you” Our washing machines. Therefore. • Objective and scope of formal decision/evaluation models . This should not be much of a surprise since these methods have the common objective of providing recommendations in complex decision and evaluation processes. Besides possible computational problems. The “decision modules” underlying such automatic decisions were studied in chapter 7. we showed why the implementation of these standard techniques may not be as straightforward as is often believed. such as possibilities. the right focus. Let us try to summarise the main ﬁndings and problems encountered in the preceding chapters here. The real case-study presented in chapter 9 has shown that their proper use can have a signiﬁcant impact on real complex decision or evaluation processes. raise similar problems and questions. our TV sets often take decisions on their own.2. Furthermore. Using a real example in electricity production planning. 10.

transparency of the model.240 CHAPTER 10. • Aggregating evaluations . these numbers seem. – The properties of the numbers manipulated in such models should be examined with care. – The numbers resulting from such “evaluation models” often appear as constructs that are the result of multiple options. communication with actors involved in the process. CONCLUSION – Formal decision and evaluation models are implemented in complex decision/evaluation processes. they may provide support at all steps of a decision process (see chapter 9) • Collecting data – All models imply collecting and assessing “data” of various types and qualities and manipulating these data in order to derive conclusions that will hopefully be useful in a decision or evaluation process. Their usefulness not only depends on their intrinsic formal qualities but also on the quality of their implementation (structuration of the problem. – The objective of these models may be diﬀerent from recommending the choice of a “best” course of action. 4. e. – The use of evaluation models greatly contributes to shaping and transforming the “reality” that we would like to “measure”. They are measured on scales that are diﬃcult to characterise properly. Implementing a decision/evaluation model only rarely implies capturing aspects of reality that can be considered as independent of the model (see chapters 6 and 9).). Using them rarely amounts to solving a well-deﬁned formal problem. Having a sound theoretical basis is therefore a necessary but insuﬃcient condition to their usefulness (see chapter 9). 6. the usefulness of such models is not limited to the elaboration of several types of recommendations. 8). More complex recommendations. 6 and 9).g. etc. 4. ranking the possible courses of action or comparing them to standards. to give an order of magnitude of what is intended to be captured (see chapters 3. Moreover. using “numbers” may only be a matter of convenience and does not imply that any operation can be meaningfully performed on them (see chapters 3. The choice between these various possible options is only partly guided by “scientiﬁc considerations”. 4. Therefore. Furthermore. 4. When properly used. at best. 6 and 7). This more or less inevitably implies building “evaluation models” trying to capture aspects of “reality” that are diﬃcult to deﬁne with great precision (see chapters 3. ambiguity and/or uncertainty. more often than not. are also frequently needed (see chapters 3. 6 and 7). These numbers should not be confounded with numbers resulting from classical measurement operations in Physics. they are often plagued with imprecision.

A formal analysis of such models may therefore prove of utmost importance (see chapters 2. at lower cost and with less eﬀort. Apparently reasonable principles can lead to a model with poor properties. not only collect but shape and/or create preference information (see chapter 6). may be diﬃcult to interpret within a welldeﬁned aggregation model (see chapter 6). faced with such evidence. 5 and 6). Some readers may think that. uncertainty and inaccurate determination. the model should explicitly deal with imprecision. The type of aggregation model that is used greatly contributes to shaping this information. • Dealing with imprecision. – Aggregation techniques often call for the introduction of “preference information”.g. – Many diﬀerent tools can be envisaged to model the preferences of an actor in a decision/evaluation process (see chapters 2 and 6). Indeed these chapters can be seen as a collection of the defects of these methods. this is not the only possible aggregation strategy (see chapters 3. – Devising an aggregation technique is not an easy task. The search for robust conclusions may imply analyses much more complex than simple sensitivity analyses varying one parameter at a time in order to test the stability of a solution (see chapters 6 and 8). – The pervasive use of simple tools such as weighted averages may lead to disappointing and/or unwanted results. It is the ﬁrm belief and conviction of the authors . Assessment techniques.2. – Deriving robust conclusions on the basis of such aggregation models requires a lot of work and care. therefore. We saw that the methods reviewed in chapters 2 to 8 are far from being without problems. ambiguity and uncertainty – In order to allow the analyst to derive convincing recommendations. 4 and 6). It is not easy to create an alternative framework in which problems such as dynamic consistency or respect of (ﬁrst order) stochastic dominance are dealt with in a satisfactory manner (see chapters 6 and 8). – Intuitive preference information. WHAT HAVE WE LEARNED? 241 – Aggregating the results of complex “evaluation models” is far from being an easy task. this type of method should be abandoned and that “intuition” or “expertise” are not likely to do much worse. In our opinion. The use of weighted averages should in fact be restricted to rather speciﬁc situations that are seldom met in practice.10. e. concerning the relative importance of several points of view. 4. Although many aggregation models amount to summarising these numbers into a single one. Modelling all these elements into the classical framework of Decision Theory using probabilities may not always lead to an adequate model. this would be a totally unwarranted conclusion.

a very simple decision/evaluation process involving a single actor) they appear to us fundamental to us in most social or organisational processes (see chapter 9). would surely contribute to a more transparent and eﬀective government. Slovic and Tversky 1981. we are more than inclined to say that the use of more formal methods could improve such selection processes (let alone on issues such as fairness and equity) in a signiﬁcant way. it should not be forgotten that formal tools lend themselves more easily to criticism and close examination than other kinds of tools. etc. .g. policy against crime and drugs. Thaler 1991) Second. casual observation suggests that there is an increasing demand for such tools in various domains (going from executive information systems. We would thus answer a clear and deﬁnite yes to the question of whether formal decision and evaluation tools are useful. Poulton 1994. the establishment of environmental standards. Similarly. Three main arguments can be proposed to support this claim. it has been more or less always shown that such types of judgements are based on heuristics that are likely to neglect important aspects of the situation and/or are aﬀected by many biases (see the syntheses of Kahneman. decision support systems and expert systems to standardised evaluation tests and impact studies). the introduction of more formal evaluation tools in the evaluation of public policies. Hogarth 1987. • they lend themselves easily to “what-if” types of questions. formal methods are often indispensable structuration instruments. It is our belief that the introduction of such tools may have quite a beneﬁcial impact in many areas in which they are not commonly used. policy towards the carrying of guns.). money and time consumed in some situations (e. Russo and Schoemaker 1989. ﬁscal policy. Bazerman 1990.g. Thus. First. Although many companies use tools such as graphology and/or astrology in order to select between applicants for a given position. • they require building models of certain aspects of “reality”. These exploration capabilities are crucial in order to devise robust recommendations. an area in which they are strikingly absent in many countries.242 CHAPTER 10. this implies concentrating eﬀorts on crucial matters. However. formal methods have a number of advantages that often prove crucial in complex organisational and/or social processes: • they promote communication between the actors of a decision or evaluation process by oﬀering them a common language. Third. CONCLUSION that the use of formal decision and evaluation tools is both inevitable and useful. whenever “intuition” or “expertise” has been subjected to close scrutiny. laws and regulations (e. Although these advantages may have little weight compared to the obvious drawbacks of formal methods in terms of eﬀort involved.

. the very presence of analysts. Indeed. it is often diﬃcult to know whether the proposed model “worked” or not. the questions they raised. of communication with stakeholders. The fact that many decision and evaluation tools are plagued with serious diﬃculties is troublesome. It should not be unexpected however. Even though the ﬁnal decision is at variance with the recommendations derived from the model. Although we would deﬁnitely not favour a method that would be unable to pass such a test. paths have often been suggested for this purpose.3. Should we say then that the method has “worked” or not? A close variant of the engineering route could be called the naive route. in practice. None of them appear totally convincing to us. Our willingness to keep mathematics and formalism to the lowest possible level has not allowed us to explore many technical details and diﬃculties. the very way in which a “good” formal decision/evaluation method is deﬁned. are elements of utmost importance in the quality of a decision/evaluation aid process. the formal tools used by an analyst are implemented in decision or evaluation processes that may be highly complex (involving many diﬀerent actors. The resulting decision/evaluation aid process is therefore conditioned by many factors outside the realm of a formal method: the quality of the structuration of the problem. i. Supporting a decision or an evaluation process should not be confounded with solving a “well-deﬁned formal problem”. The paradox between our conviction in the usefulness of formal methods and the content of this book is only apparent and results from a misunderstanding. First. has been applied several times in real-world problems and has been well accepted by the actors in the process. WHAT CAN BE EXPECTED? 243 10. • the engineering route that amounts to saying that a method is good because “it works”. a thorough critical examination of each of the methods covered in chapters 2 to 8 could be the subject of an entire book. the timing and costs of the study.e. etc. is nothing but clear. it is important to remember that the “quality” of the support provided by a formal tool is very diﬃcult to separate from considerations linked to the implementation of the method. Although it may make sense to associate a “good” method for solving it to such a problem. supporting real decision and evaluation processes should not be confounded with this formal exercise. we doubt that the “engineering” argument is suﬃcient to deﬁne what would distinguish “good” formal decision or evaluation methods. Two main. non-exclusive. We doubt that this is a reasonable belief. the availability of user-friendly softwares. Indeed.3 What can be expected? Our plea for the introduction of more formal decision and evaluation tools may appear paradoxical in view of the content of this book. Second.10. Have we been overly critical then? Certainly not. As should be apparent from of chapter 9. the type of reasoning they have promoted could have had a signiﬁcant impact on the decision process. unless one believes that there is a single “best way” to provide support in each type of decision or evaluation process. lasting a long time and being governed by complex rules and/or regulations).

McCord and de Neufville 1982. CONCLUSION It amounts to saying that a formal tool is adequate if it consistently leads to “good” decisions. Kahneman and Tversky 1979. Quiggin 1982. Fishburn 1988. Schmeidler 1989. etc. Analysts implementing formal decision and evaluation tools are in a position similar to that of an engineer. Jaﬀray 1988.g. Allais 1953. Keeney.g.g. At this point it should be apparent that research on formal decision and evaluation methods should not be guided by the hope of discovering models that would be ideal under certain types of circumstances. Jaﬀray 1989. a set of conditions is known that completely characterises the proposed choice or evaluation models). however. Dubois. Machina 1989.e. Loomes and Sugden 1982. Bouyssou 1984). Although we ﬁnd theories most useful.) and that the essential idea is to promote a good “decision process”. if not all.g. This literature shows that it is very diﬃcult to deﬁne what would constitute a “good decision” a priori (good in which state of nature ? good for whom ? good according to what criteria ? at what moment in time ?. Johnson and Schkade 1989. Kunreuther and Schoemaker 1982. Kahneman and Tversky 1979. Contrary to most engineers.g. we do think that this area will be rich and fertile for future research. has always insisted on the fact that “good decisions do not necessarily lead to good outcomes”. A striking example of this diﬃculty can be found in the area of decision under risk and uncertainty. Wakker 1989. however. Having axioms is certainly useful in order to compare theories but the “rational” content of the axioms and their interpretation remain much debated. Machina 1982. Sopher and Gigliotti 1993) or a normative point of view (see e. Hammond 1988. Can something be done then? In view of the many diﬃculties encountered with the models envisaged in this book and the many ﬁelds in which no formal decision and evaluation tools are used. McClennen 1990. Furthermore. these “decision engineers” often lack clear criteria for appreciating the “success” or “failure” of their models. • the rational route which amounts to saying that a method is adequate if it is backed by a sound theory of “rational choice”. While. Hershey. Hammond and Raiﬀa 1999). Harless and Camerer 1994. Yaari 1987) fostered by the result of numerous empirical experiments (see e. Fargier and Prade 1997.244 CHAPTER 10. This is true even though most. of these theories have been axiomatically characterised (i. expected utility theory was considered almost unanimously as the “rational theory of choice under risk”. McCrimmon and Larsson 1979) presently results in a very complex situation in which it is not easy to discriminate between theories both from an empirical (see e. . the relation between the formal axiomatic theory and the assessment technologies derived from it are far from being obvious (see e. the criteria for separating sound from unsound theories of “rational choice” do not appear obvious to us. The literature on “decision” (see Raiﬀa 1970. Carbone and Hey 1995. the proliferation of alternative theories since then (see e. Abdellaoui and Munier 1994. Gilboa and Schmeidler 1989. Hey and Orme 1994. Nau and McCardle 1991). until the beginning of the eighties. Nau 1995. Russo and Schoemaker 1989.

10.3. WHAT CAN BE EXPECTED?

245

Freed from the idea that we will discover the method, we can, more modestly and more realistically, expect to move towards: • structuring tools that will facilitate the implementation of formal decision and evaluation models in complex and conﬂictual decision processes; • ﬂexible preference models able to cope with data of poor or unknown quality, conﬂicting or lacking information; • assessment protocols and technologies able to cope with complex and unstable preferences, uncertain trade-oﬀs, hesitation and learning; • tools for comparing aggregation models in order to know what they have in common and whether one is likely to be more appropriate in view of the quality of the data? • tools for deﬁning and deriving “robust” conclusions. To summarise, the future as we see it: structuration methodologies allowing for an explicit involvement and participation of all stakeholders, ﬂexible preference models tolerating hesitations and contradictions, ﬂexible tools for modelling imprecision and uncertainty, evaluation models fully taking incommensurable dimensions into account in a meaningful way, assessments technologies incorporating framing eﬀects and learning processes, exploration techniques allowing to build robust recommendations (see Bouyssou et al. 1993). Thus, “thanks to rigourous concepts, well-formulated models, precise calculations and axiomatic considerations, we should be able to clarify decisions by separating what is objective from what is less objective, by separating strong conclusions from weaker ones, by dissipating certain forms of misunderstanding in communication, by avoiding the trap of illusory reasoning, by bringing out certain counter-intuitive results” (Roy and Bouyssou 1991). This “utopia” calls for a vast research programme requiring many diﬀerent types of research (axiomatic analyses of models, experimental studies of models, clinical analyses of decision/evaluation processes, conceptual reﬂections on the notions of “rationality” and “performance”, production of new pieces of software, etc.). The authors are preparing another book that will hopefully contribute to this research programme. It will cover the main topics that we believe to be useful in order to successfully implement formal decision/evaluation models in real-world processes : • structuration methods and concepts, • preference modelling tools, • uncertainty and imprecision modelling tools, • aggregation models, • tools for deriving robust recommendations.

246

CHAPTER 10. CONCLUSION

If we managed to convince you that formal decision and evaluation models are an important topic and that the hope of discovering “ideal” methods is somewhat chimerical, it is not unlikely that you will ﬁnd the next book valuable.

Bibliography

[1] Abbas, M., Pirlot, M. and Vincke, Ph. (1996). Preference structures and cocomparability graphs, Journal of Multicriteria Decision Analysis 5: 81–98. [2] Abdellaoui, M. and Munier, B. (1994). The ‘closing in’ method: An experimental tool to investigate individual choice patterns under risk, in B. Munier and M.J. Machina (eds), Models and experiments in risk and rationality, Kluwer, Dordrecht, pp. 141–155. [3] Adler, H.A. (1987). Economic appraisal of transport projects: A manual with case studies, Johns Hopkins University Press for the World Bank, Baltimore. [4] Airaisian, P.W. (1991). Classroom assessment, McGraw-Hill, New York. [5] Allais, M. and Hagen, O. (eds) (1979). Expected utility hypotheses and the Allais paradox, D. Reidel, Dordrecht. [6] Allais, M. (1953). Le comportement de l’homme rationnel devant le risque : Critique des postulats et axiomes de l’´cole am´ricaine, Econometrica e e 21: 503–46. [7] Armstrong, W.E. (1939). The determinateness of the utility function, The Economic Journal 49: 453–467. [8] Arrow, K.J. and Raynaud, H. (1986). Social choice and multicriterion decision-making, MIT Press, Cambridge. [9] Arrow, K.J. (1963). Social choice and individual values, 2nd edn, Wiley, New York. [10] Atkinson, A.B. (1970). On the measurement of inequality, Journal of Economic Theory 2: 244–263. [11] Baldwin, J.F. (1979). A new approach to approximate reasoning using a fuzzy logic, Fuzzy Sets and Systems 2: 309–325. [12] Balinski, M.L. and Young, H.P. (1982). Fair representation, Yale University Press, New Haven. [13] Bana e Costa, C.A., Ensslin, L., Corrˆa, E.C. and Vansnick, J.-C. (1999). e Decision support systems in action: Integrated application in a multicriteria decision aid process, European Journal of Operational Research 113: 315–335. [14] Barbera, S., Hammond, P. and Seidl, C. (eds) (1998). Handbook of utility theory, Vol. 1: Principles, Kluwer, Dordrecht. 247

248

BIBLIOGRAPHY

[15] Bartels, R. H.., Beatty, J. C.. and Barsky, B.H.. (1987). An introduction to Spline for use in computer graphics and geometric Modeling, Morgan Kaufmann, Los Altos. [16] Barzilai, J., Cook, W.D. and Golany, B. (1987). Consistent weights for judgments matrices of the relative importance of alternatives, Operations Research Letters 6: 131–134. [17] Bazerman, M.H. (1990). Judgment in managerial decision making, Wiley, New York. [18] Bell, D., Raiﬀa, H. and Tversky, A. (eds) (1988). Decision making: Descriptive, normative and prescriptive interactions, Cambridge University Press, Cambridge. [19] Belton, V., Ackermann, F. and Shepherd, I. (1997). Integrated support from problem structuring through alternative evaluation using COPE and V•I•S•A, Journal of Multi-Criteria Decision Analysis 6: 115–130. [20] Belton, V. and Gear, A.E. (1983). On a shortcoming of Saaty’s analytic hierarchies, Omega 11: 228–230. [21] Belton, V. (1986). A comparison of the analytic hierarchy process and a simple multi-attribute value function, European Journal of Operational Research 26: 7–21. [22] B´reau, M. and Dubuisson, B. (1991). A fuzzy extended k-nearest neighbor e rule, Fuzzy Sets and Systems 44: 17–32. [23] Bernoulli, D. (1954). Specimen theoriæ novæ de mensura sortis, Commentarii Academiæ Scientiarum Imperialis Petropolitanæ (5, 175–192, 1738), Econometrica 22: 23–36. Translated by L. Sommer. [24] Bezdek, J., Chuah, S.K. and Leep, D. (1986). Generalised k-nearest neighbor rules, Fuzzy Sets and Systems 18: 237–256. [25] Blin, M.-J. and Tsouki`s, A. (1998). Multicriteria methodology contribution a to the software quality evaluation, Technical report, Cahier du LAMSADE No 155, Universit´ Paris-Dauphine, Paris. e [26] Boardman, A. (1996). Cost beneﬁt analysis: Concepts and practices, PrenticeHall, New-York. [27] Boiteux, M. (1994). Transports : Pour un meilleur choix des investissements, La Documentation Fran¸aise, Paris. c [28] Bonboir, A. (1972). La docimologie, PUF, Paris. [29] Borda, J.-Ch. (1781). M´moire sur les ´lections au scrutin, Comptes Rendus e e de l’Acad´mie des Sciences. Translated by Alfred de Grazia as “Mathee matical derivation of an election system”, Isis, Vol. 44, pp. 42–51. [30] Bouchon, B. (1995). La logique ﬂoue et ses applications, Addison Wesley, New York.

BIBLIOGRAPHY

249

[31] Bouchon-Meunier, B. and Marsala, C. (1999). Learning fuzzy decision rules, in D. D. J. Bezdek and H. Prade (eds), Fuzzy sets in approximate reasoning and information systems, Vol. 3 of Handbook of Fuzzy Sets, Kluwer, Dordrecht, chapter 4, pp. 279–304. [32] Bouyssou, D., Perny, P., Pirlot, M., Tsouki`s, A. and Vincke, Ph. (1993). a A manifesto for the new MCDM era, Journal of Multi-Criteria Decision Analysis 2: 125–127. [33] Bouyssou, D. and Perny, P. (1992). Ranking methods for valued preference relations: A characterization of a method based on entering and leaving ﬂows, European Journal of Operational Research 61: 186–194. [34] Bouyssou, D. and Pirlot, M. (1997). Choosing and ranking on the basis of fuzzy preference relations with the ‘Min in Favor’, in G. Fandel and T. Gal (eds), Multiple criteria decision making – Proceedings of the twelfth international conference, Hagen, Germany, Springer Verlag, Berlin, pp. 115– 127. [35] Bouyssou, D. and Vansnick, J.-C. (1986). Noncompensatory and generalized noncompensatory preference structures, Theory and Decision 21: 251–266. [36] Bouyssou, D. (1984). Decision-aid and expected utility theory: A critical survey, in O. Hagen and F. Wenstøp (eds), Progress in utility and risk theory, Kluwer, Dordrecht, pp. 181–216. [37] Bouyssou, D. (1986). Some remarks on the notion of compensation in MCDM, European Journal of Operational Research 26: 150–160. [38] Bouyssou, D. (1990). Building criteria: A prerequisite for MCDA, in C.A. Bana e Costa (ed.), Readings in multiple criteria decision aid, Springer Verlag, Berlin, pp. 58–80. [39] Bouyssou, D. (1992). On some properties of outranking relations based on a concordance-discordance principle, in A. Goicoechea, L. Duckstein and S. Zionts (eds), Multiple criteria decision making, Springer-Verlag, Berlin, pp. 93–106. [40] Bouyssou, D. (1996). Outranking relations: Do they have special properties?, Journal of Multi-Criteria Decision Analysis 5: 99–111. [41] Brams, S.J. and Fishburn, P.C. (1982). Approval voting, Birkh¨user, Basel. a [42] Brans, J.-P. and Vincke, Ph. (1985). A preference ranking organization method, Management Science 31: 647–656. [43] Brekke, K.A. (1997). The num´raire matters in cost-beneﬁt analysis, Journal e of Public Economics 64: 117–123. [44] Brent, R.J. (1984). Use of distributional weights in cost-beneﬁt analysis: A survey of schools, Public Finance Quarterly 12: 213–230. [45] Brent, R.J. (1996). Applied cost-beneﬁt analysis, Elgar, Adelshot Hants. [46] Broome, J. (1985). The economic value of life, Economica 52: 281–294.

J. [52] Cover. and Pearce. (1982).250 BIBLIOGRAPHY [47] Carbone. Systems thinking. J. San Francisco. H. A comparison of the estimates of expected utility and non-expected utility preference functionals.. Les Dossiers d’Education et Formae e tions 47: 183–203. Handbook of public economics. and Stern. Cambridge University Press. Principles of cost-beneﬁt analysis for developing countries. Journal of Economic Theory 40: 304–318. Elsevier. Technical Report 129/J310. Evaluation continue et examens. The theory of cost-beneﬁt analysis. E. pp. P. Basingstoke. (1996).N.G. [54] Daellenbach. (1986). Jossey-Bass. Tools for teaching. UNIDO. (1995). (1995). Qu’est-ce qu’une note : recherche sur la pluralit´ des e ´ modes d’´ducation et d’´valuation. [57] Davis.H. New York. (1994). Auebach and M. [58] de Jongh.. J. Guidelines for project evaluation. (1785). Ima e e a e primerie Royale. J. [51] Condorcet. Technical Report Series EDO-TM-95-5. SMG. (1994). (1972). [65] Dr`ze. Grading students. in e A. Brussels. Cambridge. G. Cost-beneﬁt analysis: Theory and practice. IT-13 1: 21–27. Labor-Nathan. A. [49] Chatel. La docimologie. An axiomatic characterization of preference under uncertainty: Weakening the independence axiom. Interfaces 26: 1–6. (1980). ´ [48] Cardinet. systems practice. F.K. Master’s thesis. and Teal. Th´orie du mesurage. Paris. C.D. INSEE. Wiley. Essai sur l’application de l’analyse ` la probabilit´ des d´cisions rendues ` la pluralit´ des voix. [60] Desrosi`res. Geneva Papers on Risk and Insurance Theory 20: 111–133.W. [50] Checkland. (1981). and Sen.-M. D. (1996). Cabay. [55] Dasgupta. New York. (1972). Reﬂ´ter ou instituer : L’invention des indicateurs e e statistiques. A. . Wiley.A. marquis de. M. Feldstein (eds). L. IEEE.G. S. 909–989. P. Transactions on Information Theory. M. B. and Hart. Nearest neighbor pattern classiﬁcation. A.C. Amsterdam. Why beneﬁt-cost analysis is widely disregarded and what to do about it?. Pr´cis de docie mologie. P. Paris. Macmillan.J. New York. e e Brussels. [64] Dorfman. [59] Dekel. N. De Boeck.S. (1995). Universit´ Libre de Bruxelles. (1967). [56] Dasgupta. [53] Cross. Evaluation scolaire et mesure. Paris. Marglin.J. T. A management science approach. Systems and decision making. E. P. Louvain-La-Neuve. ´ [62] de Landsheere. (1986). R. (1987). agr´gation des crit`res et applie e e cation au d´cathlon. E.S. E. ERIC/AE Digest. [61] de Ketele. (1992). and Hey. (1993). [63] Dinwiddy.

P. New-York. [79] Fishburn. (1990). P. . G. Quarterly Journal of Economics 75: 643–669. [76] Farrell. Essentials of educational measurement. Annales des e Ponts et Chauss´es (8). and Perny. Management Science 40: 1174–1188.D. control engineering and artiﬁcial intelligence.P. Equity considerations in public risks evaluation. 121–128. Wiley. Iob. [77] Fiammengo. I. Maﬃoli. (1970). Possibility theory. H. Prade (eds).BIBLIOGRAPHY 251 [66] Dubois. Risk. Fuzzy algorithms for control. Condorcet social choice functions. [70] Dubois. Synthese 33: 393–403.C. R. Qualitative decision theory with Sugeno integrals. Laskey and H. H. pp. (1961). Prade.C. [83] Fishburn. (1976). Remarks on the analytic hierarchy process. P. ambiguity and the Savage axioms. (1991). Panarotto. Proceedings of the 13th conference on uncertainty in artiﬁcial intelligence. L. 157–164. D. and Straﬃn. Operations Research 37: 229–239. D. Dordrecht. J. in H. Los Altos. R.. and Sarin. Contemporary Political Studies.J. R. R. D.L. pp. Verbruggen. Buosi. H. H. (1989).C. P. [80] Fishburn.C. (1844). (1997). Aosta. New York. Morgan Kaufmann. pp. Kluwer. Noncompensatory preferences.B. New-York.K. Morgan Kaufmann. (1997). Proceedings of the 15t h conference on uncertainty in artiﬁcial intelligence. Dispersive equity and social risk... D. Morgan Kaufmann. New-York.. Utility theory for decision-making.. [68] Dubois. Presented at AIRO ’97 Conference. and Frisbie. [78] Fishburn.. D. P. [81] Fishburn. [67] Dubois. Prentice-Hall. Shenoy (eds). Management Science 36: 249–258..B. Comparing electoral systems.C. (1977).C. SIAM Journal on Applied Mathematics 33: 469–489. H. and Sarin. (1997). Geiger and P. Decision-making under ordinal preferences and uncertainty. H. and Prade. 17–58. D. Fairness and social risk I: Unaggregated analyses.. (1994). Fuzzy logic.A. P. D. D. Babuska (eds). e [72] Dyer. H. (1991). Prade. and Ughetto. and Prade. and Prade. Proceedings of the 14t h conference on uncertainty in artiﬁcial intelligence. Management Science 37: 751–769. (1988). and Sabbadin. in D. [75] Fargier. (1998). Fuzzy Sets and Systems 24: 279–300. (1987). Los Altos. Plenum Press. P. J. P. Los Altos. P.M. (1999). [71] Dupuit. Fargier. Prentice-Hall.K. A. [82] Fishburn. D. Qualitative decision models under uncertainty without the commensurability hypothesis. [74] Ellsberg. M. [73] Ebel. The mean value of a fuzzy number. De la mesure de l’utilit´ des travaux publics.S. Zimmermann and R. pp. (1999). and Turino. in K. [69] Dubois. 188–195. Bid management of software acquisition for cartography applications. H.

[90] Fix. Technical report. (1993). S. Equity considerations in utility-based measures of health outcomes in economic appraisals: An adjustment algorithm. Herm`s. S. D. 197.. and Schmeidler. (1988b). (1973). P. Springer Verlag. P.C. and Birch. A. Operations Research 32: 901–908. E.C..C. [87] Fishburn. [91] Fodor.C. ´ [101] Grabisch. Manipulation of voting schemes: A general result. and Hodges. I. pp. 469–489. (1997). [93] French. [86] Fishburn. Paris. F. (1981). (1989). (1991). Reidel. in S. British Journal of Mathematical and Statistical Psychology 34: 38–49. Normative theories of decision making under risk and under uncertainty. A. . 4. Dordrecht. Nonlinear preference and utility theory. Baltimore. (1984). Equity axioms for public risks. (1997). Journal of Economic Theory 59: 33–49. Multicriteria problem solving. P. London. [88] Fishburn.C. The economics of health and health care. P. [89] Fishburn. [100] Gilboa. M. Nonconventional preference relations in decision making. Condorcet’s paradox. Randolph Field. [95] Gacogne. (1982). Kacprzyk and M.C. I. P.252 BIBLIOGRAPHY [84] Fishburn. Johns Hopkins University Press. [85] Fishburn. [92] Folland. Maxmin expected utility with a nonunique prior. [94] French. in M. M. Theory and Decision 15: 161– [98] Gibbard. J. Paris. Berlin. Discriminatory analysis. Les cahiers du Club CRIN . Journal of Mathematical Economics 18: 141–153. S.C.V. and Stano. P. (1988a). Springer Verlag. J. (1983). Measurement theory and examinations. D. (1997). El´ments de logique ﬂoue.Association ECRIN. [99] Gilboa. Guely. e e [96] Gafni. non-parametric discrimination: consistency properties. Updating ambigous beliefs. and Perny. (1978). Journal of Risk and Uncertainty 4: 113–134. P. Nontransitive preferences in decision theory. Ellis Horwood. D. Fuzzy preference modelling and multicriteria decision support. Kluwer. USAF Scholl of aviation and medicine. Journal of Health Economics 10: 329–342. W. Dordrecht.C. New-York. L. Econometrica 41: 587–601. Zionts (ed. A. [97] Gehrlein. (1997). and Schmeidler. A survey of multiattribute/multicriteria evaluation theories. Decision theory – An introduction to the mathematics of rationality. M. (1994). (1951). Goodman. 181–224.). Prentice-Hall. Evaluation subjective. and Roubens. (1993). Roubens (eds). pp.L. Berlin. The foundations of expected utility. S.

Cost-beneﬁt analysis and the environment.C. Corkindale (eds). The theory of ratio scale estimation: Saaty’s analytic hierarchy process. pp. A slow-discounting model for energy conservation.C. Analysis and aiding a decision processes. (1992). E. Management Science 28: 936–953. Kunreuther. (1995). (1995). C.M. D.T. The Institute of Electrical and Electronics Engineers. N. The assumptions of cost-beneﬁt analysis: A philosopher’s view. (1994). Econometrica 62: 1251–1289. L. and Orme. Cost-beneﬁt aspects of food irradiation processing.M. P.G. [120] International Atomic Energy Agency (1993). . Elgar. (1995). (1982).V. (1993). J. Proportional discounting of future costs and beneﬁts. The reasonableness of non-constant discounting. R. (1994). Mathematics of Operations Research 20: 381–399. [106] Harless. [114] Hey.J. e M´moire du dea 103. Interfaces 22: 47–60. Statistical indicators. Investigating generalizations of expected utility theory using experimental data. pp. P. C. The utility of generalized expected utility theories. [105] Harker. R. and V´ri.G. Probl mes d’aﬀectation et m´thodes de classiﬁcation. CAB International. M. (1996). Amsterdam. 21–38. [111] Henriet. O. H. P. Proceedings of LFA’96. [118] Humphreys. [119] IEEE 92 (1992). Universit´ Paris Dauphine. European Journal of Operational Research 89: 445–456. Sources of bias in assessment procedures for utility functions.L. C. [109] Harvey. [115] Hogarth. (1996). and Vargas. M´thodes multicrit res non-compensatoires e pour la classiﬁcation ﬂoue d’objets. Relationships between decision making process and study process in OR interventions. and Perny. in K. Technical report. and Spash. [104] Hanley.C. C. (1982). (1994). The application of fuzzy integrals to multicriteria decision making. Management Science 33: 1383–1403. A. Consequentialist foundations for expected utility. (1987). 9–15. Standard for a software quality metrics methodology. Cambridge University Press. J.H. (1987). Cambridge. C. [108] Harvey. Judgement and choice: The psychology of decision. [116] Holland. (1993). (1993). Theory and Decision 25: 25–78.F. L. [117] Horn.D. Oxford. Adelshot Hants. Washington D. European Journal of Operational Research 10: 230–236. [103] Hammond. (1988). Econometrica 62: 1251–1289. A. [110] Henriet. New York. C. Bernan Associates. L. North-Holland.BIBLIOGRAPHY 253 [102] Grabisch. e e [112] Hershey. P.. [107] Harvey. P.J. Journal of Public Economics 53: 31–51.M.C.. [113] Heurgon.T. Svenson. Willis and J. and Schoemaker. Environmental valuation: New perspectives. Wiley. and Camerer.

and Cretin. G.. D. in C. J. Wiley. R. Paris. (1981). and Tversky. The relationship between cost-eﬀectiveness analysis and cost-beneﬁt analysis. The PREFCALC system.L. e [122] Jacquet-Lagr`ze. Springer Verlag. Readings in multiple criteria decision aid. E. A. Discounting of life-saving and other nonmonetary eﬀects. J. Cambridge. Interactive assessment of preferences using holise tic judgments. Operations Research Letters 8: 107–112. S. Social Science and Medicine 41: 483–489. [137] Keeney. M. New York. Technical report.A. H. [129] Johannesson. (1988). [132] Johnson. P. [134] Kahneman. [126] Jaﬀray. . (1989). Universit´ Paris-Dauphine. (1989a). Assessing a set of additive utility e functions for multicriteria decision making: The UTA method. and Hirsch. (1976). quality characteristics and guidelines for their use. Judgement under uncertainty – Heuristics and biases.J. Econometrica 47: 263–291. ISO.). (1996). 335–350.254 BIBLIOGRAPHY [121] ISO/IEC 9126 (1991). European Journal of Operational Research 10: 151–164. H. and Tversky. E. Bana e Costa (ed. and Raiﬀa.O. J. Cost-beneﬁt analysis of environmental change. A note on the depreciation of the societal perspective in economic evaluation in health care. [133] Kahneman. Hammond. (1999). (1989b). Gen`ve. (1995b). (1978). Slovic. (1983). [135] Keeler. [128] Johannesson. Health Policy 33: 59–66. (1979). Theory and methods of economic evaluation of health care. Management Science 29: 300–306. D. Cambridge University Press. D.S. Decisions with multiple objectives: Preferences and value tradeoﬀs. Roy. Theory and Decision 24: 169–200. and Raiﬀa.-Y. Technical report. Choice under risk and the security factor: An axiomatic model. R. Dordrecht. Harvard University Press. E. Cambridge University Press. (1990). J. Prospect theory: An analysis of decision under risk. and Siskos. Some experimental ﬁndings on decision making under risk and their implications. [136] Keeney. B. E. and Schkade. Cambridge.. [124] Jacquet-Lagr`ze. J. Utility theory for belief functions. [127] Jaﬀray. [131] Johansson..L. Moscarola. [130] Johannesson. (1995a). [125] Jaﬀray. Berlin. Descripe tion d’un processus de d´cision. A. Bias in utility assesments: Further evidence and explanations. Cahier du LAMSADE e No 13. Smart choices: A guide to making better decisions. E. e [123] Jacquet-Lagr`ze. pp.-Y. (1993). (1982). J. M.A. Kluwer. M.B. P. Management Science 35: 406–424. Boston.-Y.. Information technology – Software product evaluation. European Journal of Operational Research 38: 301–306.

(1998).L. Foundations of measurement.D. Paris. Foundations of behavioral research.G. NorthHolland. pp. O. in K.BIBLIOGRAPHY 255 [138] Keller. Luce. Economic analysis of investment projects: A practical approach. 15: 580– 585. Environmental valuation: New perspectives. Foundations of measurement. Academic Press. and Weiss. and Fitz-Gibbon. L. [148] Lindheim. Krantz. J. Basic books. I. J. [147] Lesourne. A. 1: Additive and polynomial representations. Charles C. [152] Loomes.. J. Social choice bibliography. Adelshot Hants. K. and Juarez. (1988). and Tversky. (1991). Project appraisal and planning for developing countries. Cost-beneﬁt analysis and project appraisal in developing countries. [143] Krantz. (1974). Willis and J. New York. D. . New York. Brown. Baltimore. [149] Little. R.M.N.D. IEEE Transactions on Systems Man and Cybernetics. Suppes. Johns Hopkins University Press. J.E. O. R. [150] Little. Elgar. Vol. J.N. Paired comparisons estimates of willingness to accept and contingent valuation estimates of willingness to pay. Vol. I. (1958). A fuzzy k−nearest neighbor algorithm. (1975). (1992). [154] Luce. axiomatisation and invariance.. Rinehart and Winston. (1987).D. B. P. F.. J.D. (1996). M.... P. CAB International. Thomas. [142] Kohli. and Givens.V. 3rd edn. J. T.H. Oxford. 3: Representation. [144] Krutilla. G. Suppes.T.C. E. T.H. [141] Kirkpatrick. Springﬁeld.M. [145] Laska. and Mirlees. How to measure performance and use tests. C. (1993). (1995). (1985). G.. [146] Laslett.T. Theory and Decision 25: 1–23. J. Cost-beneﬁt analysis and economic theory. Economic Journal 92: 805–824. Academic Press. The assumptions of cost-beneﬁt analysis. C. J. Peterson. (1986). Multiple purpose river development. (1968). Champ.. Oxford. New York. R. Diﬀerent experimental procedures for obtaining valuations of risky actions: Implications for utility theory. [139] Kelly. D. Amsterdam.A. Corkindale (eds). and Lucero. Holt. and Sugden. Regret theory: An alternative theory of rational choice under uncertainty. Social Choice and Welfare 8: 97–169. A. and Eckstein. Journal of Economic Behavior and Organisation 35: 501–515. (1982). and Mirlees. Thousand Oaks. (1990).S. New York. and Tversky. 5–20. [151] Loomes. G. [153] Loomis. Grading and marking in American schools: Two centuries of debate. [140] Kerlinger. P. R.A. Morris. J..A. Sage Publications.D. Oxford University Press for the Asian Development Bank. Manual of industrial project analysis in developing countries. (1971). Gray.

in S. Thousand Oaks.. [171] Mishan. Cost-beneﬁt analysis. Environment and Planning B 10: 47–62. Journal of Multi-Criteria Decision Analysis 5: 127–132. Enquˆte sur le jugement professoe ee e ral. [162] Masser. Games and Decisions. L.E. Wiley. Hagen (eds). (1983). Cambridge University Press. (1996). R. A. [160] Mamdani. Administrative Science Quarterly 21: 246– 272. in B. Rationality and dynamic choice: Foundational explorations. Reidel. Expected utility hypotheses and the Allais paradox. Dordrecht. Valued relations aggregation with the Borda method. Academic Press. Wenstøp (eds). D.J. Expected utility without the independence axiom.O. Allen and Unwin. M. Econometrica 50: 277–323. K. New York. R. M. [164] McClennen. pp. Sage Publications.J. and Th´oret.256 BIBLIOGRAPHY [155] Luce. . (1982). Gaines fuzzy reasonning and its applications. A. (1982). Scandinavian Journal of Educational Research 28: 149–165. (1990). Dynamic consistency and non-expected utility models of choice under uncertainty. Academic Press. H. D. (1982). Grading of student’s attainement: Purposes and functions. S.C. Allais and O. Econometrica 20: 680–684.. Empirical demonstration that expected utility decision analysis is not operational. New York. [163] May. Fundamental deﬁciency of expected utility analysis. [156] Luce. (1984). and de Neufville. and Raiﬀa. E. Paris.D. [168] McLean. London.R. (1996). and Lockwood. L’´valuation des ´l`ves. [161] Marchant. [159] Machina. [158] Machina. Cambridge. Hartley. D. R. [165] McCord. Raisinghani. [157] Lysne. Econometrica 24: 178–191. in M. [170] Mintzberg. (1981). (1956). Why and how should we assess students? The competing measures of student performance. R. (1976). H. PUF. A set of independent necessary and suﬃcient conditions for simple majority decisions. E. (1989). and Larsson. [166] McCord. Journal of Economic Literature 27: 1622– 1688. (1952). (1979). White (eds). Stigum and F. Reidel.D. Thomas and D. K. (1983). [169] Merle. (1996). Th. I. pp. French. 27–145. P. J. The representation of urban planning-processes: An exploratory review. London. H. M. Utility theory: Axioms versus paradoxes. E.F. The structure of une structured decision processes. R. and de Neufville. M. R. (1957). Multiobjective decision making. Semiorders and a theory of utility discrimination. [167] McCrimmon. 181–199.J.E. Foundations of utility and risk theory. pp. 279– 305.

T. 3 of Handbook of Fuzzy Sets. Dordrecht. H. (1998). Some Norwegian politician’s use of cost-beneﬁt analysis. in C. [180] Nau. (1990). Action evaluation and action structuring – Diﬀerent decision aid situations reviewed through two actual cases. Probl`mes li´s ` l’´valuation de l’importance en aide e e a e multicrit`re ` la d´cision : R´ﬂexions th´oriques et exp´rimentations. (1998). [184] Nurmi. [188] Ostanello. [173] Morisio. 169– 186. [187] Ostanello. e a e e e e PhD thesis.F. A. e [176] Munier. [181] Nguyen. B. (1989). Administrative Science Quarterly 19: 414–450. Arbitrage. Organizational decision processes and ORASA intervention. (1978). J. (1984). Types of organizational decision processes. An explicative model of ‘public’ ina terorganizational interactions.F. (1997). and Sugeno. (1987). (1990). (1993). Vol. Arkansas. Comparing voting systems. Oxford. and Caverini.A. New models of decisions under uncertainty. (1996). Kiss (eds). T. A.F. Journal of Risk and Uncertainty 10: 71–91. A. (1984). Kluwer. Bana . G.-P. IEE Proceedings on Software Engineering 144: 162–174. Grove Publishing. Westminster. D. Paris. Dordrecht. LAMSADE. Rethinking the process of operational research and systems analysis. K. R. Neuro-fuzzy methods in fuzzy rule generation. in R. [174] Moscarola.BIBLIOGRAPHY 257 [172] Moom.T. How do you know they know what they know? A handbook of helps for grading and evaluating student progress.C. Reidel. K. Bezdek and H. 305–333. [177] Nas. chapter 5. (1999).F. M. (1995). Kluwer. rationality and equilibrium. Prade (eds). (1991). [186] Nyborg. European Journal of Operational Research 70: 67–82. Poems in translation: Sappho to Val´ry. [179] Nau. R. Modelling and control. and Tsouki`s. Pergamon Press. The University e of Arkansas Press. and Kruse. [175] Mousseau. pp. Cost-beneﬁt analysis: Theory and application. and McCardle. M. [183] Noizet. D. IUSWARE: A formal methodology for a software evaluation and selection. A.M. Coherent decision analysis with inseparable probabilities and utilities. Theory and Decision 31: 199–240. [185] Nutt. (1997). H. [182] Nims. J. P. pp.F. La psychologie de l’´valuation scolaire. Universit´ Paris-Dauphine. (1993). and Tsouki`s. J. Thousand Oaks. European Journal of Operational Research 38: 307–317. D. Fuzzy sets in approximate reasoning and information systems. Tomlinson and I. Sage Publications. R. V. e PUF. [178] Nauck. Paris. Public Choice 95: 381–401. J. Dordrecht. in D.

applications. (1992). A common framework for describing some outranking procedures. Slowi´ski (ed. (1997).C. G. Journal of Multi-Criteria Decision Analysis 6: 86–93. [200] Pirlot. Ann Arbor Science. Berlin. (1981).J.D. 61–74. e [196] Perrot. [201] Popham. W. Ph. J. Th. Use of artiﬁcial intelligence multicriteria decision making. Berlin. in J. P. and Guely. (1999).J. N. and Roubens.43. (1999). Readings in multiple criteria decision aid. [190] Ott. representations. Ann Arbor. Journal of Economic Behaviour and Organization 3: 323–343. Sensor fusion for real time quality evaluation of biscuit during baking. Springer Verlag. Proceedings of EUROFUSE-SIC’99. P. Properties. . Dordrecht. pp. and Pomerol. (1994).-Ch. Kluwer. [198] Pi´ron. comparison between bayesian and fuzzy approaches. Cl´ ımaco (ed. Document du LAMSADE No 113. in R. Paris. Le Guennec. (1999). A. Kluwer. Paris. Stewart and Th. Advances in MCDM models. M. e [192] Perny.R. 279– 285. Universit´ Paris-Dauphine. [189] Ostanello. P. PhD thesis. pp. Sur le non-respect de l’axiome d’ind´pendance dans les e m´thodes de type ELECTRE. and Vincke.). New-York. (1997).. 36–57. (1997). J. Ecole Nationale Sup´rieure des Industries Agricoles e Alimentaires. algorithms. [203] Quiggin. A real world MCDA application: a Evaluating software. E.258 BIBLIOGRAPHY e Costa (ed. M. Cahiers du CERO 34: 211–232. Fuzzy preference modelling.). [195] Perny. (1996). E. PUF. Springer Verlag. A theory of anticipated utility. Trystram. P. (1978). Technical report. (1982). N. Prentice-Hall. Fuzzy sets in decision analysis. Cambridge University Press. H. F. [191] Paschetta. [197] Perrot. [194] Perny. [193] Perny.1–15. Examens et docimologie. D. Multi-criteria analysis. e [199] Pirlot. Gal. Kluwer. Behavioral decision theory: A new approach.). pp.. M. 15. and Zucker. Validation aspects of a prototype solution implementation to solve a complex MC problem. (1963). (1998). in T. A. and applications. Cambridge. 3–30. pp. Dordrecht. Modern educational measurement. Journal of Food Engineering 29: 301–315. Dordrecht. Hanne (eds). operations research n and statistics. Collaborative ﬁltering methods based on fuzzy preference relations.. Environmental indices: Theory and practice. J. W. theory. and Tsouki`s. Maˆ ıtrise des proc´d´s alimentaires et th´orie des enseme e e bles ﬂous. Semiorders. (1997). pp. [202] Poulton.

Original version in French “M´thodologie multicrit`re d’aide ` la e e a d´cision”. Paris. Paris.-M. [208] Roubens. Grade inﬂation and course choice. P. (1980). ELECTRE IS : Aspects m´thodologiques e et guide d’utilisation. [219] Sager. B. B. and Bouyssou. Milwaukee. [215] Roy. (1984). Economica.C. [214] Roy.J. Wiley. (1994). Checca. (1989). (1993). R. Revue d’Economie Politique 1: 1–44. P.BIBLIOGRAPHY 259 [204] Quiggin. e [216] Russo. C.E. Rational analysis of a problematic world. B. American Association of Collegiate Registrars and Admissions Oﬃcers. (1989). B. [205] Raiﬀa. [211] Roy.J. Universit´ Paris-Dauphine. M. e a e Technical report. H. H.S. [220] Salles. A S Q Quality Press. and Schoemaker. London. European Journal of Operational Research 66: 184–204. Dordrecht. Cahier du LAMSADE No 97. and Skalka.C. Science de la d´cision ou science de l’aide ` la d´cision ?. (1992). C. Fuzzy Sets and Systems 49: 9–13. J. Document du LAMSADE No 30. Decision science or decision-aid science?. (1993). Journal of Economic Perspectives 5: 159–170. and Vincke. o [210] Roy. The analytic hierarchy process. Crit`res multiples et mod´lisation des pr´f´rences : l’apport e e ee ´ des relations de surclassement. (1970). Eliminating grades in schools: An allegory for change.H. Aide multicrit`re ` la d´cision : M´thodes e a e e et cas. Singer. Piatkus. e Paris. [207] Rosenhead. Berlin. [209] Roy. Rationality and aggregation of preferences in an ordinally fuzzy framework. Generalized expected utility theory – The rank-dependent model. (1991). (1994).. Economica.. McGraw-Hill. Kluwer.R. New York. Dordrecht. J. e [212] Roy. B. Conﬁdent decision making.F.. New York. [218] Sabot. and Worthington. . D. Ph. Addison-Wesley. (1990). Springer Verlag. Investigaci´n Operativa 2: 95–110.L. and Pattanaik. [213] Roy. Preference modelling. D. and Bouyssou. (1993). Technical report. (1996). and Wakeman. (1985).J.K. Grades and grading practices: The results of the 1992 AACRAO survey. (1974). New York. T. (1991). Multicriteria methodology for decision aiding. B.. J. M. Universit´ Paris-Dauphine. R. J. Paris. [217] Saaty. D. Decision-aid: an elementary introduction with emphasis on multiple criteria. Washington D. B. 1985. Kluwer. Barrett. L. Decision analysis – Introductory lectures on choices under uncertainty. [206] Riley. T.

Eeckoudt.E. Handbook of mathematical economics. North Holland. Economic decisions under uncertainty. Economics of radiation protection: Equity considerations. Wiley. Software evaluation problem situaa tions. grades and student evaluations. (1989). C. [235] Steuer. Hedonic prices and cost-beneﬁt analysis. (1997). B. (1998). S. Saridis and B. (1975).W. Oxford University Press. R. [222] Savage. [234] Stamelos. Th. (1997). Paris.H. Strategy proofness and Arrow’s conditions: Existence and correspondence theorems for voting procedures and social welfare functions. North-Holland. Theory and Decision 43: 241–51. [238] Sugeno. Journal of Economic Theory 10: 187–217.W.A. (1998). [225] Schoﬁeld. (1957). Econometrica 57: 571–587.C. North-Holland. Gains (eds).M. [232] Sopher. Fuzzy sets in decision analysis. The foundations of statistics. Amsterdam.. 241–260. New York. R. Cahier du LAMSADE No 156. Econometrica 65: 745– 779. Arrow and M. L. Dordrecht. H. (1983). Grading student writing: An annotated bibliography. and Gigliotti.D. Oxford. and Tsouki`s. [236] Stratton. Theory and Decision 35: 75–106. [231] Slowi´ski. [223] Schmeidler. 1073–1181. Social choice theory. [237] Sugden. Amsterdam. D. (1985). A. Amsterdam. 2nd revised edn. [224] Schneider. [227] Sen.. (1994). (1983). A. Greenwood Publishing Group. (1986).R. . Faculty behavior. B. A. G. Gupta.. (1989). New York. Fuzzy automata and decision processes. J. A behavioural model of rational choice in Models of man. Westport. Kluwer. (1986). A. 1972.) (1998). Fuzzy measures and fuzzy integrals: a survey. [229] Simon.W. Journal of Economic Education 25: 5–15. and Wiliams. Intriligator (eds). New York. Wiley. and application. 3.260 BIBLIOGRAPHY [221] Satterthwaite.A. [228] Sen. Vol. L. Unwin and Hyman. M. (1977). Technical report. S. The principles of practical cost-beneﬁt analysis. (1954).J. M. Myers. G. and Gollier. pp. H.N. C.K. London. Wiley. pp. Universit´ Parise Dauphine. [230] Sinn. (ed. Multiple criteria optimisation: Theory. R. in M. [233] Speck. Journal of Economic Theory 37: 55–75. A test of generalized expected utility theory. 89–102. computation. operations research n and statistics. pp. Subjective probability and expected utility without additivity. Cost-beneﬁt analysis in urban and regional planning. [226] Scotchmer. R. I. Schieber. Maximization and the act of choice.K. (1993). R. in K. and King.

Birkh¨user. M. Consequences. Q. . and Guely. [249] Tversky. Intransitivity of preferences. (1982). I preference structures. to appear also in Discrete Applied Mathematics. and Vincke.L. A. Syndicat des Transports Parisiens. (1997). Non conventional preference relations in decision making. Application of fuzzy logic for the control of food processes. M. (1988). Alternatives to grading student writing. [250] United Nations Development Programme (1997). Paris.. Cost-beneﬁt analysis of climate change: The broader perspectives. Springer Verlag. Some multi-attribute models in examination assessment. Ph. Perrot. Urbana. [242] Tchudi. (http://www. Kacprzyk and M. F. Technical e report. [251] van Doren. A new axiomatic foundation of partial a comparability. (1986). An introductory survey on fuzzy control. M. New York. Ph. ´ [243] Teghem. A. S. (1996). F. opportunities and procedures. Electronic Notes on Discrete Mathematics. [255] Vincke. [244] Thaler. pp. Arrow’s theorem and examination assessment. Social Choice and Welfare 16: 17–40. New York. S. and French. [252] Vansnick. J. British Journal of Mathematical and Statistical Psychology 35: 183–192. 72–81. Proceedings OSDA ’98. A characterization of PQI interval a orders. De Borda et Condorcet ` l’agr´gation multicrit`re. Theory and Decision 39: 79–114. a e e Ricerca Operativa (40): 7–44. Basel. G. M´thodes d’´valuation des projets e e d’infrastructures de transports collectifs en r´gion Ile-de-France. Quasi rational economics. J. (1999).BIBLIOGRAPHY 261 [239] Sugeno. [247] Tsouki`s. [254] Vassiloglou. Oxford University Press.nl/locate/endm). A. P. (1985). Oxford. Russell Sage Foundation. (1984). and Vincke. [240] Suzumura. [253] Vassiloglou. M. (1995). (1928). (1999). (1997). Roubens (eds). Berlin. pp. in J. [248] Tsouki`s. (1991). (1995). Information Sciences 36: 59–83. a [246] Trystram. Human Development Report 1997. An anthology of world poetry. K. Psychological Review 76: 31–48. [241] Syndicat des Transports Parisiens (1998).-C. N. Programmation lin´aire. National Council of Teachers of English. Ph. Processing Automation 4: 504–512. Albert and Charles Boni. Editions de l’Universit´ de e e ´ Bruxelles-Editions Ellipses. R.elsevier.H. Brussels. [245] Toth. British Journal of Mathematical and Statistical Psychology 37: 216– 233. (1969).

E. W. Ph. [268] Yaari. PhD thesis. Amsterdam. Princeton.B. and Morgenstern. [257] Vincke. Decision analysis and behavioral research. P. (1999). (1987). Cambridge University Press.R. [262] Warusfel.A. Econometrica 55: 95–115. On the “environmental” discount rate. From computing with numbers to computing with words. pp. (1981). Mathematical Social Sciences 1: 409–430. Elsevier. Editions de e a e ´ l’Universit´ de Bruxelles-Editions Ellipses.262 BIBLIOGRAPHY [256] Vincke. Theory of games and economic behavior. and Stason. A review of cost-beneﬁt analysis as applied to the evaluation of new road proposals in the U. W. D. (1977). Dordrecht. Paris. [272] Zarnowsky. Decision analysis as a replacement for cost-beneﬁt analysis. L. Leisure Press. from manipulation of measurement to manipulation of perceptions.P.A.K. Cambridge.. [263] Watson.K. 1989. W. L. Aide multicrit`re ` la d´cision dans le cadre de la e a e probl´matique du tri : M´thodes et applications. Les nombres et leurs myst`res. G. A theory of approximate reasoning. (1992a). The decathlon – A colorful history of track and ﬁeld’s most challenging event. Exploitation of a crisp binary relation in a ranking problem.C. W. (1961). LAMSADE.R. 149–194. [265] Weitzman. e Paris. Oxford. Origi´ nal version in French “L’Aide Multicrit`re ` la D´cision”. e [270] Zadeh. [266] Weymark. A. J. K. S. M. and Edwards. (1992). (1979). O. Transportation Research – D 3: 141–156. (1986). (1992b). pp. (1989). (1981). Machine intelligence. Brussels. (1989).I.. (1944). Foundations of cost-effectiveness analysis for health and medical practices. J. [264] Weinstein.D. Oxford University Press. Hayes. [267] Willis.E. Princeton University Press. Kluwer. Points Sciences. in J. Wiley. Additive representations of preferences – A new foundation of decision analysis. New England Journal of Medicine 296: 716–721. [261] Wakker. and Harvey. Proceedings of EUROFUSE-SIC’99. Multi-criteria decision aid. [260] von Winterfeldt. European Journal of Operational Research 7: 242–248. Champaign.G.L.A. Fatal tradeoﬀs: Public and private responsibilities for risk. Ph. e e Universit´ Paris-Dauphine. Mikulich (eds). . [259] von Neumann. The dual theory of choice under risk. (1992). Seuil. e [258] Viscusi. 1–2. [271] Zadeh. [269] Yu. (1998). Theory and Decision 32: 221–241. (1994). M. D. F. Michie and L. Generalized Gini inequality indices. M. New York. Garrod. D. Journal of Environmental Economics and Management 26: 200–209.

Beneﬁt-cost analysis in theory and practice. R. New York.D. D.BIBLIOGRAPHY 263 [273] Zerbe. Harper Collins. and Dively. (1994).O. .

124 call for tenders. 30. 130. 241 weighted sum. 115 action. 242 compensation. 105 semiorder. 18–20 Borda. 130 disaggregation. 148. 10. 42. 21 incomparability. 46. 206 coalition. 130 corporate ﬁnance. 47 client. 237 concordance. 41. 206 acyclic. 102 cost-beneﬁt analysis. 191 ambiguity. 208 cardinal. 59. 46. 57. 44 compensation. 93 linearity. 84 constructive approach. 213 automatic decision. 63 Allais’ paradox. 16 aspiration level. 106 value function. 106 tournament. 126 fuzzy. 172 additive. 51. 220 outranking. 245 weighted sum. 212 analyst. 20 transitivity. 117 air quality. 61 multi-attribute value function. 239 automatic decision systems. 73 correlation. 237 attributes hierarchy. 34 Arrow. 14 Borda’s method. 215 rank reversal. 219 concordance threshold. 125 conjunctive rule. 41. 96 264 astrology. 238 . 134 Condorcet. 212 computer science. 214 attributes hierarchy. 166 AHP. 80 monotonicity. 219 coherence test. 105 non-compensation. 141 paired comparison. 212 actor. 244 bayesian decision theory. 85. 239.Index absolute scale. 159. 155. 30. 71. 117 screening process. 125 utility. 214 communication. 212 conjunctive rule. 206 anchoring eﬀect. 126 aggregation. 35 weighted average. 41. 244 binary relation acyclic. 111 rank reversal. 46. 61. 105 utility function. 117 dominance. 19. 193 procedure. 57. 148 axiomatic analysis. 96 consistency. 96 constructive approach. 216. 96 single-attribute value function. 106 weight. 13 paradox. 51 Condorcet’s method.

81 price of time. 192 engineering.INDEX externalities. 79. 228 Ellsberg’s paradox. 237 ELECTRE-TRI. 206 decision model. 86 credibility index. 48 grade. 210 evaluation model. 212. 42 standardised score. 86 evaluation absolute. 237 education science. 86. 75 social costs. 166 interval. 75 social welfare. 77. 76. 169 set. 40. 119 problem formulation. 237 environment. 51. 48 marking scale. 1. 71. 237 elections. 30. 237 anchoring eﬀect. 219 coherence test. 1 decision aiding process. 219 learning process. 207. 187 disaggregation. 239 economics. 21 control. 151 decision support. 219 forecasting. 191 Ellsberg’s paradox. 51. 93. 215. 212 relative importance. 78 social beneﬁts. 74 social rate. 206 analyst. 214 coherent family. 78 ﬁnal recommendation. 135 decathlon. 76 net present social value. 84. 192 expected utility. 189. 161. 138 discounting. 149. 214 hierarchy. 237 decision process. 36. 237 Allais’ paradox. 202. 206. 212 problem situation. 33 minimal passing. 206. 212. 169 GPA. 202. 212 model. 219 cycle reduction. 213 evaluation model problem statement. 206 decision rule. 66 decision dynamic consistency. 85. 206 actor. Petersburg game. 82 dominance. 80 fuzzy. 103 point of view. 214 interaction. 215 decision table. 161 rule. 161 labels. 201 expected value. 139 criteria coalition. 220 model. 216. 74. 239 legitimation. 206. 29. 75. formal. 187 St. 63. 76 price. 34 GPA. 213 problem statement. 206 client. 71. 82 equity. 78 markets. 213 ﬁnal recommendation. 149. 212 expected value. 165 implication. 213. 152 decision theory. 211. 117 discordance. 215 software. 96 dynamic consistency. 34 graphology. 237 265 . 239 net present value. 80 public goods. 187 externalities. 85 price of human life. 206 problem statement.

75. 124. 242 hierarchy. 130 indiﬀerence threshold. 76 net present value. 201 interaction. 214 human development. 33 mathematics. 216. 219 paired comparison. 172 net present social value. 214. 245 nontransitive. 104 meaningfulness. 125 threshold. 39. 79 standard sequences. 212 proﬁles. 173 indiﬀerence threshold. 125 manipulability. 125 PROMETHEE. 245 structuration. 19. 130. 214 ratio scale. 103 interactive methods. 30. 193 veto. 193 point of view. 227 measurement. 238 indices. 201 majority rule. 58. 80 priority. 84. 85 price of human life. 119 legitimation. 51. 98 reliability. 220 substitution rate. 76. 220 linear scale. 214 ordinal. 245 mono-criterion analysis. 47 interval scale. 214 indicator. 10. 74 nominal scale. 110 trade-oﬀ. 131 learning process. 62.266 health. 38. 15 separability. 242. 155 interval scale. 99 intuition. 219 concordance threshold. 228 incomparability. 214 non-compensation. 33 scale. 212 political science. 105 outranking methods. 85 monotonicity. 237 ordinal. 130 relation. 50 price. 214 validity. 105 interpolation. 138 concordance. 99 linear scale. 115 cardinal. 57 swing-weight. 139 cycle reduction. 61 ideal point. 242 kernel. 62. 134 credibility index. 61 nearest neighbours. 54. 129 discordance. 107 subjective. 220 independence. 103 incomparability. 111 . 166 imprecision. 117 implication. 32. 102 of irrelevant alternatives. 47. 38. 47. 216. 17 markets. 214 outranking. 227 INDEX nominal scale. 30 MCDM. 213. 51. 141 operational research. 104 majority rule. 71 heuristics. 117 interactive methods. 76 marking scale. 67. 238 ideal point. 105 sorting. 237 preference model. 135 ELECTRE-TRI. 101 meaningfulness. 39. 40. 33 model. 212 absolute scale. 81 price of time. 215.

187 stability. 173 relation. 214 substitution rate. 218 utility. 82 social welfare. 98 relative importance. 86. 105. 78 rank reversal. 101 statistics. 101 transitivity. 213 sorting. 212 St. 212 problem situation. 42. 216. 193 public goods. 106 multi-attribute. 106 veto. 99. 219 voting procedure Borda’s method. 179.INDEX probability. 13 267 uncertainty. 32. 124 Concordet paradox. 241 weighted sum. 84. Petersburg game. 13 weight. 75 social costs. 216. 173 tournament. 221 exogenous. 103. 75 social rate. 20 sensitivity analysis. 218. 189. 71. 38. 79 screening process. 96 security. 125 manipulability. 201. 206 problem statement. 206. 182. 105 single-attribute. 212. 117 ranking. 83 stability. 125 trade-oﬀ. 51. 207. 59. 242. 106 expected. 245 subjective. 77. 164 threshold. 51 Condorcet’s method. 99. 57 t-norm. 239 endogenous. 237 structuration. 245 rule aggregation. 50. 212 ratio scale. 79. 155. 85. 239 problem formulation. 101 separability. 242. 214 similarity indices. 219 risk. 215 PROMETHEE. 172 . 173 social beneﬁts. 211. 239 robustness. 86 software. 79 unanimity. 166. 76. 35 weighted average. 148 scale. 81 semiorder. 18–20 transportation. 159. 17 unanimity. 201 value function.

- Metode Pengukuran Beban Kerja Psikis Dan Fisik
- AUN-QA Accomplishing Programme Assessment Trainer ,Copy
- Pengenalan K3
- Perencanaan Strategis PPK
- 09835.pdf
- Draft-Key Performance Indicator
- Pedoman Pembuatan Flowchart
- Kompetensi Kuliah Nurse
- Aspek Hukum Kontrak Asuransi Di Indonesia
- Aspek Hukum Kontrak Asuransi Di Indonesia
- Aspek Hukum Kontrak Asuransi Di Indonesia
- PMK No 69 Ttg Tarif Pelayanan Kesehatan Program JKN
- PurpleMovement Edisi September 2012
- Purplemovement #2
- Purplemovement #1
- Konsep Penelitian Brand Expression Rumah Sakit
- Penyusunan Brand Expression Berdasarkan Strategi Pemasaran dan Brand Strategy
- Turn Over Karyawan (Kajian Literatur)
- Center of Gravity Method "Penentuan Lokasi Sarana Kesehatan"
- DSP_Kekalahan Kaum Ibu
- DSP_Jebakan Kebijaksanaan
- CV
- Distribusi Tenaga Kesehatan
- Indikator efisiensi
- EKUITAS MEREK

Sign up to vote on this title

UsefulNot useful0.0 (0)

- Interest Rate Riskby asifanis
- Be a Success Full Consultant - Insider Guide to Setup and Running Consulting Serviceby api-3837252
- 5505481-john-wiley-sons-2004-decision-analysis-for-management-judgment-3rd-edition-isbn-0470861088-493s-tlfebookby Nazareno Tuttuparupazzescu
- Financial Reporting Vol. 1by Mayank Jain

- Measuring Risk
- decisiontheory
- Notes on Heuristics in Problem Solving
- 1N4148_1N4448
- Venture Capital Financing in India
- Company Health Check
- Project Finance Manual
- Risk Manual
- Due Diligence
- Due Diligence Tools & Techniques
- European Comission - International is at Ion of SME 2010
- Bureaucracy
- Risk Analysis
- rgraphics
- Bierens - Introduction to the Mathematical and Statistical Foundations of Eco No Metrics
- 0750658525
- PwC-Internal Audit 2012
- clustersjan09
- Financial_Econometrics_2010-2011
- Franses, van Dijk. Non-Linear Time Series Models in Empirical Finance (2000)
- Interest Rate Risk
- Be a Success Full Consultant - Insider Guide to Setup and Running Consulting Service
- 5505481-john-wiley-sons-2004-decision-analysis-for-management-judgment-3rd-edition-isbn-0470861088-493s-tlfebook
- Financial Reporting Vol. 1
- Risk Management
- Real Analysis and Probability (2002 Dudley)
- Book of Proof
- Foellmer_H.,_Schied_A.-Stochastic_Finance__An_Introduction_In_Discrete_Time-de_Gruyter(2004)
- Bouyssou,Marchant,Pirlot,Perny,Tsoukias,Vincke Evaluation and Decision Models - A Critical Perspective (Kluwer)