This action might not be possible to undo. Are you sure you want to continue?

**EVALUATION AND DECISION MODELS:
**

a critical perspective

**EVALUATION AND DECISION MODELS:
**

a critical perspective

Denis Bouyssou ESSEC

Thierry Marchant Ghent University

Marc Pirlot SMRO, Facult´ Polytechnique de Mons e

Patrice Perny LIP6, Universit´ Paris VI e

Alexis Tsouki`s a LAMSADE - CNRS, Universit´ Paris Dauphine e

Philippe Vincke SMG - ISRO, Universit´ Libre de Bruxelles e

KLUWER ACADEMIC PUBLISHERS Boston/London/Dordrecht

1. . . . . . . 2. . . . . . . . . . . . . . . . . . . . . . . .2. . . . . . . .1 Motivations . . . . . . . . . . . . . . . . . .2 Audience . . . . . . . .2 Deﬁnition of the set of the voters . . . . .3 Structure . . . . . . . . . . . . . . . . . . . . . . 2. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2. . . . . . .3 Other models . . . . . . . . . . . . . . . . . . . . . . .4 Why use grades? . . . . 2. . .2. . . . . . . . . . . . . . .1. . . . . . . 1. . . . .2. . . . . . . . . . . .2. .1. . . . . . 3. . . . . . . . . . . . . . . . . . .2 Fuzzy relations . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1 Analysis of some voting systems . . . .3. 3 Building and aggregating evaluations 3. . . . . . . . . . . . . . . . 3. . . . . . . . . . . . . . . . 3. . . . . . .1 Analogies . . . . . . . . . . . . . . .3 Aggregating grades .2 Evaluating students in Universities 3. . .2. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3. .2 Grading students in a given course . . . . . . . . . . . . . . . . . . . . . 3. . . . . . . . . . . . v . . . . .3 Choice of the aggregation method 2. . . . . . .1 Introduction . . . . . . . . .1. .1 What is a grade? . . .5 Conclusions . . . .1 Deﬁnition of the set of candidates 2. . . . . . . . . . . . . 2. .1 Uninominal election . . . . . .4 Outline . . . . . . .3 The voting process . . . 1. . . . . . . . . . . . . . . 3. . . . .2. . . . . . 3. . . . . . . . . . . . . . . . . . . .3 Some theoretical results .2 Modelling the preferences of a voter . . . . . . . . . . 2. .2 Election by rankings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7 Acknowledgements . . . . . . . .2 The grading process . . . .1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Contents 1 Introduction 1. . . 2. . . . . .1 Rules for aggregating grades . . . . . . .1. .3. . . . .4 Social choice and multiple criteria decision support . . . 2. . . . . . . . . .4. . . . . . . . . 1 1 2 3 3 5 5 6 7 8 9 13 16 18 18 21 23 23 23 24 24 25 25 26 29 29 29 30 31 31 31 37 40 41 41 2 Choosing on the basis of several opinions 2.5 Who are the authors ? 1. . . . . . . . . . . .3. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2. .3 Interpreting grades . . . . . . . . . . . . . . . . . . 2. .1 Rankings . . . . . . . . .6 Conventions . 2. . . . . . . . . . . 3. . . . . . 1. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2. . . . 1. . . .3. . . . . 2. . . . . . . .

. . . . . . . . .5 Statistical aspects . . . . . 91 . . . . . . . .vi 3.3 The decathlon score . . . . . . . . . 6. . . . . . . . . . . . . . . . . .3. . .3 The additive value model . . . . . . . . . .3. . . . . . .4 Other eﬀects and remarks . . . . . . 6. . . . . . . . . . . . 6. . . . . . . . . . . . . . . . . . . .1. . . . . . . .2 The weighted sum . . 4. . . . . . . . 6. .2. . . . . . . 6.2. . . .2. . . . . . 4. . . . . . .1 Thierry’s choice . 99 .2 From Corporate Finance to CBA . 101 . . . . 4. . . . . .1 Scale Normalisation . . . . . . . . . . . 6. .1 Direct methods for determining single-attribute value functions . . . . . . . . . . . . . . . . . . . . 4. . . . . . . . . . . .1. . . . . . 4. . . . . . . . . . . . . . . . 4. . . . .2. . . . . . . . . .2 Time gains . . 5. 87 .3 Dimension independence . . . . .1. 4. . . .2. . . . . 4. . . 4. . . .3 Security gains . 98 . . . . . . . . .4 Conclusions . . . . . . . 42 51 53 54 56 57 58 59 59 61 61 62 62 63 65 66 69 71 71 73 73 75 76 79 80 80 81 82 83 3. . . . . . . . . . 97 .3.5 Conclusion . . . . .2 The principles of CBA . . . . . . 88 . . . . . . . 5. . . . . . . . . . . .3. . . . . . .1 The human development index . . . . . . . .3. . . . . . . . . . . . . . . . . . . 6. . . .2 Reasoning with preferences . . . . . . . .4 4 Constructing measures 4. . . .3 Meaningfulness . . . . . . . .2. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Conclusions . 5.3. . . . . . . . . . . . . 5. . . . . . . . . .4 The diﬃculties of a proper usage of the weighted sum 6. . . . . . . . . . . . . . . . . . . . . . . . . . 106 . . 6. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2 Aggregating grades using a weighted average . . .2. . 99 . . . . . . . .2 Non compensation . . . . . . . . . . . . . .5 Conclusions . . . . . . 6 Comparing on several attributes 6. . . . . . .1 Description of the case . . . . 5. . 6. . . . . . . . . . . .1 Choosing between investment projects in private ﬁrms 5. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87 . . . . 5. . . . . . .1. . . . . . .3 Some examples in transportation studies . . . . . . . . . . . . . . . . . . 4. . . . . . . . . . . . . . . .1. . .4 Indicators and multiple criteria decision support . . . . . . . . . . .1 Prevision of traﬃc . . . . . . . . . . . . 4. .1. . . . . . . . . . . . . . . . . .1 Role of the decathlon score . .3 Is the resulting ranking reliable? . . . . . . . . . .4 Scale construction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1 Introduction .2 Air quality index . . . . 5 Assessing competing projects 5. . . 5. .3. . 5.1 Transforming the evaluations . . . . . . . . . . . . . .2. . . . . . . . . .2. . . . . . . . . . . . . .2 Compensation . . .1. . 4. . . . . . . .1 Monotonicity . . . . . . . . . . . . . . . . . . . . . . . . .2 Using the weighted sum on the case . . . . . . . . . .2. . 5. . . . . . . . . . . . . . . . . . . . .3 Theoretical foundations . .2. . 105 . . . . . . . . . 107 . . . . . . . 4. . . .

4. . . . . . . . . . . . . . . An indirect method for assessing single-attribute value functions and trade-oﬀs . . . . . . . . . .4. . . . . . . Outranking methods . . . . . . . . . . . . . . . . . 7. . . .3. . . . . .4. . . . . .3. .3. . . . . . . . . . . .1 The set of actions . . . . . . . . . . . . . . .3. . . . . . .3. . . .4. 111 117 124 124 124 129 131 139 141 144 147 147 149 150 150 153 156 161 164 170 170 171 174 176 179 179 179 180 180 181 182 184 186 186 187 187 189 191 193 196 198 6.1 Introduction . . . . . . . . . . . . . .3. . . . .3. . General conclusion . . . . . . . . . . . . . . . . . . . 8. . . . . . . . . . . . . . . .4 Conclusion . . . . . . . . . .5 Advanced outranking methods: from thresholding towards valued relations . . . . 7. . . . . .2. . . . . . . . . . . . .6 Interpreting output labels as (fuzzy) intervals . .3. . . . .3 AHP and Saaty’s eigenvalue method . . . . 8 Dealing with uncertainty 8. . 6.4 Main features and problems of elementary outranking approaches . . . . . .6 Comment on the ﬁrst step . . . . . . . . . . . . . 7. . .2 Automatising human decisions by learning from examples 7. . . . . . . . . . . . . . . .5 7 Deciding automatically 7. . . . . . .2 A simple outranking method . 6. . . . . . . . . . . .2 6. . . .4. . . . . . 8. .4 The temporal dimension . . .7 The approach applied in this case: second step . . . . . . . . . .1 The expected value approach . . . 8. . . . . . . . . .4. . .5 Summary of the model . . . . . . . . . .2 Linking symbolic and numerical representations . . . . . . . . . . 8. . . .3 The expected utility approach . . . . . . . . .5 Conclusion . . . . . 8. .2 The set of criteria . . . . 8. . . . . . 8. . . . . . . . . . .4 Some comments on the expected utility approach 8. . . . . . . . . . . . . . . . . 6. . . . . . . . . . . . . . . . . . . . . . . .1 Controlling the quality of biscuits during baking . . . .2 The context . . . .1 Condorcet-like procedures in decision analysis . . . .4. . . . .3 A System with Implicit Decision Rules . 7. . . 7. . . . . . . . . . . . .4 An hybrid approach for automatic decision-making . . . . . . . .2. . . . . . . . . . .4. . . . . .4. .5 The approach applied in this case: ﬁrst step . . . . . . . .3. . .1 Designing a decision system for automatic watering . . .4. .4 A didactic example . . . . . . . . 8. . . . . . . . 7. . . . . .3 The model . . . . . . . . . . 7. . . . . . . . . . . . . . . .4. . . .1 Introduction . . . . . . .4 6. . . . . .4. . . . . . . .2. . . . . . . . . . . . 6. . . . . . . . . . . 8. 8. . 8. . . . . . . . . . . 7. . .2. . . . . . . . 7. . . . . . . . . . .2. .vii 6.3 Interpreting input labels as scalars . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5 Interpreting input labels as fuzzy intervals . . 6. . . . . . . . . . . 8. . . . . . . 8. . . . . . . . . . . . . . . . . 8.2 A System with Explicit Decision Rules . . . . .2. . .3 Using ELECTRE I on the case . . . . . . . . . . . . . . . .2 Some comments on the previous approach . . . . .4 Interpreting input labels as intervals . 7. . . . . . . . . 7. . . . . . . .3. . . . .3 Uncertainties and scenarios . . 6. . .

. . . Appendix B . . . . . . . Appendix A . .viii 8. 10 Conclusion 10. . . . . . . . . . . . . . . . . . . . . . . . . . . 9. . . . . . . . 200 205 206 207 210 211 213 219 226 228 231 237 237 239 243 247 262 9 Supporting decisions 9. . . . . . . . . . . .3 The ﬁnal recommendation 9. . . . . . . . . . . . . . . . . . . . . . . . . . . . 10. . . . . . . . . . . . . . . . . . . . . . 9. . . . . . . . . . . . . . . . . . . . . . . . . .3. . . . . . . . . . . . . . . . . . . . . . . . . . . . .3 Decision Support . . . . . . . . . . . . . .4 Conclusions . . . . .2 The Decision Process . . . . 9. . .3. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10. . . . . . . . . . . 9. . . . . . . .5 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1 Formal methods are all around us . Bibliography Index . . . . . . .1 Problem Formulation . .2 The Evaluation Model . . . .3 What can be expected? .2 What have we learned? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3. . . . . . 9. . . . . . . . . . . .1 Preliminaries . . . . . . . . . . . . . .

he probably takes the value of some indicators. called credit scoring. e. it quite often happens that we do not know or we are not sure what to decide and. Although informal decision support techniques can be of interest. we ﬁnd some well-known decision support techniques: cost-beneﬁt analysis. .g. assess and process information in order to be able to make recommendations in decision and/or evaluation processes.1 Motivations Deciding is a very complex and diﬃcult task. we will focus on formal ones. Nevertheless. a technique. decision trees. They are so widespread that almost no one can pretend he is 1 . Some people even argue that our ability to make decisions in complex situations is the main feature that distinguishes us from animals (it is also common to say that laughing is the main diﬀerence). the air quality index. in this book. • Groups or committees must also make decisions. we consult an expert. Let us cite but a few examples. they often use voting procedures. that help making decisions. Among the latter. • When the director of a school must decide whether a given student will pass or fail. But there are many other ones. we resort to a decision support technique: an informal one–we toss a coin. multiple criteria decision analysis. when the task is too complex or the interests at stake are too important.1 INTRODUCTION 1. sometimes not presented as decision support techniques. . we think–or a formal one.e. • When the mayor of a city decides to temporarily forbid car traﬃc in a city because of air pollution. • When a bank must decide whether a given client will obtain a credit or not. i. we ask an oracle. into account. All these formal techniques are what we call (formal) decision and evaluation models. is often used. we visit an astrologer. In order to do so. . a set of explicit and well-deﬁned rules to collect. The director then sums the grades and compares the result to a threshold. he usually asks each teacher to assess the merits of the student by means of a grade. in many instances.

hundreds of what-if questions can be answered in a ﬂash. whether it be for research or applications.2 CHAPTER 1. • Formal models require that the decision maker makes a substantial eﬀort to structure his perception or representation of the problem. You are right. None of the evaluation and decision models that we examined are perfect or the best. We have tried to keep mathematics and formalism . Very often. They are therefore particularly well suited for facilitating communication among the actors of a decision or evaluation process. importance of the interests at stake. we think that it is important to deepen our understanding of evaluation and decision models and encourage their users to think more thoroughly about them. This eﬀort can only be beneﬁcial as it forces the decision maker to think harder and deeper about his problem. unambiguous representations of a given problem. For all these reasons (complexity. • Once a formal model has been established. Our aim with this book is to foster reﬂection and critical thinking among all individuals utilising decision and evaluation models. This can be of great help if we want to devise robust recommendations.g. They all suﬀer limitations. For example.2 Audience Most of us are confronted with formal evaluation and decision models. for anyone who uses decision or evaluation models–for research or for applications–and is willing to question his practice. we use them without even thinking about it. they oﬀer a common language for communicating about the problem. a battery of formal techniques (often implemented on a computer) become available for drawing any kind of conclusion that can be drawn from the model. we can ﬁnd situations in which it will perform very poorly. in voting) and seems empirically correct in other contexts–but we are convinced as well that formal evaluation and decision models are useful in many circumstances and here is why: • Formal models provide explicit and. You guessed it: this book is more than 200 pages long. This book is intended for the aware or enlightened practitioner. 1. to a large extent. usefulness. Do we want to contend all models at the same time ? Deﬁnitely not ! Our conviction is that there cannot be a best decision or evaluation model–this has been proved in some contexts (e. For each one. INTRODUCTION not using or suﬀering the consequences of one of them. These models–probably because of their formal character–inspire respect and trust: they look scientiﬁc. This is not really new: most decision models have had contenders for a long time. But are they really well founded ? Do they perform as well as we want ? Can we safely rely on them when we have to make important decisions ? That is why we try to look at formal decision and evaluation models with a critical eye in this book. there is probably a lot of criticism. So. to have a deeper understanding of what he does. popularity) plus the fact that formal models lend themselves easily to criticism.

A rich bibliography will allow the interested reader to locate the more technical literature easily. 1. almost independent of the other chapters. We present many diﬀerent models.3 Structure There are so many decision and evaluation models that it would be impossible to deal with all of them within a single book. The aim of this chapter is to emphasise the importance of the decision aiding process (the context of the problem. The construction of the set of voters and the set of candidates. Then we turn to the way voters’ preferences are modelled. After showing the analogy between voting and multiple criteria decision support. we explore some issues that are often neglected: who is going to vote? Who are the candidates? These questions are diﬃcult and we show that they are important. STRUCTURE 3 at a very low level so that. ). Each example is presented in a chapter (Chapters 2 to 8). Finally. . hopefully.1. placing what has been discussed in a broader context and indicating links with other chapters. the position of the actors and their interactions. These examples. Chapter 9 is somewhat diﬀerent from the seven previous ones: it does not focus on a decision model but presents a real world application. chosen in a wide variety of domains. . The other examples (cost-beneﬁt analysis. . we informally present two theorems (Arrow and GibbardSatterthwaite) that in one way or another explain why we encountered so many diﬃculties in our twelve examples. as well as the choice of a voting method must be considered as part of the voting process. each one illustrating a problem that arises with a particular voting method. multiple criteria decision support and choice under uncertainty) correspond to well identiﬁed and popular evaluation and decision models. As will become apparent later.3. the role of the analyst. most of the material will be accessible to the not mathematically-inclined readers. Some examples have been chosen because they correspond to decision models that everyone has experienced and can understand easily (student grades and voting). will hopefully allow the reader to grasp these principles. most of them rely on similar kinds of principles. We chose some models because they are not often perceived as decision or evaluation models (student grades. Although the goal of this book is not to overwhelm the reader with theory. . We decided to present seven examples of such models.4 Outline Chapter 2 is devoted to the problem of voting. indicators and rule based control). we present a sequence of twelve short examples. to show that many diﬃculties arise there as well and that a coherence between the decision aiding process and the formal model is necessary. each one trying to outdo the previous one but suﬀering its own weaknesses. We begin with simple methods based on pairwise comparisons and we end up with the Borda method. Each of these seven chapters ends with a conclusion. 1.

Chapter 7 is dedicated to the study of automatic decision systems. This seems to indicate that assessing students might not be trivial. three particular indicators are considered: the Human Development Index (used by the United Nations). often. Following the CBA approach. the price of oil or the electricity demand in 20 years time. it is also a tool for controlling or managing (in a broad sense). ranking the students. Then we turn to the so called outranking methods. The ﬁrst two examples describe decision systems based on explicit decision rules.4 CHAPTER 1. Finally. Students are assessed in a huge variety of ways in diﬀerent countries and schools. Three examples are presented: the ﬁrst one concerns the control of an automatic watering system while the others are about the control of a food process. In Chapter 4. . An indicator is a measure but. This problem is classically described by using a decision tree and solved with an expected utility approach. the third one addresses the case of implicit decision rules. we present some diﬃculties that arise when one wants to choose from or rank a set of alternatives considered from diﬀerent viewpoints. We present a few examples illustrating some problems occurring with indicators. First we present the principles of CBA and its theoretical foundations. deciding whether a student is allowed to begin the next level of study. This problem is characterised by many diﬀerent uncertainties: for example. We present a real-life problem concerning the planning of electricity production. fuzzy sets) to model decision rules but also to clarify some problems arising when simulating the rules. using an example in transportation studies. using a well documented example. We assert that some diﬃculties are the consequences of the fact that the role of an indicator is often manifold and not well deﬁned. in the sense that conclusions that can be drawn regarding a decision are not clear-cut. In Chapter 6. namely the weighted sum. The goal of this section is to show the interest of some formal tools (e. the ATMO index (an air pollution indicator used by the French government) and the decathlon score. Some of these methods can be used even when the data are not very rich or precise. We use this familiar topic to discuss operations such as evaluating a performance and aggregating evaluations. . Then. The price we pay for this is that results provided by these methods are not rich either. ). We examine several aggregation methods that lead to a value function on the set of alternatives. a project should only be undertaken when its beneﬁts outweigh its costs. deciding whether a student gets a degree. the sum of utilities (direct and indirect assessment) and AHP (the Analytic Hierarchy Process). we clarify some of the hypotheses at the heart of CBA and criticise the relevance of these hypotheses in some decision aiding processes.g. . INTRODUCTION After examining voting.g. Cost-beneﬁt analysis (CBA) is a decision aiding method that is extremely popular among economists. we turn in Chapter 3 to another very familiar topic for the reader: students’ marks or grades. These systems concern the execution of repetitive decision tasks and the great majority of them are based on more or less explicit decision rules aimed towards reﬂecting the usual decision policy of humans. we illustrate some diﬃculties encountered with CBA. Marks are used for diﬀerent purposes (e. After recalling some well known criticisms directed against this . The goal of Chapter 8 is to raise some questions about the modelling of uncertainty.

we devote the last chapter to the description of a real world decision aiding process that took place in a large Italian company a few years ago. deserve greater consideration. Among their special interests are preference modelling. in France and in Belgium. In spite of the large number of co-authors. It is a joint work. One should ideally never consider these elements separately from the aggregation process because they can impact the whole decision process and even the way the aggregation procedure behaves. a voter or an individual whose sex is not determined. .6 Conventions To refer to a decision maker. but have been involved in a variety of applications ranging from software evaluation to location of a nuclear repository. Their background is quite varied as well: mathematics. mathematics. WHO ARE THE AUTHORS ? 5 approach. Some of the drawbacks of this approach are discussed as well. such as belief functions. The fact that all of the authors are male has nothing to do with this choice. Tsouki`s and Vincke 1993). measurement theory. Pirlot. we present the approach that has been used by the team that “solved” this problem. problem structuring. aggregation techniques. Therefore. The same applies for “his/her”. computer science and psychology schools. Five of the six authors of the present volume presented their thoughts on the past and the objectives of future research in multiple criteria decision support in the Manifesto of the new MCDA era (Bouyssou. we decided not to use the politically correct “he/she” but just “he” in order to make the text easy to read. Perny. are brieﬂy mentioned.5. the reader should not be surprised to ﬁnd . mainly from an axiomatic point of view. social choice theory. engineering. business. the construction of the criteria. this book is not a collection of papers. a The authors are very active in theoretical research on the foundations of decision aiding. Convinced that there is more to decision aiding than just number crunching. fuzzy logic.1. It concerns the evaluation of oﬀers following a call for tenders for a GIS (Geographical Information System) acquisition. economics. Some important elements such as the participating actors. through the rehabilitation of a sewer network or the location of high-voltage lines. etc. They teach in engineering. The relevance of probabilities is criticised and other modelling tools. even if we did our best to write in correct English. . operations research. Besides their interest in multiple criteria decision support. 1. they share a common view on this ﬁeld. . artiﬁcial intelligence.5 Who are the authors ? The authors of this book are European academics working in six diﬀerent universities. 1. law and geology but they are all active in decision support and more particularly in multiple criteria decision support. None of the authors is a native English speaker. the problem formulation. fuzzy set theory and possibility theory.

Ottinger. We beg the reader’s leniency for any incorrectness that might remain. The adopted spelling is the British and not the American one. and Stefano e Abruzzini. 1.%\newline The authors also wish to thank J. a A special thank goes to Marjorie and Diane Gassner who had the patience to read and correct our continental approximation of the English language and to Fran¸ois Glineur who helped in solving a great number of latex problems.6 CHAPTER 1. who contributed to Chapter 8.7 Acknowledgements We are ggreatly indebted to our collEague friend Philippe Fortemps \cite{Fortemps99} ///////// . who gave us a number of references concerning indicators. H. . Without him and his knowledge of Latex.-L. this book would look like this paragraph. who laid out the complex diagrams of that chapter. Large part of chapter 9 uses material already published in (Paschetta and Tsouki`s 1999). INTRODUCTION some mistakes or inelegant expressions. M´lot. c We thank Gary Folven from Kluwer Academic Publisher for his constant support during the preparation of this manuscript. Chapter 6 is based on a report by S´bastien Cl´ment written to fulﬁl the requirements of a e e course on multiple criteria decision support.

5 % of the registered voters may compete. Each voter chooses one of the candidates in his constituency. One representative is elected in each constituency. If one candidate has been chosen by more than 50 % of the voters. the candidate that received fewer papers than any other is eliminated and the corresponding ballot papers are transferred to the candidates that got 7 . all candidates that were chosen by more than 12. France. . In a constituency. The winner is the candidate that is chosen by more voters than any other one. for the senate. During the second stage. . If one candidate receives more than 50 % of the votes. in presidential elections. Otherwise. Otherwise a second stage is organised. Is there much to say about voting ? Well. Then the ballot papers are sorted according to the ﬁrst preference votes. Note that the winner does not have to win an overall majority of votes. Once more. only two candidates remain: those with the highest scores. United Kingdom’s members of parliament The territory of the UK is divided into about 650 constituencies. Otherwise a second stage is organised. each voter chooses one of the candidates. he is elected. the French territory is divided into single-seat constituencies. The winner is the candidate that has been chosen by more voters than the other one. France’s president Each voter chooses one of the candidates. a 2 next to his second preferred candidate. each voter is asked to rank all candidates: he puts a 1 next to his preferred candidate. . During the second stage. Australia’s members of parliament The territory is divided into single-seat constituencies called divisions. each voter chooses one of the candidates. Once again. just think about the way heads of state or members of parliament are elected in Australia. each voter chooses one of the candidates. . he is elected. . . then a 3. In a division. The winner is the candidate that received the most votes. France’s members of parliament As in the UK. If a candidate has more than 50 % of the ballot papers. the UK.2 CHOOSING ON THE BASIS OF SEVERAL OPINIONS: THE EXAMPLE OF VOTING Voting is easy! You’ve voted hundreds of times in committees. and so on until his least preferred candidate. he is elected.

without ties.1 Analysis of some voting systems From now on. 2. thousands of papers have been devoted to the problem of voting (Kelly 1991) and our guess is that many more are to come. In Section 1. Then we show some problems occurring when aggregating the rankings. each party can present one candidate. In the worst case. using classical voting systems such as those applied in France or the United Kingdom. etc. Those interested in voting methods and the way they are applied in various countries will ﬁnd valuable information in Farrell (1997) and Nurmi (1987). in fact. Voting is not instantaneous. we discuss the analogy with multiple criteria decision support. This chapter is organised as follows. the candidate that received fewer papers than any other is eliminated and the corresponding ballot papers are transferred to the candidates that got a 3 on these papers. Otherwise. you will be amazed by the incredible complexity of the subject. one of the candidates necessarily has more than 50 % of the papers. to show that many diﬃcult and interesting problems arise in voting and. The winner in a county is the candidate that is chosen by more voters than any other one. Our aim in this chapter is. the Canadian parliament is elected as follows. We do this through the use of small and classical examples. we consider other preference models than the linear ranking of Section 1. The chapter ends with a conclusion. The diversity of the methods applied in practice probably reﬂects some underlying complexity and. In Section 4. unless they are tied. we will distinguish between the election—the process by which the voters express their preferences about a set of candidates—and the aggregation . The leader of the party that has the most representatives becomes prime minister. we make the following basic assumption: each voter’s preferences can accurately be represented by a ranking of all candidates from best to worse. he is elected. CHOOSING ON THE BASIS OF SEVERAL OPINIONS a 2 on these papers. on the other hand. The territory is divided into about 270 constituencies called counties. to convince the reader that a formal study of voting might be enlightening. In most cases. In Section 3. Once more. It is a process that begins when somebody decides that a vote should occur (or even earlier) and ends when the winner begins his mandate (or even later). it seems that the case of a tie is seldom considered in electoral laws. this process ends when all but two candidates are eliminated. Some models are poorer in information but more realistic. Canada’s members of parliament and prime minister Every ﬁve years. Some are richer and less realistic. In each county. on the one hand. Note that. In spite of its apparent simplicity. because. as far as we know. He is thus the county’s representative in the parliament. if you take a closer look at voting. In Section 2. It is not just counting the votes and performing some mathematical operation to ﬁnd the winner. Each voter chooses one candidate. we change the focus and try to examine voting in a much broader context. the aggregation remains a diﬃcult task. if a candidate has more than 50 % of the ballot papers.8 CHAPTER 2.

e. He votes for a. Example 2. b and c) obtains 10 votes (resp. P yP z and 49 voters have preferences zP bP cP . Let us continue with some strange problems arising when using a uninominal election.e. A possible way to avoid this problem might be to ask the voters to provide their whole ranking instead of their preferred candidate.1. each voter sincerely (or naively) reports his preferences. i. Then a (resp.2. . Suppose that 51 voters have preferences aP bP cP . This will be discussed later. Indeed. Dictature of majority Let {a. Example 1. a uninominal election combined with the majority rule allows a dictatorship of majority and doesn’t favour a compromise. . consciously or not. suppose that a voter prefers candidate a to b and b to c (in short aP bP c). when voting. . Thus a has an absolute majority and. Nevertheless. Candidate b could be a good compromise. ranks all candidates from best to worse. y. . We are now ready to present a ﬁrst example that illustrates a diﬃculty in voting. P yP a. the election is uninominal and the aggregation method is simple majority. without ties and. And candidate b seems to be a good candidate for everyone. a wins. Suppose that 10 voters have preferences aP bP c. . . . each voter votes for one candidate only 2. c} be the set of candidates for a 21 voters election. For example. c. an absolute majority of voters prefers any other candidate to a (11 out of 21 voters prefer b and c to a). Thus a is chosen. As shown by this example. 6 and 5). Each voter. we shall assume that each voter votes for the candidate that he ranks in ﬁrst position.1 Uninominal election Let us recall the assumption that we mentioned earlier and that will hold throughout Section 1. Thus. b. this might be diﬀerent from what a majority of voters wanted. .1. In many cases. i. Respect of majority in the British system The voting system in the United Kingdom is plurality voting. b. the election is uninominal. in a uninominal election. 6 voters have preferences bP cP a and 5 voters have preferences cP bP a. z} be a set of 26 candidates for a 100 voters election. in all uninominal systems we are aware of. Let {a. But is a really a good candidate ? Almost half of the voters perceive a as the worst one. ANALYSIS OF SOME VOTING SYSTEMS 9 method—the process used to extract the best candidate or a ranking of the candidates from the result of the election. . It is clear that 51 voters will vote for a while 49 vote for z.

This time. b would have been elected. Example 5. none of the beaten candidates (a and c) are preferred to b by a majority of voters. Manipulation in the two-stage French system Let us continue with the example used above. Because it is not necessary to be a mathematician to ﬁgure out such problems. Respect of majority in the two-stage French system Let {a. This is not the only weakness of the French system as attested by the three following examples. if such a problem would be avoided by the two-stage French system. in order to increase his lead over b and to lessen the likelihood of a defeat. After the ﬁrst stage. b. Example 4. 11 votes so that candidate b is elected. a second stage is run between candidates b and c. 6 voters have preferences cP aP dP b and 5 voters have preferences aP dP bP c. Suppose that the six voters having preferences cP aP dP b decide not to be sincere and vote for a instead of c.10 CHAPTER 2. a second stage is run between candidates a and b. Then candidate a wins after the ﬁrst stage because there is an absolute majority for him (11/21). If they had been sincere (as in the previous example). Suppose that candidate a. cP aP b. that may encourage voters to falsely report their preferences. CHOOSING ON THE BASIS OF SEVERAL OPINIONS Let us see. Suppose that the survey did exactly . d} be the set of candidates for a 21 voters election. as no candidate has absolute majority. Suppose that 10 voters have preferences bP aP cP d. using the same example. is called manipulable. the results of a survey are as follows: 6 5 4 and 2 voters voters voters voters have have have have preferences preferences preferences preferences aP bP c. as no candidate has an absolute majority. Monotonicity in the two-stage French system Let {a. c}. Candidate b easily wins with 15 out of 21 votes though an absolute majority (11/21) of voters prefer a and d to b. as shown by the following example. a second stage would be run. Nonetheless we cannot conclude that the two-stage French system is superior to the British system from this point of view. c. A few days before the election. After the ﬁrst stage. between a and b and a would be chosen obtaining 11 out of 17 votes. c} be the set of candidates for a 17 voters election. Thus a obtains 10 votes and b. decides to strengthen his electoral campaign against b. Thus. Example 3. b. We suppose that the voters keep the same preferences on {a. bP cP a bP aP c. b. some voters might be tempted not to sincerely report their preferences as shown in the next example. casting a non sincere vote is useful for those 6 voters as they prefer a to b. Such a system. With the French system.

is going to win anyway. The second stage opposes a to c and c wins. obtaining 9 votes. Example 6. You will note in the next example that some manipulations can be very simple. What will happen ? There will be only 9 voters. After the ﬁrst stage. 5 voters have preferences cP aP b and 4 voters have preferences bP cP a. 2 voters have preferences aP bP c. Suppose that the 13 voters located in the town have the following preferences. b. Participation in the two-stage French system Let {a. Example 7. obtaining 5 out of 9 votes. a second stage should oppose a to c and c should win the election obtaining 7 out of 11 votes. ANALYSIS OF SOME VOTING SYSTEMS 11 reveal the preferences of the voters and that the campaign has the right eﬀect on the last two voters. Hence we observe the following preferences. Clearly such a method does not encourage participation.1. bP aP c. Suppose that 2 of the 4 ﬁrst voters (with preferences aP bP c) decide not to vote because c. due to the campaign of a. Suppose that 4 voters have preferences aP bP c. 4 3 3 and 3 voters voters voters voters have have have have preferences preferences preferences preferences aP bP c. Contrary to all expectations. Separability in the two-stage French system Let {a. It is clear with such a system that it is not always interesting or eﬃcient to sincerely report one’s preferences. c} be the set of candidates for a 11 voters election. 4 voters have preferences cP bP a and 3 voters have preferences bP cP a. Our two lazy voters can be proud of their abstention since they prefer b to c. b.2. Such a method is called non monotonic because an improvement of a candidate’s position in some of the voter’s preferences can lead to a deterioration of his position after the aggregation. cP aP b cP bP a. . He was wrong. Using the French system. 8 voters have preferences aP bP c. candidate c will loose while b will win. the worst candidate according to them. b is eliminated. Suppose that the 13 voters located in the countryside have the following preferences. Candidate a thought that his campaign would be beneﬁcial. 4 voters have preferences cP bP a and 3 voters have preferences bP cP a. c} be the set of candidates for a 26 voters election. The voters are located in two diﬀerent areas: countryside and town.

If the agenda is a and c ﬁrst. b. Thus a is the winner in both areas. Clearly. But it is easy to observe that in the global election (26 voters) a is defeated during the ﬁrst stage. ﬁnally. it is not an easy task to imagine a system that would behave as expected. cP aP b. then c. with 13 voters. If an election is organised in the countryside. The ﬁrst one is opposed to the status quo. And so on until no more candidates remain. The 3 candidates will be considered two by two in the following order or agenda: a and b ﬁrst. Suppose that 1 voter has preferences 1 voter has preferences and 1 voter has preferences aP bP c.12 CHAPTER 2. using the British system. this method suﬀers severe drawbacks. The previous examples showed that. The winner is opposed (using the British system) to a new arbitrarily chosen candidate. and so on. obtaining 7 votes. b wins against c and is elected. it is easy to see that. Unfortunately. if there are more than two candidates. Such a method is called non separable. any candidate can be elected and the outcome depends completely on the agenda. Note that. Thus we might be tempted by a generalisation of the British system (restricted to 2 candidates). Then a is opposed to c and c defeats a with absolute majority. we use the British system. Example 8. Candidates a and c will go to the second stage and a will be chosen. Let us note that sequential voting is very common in diﬀerent parliaments. i. During the ﬁrst vote. a is elected. The diﬀerent amendments to a bill are considered one by one in a predeﬁned sequence. If there are two candidates.e. Candidates (or amendments) appearing at the end of the agenda are more likely to be elected than those at the beginning. bP cP a bP aP c. in this example. obtaining 7 votes. Hence. a will defeat b in the second stage. c} be the set of candidates for a 3 voters election. we arbitrarily choose two of them and we use the British system to select one. Thus c is elected. This would require n − 1 votes between 2 candidates. CHOOSING ON THE BASIS OF SEVERAL OPINIONS 4 3 3 and 3 voters voters voters voters have have have have preferences preferences preferences preferences aP bP c. Suppose now that an election is organised in the town. . It doesn’t treat all candidates in a symmetric way. Inﬂuence of the agenda in sequential voting Let {a. such a method lacks neutrality. If the agenda is b and c ﬁrst. the British system (uninominal and one-stage) is equivalent to all other systems and it suﬀers none of the above mentioned problems (May 1952). a is opposed to b and a wins with absolute majority (2 votes against 1). Naturally we expect a to be the winner in a global election. on an arbitrary decision. when there are more than 2 candidates. the second one is opposed to the winner . in the presence of 2 candidates. it is easy to see that c defeats a and is then opposed to b. bP cP a cP aP b. Consequently.

In case of tie. Candidate a is defeated by b during the ﬁrst vote. Up to now. candidates a and b are indiﬀerent. Candidate c wins the second vote and d is ﬁnally elected though all voters unanimously prefer a to d. Let us remark that this cannot happen with the French and British systems. Although other methods have been proposed. though interesting. Why not try to palliate the many encountered problems by asking voters to explicitly rank the candidates ? This idea.1. 13 Consider the following agenda: a and b ﬁrst. each voter provides a ranking without ties of the candidates. One was proposed by Borda. Although the principle underlying the Condorcet method—the candidate that beats all other candidates in a pairwise contest is the winner—seems very natural. Violation of unanimity in sequential voting Let {a. In fact. Hence the task of the aggregation method is to extract from all these rankings the best candidate or a ranking of the candidates reﬂecting the preferences of the voters as much as possible. this principle might be questioned: in example 1. In other words. in some instances. then c and ﬁnally d. c. wins by a majority. cP bP aP d aP dP cP b. we have assumed that the voters are able to rank all candidates from best to worse without ties but the only information that we collected was the best candidate. 2. opposed to each of the n − 1 other candidates.2. Suppose that 1 voter has preferences 1 voter has preferences and 1 voter has preferences bP aP dP c.1. Note that both the British as well as the two-stage French methods are diﬀerent from the Condorcet method. their methods are still at the heart of many scientists’ concerns. b. a is the . the other by Condorcet. candidate a is elected by the British method but b is the Condorcet winner. two aggregation methods for election by rankings appeared in France. In example 2. ANALYSIS OF SOME VOTING SYSTEMS Example 9. A candidate a is preferred to b if and only if the number of voters ranking a before b is larger than the number of voters ranking b before a. a is the Condorcet winner although b is chosen by the French method. close to the concept of democracy and hence very appealing.2 Election by rankings In this kind of election. d} be the set of candidates for a 3 voters election. many methods are variants of the Borda and Condorcet methods. It can be shown that there is never more than one Condorcet winner. will lead us to many other pitfalls that we discuss just below. it is worth noting that. In example 3. The Condorcet method Condorcet (1785) suggests to compare all candidates pairwise in the following way. a winner is a candidate that. A candidate that is preferred to all other candidates is called a (Condorcet) winner. At the end of the 18th century.

e. eP xP yP aP bP cP dP f P g. e. in order to obtain this result. it seems that y should be elected. The Borda method Borda (1781) proposed to use the following aggregation method.1: Number of voters who rank the candidate in k-th place in their preferences Furthermore. the higher the probability of such a paradox. g. Note that. But let us focus on the candidates x and y. Critique of the majority principle Let {a. Nurmi 1987). i.1. and n for the last. Let us summarise their results in Table 2. In each voter’s preference. if he exists. No candidate is preferred to all others.14 CHAPTER 2. Example 10. eP f P gP xP yP aP bP cP d. . Candidate x wins against every other candidate with a majority of 51 votes. the probability of such a paradox is signiﬁcantly high as it is approximately 1/2 (Gehrlein 1983) and the more candidates or voters. Consider also example 10 taken from Fishburn (1977). Compute the Borda score of each candidate. One might think that example 8 is very bizarre and very unlikely to happen. there are cases (called Condorcet paradoxes) where there is no Condorcet winner. CHOOSING ON THE BASIS OF SEVERAL OPINIONS Condorcet winner. In such a case. y} be a set of 9 candidates for a 101 voters election. Unfortunately it isn’t. Consider example 8: a is preferred to b. c. k x y 1 0 50 2 30 0 3 0 30 4 21 0 5 0 21 6 31 0 7 0 0 8 0 0 9 19 0 Table 2. If you consider an election with 25 voters and 11 candidates. Then choose the candidate with lowest Borda score. Many methods have been designed that elect the Condorcet winner. x. each candidate has a rank: 1 for the ﬁrst candidate in the ranking. Thus x is the Condorcet winner. Suppose that 19 21 10 10 10 and 31 voters voters voters voters voters voters have have have have have have preferences preferences preferences preferences preferences preferences yP aP bP cP dP eP f P gP x. . In view of Table 2. b. f. 2 for the second. gP xP yP aP bP cP dP eP f yP aP bP cP dP xP eP f P g. b is preferred to c and c is preferred to a. d. f P xP yP aP bP cP dP eP g. . and choose a candidate in any case (Fishburn 1977. the sum for all voters of that candidate’s rank. Such an hypothesis is clearly questionable (Gehrlein 1983).1. although almost half of the voters consider him to be the worse candidate. all rankings are supposed to have the same probability. . the Condorcet method fails to elect a candidate.

Note that the Borda method not only allows to choose one candidate but to rank them (by increasing Borda scores). see Example 2). b remains the winner and it can be shown that this is always the case: if a candidate is a Condorcet winner. c} be the set of candidates for a 2 voters election. Suppose now that candidates c and d decide not to compete because they are almost sure to lose. Using the Condorcet method. Example 12. it is less than 1 %. d} be the set of candidates for a 3 voters election. For example. Thus the fact that a defeats or is defeated by b depends upon the presence of other candidates. then he is still a Condorcet winner after the elimination of some candidates. i. when a Condorcet winner exists. Nevertheless. because new solutions emerge during discussions. .1. It can vary because candidates withdraw. . In these cases. The Borda score of a is 5 = 2×2+1×1. With the Condorcet method. Comparison of the Borda and Condorcet methods Let {a.2. it can be shown that the Borda method never chooses a Condorcet looser. With the Borda method. Thus. the probability of all candidates being tied is 1/3. Now consider a new election where the alternatives and voters are identical but they changed their preferences about c. Suppose that 1 voter has preferences and 1 voter has preferences aP cP b bP aP c. The alternative with the lowest Borda score is a. b. Thus b now defeats a just because c and d dropped out. . it is not always chosen by the Borda method. Candidates c and d receive 8 and 11. b. the conclusion is diﬀerent: b is the Condorcet winner. c. Borda and the independence of irrelevant alternatives Let {a. it is 6 = 2×1+1×4. But the likelihood of indiﬀerence is rather small and decreases as the number of candidates or voters increases. This can be a problem as the set of the candidates is not always ﬁxed. the Borda method does not tell us which one to choose. we supposed that all rankings have the same probability. For b. Suppose that 2 voters have preferences bP aP cP d and 1 voter has preferences aP cP dP b. If two candidates have the same Borda score. Suppose that 1 voter has preferences and 1 voter has preferences aP bP c bP cP a. for 3 candidates and 2 voters. because feasible solutions become infeasible or the converse. for 3 candidates and 50 voters. Note that once again. then they are indiﬀerent. They are considered as equivalent. a candidate that is beaten by all other candidates by an absolute majority (contrary to the British system. Thus a is the winner. . ANALYSIS OF SOME VOTING SYSTEMS 15 Note that there can be several such candidates. the new winner is b. Example 11.e.

second. Unanimity. then a must be ranked before b in the overall preference. We will call this ranking the overall ranking. The result of the aggregation must always be a ranking. Universal domain. This can be seen as a shortcoming of the Borda method.16 CHAPTER 2. The ﬁrst (resp. In another book. in preparation. One says that the Borda method does not satisfy the independence of irrelevant alternatives. in the present volume. that any method you can think of suﬀers severe problems. b is preferred to c and c is preferred to a. possibly with ties. It can be shown that the Condorcet method satisﬁes this property. This implies that. A more general (and thus theoretic) approach is needed. Only the relative position of c changed and this was enough to turn b into a winner and a into a looser. each example is related to a particular method. possibly with ties. Whatever the rankings provided by the voters. Arrow’s theorem Arrow (1963) was interested by the aggregation of rankings with ties into a ranking. We should ﬁnd a way to answer questions like • Do non manipulable methods exist ? • Is it possible for a non separable method to satisfy unanimity ? • . This seems quite reasonable but example 9 showed that some commonly used . some of the most famous results of social choice theory..g. if aP b and bP c in the overall ranking. e. the method must yield an overall ranking of the candidates. But we think it is time to stop for at least two reasons. b}. Nevertheless.3 Some theoretical results We could go on and on with examples showing. then aP c in the overall ranking. hence this approach lacks generality. none of the two voters changed their opinion about the pair {a. He examined the methods verifying the following properties. However. If all voters are unanimous about a pair of candidates. if all voters rank a before b. This property implies that the aggregation method must be applicable to all cases. Example 8 showed that the Condorcet method doesn’t verify transitivity: a is preferred to b. CHOOSING ON THE BASIS OF SEVERAL OPINIONS It turns out that b has the lowest Borda score. First. in an informal way. we will follow such a general approach but. This property rules out methods that would impose some restrictions on the preferences of the voters. second) voter prefers a (resp. Transitivity.1. we try to present various problems arising in evaluation and decision models in an informal way and to show the need for formal methods. 2. we cannot resist to the desire to present now. it is not very constructive and. b) in both cases..

. They proved the following result. as a consequence of theorem 2. Note that we observed in example 12 that the Borda method violates the independence property. ANALYSIS OF SOME VOTING SYSTEMS 17 aggregation methods fail to respect unanimity. there exists no aggregation method satisfying simultaneously the properties of universal domain. This property is often called Pareto condition.2. separability. in no case. None of the voters can systematically impose his preferences on the other ones. This property is often called Independence of irrelevant alternatives. are quite weak (at least at ﬁrst glance).1. Gibbard-Satterthwaite’s theorem Gibbard (Gibbard 1973) and Satterthwaite (Satterthwaite 1975) were very interested by the (non-)manipulability of aggregation methods. .1 (Arrow) When the number of candidates is at least 3. For example. This rules out aggregation methods such that the overall ranking is always identical to the preference ranking of a given voter. we face an even more puzzling problem. unanimity. Example 4 concerning the two-stage French system can be revisited bearing in mind theorem 2.2. Note that Arrow’s theorem uses only ﬁve conditions that. this theorem explains why we encountered so many diﬃculties when trying to ﬁnd a satisfying aggregation method. Theorem 2.1. monotonicity. Therefore. it is not surprising that it is manipulable. To a large extent. especially those leading to the election of a unique candidate. unanimity. let us observe that the Borda method satisﬁes the universal domain. Therefore other alternatives are considered as irrelevant with respect to that pair. What about the Condorcet method ? It satisﬁes the universal domain. . This may be seen as a minimal requirement for a democratic method. Independence. non-manipulability. transitivity. The French system satisﬁes universal domain and nondictatorship. in addition. Hence it cannot verify transitivity (see example 8). non-manipulability and non-dictatorship. If. we wish to ﬁnd a method satisfying neutrality. The relative position of two candidates in the overall ranking depends only on their relative positions in the individual’s preferences. independence and non-dictatorship properties. Theorem 2. Informally. a voter can improve the result of the election by not reporting his true preferences. Yet. the result is powerful. unanimity and non-dictatorship properties. a method is non-manipulable if. Non-dictatorship. we can deduce that it cannot satisfy the independence condition. . Therefore. transitivity. in addition to these ﬁve conditions. These ﬁve conditions allow to state Arrow’s celebrated theorem. there exists no aggregation method satisfying simultaneously the properties of universal domain.2 (Gibbard-Satterthwaite) When the number of candidates is larger than two. independence and non-dictatorship.

he prefers a. These results help to understand the fundamental principles of a method and to compare diﬀerent methods. we need to model his preferences by a ranking with ties. CHOOSING ON THE BASIS OF SEVERAL OPINIONS Many other impossibility results can be found in the literature.2 Modelling the preferences of a voter Let us consider the assumption that we made in Section 1: the preferences of each voter can accurately be represented by a ranking of all candidates from best to worse. b and c to d. we list diﬀerent cases in which our initial assumption is not valid. At the beginning of this chapter. they are not. A characterisation of a given aggregation method is a set of properties simultaneously satisﬁed by only that method. Furthermore. suppose that a parliament has been elected. 2. a voter is not able to rank the candidates. if he prefers a to b and b to c. in others. This implies that when you present a pair of candidates (a.1 Rankings To model the preferences of a voter. just because he considers them as equivalent. Some voting systems lead to the election of several candidates and aim towards achieving a kind of proportional representation. Preference still is transitive. he necessarily prefers a to c (transitivity of preference).1 where an arrow between two candidates (e. For example. the converse or “a is indiﬀerent to b” (which is equivalent to “b is indiﬀerent to a”).g. he is indiﬀerent between b and c and. 2. Indiﬀerence: rankings with ties In some cases. in a ranking with ties. We all know that this is not always realistic. there are several candidates that a voter cannot rank. This model corresponds to the assumption of Section 1. 2. b) to a voter. many characterisations are available. b). Thus. In this section. only one candidate or law or project will have to be chosen. This parliament will have to vote on many diﬀerent issues and. ﬁnally. Furthermore. we have “a is preferred to b”. For each pair of candidates (a. A graphic representation of this model is given in Fig. c and d. We can model his preferences by a ranking with ties. we decided to focus on elections of a unique candidate.18 CHAPTER 2. Besides impossibility results. But this is not the place to review them. . Suppose that a voter prefers a to b. in some instances. a voter is unable to state if he prefers a to b or the converse because he thinks that both candidates are of equal value. In some cases. In fact.2. He is indiﬀerent between a and b. Those candidates are tied. without ties. Those systems raise as many questions (perhaps more) as the ones we considered (Balinski and Young 1982). he is always able to tell if he prefers a to b or the converse. One might think that those systems are the solution to our problems. a and b) means that a is preferred to b and a line between them means that a is indiﬀerent to b. Note that. he is able to rank them but another kind of modeling of his preferences would be more accurate. very often. we can use a ranking without ties. There are many other reasons to question our assumption. using proportional representation.

as he doesn’t know which one. all the truth and nothing but the truth about their preferences. in some respects.2. But this would not really reﬂect his preferences because he has no reasons to consider that they are equivalent. Arrows implied by transitivity are not represented indiﬀerence also is transitive. not because he thinks that some of them are equivalent but because he cannot compare some of them. four situations can arise: 1. he is better oﬀ not stating any preferences about them. We therefore need to introduce a new model in which voters are allowed to express incomparabilities. Hence. I like both but I cannot compare them. MODELLING THE PREFERENCES OF A VOTER 19 c a b Figure 2. you would probably be very embarrassed to answer. Such a voter cannot declare that he prefers a to b nor the converse. he is also indiﬀerent between a and c.” Such situations are very common in real life where people do not tell the truth. In fact they are diﬀerent. b is far better than a. At the end of the meal. it is diﬃcult to say. He might be embarrassed when asked to tell which candidate he prefers because. when comparing two candidates a and b. If a voter is indiﬀerent between a and b and between b and c. your mother says “I have never eaten such a good pie! Does NameOfYourWife prepare it as well as I do ?” No matter what your preference is.1: A complete pre-order. a is preferred to b. this list is not exhaustive. in other respects. Poor information Suppose that a voter must compare two candidates a and b about which he knows almost nothing. a is far better than b but. Of course.2. Conﬂicting information Suppose that a voter has to compare two candidates a and b about which he knows a lot. except that their names are a and b and that they are candidates. he will probably rank a and b tied rather than ranking one above the other. It is very likely that one is better than the other but. And your answer is very likely to be “Well. And he does not know how to balance the pros and cons or he does not want to do so for the moment. Incomparability: partial rankings It can also occur that a voter is unable to rank the candidates. Conﬁdential information Suppose that your mother invited you and your wife for dinner. There can be several reasons for this. d . If he is forced to express his preferences by means of a ranking with ties.

see Pirlot and Vincke (1997). it is obvious that the child can still be indiﬀerent between the poney and the red bicycle. two grains and three grains. Because of the transitivity of indiﬀerence. As he likes both of them equally.20 CHAPTER 2. Example 14. b is preferred to a. he will choose the cup with one thousand grains. a is indiﬀerent to b or 4. . is that right ?” you would say if you consider a transitive indiﬀerence. But of course. if I ask him which one he prefers. If we keep the transitivity of preference (and indiﬀerence). Example 13. A possible objection to this is that the voter will be tired before he reaches the cup with one thousand grains. the other one with one grain of sugar. Transitivity and coﬀee: semiorders Consider a voter who is indiﬀerent between a and b as well as between b and c. He equally dislikes both. Consequently. the structure we obtain is called a partial ranking. Next. he will tell me that he is indiﬀerent (because he is not able to detect one grain of sugar). However. Transitivity and poneys: more semiorders Do we need semiorders only when a voter cannot distinguish between two very similar objects ? The following example. Thus transitivity of indiﬀerence is violated. Let us suppose that I present two cups of coﬀee to a voter: one cup without sugar. he must also be indiﬀerent between a cup without sugar and a cup with one thousand grains (2 full spoons). Suppose that you ask your child to choose between two presents for his birthday: a poney and a blue bicycle. because of the transitivity of indiﬀerence. I will then present him a cup with one grain and another with two. Furthermore–this is more serious–the coﬀee will be cold and he hates that. If we use a ranking with ties to model his preferences. There is a structure that keeps transitivity of preference and drops it for indiﬀerence. it can model the preferences of our coﬀee drinker. he is necessarily indiﬀerent between a and c. adapted from (Armstrong 1939) will give the answer. If I ask him which cup he prefers. “So. He will still be indiﬀerent. CHOOSING ON THE BASIS OF SEVERAL OPINIONS 2. It is called semiorder. Let us also suppose that he likes his coﬀee with sugar. at least in some cases. you prefer the red bicycle to the poney. 3. He will probably tell you that he prefers the red one to the blue one. Suppose now that you present him a third candidate: a red bicycle with a small bell. For details about semiorders. and so on until nine hundred ninety nine and one thousand grains. The voter will always be indiﬀerent between the two cups that I present to him because they diﬀer by just one grain of sugar. a and b are incomparable. Is this what we want ? We are going to borrow a small example from Luce (1956) to show that transitivity of indiﬀerence should be dropped. he will say he is indiﬀerent.

MODELLING THE PREFERENCES OF A VOTER poney red bike blue bike 21 Figure 2. partial rankings and semiorders are all binary relations. Many other families of binary relations have been considered in the literature in order to formally model the preferences of individuals as faithfully as possible (e. Fuzzy relations and uncertainty When a voter is asked to express his preferences by means of a binary relation. reality is more subtle. a voter might ideally know everything about all candidates. he has to examine each pair and choose “a is preferred to b”. If such a vote is to occur. Roubens and Vincke 1985. Abbas. again in a legislative election. . during the forthcoming mandate. .2. When facing a question like “do you prefer a to b”. Pirlot and Vincke 1996). a voter might hesitate. Let us now focus on another kind of mathematical structure used to model the preferences of a voter. But he does not know if. It is easy to imagine situations where a voter would like to say “perhaps”. Note that even the transitivity of strict preference can be questioned due to empirical observations (e. For example. Fishburn 1988. in a legislative election. a voter does not necessarily know what the position of all candidates is regarding a particular issue.2. perhaps.2. . And it is just a step further to imagine diﬀerent situations where a voter would hesitate but with various degrees of conﬁdence: almost yes but not completely sure. • He does not have full knowledge about the candidates. In fact. the representatives will have to vote on a particular issue. Tversky 1969. 2. Fishburn 1991.2 Fuzzy relations Fuzzy relations can be used to model preferences in at least two very diﬀerent situations. .g. • He does have full knowledge about the candidates but not about some events that might occur in the future and aﬀect the way he compares the candidates.g. Sen 1997). perhaps but more on the side of yes. perhaps but more on the side of no. “b is preferred to a”. There can be many reasons for his hesitations.2: The poney vs bicycles semiorder Other binary relations Rankings with or without ties. For example. a voter might prefer candidate a to candidate b. “a is indiﬀerent to b” or “a and b are incomparable” (if indiﬀerence and incomparability are allowed).

Fuzzy relations can be used to model such preferences. A value of 0 would correspond to no preference. . when a voter is asked to tell if he prefers a to b. If he feels that “a is preferred to b” is deﬁnitely false. a probability distribution on the possible consequences is assumed to exist. . a and b) is the answer of the voter to the question “is a preferred to b”. perhaps could be 0. • He does not fully know his preferences. not because he is uncertain about his judgement. he will tend to express faint diﬀerences in his judgement. For intermediate situations. Fuzzy relations and preference intensity In some cases. The voter is uncertain about the ﬁnal consequences of his choice. CHOOSING ON THE BASIS OF SEVERAL OPINIONS In the other case. You like tennis and your children would love that playground. You will have access to both facilities under the same conditions. probabilities of preference might be assigned to each pair. A typical fuzzy relation on three candidates is illustrated by Fig. plan. We might then model his preferences by a fuzzy relation and choose 0. in some cases. 0. time to completion.5 and almost yes. You perfectly know the two options (budget. There are two options: a tennis court or a playground.g. The voter must still answer the above mentioned question (do you prefer a to b ?). For example. b) and 0. no longer by yes or no.22 CHAPTER 2. but because the concept of preference is vague and not well deﬁned. You have to vote. 0.0 0.3 where a number on the arrow between two candidates (e.0 c Figure 2.8 for (c. d). If he feels that “a is preferred to b” is deﬁnitely true. In these cases. . he might prefer b to a because there is just one thing that he disapproves of the policy of b: his position about that particular issue. ). he chooses intermediate numbers. 2. For example.3: A fuzzy relation Note that.4 b 0.9.6 0.5 for (a.3 a 1. he answers 0. This is due to the fact that preference is not a clear-cut concept. but by numbers. Can you tell which one you will choose ? What will you enjoy more ? To play tennis or to let your children play in the playground ? These three cases can be seen as three facets of a single problem. he answers 1. a voter might say “I deﬁnitely prefer a to b but not as much as I prefer c to d”.8 0. In such cases. Suppose that the community in which you live has decided to build a new recreational facility. the problem faced by the voter is no longer uncertainty but risk. .

1. we were using complete orders to model voters’ the preferences. The position of a candidate with respect to any other candidate is a function only of the utilities of the two candidates. But voting is much more than that. it implies that the preference between c and d is twice as large as the preference between a and b. 2. . semiorders. Many examples. every voter votes for as many candidates as he wants or approves.3 The voting process Until now. on what issue. We then examined alternative models. fuzzy relations. For a thorough review of fuzzy preference modelling.3 Other models Many other models can be conceived or have been described in the literature.2. Who should ﬁx these rules and how ? There is an even more fundamental question: who should decide that voting should occur.3. THE VOTING PROCESS 23 Note that in many cases. . there are often some rules: not everyone can be a candidate. 2. we considered only modelling the preferences of a voter and aggregating the preferences of several voters. Another important model is used in approval voting (Brams and Fishburn 1982). . In addition. In this section. Approval voting received a lot of attention during the last twenty years and has been adopted by a number of committees. the implication is that a is preferred to b. e. Barrett and Pattanaik 1992).3. if the utilities of c and d are respectively 30 and 10. But there is an important issue that we still must address. the candidates are voters that become candidates on a voluntary basis. uncertainty and vagueness are probably simultaneously present. We will not continue our list of preference models any further. An important one is the utilitarian one: a voter assigns to each candidate a number (the utility of the candidate). Salles. according to which . presidential elections. see (Perny and Roubens 1998). can be built to demonstrate this (Sen 1986. similar to those in Section 1. Our aim was just to give a small overview of the many problems that can arise when trying to model the preferences of a voter. some of them or one of them ? In some cases. ? Unfortunately. Is it easier to aggregate individual preferences modelled by means of complete pre-orders. Nevertheless. the preferences of a voter are modelled by a partition of the set of candidates into two subsets: a subset of approved candidates and a subset of disapproved candidates.1 Deﬁnition of the set of candidates Who is going to deﬁne the candidates or alternatives that will be submitted to a vote ? All the voters. In this voting system. Here are a few points that are included in the voting process. If the utilities of a and b are respectively are 50 and 40. We encountered many problems in Section 2. the answer is no. 2. even if they are often left aside in the literature. Consequently.g.2.

3. A vote on the proposed strategies will be held during the next board of directors meeting. the relative ranking of two candidates depends on the presence (or absence) of some other candidates. men. There is no systematic way. no formal method to do that. The board of directors of a company asks the executive committee to prepare a report on the future investment strategies. Hence. To ﬁnd all feasible strategies might be prohibitively resource and time consuming. past or present. men and women. 2. the aggregation method is at least as important as the result of the vote. let us look at diﬀerent democracies. B is a democracy. everyone. even infeasible ones ? If infeasible ones are to be avoided. What we value in B is freedom of choice. How should the executive committee prepare its report ? Should they include all strategies. Even if they do make this selection in a perfectly honest way. it can have far reaching consequences on the outcome of the process. Some references or more details on this topic can be found in (Sen 1997. all governmental decisions are the same in A and B. 2.24 CHAPTER 2. who should decide that they are infeasible. Furthermore. unless you are the dictator.2 Deﬁnition of the set of the voters Who is going to vote ? As in the previous subsection. rich people.3. Suzumura 1999). . white men. for some aggregation methods. suppose that the executive committee decides to explore only some strategies. And one can never be sure that all feasible strategies have been explored. In what country would you prefer to live ? I guess you would choose B. Finally. A and B: A is ruled by a dictator. their benevolent dictator decides alone. Creativity and imagination are needed during this process. Let us now be more pragmatic. noble people. experts who have some knowledge about the discussed problem. a number of representatives proportional to the size of that faction. Suppose that each time a policy is chosen by voting in B. . one representative for each faction. in some cases. CHOOSING ON THE BASIS OF SEVERAL OPINIONS rules ? All these questions received diﬀerent answers in diﬀerent countries and committees. And you would probably choose B even if the decisions taken in B were a little bit worse than the decisions taken in A. There is no universal answer. the dictator of A applies the same policy in his country. . Remember example 11 in which we showed that. The only diﬀerence is that the people in A do not vote. Consider two countries. A more or less arbitrary selection needs to be made. without voting. Citizens.3 Choice of the aggregation method Even the choice of the aggregation method can be considered as part of the voting process for. This may indicate that they are far from trivial. some studies show that an individual can prefer a to b or b to a depending on the presence or absence of some other candidate (Sen 1997). .

1 Social choice and multiple criteria decision support Analogies There is an interesting analogy between voting and multiple criteria decision support. In multiple criteria decision support. of the best alternative from a performance tableau.e. prime ministers or presidents). alternatives by candidates and you get it. Hence we have a huge amount of results on voting at our disposal for use in multiple criteria decision support. Vansnick 1986). In a large part of the literature on voting. in Belgium and Germany. called decision-maker. This entity consists of individuals and. In multiple criteria decision support. The very beginning of the process. there is an entity called group or society that has to choose a candidate from a set of candidates.4.g. the preferences of an individual play the same role.4. In multiple criteria decision support. by some aggregation method. it is often the case that several candidates must be chosen. the problem deﬁnition. the individuals often have conﬂicting views about the candidates. as the preferences along a single viewpoint or criterion in multiple criteria decision support. Let us be more explicit. other alternatives are better. in social choice theory. Besides. the decision process is much broader than just the extraction. that wants to choose an alternative from a set of available alternatives. the decision maker takes several viewpoints called criteria into account. These criteria are often conﬂicting. of course. In fact. for some reasons. in each constituency. etc. and the global or multiple criteria preferences. In this chapter. When a decision maker enters a decision process. The decision-maker is often assumed to be an individual. a person. in multiple criteria decision support. However. The seminal works by Borda (1781). The collective or social preferences. we only discussed elections in which only one candidate must be chosen (single-seat constituencies. A human resources manager chooses amongst the candidates those that will form an eﬃcient team. In other words. this similarity has widely been used (see e. Arrow and Raynaud 1986. The main interest of this analogy lies in the fact that voting has been studied for a long time. Condorcet (1785). To make his choice. SOCIAL CHOICE AND MULTIPLE CRITERIA DECISION SUPPORT 25 2. the comparison can be extended to the processes of voting and decisionmaking. Replace criteria by voters. several representatives are elected so as to achieve a proportional representation.4 2. For example. He just feels unsatisﬁed with his current situation. most papers consider an entity. in social choice. An investor usually invests in a portfolio of stocks. such cases are common. according to a criterion. that can vary largely in diﬀerent groups. and Arrow (1963) have led to an important stream of research in the 20th century. according to another criterion. according to the available resources.2. A committee that must select projects from a list often selects several ones. He then tries to structure his . the choice made by this entity must reﬂect in some way the opinion of the individuals. And. can be compared in the same way. a given alternative is the best one while. he has no clearly deﬁned problem. i. is a crucial step.

in this domain. And so on. Some features of a voting procedure may be highly desirable in a given context while not so important in another one.5 Conclusions In this chapter. In the ﬁrst section. But this does not mean that we can use any procedure in any circumstance and any way. We also presented two theoretical results indicating that there is no hope of ﬁnding a perfect voting procedure. we found that intuition and common sense are not suﬃcient to avoid the many traps that await us when using aggregation procedures. Therefore. the problem statement contains information that allows to recognise if a given action or course of actions is a potential solution or not. Daellenbach 1994). the one he trusts.g. See e. we have shown that the operation of voting is far from simple. So. It is a description. All criteria should be independent except if the aggregation method to be used thereafter allows dependence between criteria. describing very simple situations. The problem statement must not be too broad. 2. in formal language or not. That is. The decisionmaker needs to identify all the viewpoints that are relevant with respect to his problem. the scales corresponding to the criteria must have some properties. Finally he obtains a “problem ”. the one he can understand. Roy (1996) and Keeney and Raiﬀa (1976). The ﬂaws of a particular procedure are probably less damageable in some instances than in others. have developed methods to help decision-makers to better structure their problem (Rosenhead 1989. to put labels on diﬀerent entities. some actions are not recognised as potential solutions even if they would be good ones. It is hard to imagine how an aggregation procedure could be scientiﬁcally proven to be the best one. common sense is of very little help. He must construct the set of alternatives. When the problem has been stated. Last but not least. The decision-maker must thus make a choice. The criteria. are not given in a decision process. using small examples. On the contrary. It usually contains a description of the reasons for which that situation is not satisfying and it contains an implicit description of the potential solutions to the problem. Brainstorming and other techniques promoting and stimulating creativity have been developed to support this step. mainly in the United Kingdom. we have to choose the procedure that best matches our . Some authors. etc. He should choose the one that satisﬁes some properties he judges important. otherwise anything can be a solution and the decisionmaker is not helped. CHOOSING ON THE BASIS OF SEVERAL OPINIONS view of the situation. like the candidates set in social choice. for each voting context. if we still want to use a voting procedure–this seems hardly avoidable–we must accept to use an imperfect one. as one can ﬁnd in books. the decision-maker has a problem. There must not be several criteria reﬂecting the same viewpoint. the aggregation method itself must be chosen by the analyst and/or the decision-maker. like the voters. of the current situation. In fact. if the statement is too narrow.26 CHAPTER 2. to look for relationships between entities. but no solution. He then must deﬁne a set of criteria that reﬂect all relevant viewpoints and that fulﬁlls some conditions. Depending on the aggregation method.

when we have made this choice. This peculiarity doesn’t make voting procedures very diﬀerent from other decision and evaluation models. Voting procedures are decision models. multiple criteria decision support (this has already been discussed in Section 4). they are imperfect and arbitrary models. . they do not take into account the fact that decision support occurs in a human process (the decision making process) and in a complex environment. that we must use the procedure in such a way that the risk of facing a problematic situation is kept as low as possible.2. . Many diﬀerent models for preferences exist and can be used in aggregation procedures. the data are not data. . They are decision models devoted to the special case where a decision must be taken by a group of voters and are mainly concerned with the case of a ﬁnite and small set of alternatives. just like student grades. In Section 2. indicators. cost-beneﬁt analysis. As you will see in the following chapters. CONCLUSIONS 27 needs. these are not given. Nothing in the “problem” tells us what model to use. the decision models are too narrow. And. . . .5. we must be aware that this match is not perfect. most decision models suﬀer the same kind of problems that we have met in this chapter: there is no perfect aggregation procedure. This shows that what is usually considered as data is not really data. we showed that the voting process itself is highly complex. we found that even the input of voting procedures–the preferences of the voters–are not simple things. fuzzy relations. When we feed our aggregation procedures with preferences. Finally. . in Section 3. They are constructed in some more or less arbitrary way. The choice of a particular model (ranking with ties. ) is itself arbitrary.

.

The purpose of this chapter is to build upon this shared experience. Our main objective in this chapter is similar. etc. raises many important and diﬃcult questions that are closely connected to the subject of this book. This diversity is even increased by the fact that each “instructor” (a word that we shall use to mean the person in charge of evaluating students) has generally developed his own policy and habits. there is much variance across countries in the way “education” is organised. it should be stressed that “grades” are often used in contexts unrelated to the evaluation of the performance of students: employees are often graded by their employers. what is meant by “evaluating a performance” and “aggregating evaluations”. an activity that you may also be familiar with.1. based on simple and familiar situations. rules for aggregating grades and granting degrees. We were not overly astonished to discover that the rules that governed the way our performances were assessed were quite diﬀerent. products are routinely tested and graded by consumer organisations. Computer Science. The ﬁndings of this chapter are therefore not limited to the realm of a classroom. both activities being central to most evaluation and decision models. The authors of this book have studied in four diﬀerent European countries (Belgium.1 Introduction Motivation In chapter 2. Curricula. Geology.eurydice. This will allow us to discuss.1 3. grading scales. experts are used to rate the feasibility or the riskiness of projects. we tried to show that “voting”. although being a familiar activity to almost everyone. Operational Research. Although the entire chapter is based on the example of grading students. Physics) and in diﬀerent Universities. France. Greece and Italy) and obtained degrees in diﬀerent disciplines (Maths. We were perhaps more surprised to realise that 29 . are seldom similar from place to place (for information on the systems used in the European Union see www.3 BUILDING AND AGGREGATING EVALUATIONS: THE EXAMPLE OF GRADING STUDENTS 3. The authors of this book spend part of their time evaluating the performance of students through grading several kinds of work. As with voting systems. Management. We all share the – more or less pleasant – experience of having received “grades” in order to evaluate our academic performances.org).

Music or Sports. In what follows. de Ketele (1982). In each course the performance of students is graded. For good accounts in English. we shall say “aggregated”. Lindheim. we shall implicitly have in mind the type of programmes in which we teach (Mathematics. what is the meaning of the resulting “grades” and how to interpret them? • how to combine the various grades obtained by a student in order to arrive at an overall evaluation of his academic performance? These two sets of questions structure this chapter into sections.1. diﬀerent institutional constraints and the popularity of the classic book by Pi´ron (1963) have led e to a somewhat diﬀerent school of thought. Computer Science. our “grading policies” were quite diﬀerent even after having accounted for the fact that these policies are partly contingent upon the rules governing our respective institutions. we refer to Airaisian (1991). Engineering) that are centred around disciplines which. Quite often the various grades are “summarised”. As we shall see. the degree is not granted immediately but there is still a possibility of obtaining it). say. Operational Research. Philosophy.g.g. These grades are then collected and form the basis of a decision to be taken about each student. . 3. e. this will however allow us to raise several important issues concerning the evaluation and the aggregation of performances. success or failure with the possibility of a diﬀered decision (e. Note that in Continental Europe. McLean and Lockwood (1996). seem to raise less “evaluation problems” than if we were concerned with. Such diversity might indicate that evaluating students is an activity that is perhaps more complex than it appears at ﬁrst sight. BUILDING AND AGGREGATING EVALUATIONS although we all teach similar courses in comparable institutions. at least at ﬁrst sight.2 Evaluating students in Universities We shall restrict our attention in this chapter to education programmes with which we are familiar. Davis (1993). in some way before a decision is taken. Our general framework will be that of a programme at University level in which students have to take a number of “courses” or “credits”.30 CHAPTER 3. the Piagetian inﬂuence. this decision may take various forms. Dealing only with “technically-oriented” programmes at University level will clearly not allow us to cover the immense literature that has been developed in Education Science on the evaluation of the performance of students. Moom (1997) and Speck (1998). success or failure. Merle (1996) and Noizet and Caverini (1978). Two types of questions prove to be central for our purposes: • how to evaluate the performance of students in a given “course”. see Bonboir (1972). ranks or average grades. de Landsheere (1980). Cardinet (1986). Depending on the programme. Morris and Fitz-Gibbon (1987). success or failure with possible additional information such as distinctions. “amalgamated”.

other instructors judging your severity and/or performance. Interpreting it necessarily calls for a study of the process that leads to its attribution. administrations evaluating the performance of programmes. it should be noticed that grades are not only a signal sent by the instructor to each of his students.2. e. 3.e. Laska and Juarez 1992. this implies a precise statement of the objectives of the course in the syllabus. many instructors share the view that this is far from being the easiest and most pleasant part of their jobs. Whereas usually the ﬁnal grade of a course in Universities mainly has a “certiﬁcation” role.2. employers looking for all possible information on an applicant for a job. on which the ﬁnal grade may be partly based.2 The grading process What is graded and how? The types of work that are graded. GRADING STUDENTS IN A GIVEN COURSE 31 3. 3. intermediate grades.2. They have many other potential important “users”: other students using them to evaluate their position in the class. Although this is less obvious in Universities than in elementary schools.2 Grading students in a given course Most of you have probably been in the situation of an “instructor” having to attribute grades to students. All grades do not have a similar function. Thus. McLean and Lockwood 1996). parents watching over their child.g. Lysne 1984. This very general deﬁnition calls for some remarks.3. the scale used for grading and the way of amalgamating these grades may vary in signiﬁcant ways for similar types or courses. We shall try here to give some hints on the process that leads to the attribution of a grade as well as on some of its pitfalls and diﬃculties. 3. 2. the result of a mid-term exam is included in the ﬁnal grade but is also meant to be a signal to a student indicating his strengths and weaknesses. 1. an indication of the level to which a student has fulﬁlled the objectives of the course. Although it may appear obvious.1 What is a grade? We shall understand a grade as an evaluation of the performance of a student in a given course. Although this is clearly a very important task. have a more complex role that is often both “certiﬁcative” and “formative”. . a condition that is unfortunately not always perfectly met. it appears that a grade is therefore a complex “object” with multiple functions (see Chatel 1994. A grade should always be interpreted in connection with the objectives of a course. i.

For reasons to be explained later. case-studies or even “class participation”. The scale that is used for grading students is usually imposed by the programme. We shall take the example of an exam for an “Introduction to Operational Research (OR)” course. standardising them. The number and type of work may vary a lot: ﬁnal exam. Obviously we would not want to conclude from this that Italian instructors have come to develop much more sensitive instruments for evaluating performance than German ones or that the evaluation process is in general more “precise” in Europe than it is in the USA. E to A or F to A. Many diﬀerent choices interfere with such a task. Their duration may vary (45 minute exams are not uncommon in some countries whereas they may last up to 8 hours in some French programmes). But there are many possible types of exams. they may be open-book or closed-book. Integer Programming and Network models. 0-100 (in some Universities). exam. Furthermore the way these various grades are aggregated is diverse: simple weighted average. with the aim of giving students a basic understanding of the modelling process in OR and an elementary mastering of some basic techniques (Simplex Algorithm. Branch and Bound. We shall come back to that point in section 3. exercises. elementary Network Algorithms). imposition of a minimal grade at the ﬁnal exam. mid-term exam. grade only based on exams with group work (e. Their content for similar courses may vary from multiple choice questions to exercises. such choices might not be totally without consequences. All instructors know that preparing the subject of an exam is a diﬃcult and time consuming task. etc. closed-book. e. Numerical scales are often used in Continental Europe with varying bounds and orientations: 0-20 (in France or Belgium). Singer and Worthington 1994). Checca. 0-30 (in Italy). case-studies or essays. case-studies or exercises) counting as a bonus. It should however be noted that since grades are often aggregated at some point.32 CHAPTER 3. others modify the “raw” grades in some way before aggregating and/or releasing them.3.g. (an overview of grading policies and practices in the USA can be found in Riley. e. Preparing a subject. BUILDING AND AGGREGATING EVALUATIONS 1. 6-1 (in Germany and parts of Switzerland). Most of us would agree that the choice of a particular scale is mainly conventional. Some instructors use “raw” grades. American and Asian institutions often use a letter scale. including Linear Programming (LP). Some courses are evaluated on the basis of a single exam.g. 1. Preparing and grading a written exam Within a given institution suppose that you have to prepare and grade a written. 4. They may be written or oral. In most courses the ﬁnal grade is based on grades attributed to multiple tests. Is the subject of adequate diﬃculty? Does it contain enough questions to cover all parts of the programme? .g. 3. 2.

GRADING STUDENTS IN A GIVEN COURSE 33 Do all the questions clearly relate to one or several of the announced objectives of the course? Will it allow to discriminate between students? Is there a good balance between modelling and computational skills? What should the respective parts of closed vs. A “nice-looking” subject might be impractical in view of the associated marking scale. has traditionally developed at least two desirable criteria. Airaisian (1991) and Merle (1996) e are good surveys of recent ﬁndings). The crudest reliability test that can be envisaged is to give similar works to correct to several instructors and to record whether or not these works are graded similarly. we are aware of no instructor not having had to revise his judgement after correcting some work and realising his severity and/or to correct work again after discovering some frequently given half-correct answers that were unanticipated in the marking scale. A grade evaluates the performance of a student in completing the tasks implied by the subject of the exam and. Such experiments were conducted extensively in various disciplines and at various levels.e. open questions be? 2. We brieﬂy recall here some of the diﬃculties that were uncovered. Not overly surprisingly.e. Preparing a marking scale. such an evaluation is often thought of as a “measure” of performance. i. Although this is debatable. A measure should be: • reliable.3. Extensive research in Education Science has found that the process of giving grades to students is seldom perfect in these respects (a basic reference remains the classic book of Pi´ron (1963). i. Will the marking scale include a bonus for work showing good communication skills and/or will misspellings be penalised? How to deal with computational errors? How to deal with computational errors that lead to inconsistent results? How to deal with computational errors inﬂuencing the answers to several questions? How to judge an LP model in which the decision variables are incompletely deﬁned? How to judge a model that is only partially correct? How to judge a model which is inconsistent from the point of view of units? Although much expertise and/or “rules of thumb” are involved in the preparation of a good subject and its associated marking scale. • valid. The preparation of the marking scale for a given subject is also of utmost importance. Kerlinger 1986. should measure what was intended to be measured and only that. give similar results when applied several times in similar conditions. Physics. hopefully.2. For this kind of “measure” the psychometric literature (see Ebel and Frisbie 1991. most experiments have shown that even in the more “technical” disciplines (Maths. Grading. Popham 1981). will give an indication of the extent to which a student has met the various objectives of the course (in general an exam is far from dealing with all the aspects that have been dealt with during the course). 3. Grammar) in which it is possible to devise rather detailed marking scales .

Near the end of a correction task. Some of them will tend to give an equal percentage of all grades and will tend to use the whole range of the scale.g. BUILDING AND AGGREGATING EVALUATIONS there is much diﬀerence between correctors.g. Some will systematically avoid the extremes of the range and the distribution of their marks will have little variability. Deﬁning a grading policy A syllabus usually contains a section entitled “grading policy”. in their minds. Some are used to giving the lowest possible grade after having spotted a mistake which. Although instructors do not generally consider it as the most important part of their syllabus. • the order in which papers are corrected greatly inﬂuences the grades. The inﬂuence of correction habits. proposing a “non linear LP model”). Experience shows that “correction habits” tend to vary from one instructor to another. implies that “nothing has been understood” (e. others will tend to equalise average grades from term to term and/or use a more or less ad hoc procedure. e In other experiments the same correctors are asked to correct a work that they have already corrected earlier.34 CHAPTER 3. it seems fair to suppose that they are no more reliable than written ones. The distribution of grades for similar papers will tend to be highly diﬀerent according to the corrector. • “anchoring eﬀects” are pervasive: it is always better to be corrected after a remarkably poor work than after a perfect one. • misspellings and poor hand-writing prove to have a non negligible inﬂuence on the grades even when the instructor declares not to take these eﬀects into account or is instructed not to. 4. Instructors accustomed to grading papers will not be surprised to note that: • grades usually show much auto correlation: similar papers handed in by a usually “good” student and by a usually “uninterested” student are likely not to receive similar grades. Other experiments have shown that many extraneous factors may interfere in the process of grading a paper and therefore question the validity of grades. Others will tend to give only extreme marks e. Even more strikingly on some work in Maths the diﬀerence can be as high as 9 points on a 0-20 scale (see Pi´ron 1963). On average the diﬀerence between the more generous and the more severe correctors on Maths work can be as high as 2 points on a 0-20 scale. These auto-reliability tests give similar results since in more than 50% of the cases the second grade is “signiﬁcantly” diﬀerent from the ﬁrst one. Although few experiments have been conducted with oral exams. some instructors will tend to standardise the grades before releasing them (the so-called “z-scores”). arguing that either the basic concepts are understood or they are not. they . most correctors are less generous and tend to give grades with a higher variance. In order to cope with such eﬀects.

dissertation.3. because they are sick)? • the policy towards cheating and other dishonest behaviour (exclusion from the programme. minus x points per hour or day). attribution of the lowest possible grade for the exam).g.2. a “dissertation” may have to be completed. an “average grade” is computed and this average grade must be over a given limit to obtain the degree. Besides useful considerations on “ethics”. The freedom of an instructor in arranging his own grading policy is highly conditioned by this environment. attribution of the lowest possible grade for the course. • the policy towards late assignments (no late assignment will be graded. GRADING STUDENTS IN A GIVEN COURSE 35 are aware that it is probably the part that is read ﬁrst and most attentively by all students. the nature of exams and the way the various grades will contribute to the determination of the ﬁnal grade. etc. The usual way to proceed is to give a (numerical) . Some programmes attribute diﬀerent kinds of degrees through the use of “distinctions”. ﬁnal exam. On top of describing the type of work that will be graded. In some programmes students are only required to obtain a “satisfactory grade” (it may or not correspond to the “middle” of the grading scale that is used) for all courses. A grade can hardly be interpreted without a clear knowledge of these rules (note that this sometimes creates serious problems in institutions allowing students pertaining to diﬀerent programmes with diﬀerent sets of rules to attend the same courses). We examine some of them below. let us mention: • the type of preparation and correction of the exams: who will prepare the subject of the exam (the instructor or an outside evaluator)? Will the work be corrected once or more than once (in some Universities all exams are corrected twice)? Will the names of the students be kept secret? • the possibility of revising a grade: are there formal procedures allowing the students to have their grades reconsidered? Do the students have the possibility of asking for an additional correction? Do the students have the possibility of taking the same course at several moments in the academic year? What are the rules for students who cannot take the exam (e. many degrees of freedom remain. Within a well deﬁned set of rules. In others. however. case-studies. it usually also contains many “details” that may prove important in order to understand and interpret grades. this section usually describes the process that will lead to the attribution of the grades for the course in detail. Weights We mentioned that the ﬁnal grade for a course was often the combination of several grades obtained throughout the course: mid-term exam. Among these “details”. “core courses”) are sometimes treated apart.g. Determining ﬁnal grades The process of the determination of the ﬁnal grades for a given course can hardly be understood without a clear knowledge of the requirements of the programme in order to obtain the degree. Some courses (e.

the diﬀerences in the ﬁnal grades will be attributable almost exclusively to the mid term exam although it has a much lower weight than the ﬁnal exam. When the ﬁnal grade is based on a single exam we have seen that it is not easy to build a marking scale. Since the variance of the grades is likely to be much lower for the dissertation than for the exam. if the ﬁnal exam is so easy that most students obtain very good grades. Let us simply mention here that the interpretation of “weights” in such a formula is not obvious. Although this process is simple and almost universally used. it is clear that the more or less arbitrary choice of a particular measure of dispersion (why use the standard deviation and not the inter quartile range? should we exclude outliers?) may have a crucial inﬂuence on the ﬁnal grades. Most instructors would tend to compensate for a very diﬃcult mid term exam (weight 30%) preparing a comparatively easier ﬁnal exam (weight 70%). In order to avoid such diﬃculties. . The problem is clearly even more diﬃcult when the ﬁnal grade results from the aggregation of several grades. Similarly weighted averages do not take the progression of the student during the course into account. Furthermore. e. given that an exam only gives partial information about the amount of knowledge of the student. you may either “pass” or “fail” a course and the grades obtained in several courses are not averaged. The use of weighted averages may give undesirable results since. the former may only marginally contribute towards explaining diﬀerences in ﬁnal grades independently of the weighting scheme. it raises some diﬃculties that we shall examine in section 3. an excellent group case-study may compensate for a very poor exam. An essential problem for the instructor is then to determine which students are above the “minimal passing grade”. It should be noted that the problem of positioning students with respect to a minimal passing grade is more or less identical to positioning them with respect to any other “special grades”. The question boils down to deciding what amount of the programme should a student master in order to obtain a passing grade. Passing a course In some institutions. It is even more diﬃcult to conceive a marking scale in connection to what is usually the minimal passing grade according to the culture of the institution. for example. some instructors standardise grades before averaging them.3. The same is true if the ﬁnal grade combines an exam with a dissertation. the manipulation of such “distorted grades” seriously complicates the positioning of students with respect to a “minimal passing grade” since their use amounts to abandoning any idea of “absolute” evaluation in the grades. the minimal grade for being able to obtain a “distinction”. However. more important works receiving higher weights. BUILDING AND AGGREGATING EVALUATIONS weight to each of the work entering into the ﬁnal grade and to compute a weighted average. Although this might be desirable in some situations.g. to be cited on the “Dean’s honour list” or the “Academic Honour Roll”.36 CHAPTER 3.

Unless there is a lot of co-ordination between colleagues they may apply quite diﬀerent rules e. Interpreting your own grades The numerical scales used for grades throughout Europe tend to give the impression that grades are “real measures” and that. In fact. There are many possible kinds of “measure” and having a numerical scale is no guarantee that the numbers on that scale may be manipulated in all possible ways. he has satisﬁed to a greater extent the objectives of the Maths course than the objectives of the Literature course? Our experience as instructors would lead us to answer negatively to such questions even when talking of programmes in which all objectives are very clearly stated.2.2 we mentioned that. you would perhaps be tempted to conclude that the diﬀerence between 13 and 14 may not be very signiﬁcant and/or that you should not trust grades that are so generous and exhibit so little variability. However.g. in section 3. given the level of the programme. This seriously complicates the interpretation of the proﬁle of grades obtained by a student. it should not be a surprise that most instructors ﬁnd it very diﬃcult to interpret grades obtained in another institution. Grades from colleagues Being able to interpret the grade that a student obtained in your own institution is quite important at least as soon as some averaging of the grades is performed in order to decide on the attribution of a degree.3. First it should be observed that there is no clear implication in having obtained a similar grade in two diﬀerent courses.3 Interpreting grades Grades from other institutions In view of the complexity of the process that leads to the attribution of a grade. GRADING STUDENTS IN A GIVEN COURSE 37 3. Secondly. before manipulating numbers supposedly resulting from “measurements” it is always important to try to ﬁgure .2. each instructor still had many degrees of freedom to choose his grading policy. However. we would like to argue that this task is not an easy one either. Is it possible or meaningful to assert that a student is “equally good” in Maths and in Literature? Is it possible to assert that.2. Not aware of the grading policy of the instructor and of the culture and rules of the previous University this student attended. in dealing with late assignments or in the nature and number of exams. The knowledge of his rank in the class may be more useful: if he obtained one of the highest grades this may be a good indication that he has mastered the contents of the course suﬃciently. consequently these numbers may be manipulated as any other numbers. Consider a student joining your programme after having obtained a ﬁrst degree at another University. This task is clearly easier than the preceding one: the grades that are to be interpreted here have been obtained in a similar environment. he wants to have the opportunity to be dispensed from your class. if you were to know that the lowest grade was 13 and that 14 is the highest. knowing that he obtained 14 oﬀers you little information. even within ﬁxed institutional constraints. Arguing that he has already passed a course in OR with 14 on a 0-20 scale.

“Knowing nothing”. An interval scale allows comparisons of “diﬀerences in performance”. It should be clear that the numerical value attributed to the highest point on the scale is somewhat arbitrary and conventional. some ambiguity remains. X weighs twice as much as Mr.e. Equality of grades at the top of the scale (or near the top. that “perfectly fulﬁlling the objectives of a course” makes clear sense. since changing the unit and origin of measurement clearly preserves such comparisons. Saying that the average temperature in city A is twice as high as the average temperature in city B may be true but makes little sense since the truth value of this assertion clearly depends on whether temperature is measured using the Celsius or the Fahrenheit scale. Even when the lowest grade can be obtained by students having taken the exam. is diﬃcult to deﬁne and is certainly contingent . BUILDING AND AGGREGATING EVALUATIONS out on which type of scales they have been “measured”. Saying that Mr. Clearly this does not imply that these students are “equally ignorant”. i. Hence it would seem that a 0-20 scale might be better viewed as an interval scale. i. problems might appear at the upper bound of the scale. e. it should be recognised that this ratio scale is somewhat awkward because it is bounded above. They cannot obtain more than the perfect grade 20/20. i. i. having completely failed to meet any of the objectives of the course. The lowest point on the scale It should be clear that the numerical value that is attributed to the lowest point of the scale is no less arbitrary and conventional than was the case for the highest point.g. The highest point on the scale An important feature of all grading scales is that they are bounded above.e. There is nothing easier than to transform grades expressed on a 0-20 scale to grades expressed on a 100-120 scale and this involves no loss of information. Unless you admit that knowledge is bounded or. Consider two excellent. but not necessarily “equally excellent”. In some institutions the lowest grade is reserved for students who did not take the exam. more realistically. students. a scale in which the unit of measurement is arbitrary (such scales are frequent in Physics.e. Y “makes sense” because this assertion is true whether mass is measured in pounds or in kilograms. whereas the computer system of most Universities would deﬁnitely reject such grades !). it makes sense to assert that the diﬀerence between 0 and 10 is similar to the diﬀerence between 10 and 20 or that the diﬀerence between 8 and 10 is twice as large as the diﬀerence between 10 and 11. At best it seems that grades should be considered as expressed on a ratio scale. Let us notice that using a scale that is bounded below is also problematic. depending on grading habits) does not necessarily imply equality in performance (after a marking scale is devised it is not exceptional that we would like to give some students more than the maximal grade. a scale in which both the origin and the unit of measurement are arbitrary (think of temperature scale in Celsius or Fahrenheit). Let us notice that this is true even in Physics. because some bonus is added for particularly clever answers. If grades can be considered as measured on a ratio scale.e. No loss of information would be incurred using a 0-100 or a 0-10 scale instead of a 0-20 one. length can be measured in meters or inches without loss of information).38 CHAPTER 3.

. Although the latter exam seems slightly better than the former. it might well be possible to assert that small diﬀerences in grades that do not cross any special grades may not be signiﬁcant at all. in order to obtain the degree. GRADING STUDENTS IN A GIVEN COURSE 39 upon the level of the course (this is all the more true that in many institutions the lowest grade is also granted to students having cheated during the exam. they provide nothing more than ranking information” (but see French 1993. the possibility of adding “+” or “–” to the letters). since these letter grades are usually obtained via the manipulation of a distribution of numerical grades of some sort.3. Vassiloglou and French 1982). some grades are very particular in the sense that they play a particular role in the attribution of the degree.3.. 10 on a 0-20 scale. Therefore. First we already mentioned that a lot of care should be taken in manipulating grades that are “close” to the bounds. Furthermore the aggregation of letter grades is often done via a numerical transformation as we shall see in section 3. In between We already mentioned that on an interval scale. with obviously no guarantee that they are “equally ignorant”). In between these “special grades” it seems that the reliable information conveyed by grades is mainly ordinal. Some authors have been quite radical in emphasising this point. Finally it should be observed that. not only the minimal passing grade has a special role: some grades may correspond to diﬀerent possible levels of distinction. Let us consider a programme in which all grades must be above a minimal passing grade. few instructors will claim that there is a highly signiﬁcant diﬀerence between 4/20 and 5/20. other may correspond to a minimal acceptable level below which there is no possibility of compensation with grades obtained in other courses. care should be taken when manipulating grades close to the bounds of the scale.g.] we contend that the diﬃculty of nearly all academic tests is arbitrary and regardless of the scoring method. On the contrary the gap between 9/20 and 10/20 may be much more important since before putting a grade just below the passing grade most instructors usually make sure that they will have good arguments in case of a dispute (some systematically avoid using grades just below the minimal passing grade). Second. A diﬀerence of 1 point on a 0-20 scale may well be due only to chance via the position of the work. e. it makes sense to compare diﬀerences in grades. say. Cross (1995) stating that: “[. The authors of this book (even if their students should know that they spend a lot of time and energy in grading them !) do not consider that their own grades always allow for such comparisons. At ﬁrst sight this would seem to be a strong argument in favour of the letter system at use in most American Universities that only distinguishes between a limited classes of grades (usually from F or E to A with. the distinction between letter grades and numerical grades is not as deep as it appears at ﬁrst sight.2. If it is clear that an exam is well below the passing grade. in some institutions. if grades are expressed on interval scales. To a large extent “knowing nothing” — in the context of a course — is somewhat as arbitrary as is “knowing everything”.. In some programmes. the quality of the preceding papers. the essential idea is that they are both well below the minimal passing grade. in between these bounds. in view of the lack of reliability and validity of some aspects of the grading process. the time of correction. However.

Instructors are likely to use a grading policy that will depend on their perception of the policy of the Faculty (on these points. BUILDING AND AGGREGATING EVALUATIONS Once more grades appear as complex objects. see Sabot and Wakeman 1991. ask some of your colleagues to take it with the following instructions: prepare what you would think to be an exam that would just be acceptable for passing. This not to say that grades cannot be a useful evaluation model. Furthermore. It would be extremely likely that the resulting grades show some surprises! However. do not ﬁnd it very easy. The resulting “scale of measurement” is unsurprisingly awkward. their use greatly contributes to transforming the “reality” that we would like to “measure”. prepare an exam that would clearly deserve distinction. prepare an exam that is well below the passing grade. Having prepared an exam. . Myers and King 1994). none of us would be prepared to abandon grades. Stratton. may have the impression that we have been overly pessimistic on the quality of the grading process.4 Why use grades? Some readers. they will undoubtedly adapt their work and learning practice to what they perceive to be its severity and consequences. of course. As is the case with most evaluation models. however.2. 3. We suggest to sceptical instructors the following simple experiment. Tchudi 1997). We tend to consider grades as an “evaluation model” trying to capture aspects of something that is subject to considerable indetermination. we. and most notably instructors. The diﬃculties that we mentioned would be quite problematic if grades were considered as “measures” of performance that we would tend to make more and more “precise” and “objective”. as with most evaluation models of this type. Then apply your marking scale to these papers prepared by your colleagues. If these lines have lead some students to consider that grades are useless. Vassiloglou 1984). the existence of special grades complicates the situation in introducing some “absolute” elements of evaluation in the model (on the measurement-theoretic interpretation of grades see French 1981. This might not be an impossible task. While they seem to mainly convey ordinal information (with the possibility of the existence of non signiﬁcant small diﬀerences) that is typical of a relative evaluation model. We would like to mention that the literature in Education Science is even more pessimistic leading some authors to question the very necessity of using grades (see Sager 1994.40 CHAPTER 3. aggregating these evaluations will raise even more problems. we suggest they try to build up an evaluation model that would not use grades without. at least for the type of programmes in which we teach. Students cannot be expected to react passively to a grading policy. the “performance of students”. relying too much on arbitrary judgements.

ranks or average grades. etc.3.3 3. distinctions. What is required from the students to obtain a degree is generally described in a lengthy and generally opaque set of rules that few instructors—but generally all students—know perfectly (as an interesting exercise we might suggest that you investigate whether you are perfectly aware of the rules that are used in the programmes in which you teach or.1 Aggregating grades Rules for aggregating grades In the previous section. If they fail to do so after a given period of time. success or failure with the additional possibility of partial success (the degree is not granted immediately but there remains a possibility of obtaining it). However.g. . Most instructors and students generally violently oppose such simple systems since they generate high failure rates and do not promote “academic excellence”. This very simple rule has the immense advantage of avoiding any amalgamation of grades. whether you are aware of such rules for the programmes in which your children are enrolled). Conjunctive rules In programmes of this type. it appears that they are often based on three kinds of principles (see French 1981). if you do not teach. success or failure with possible additional information e. • it does not allow to discriminate (e. AGGREGATING GRADES 41 3. we already mentioned that this decision may take diﬀerent forms: success or failure. Such decisions are usually based on the ﬁnal grades that have been obtained in each course but may well use some other information. Once they have received a grade in each course. • it oﬀers no incentive to obtain grades well above the minimal passing grade. Depending on the programme. Unfortunately. verbal comments from instructors or extra-academic information linked to the situation of each student. a decision still has to be made about each student. It is however seldom used as such because: • it is likely to generate high failure rates. using several kinds of distinctions) between students obtaining the degree. e. obtain a grade above a “minimal passing grade” in all courses in order to obtain the degree. These rules exhibit such variety that it is obviously impossible to exhaustively examine them here.3.g. they do not obtain the degree. students must pass all courses.g. i. we hope to have convinced the reader that grading a student in a given course is a diﬃcult task and that the result of this process is a complex object. this is only part of the evaluation process of students enrolled in a given programme. • it does not allow to discriminate between grades just below the passing grade and grades well below it.e.3.

for instance.42 CHAPTER 3. The average grade obtained by student a is then n computed as g(a) = i=1 wi gi (a). 3. the grades of students are aggregated using a simple weighted average. the minimal average grade for obtaining the degree with a distinction. because of serious personal problems. This makes aggregation an uneasy task. In such programmes. BUILDING AND AGGREGATING EVALUATIONS Weighted averages In many programmes.g. The weights wi may. This amounts to aggregating evaluations that are highly inﬂuenced by the aggregation rule. This average grade (the so-called “GPA” in American Universities) is then compared to some standards e.g. Furthermore.2 Aggregating grades using a weighted average The purpose of rules for aggregating grades is to know whether the overall performance of a student is satisfactory taking his various ﬁnal grades into account. e. The rules that are used in the programmes we are aware of often involve a mixture of these three principles. it should be noticed that the ﬁnal decision concerning a student is very often taken by a committee that has some degree of freedom with respect to the rules and may. be n normalised in such a way that i=1 wi = 1. the minimal average grade for obtaining the degree.g. grant the degree to someone who does not meet all the requirements of the programme e. We shall suppose that all ﬁnal grades are expressed on similar scales and note gi (a) the ﬁnal grade for course i obtained by student a. etc. Minimal acceptable grades In order to limit the scope of compensation eﬀects allowed by the use of weighted averages. all sorts of compensation eﬀects are at work with a weighted average. Using a weighted average system amounts to assessing the performance of a student combining his grades using a simple weighting scheme. the (positive) weights wi reﬂecting the “importance” (in “academic” terms and/or in function of the length of the course) of the course for the degree. We study some aspects of the most common aggregation rule for grades below: the weighted average (more examples and comments will be found in chapters 4 and 6).2 that the very nature of the grades was highly inﬂuenced by these rules. the ﬁnal decision is taken on the basis of an average grade provided that all grades entering this average are above some minimal level. the minimal average grade for being allowed to stay in the programme. All these rules are based on “grades” and we saw in section 3. Whereas conjunctive rules do not allow for any kind of compensation between the grades obtained for several courses. an average grade is computed for each “category” of courses provided that the grade of each course is above a minimal level and such average grades per category of courses are then used in a conjunctive fashion. without loss of generality. some programmes include rules involving “minimal acceptable grades” in each course. Using such a convention the average grade g(a) will be expressed on a scale having the same bounds as the scale .3.

1−w) that would rank c before both a and b. Considering that both courses are of equal importance gives the following average grades: a b c d average grades 12 12 11 5 which leads to having both a and b ranked before c. The use of simple weighted average of grades leads to very diﬀerent results. Ranking c before b implies 11w + 11(1 − w) > 20w + 4(1 − w).1 should make clear that there is no loss of generality in supposing that weights sum to 1). The exclusive reliance on a weighted average might therefore be an incentive for students to concentrate their eﬀorts on a limited number of courses and beneﬁt . a ﬁnal grade between 0 and 20 is allocated. A number of examples will allow us to understand the meaning of this rule better and to emphasise its strengths and weaknesses (we shall suppose throughout this section that students have all been evaluated on the same courses. Ranking c before a implies that 11w + 11(1 − w) > 5w + 19(1 − w) which 8 leads to w > 15 . The results are as follows: a b c d g1 5 20 11 4 g2 19 4 11 6 Student c has performed reasonably well in all courses whereas d has a consistent very poor performance. Students a and b should be ranked in between. we can say even more: there is no vector of weights (w. Casual introspection suggests that if the students were to be ranked. 7 w < 16 (ﬁgure 3.3. see Vassiloglou (1984)). c should certainly be ranked ﬁrst and d should be ranked last. Example 1 Consider four students enrolled in a degree consisting of two courses. AGGREGATING GRADES 43 used for the gi (a). As shown in ﬁgure 3. i. both a and b are excellent in one course while having a serious problem in the other. For each course. The simplest decision rule consists in comparing g(a) with some standards in order to decide on the attribution of the degree and on possible distinctions.e. Their very low performance in 50% of the courses does not make them good candidates for the degree. their relative position depending on the relative importance of the two courses. The use of a simple weighted sum is therefore not in line with the idea of promoting students performing reasonably well in all courses.3. for the problems that arise when this is not so.1.

The results are as follows: Physics 18 18 5 5 Maths 12 7 17 12 Economics 6 11 8 13 a b c d On the basis of these evaluations. BUILDING AND AGGREGATING EVALUATIONS 20 a 18 16 14 12 c 10 8 6 4 2 0 0 2 4 6 8 10 12 14 16 18 20 d l l l l b Figure 3. This is a consequence of the additivity hypothesis embodied in the use of weighted averages. For each course. it is felt that a should be ranked before b. Although a has a low grade in Economics. Maths and Economics.44 CHAPTER 3.1: Use of a weighted sum for aggregating grades from the compensation eﬀects at work with such a rule. a ﬁnal grade between 0 and 20 is allocated. Example 2 Consider four students enrolled in an undergraduate programme consisting in three courses: Physics. It should ﬁnally be noticed that the addition of a “minimal acceptable grade” for all courses can decrease but not suppress (unless the minimal acceptable grade is so high that it turns the system in a nearly conjunctive one) the occurrence of such eﬀects. he has reasonably good grades in both . A related consequence of the additivity hypothesis is that it forbids to account for “interaction” between grades as shown in the following example.

The results are as follows: g1 11 12 g2 10 9 a b It is clear that both students will receive an identical average grade of 10. they are not compatible with the use of a weighted average in order to aggregate the three grades. Therefore d is ranked before c. Student c has two low grades and it seems diﬃcult to recommend him for a programme in Engineering or in Economics. we . having good grades in a both Math and Physics or in both Maths and Economics is better than having good grades in both Physics and Economics. b is weak in Maths and it seems diﬃcult to recommend him for any programme with a strong formal component (Engineering or Economics). If this is the case. although not unfrequent. cannot be dealt with using weighted averages. both courses have the same weight and the required minimal average grade for the degree is 10. this grade will play the role of a “special grade” for the instructors. Taking such interactions into account calls for the use of more complex aggregation models (see Grabisch 1996).3. Using a similar type of reasoning. Such interactions.3. this is another consequence of the additivity hypothesis. it might be reasonable to consider that the diﬀerence between 10 and 9 which crosses a special grade is much more signiﬁcant than the diﬀerence between 12 and 11 (it might even be argued that the small diﬀerence between 12 and 11 is not signiﬁcant at all). Both students will obtain the degree having performed equally well. If 10 is a “special grade” then. a grade above 10 indicating that a student has satisfactorily met the objectives of the course. It is easy to observe that: • ranking a before b implies putting more weight on Maths than on Economics (18w1 + 12w2 + 6w3 > 18w1 + 7w2 + 11w3 ⇒ w2 > w3 ). Although these preferences appear reasonable. AGGREGATING GRADES 45 Maths and Physics which makes him a good candidate for an Engineering programme. It is not unreasonable to suppose that since the minimal required average for the degree is 10. • ranking d before c implies putting more weight on Economics than on Maths (5w1 + 17w2 + 8w3 > 5w1 + 12w2 + 13w3 ⇒ w3 > w2 ).5: the diﬀerence between 11 and 12 on the ﬁrst course exactly compensates for the opposite diﬀerence on the second course. Whereas Maths do not overweigh any other course (see the ranking of d vis-`-vis c). In this example it seems that “criteria interact”. d appears to be a fair candidate for a programme in Economics. which is contradictory. Example 3 Consider two students enrolled in a degree consisting of two courses. For each course a ﬁnal grade between 0 and 20 is allocated.

b could well be judged preferable to both a and c even though b is indiﬀerent to a and c. However. We have the following results for three students: g1 14 15 16 g2 16 15 14 a b c All students have an average grade of 15 and they will all receive the degree. to view the following three students as perfectly equivalent with an average grade of 15: g1 10 15 20 g2 20 15 10 a b c whereas we already argued that. This is another consequence of the linearity hypothesis embodied in the use of weighted averages. whatever the value of x. Furthermore. The linearity hypothesis embodied in the use of weighted averages has the inevitable consequence that a diﬀerence of one point has a similar meaning wherever on the scale and therefore does not allow for such considerations. This appears desirable since these three students have very similar proﬁles of grades. The use of linearity and additivity implies that if a diﬀerence of one point on the ﬁrst grade compensates for an opposite diﬀerence on the other grade. if x is chosen to be large enough this may appear dubious since it could lead. BUILDING AND AGGREGATING EVALUATIONS would have good grounds to question the fact that a and b are “equally good”. For each course a ﬁnal grade between 0 and 20 is allocated. in such a case. All courses have identical importance and the minimal passing grade is 10 on average. then a diﬀerence of x points on the ﬁrst grade will compensate for an opposite diﬀerence of −x points on the other grade. Example 5 Consider three students enrolled in a degree consisting of three courses. these three students will not be distinguished: their equal average grade makes them indiﬀerent. The results are as follows: . for instance. Example 4 Consider a programme similar to the one envisaged in the previous example.46 CHAPTER 3. if the degree comes with the indication of a rank or of an average grade.

AGGREGATING GRADES g1 12 13 5 g2 5 12 13 g3 13 5 12 47 a b c It is clear that all students have an average equal to the minimal passing grade 10. which is nothing more than a weighted average of grades. with the possible exception of a few “special grades” such as the minimal passing grade. it does not seem that the use of “letter grades”.3. helps much in this respect. They all end up tied and should all be awarded the degree. As shown by the following example. Example 6 In many American Universities the Grade Point Average (GPA). Note that using diﬀerent transformations.e. this example shows that a weighted average makes use of the “cardinal properties” of the grades. instead of numerical ones. i. A common “conversion scheme” is the following: A B C D E 4 3 2 1 0 (outstanding or excellent) (very good) (good) (satisfactory) (failure) . only reﬂect the relative rank of the students in the class.2).2 it might not be unreasonable to consider that ﬁnal grades are only recorded on an ordinal scale. only a b (say the Dean’s nephew) gets an average above 10 and both a and c fail (with respective averages of 9 and 9.3. In this case. is crucial for the attribution of degrees and the selection of students. This is hardly compatible with grades that would only be indicators of “ranks” even with some added information (a view that is very compatible with the discussion in section 3. As argued in section 3. Since courses are evaluated on letter scales. This means that the following table might as well reﬂect the results of these three students: g1 11 13 4 g2 4 13 14 g3 12 6 11 a b c since the ranking of students within each course has remained unchanged as well as the position of grades vis-`-vis the minimal passing grade. the GPA is usually computed by associating a number to each letter grade. Not surprisingly.6). we could have favoured any of the three students.

First. Allowing for the possibility of adding “+” or “–” to the letter grades generally results in a conversion schemes maintaining an equal diﬀerence between two consecutive letter grades. These numbers are then converted into letter grades using a ﬁrst conversion scheme. The choice of such a scheme is not obvious. the conversion scheme of letters into numbers used to compute the GPA is somewhat arbitrary. To show how this might happen suppose that all courses are ﬁrst evaluated on a 0–100 scale (e. This implies using a ﬁrst “conversion scheme” of numbers into letters. letter grades for a given course are generally obtained on the basis of numerical grades of some sort. Secondly. This can have a signiﬁcant impact on the ranking of students on the basis of the GPA. Such a practice raises several diﬃculties. a common conversion scheme (that is used in many Universities) is A B C D E 90–100% 80–89% 70–79% 60–69% 0–59% This results in the following letter grades: g1 A C A g2 D C C g3 C B D a b c Supposing the three courses of equal importance and using the conversion scheme of letter grades into numbers given above. BUILDING AND AGGREGATING EVALUATIONS in which the diﬀerence between two consecutive letters is assumed to be equal.48 CHAPTER 3. into a numerical scale and the GPA is computed. Now consider three students evaluated on three courses on a 0-100 scale in the following way: g1 90 79 100 g2 69 79 70 g3 70 89 69 a b c Using an E to A letter scale. Note that when there are no “holes” in the distribution of numerical grades it is possible that a very small (and possibly non signiﬁcant) diﬀerence in numerical grades results in a signiﬁcant diﬀerence in letter grades. These letter grades are further transformed. indicating the percentage of correct answers to a multiple choice questionnaire). the calculation of the GPA is as follows: . using a second conversion scheme.g.

33 49 a b c making the three students equivalent.33 2.3.33 2. 60–69%. Now another common (and actually used) scale for converting percentages into letter grades is as follows: A+ A A– B+ B B– C+ C C– D F 98–100% 94–97% 90–93% 87–89% 83–86%. 80–82% 77–79%. 70–72%. 0–59% This scheme would result in the following letter grades: g1 A– C+ A+ g2 D C+ C– g3 C– B+ D a b c Maintaining the usual hypothesis of a constant “diﬀerence” between two consecutive letter grades we obtain the following conversion scheme: A+ A A– B+ B B– C+ C C– D F 10 9 8 7 6 5 4 3 2 1 0 .3. AGGREGATING GRADES g1 4 2 4 g2 1 2 2 g3 2 3 1 GPA 2. 73–76%.

b is signiﬁcantly better than c and c is . student a has good grounds to complain. The situation is the same vis-`-vis c: a has a signiﬁcantly higher grade on g2 and a this is the only signiﬁcant diﬀerence. using the same hypotheses. For each course a ﬁnal grade between 0 and 20 is allocated. the following table appears even more problematic: g1 13 11 12 g2 12 13 11 g3 11 12 13 a b c since.66 5. The explicit treatment of such imprecision is problematic using a weighted average. In a similar vein.2 that small diﬀerences in grades might not be signiﬁcant at all provided they do not involve crossing any “special grade”.33 a b c In this case. a is signiﬁcantly better than b (he has a signiﬁcantly higher grade on g1 while there are no signiﬁcant diﬀerences on the other two grades). most often. Consider the following example in which three students are enrolled in a degree consisting of three courses. The results are as follows: g1 13 11 14 g2 12 13 10 g3 11 12 12 a b c All students will receive an average grade of 12 and will all be judged indiﬀerent.50 CHAPTER 3. If all instructors agree that a diﬀerence of one point in their grades (away from 10) should not be considered as signiﬁcant. it is simply ignored. while all students clearly obtain a similar average grade. b (again the Dean’s nephew) gets a clear advantage over a and c. It should be clear that standardisation of the original numerical grades before conversion oﬀers no clear solution to the problem uncovered. All courses have the same weight and the minimal passing grade is 10 on average.00 4. He can argue that he should be ranked before b: he has a signiﬁcantly higher grade than b on g1 while there is no signiﬁcant diﬀerence between the other two grades. Example 7 We argued in section 3. BUILDING AND AGGREGATING EVALUATIONS which leads to the following GPA: g1 8 4 10 g2 1 4 2 g3 2 7 1 GPA 3.

aggregating the grades obtained at the mid-term and the ﬁnal exams) and may be aﬀected by imprecision. we have tried to show that these activities may not be as simple and as unproblematic as they appear to be. We would like to emphasise a few simple ideas to be drawn from this example that we should keep in mind when working on diﬀerent evaluation models: • building an evaluation model is a complex task even in simple situations. uncertainty and/or inaccurate determination. Aggregation rules using weighted sums will be dealt with again in chapters 4 and 6.4. 3. write syllabi specifying a grading policy. this is not overly surprising. the properties of these numbers should be examined with care. In view of these few examples. the necessity and feasibility of using rules that completely rank order all students might well be questioned. Most of us routinely grade various kinds of work. • aggregation models should be analysed with care. Even the simplest and most familiar ones may in some cases lead to surprising and undesirable conclusions. CONCLUSIONS 51 signiﬁcantly better than a (the reader will have noticed that this is a variant of the Condorcet paradox mentioned in chapter 2). we discussed the many elements that may obscure the interpretation of grades and argued that the common weighted sum rule to amalgamate them may not be without diﬃculties. Actors are most likely to modify their behaviour in response to the implementation of the model. etc. When they result in numbers. In particular. In particular. using “numbers” may be only a matter of convenience and does not imply that any operation can be meaningfully performed on these numbers. Although they are very familiar. If it is admitted that there is no easy way to evaluate the performance of a student in a given course. .4 Conclusions We all have been accustomed to seeing our academic performances in courses evaluated through grades and to seeing these grades amalgamated in one way or another in order to judge our “overall performance”. Since grades are a complex evaluation model. • the aggregation of the result of several evaluation models should take the nature of these models into account.g. We expect such diﬃculties to be present in the other types of evaluation models that will be studied in this book. its use may be problematic for aggregating grades.3. The information to be aggregated may itself be the result of more or less complex aggregation operations (e. there is no reason why there should be an obvious one for an entire programme. prepare exams. • “evaluation operations” are complex and should not be confused with “measurement operations” in Physics. we hope to have convinced the reader that although the weighted sum is a very simple and almost universally accepted rule.

BUILDING AND AGGREGATING EVALUATIONS Finally we hope that this brief study of the evaluation procedures of students will also be the occasion for instructors to reﬂect on their current grading practices. This has surely been the case for the authors.52 CHAPTER 3. .

Whatever the answer to the previous question. hence. you could feel that these magic numbers rule the world. Horn (1993)).g. several realities or no reality ? Many philosophers nowadays consider that reality is not unique. Why are these indicators (often called indices) so powerful ? Probably because it is commonly accepted that they faithfully reﬂect reality. Note that in many cases. Is there one reality. as it is impossible to consider reality independently of our perception of it. can we hope that an indicator faithfully reﬂects reality (the reality or a reality) ? Reality is so complex that this is doubtful. 1. consumer price index. an indicator might only be relevant for the person who constructed it. One could argue that these particular realities are just particular views of the same reality but. an indicator must be designed so as 53 . . The EU countries with a deﬁcit/GNP ratio lower than 3% will be allowed to enter the EURO. The World Bank threatens country x to suspend its help if it doesn’t succeed in bringing indicator y to level z. pregnant women and young children should stay indoors. As a consequence. Dow Jones. social position index. air quality.Q. Hence. Therefore. a particular reality. Violations of human rights are often presented as the main factor. This forces us to raise several questions. poverty index.4 CONSTRUCTING MEASURES: THE EXAMPLE OF INDICATORS Our daily life is ﬁlled with indicators: I. . rate of return. 2. . it might be meaningless to consider that reality exists per se (Roy 1990). we must accept that an indicator accounts only for some aspects of reality.. If you read a newspaper. But it is worth noting that indicators of human rights also exist (see e. Each person has a particular perception of the world and. the decisions of the World Bank to withdraw help are not motivated by economic or ﬁnancial reasons. physicians per capita. Today’s air quality is 7: older persons. GNP.

As an illustration. Furthermore. are the concerns of UNDP itself with respect to the HDI clearly deﬁned ? Why do they need the human development index ? To cut subsidies to nations evolving in the wrong direction ? To share subsidies among the poorest countries (according to what key) ? To put some pressure on the governments performing the worst ? To prove that Western democracies have the best political systems ? 3. [.54 CHAPTER 4. such as the Philippines. . UNDP proudly reports that The HDI has been used in many countries to rank districts or counties as a guide to identifying those most severely disadvantaged in terms of human development. Several countries. As stated by the United Nations Development Programme (1997). have used such analysis as a planning tool. Can we assume that their concerns are similar ? In the Human development report 1997. the Human Development index (HDI) deﬁned by the United Nations Development Programme (UNDP) to measure development (United Nations Development Programme 1997) is used by many diﬀerent people in diﬀerent continents and in diﬀerent areas of activity (politicians. page 14. For such uses. . . Are we sure that this indicator indicates what we want it to ? Do the arithmetic operations performed during the computation of the indicator lead to something that makes sense ? Let us now discuss three well known indicators arising in completely diﬀerent areas of our lives in detail: the human development index. the air quality index and the decathlon score. 4. ). . other indicators have sometimes been added to the HDI. ] The HDI has been used especially when a researcher wants a composite measure of development. knowledge and a decent standard of living. secondary and tertiary enrollment) and real GDP (Gross Domestic Product) per capita expressed in PPP$ (Purchasing Power Parity $). . CONSTRUCTING MEASURES to reﬂect those aspects that are relevant with respect to our concerns. businessmen. economists. . A composite index.1 The human development index The human development index measures the average achievements in a country in three basic dimensions of human development–longevity. Suppose that the purpose of an indicator is clearly deﬁned. This clearly shows that many people used the HDI in completely diﬀerent ways. educational attainment (adult literacy and combined primary. the HDI thus contains three variables: life expectancy.

secondary and tertiary Enrollment Ratio Index (ERI). Here is how each index is computed. The transformed value of y. In order to normalise the scale of this index. W (40 000) − W (100) Hence. The EAI is a weighted average of ALI and ERI. The index is deﬁned as life expectancy at birth − 25 . 85 − 25 Hence. i. 3 Adjusted real GDP per capita (PPP$) Index (GDPI) This index aims at measuring the income per capita. it is a value between 0 and 1. y represents the income. THE HUMAN DEVELOPMENT INDEX 55 HDI’s precise deﬁnition is presented on page 122 of the 1997 Human Development Report. ∗ y + 2(y ∗ )1/2 + 3[(y − 2y ∗ )1/3 ] if 2y ∗ ≤ y < 3y ∗ . is given by one of the following: if 0 < y < y ∗ . Thereafter. The ﬁrst one is the proportion of literate adults while the second one is the proportion of children in age of primary. As the value of one dollar for someone earning $100 is much larger than the value of one dollar for someone earning $100 000. the minimum value of $100 and the formula transformed income − W (100) . educational attainment index and adjusted real GDP per capita (PPP$) index. In this formula. . . the income is ﬁrst transformed using Atkinson’s formula (Atkinson 1970). . the income scale is normalised. it is a value between 0 and 1. y ∗ y + 2[(y − y ∗ )1/2 ] if y ∗ ≤ y < 2y ∗ .e. a minimum value (25 years) and a maximum one (85 years) have been deﬁned. it is equal to 2ALI + ERI .4. Educational Attainment Index (EAI) It is a combination of two other indicators: the Adult Literacy Index (ALI) and the combined primary. if (n − 1)y ∗ ≤ y < ny ∗ +n[(y − (n − 1)y ∗ )1/n ] . . using the maximum value of $40 000. W (y) the transformed income and y ∗ is set at $5 835 (PPP$) which was the World average annual income per capita in 1994. Life Expectancy Index (LEI) This index measures life expectancy at birth. . secondary or tertiary school that really go to school. The HDI is a simple average of the life expectancy index.1. . Note that W (40 000) = 6 154 and W (100) = 100. W (y). ∗ y + 2(y ∗ )1/2 + 3(y ∗ )1/3 + .

Thus the adjusted real GDP per capita for Greece is $5 982 (PPP$) because 5 982 = 5 835+2(11 265−5 835)1/2 . the 199i HDI (in the 199j report) is an aggregate of data from 199i (for some dimensions) and from earlier years (for other dimensions).93 . Life expectancy in Greece is 77. Greece’s real GDP per capita at $11 265 is above y ∗ by less than twice y ∗ . But the choice of the bounds is not without consequences. then the HDI is 0. Costa Rica is less developed than South Korea while in the second one. The likelihood of observing a value smaller than the minimum would have been much smaller. At that time. the HDI computed in the 97 report is considered by the UNDP as the HDI of 1994.6 EAI . . Greece’s HDI is (0. To make things more complicated.95 South Korea Costa Rica Table 4.56 CHAPTER 4. EAI = (2 × 0. CONSTRUCTING MEASURES Some words about the data and their collection time: the Human Development Report is a yearly publication (since 1990). let’s compute the HDI for Greece (HDR97). the 1997 report does not contain the 1997 data. they could have chosen a much lower value: 20 or 10. To avoid this problem. then the HDI is 0. Hence GDPI = (5 982−W (100))/(W (40 000)− W (100)) = (5 982 − 100)/(6 154 − 100) = 0. Suppose that the EAI and GDPI have been computed for South Korea and Costa Rica (HDR97). Indeed.880 + 0. Consider the following example. In this volume.889 for Costa Rica. maximum and minimum values have been deﬁned so that. The ALI is 0. Obviously.820.1: Bounds: life expectancy.1]. But if the maximum and minimum for life expectancy are set to 80 and 25.8−25)/(85−25) = 0. the lowest observed value is 22.5 76. EAI and GDPI for South Korea and Costa Rica (HDR97) are set to 85 and 25. no one would ever have thought that life expectancy could be lower than 25. The choice of these bounds is quite arbitrary. HDR97).880.923.967 and the ERI is 0. To illustrate how the HDI works. In the ﬁrst case. Therefore the LEI is negative for Rwanda.1) If the maximum and minimum for life expectancy life expectancy 71.1. Hence. 4.6 (Rwanda.97 . Hence. when the lowest observed value was above 35.967 + 0.86 GDPI .820)/3 = 0.918 + 0.972. We refer to them as HDR97. 4.890 for South Korea and 0.916 for Costa Rica. the choice of the bounds matters. LEI = (77. after normalisation. The value of 25 was chosen for the ﬁrst report (1990). Finally. irrespective of the collection year. Why 25 and 85 years ? Is 25 years the smallest observed value ? No. Hence. the range of the index is [0. We also know the life expectancy at birth for South Korea and Costa Rica (see Table.918.972)/3 = 0.915 for South Korea and 0.1 Scale Normalisation To obtain the LEI and the GDPI. we obtain the converse: Costa Rica is more developed than South Korea. we use only data from the 1997 Human Development Report.8 years.

9$. this very short life expectancy is clearly a sign of severe underdevelopment.60 . In our example. To compensate this.9$ (adjusted by Atkinson’s formula). a decrease in life expectancy by one year can be compensated by some increase in adjusted real GDP (income transformed by Atkinson’s formula). it is not surprising that its position is improved when life expectancy is given more weight (by narrowing its range). 4. Let us compute this increase. even by very good performances on other dimensions. a decrease in life expectancy by n years can be compensated by an increase in adjusted real GDP by n times 100. Let us go further with compensation. life expectancy 54. In reality.1 70.2 Compensation Consider Table 4.1. apparently.016667(6 154 − 100)= 100. the GDPI must increase by the same amount.47 real GDP 3 641 2 118 Gabon Solomon Islands Table 4.63 .2: Compensation: performances of Gabon and Solomon Islands (HDR97) in spite of the informal analysis we performed on the table. Yet.1] and the scale could be normalised. For us. Hence. no bounds were ﬁxed for the ALI and the ERI. This problem is due to the fact that we used the usual average to aggregate our data into one number. A decrease by one year yields a decrease of LEI by 1/(85 − 25) = 0. this is equivalent to choosing 1 for maximum and 0 for minimum. using other values than 0 and 1. HDR97) are presented. the adjusted real GDP must be increased by 0.9 is called the substitution rate between life expectancy and adjusted real GDP. This is also an arbitrary choice.016667.9$. As any weakness can be compensated by a strength.62 ERI .4. The Solomon Islands perform quite well on all dimensions. Nevertheless. Hence it amounts to increasing the weight of LEI by the same factor. Hence the range of these scales is narrower than [0. THE HUMAN DEVELOPMENT INDEX 57 In fact narrowing the range of life expectancy from [25.1. The value 100. extreme weaknesses should not be compensated. Therefore.80] increases the diﬀerence between any two values of LEI by a factor (85−25)/(80−25). Gabon is slightly better than the Solomon Islands on all dimensions except life expectancy where it is very bad.8 ALI .85] to [25. The value of one year of life is thus 100. It is obvious that values 0 and 1 have not been observed and are not likely to be observed in a foreseeable future. Note that. we should conclude that Gabon and Solomon Islands are at the same development level. Weaknesses on some dimensions are compensated by strengths on other dimensions. the HDI is equal to 0. Accordingly. a decrease in life expectancy by 2 years can be compensated by an increase in adjusted real GDP by 2 times 100.56 for both Gabon and Solomon Islands. Hence. Costa Rica performed better than South Korea on life expectancy. even if other dimensions are good. to some extent.2 where the data for two countries (Gabon and the Solomon Islands. This is probably desirable.9$ (recall that W (40 000) = 6 154). .

Our conclusion is conﬁrmed by the HDI: 0. we obtain 0. Even if the high income of z is used to foster education. But the diﬀerences in life . the substitution rate between life expectancy and adult literacy is 0.58 CHAPTER 4.025.40 real GDP 500 3 500 w z Table 4. one could conclude that y is more developed than x.80 .1. a decrease in life expectancy of one year can be compensated by an increase in real GDP of 21 084$. w’s low income doesn’t seem to life expectancy 70 70 ALI . In a country where real GDP is 700$ (Chad.9$.3: Independence: performances of x and y life expectancy. y is much lower than x on adult literacy but much higher than x on income. except for life expectancy.65 .65 . you need an increase of the adult literacy index of n times 0.4). The adult population is very important and its illiteracy is a severe problem. a decrease of life expectancy by one year can be compensated by an increase in real GDP by 100.4: Independence: performances of w and z be a problem for the quality of life. Let us now compare two countries.3 Dimension independence Consider the example of Table 4. Hence. there is no diﬀerence between x and y on the one hand and w and z on the other hand. But if we compute the HDI. HDR97). Hence. Countries x and y perform equally badly on life expectancy 30 30 ALI . one might consider that adult literacy is not very important (because there are almost no adults) but income is more important because it improves quality of life in other respects. To compensate a decrease of n years of life expectancy. 4. it might not be unreasonable to conclude that w is more developed than z. poor people’s life expectancy has much less value than that of rich ones.80 . as life expectancy is high as well as education. it will take decades before a signiﬁcant part of the population is literate. In a country where real GDP is 13 071$ (Cyprus. Hence.52 for w and 0. Let us now think in terms of real GDP (not adjusted). CONSTRUCTING MEASURES Other substitution rates are easy to compute: e.35 ERI .025. Furthermore. On the contrary. health conditions and life expectancy can be expected to improve rapidly due to a higher income. HDR97).40 real GDP 500 3 500 x y Table 4.016667(1 − 0)(3/2)=0.3. In such conditions. the performance of z on adult literacy is really bad compared to that of w.g.34 for y. w and z similar to x and y except that life expectancy is equal to 70 for both w and z (see Table 4. As life expectancy is very short.30 for x and 0.35 ERI .56 for z! This should not be a surprise.

cannot reﬂect the variety present in the population. this results in the same increase of the HDI (compared to x and y) for both w and z. It increases the health budget in such proportions that no more resources are available for other important areas: education. Hence. employment policy. A country where a part of the population (rural or poor or of some race) dies early and where another part of the population lives until 80 might also have a life expectancy of 50 years. For example. identical performances of by two items (countries or whatever) on one or more dimensions are not relevant for the comparison of these items. Note that this kind of average is quite particular. It is very diﬀerent from the average that we perform when.4. one more dollar is considerable. It is well known that averages.5 Statistical aspects Let us consider the four indices of the HDI from a statistical point of view. 4. as long as they remain identical.1 (Scale Normalisation). adult literacy and enrollment have not been adjusted is also an arbitrary choice. Once more. . dimension independence might not be desirable. the real GDP is adjusted using Atkinson’s formula. Some could even argue that increasing life expectancy above a certain threshold is no longer an improvement. an arbitrary choice has been made and we could easily build a small example showing that another arbitrary (but defendable) choice would yield a diﬀerent ranking of the countries. we already have discussed this topic in Section 4. they do not aﬀect the way both items compare to each other. it is inherent to sums and averages. The life expectancy index is the average over the population and for a determined time period of the length of the lives of the individuals in the population. One could argue that improving life expectancy by one year in a country where life expectancy is 30 is a huge achievement while it is a moderate one in a country where life expectancy is 70. Note that the fact that life expectancy. If you earn 100 dollars. When a sum (or an average) is used to aggregate diﬀerent dimensions. even if they are useful. .1. Atkinson’s formula reﬂects this. This is called dimension independence. concerning real GDP.4 Scale construction In a way. The identical performances can be changed in any direction. The goal of this adjustment is obvious: if you earn 40 000 dollars. On the . THE HUMAN DEVELOPMENT INDEX 59 expectancy between x and w and between y and z are equal. one more dollar is negligible. The weight of an object really exists (as far as reality exists).1. .1. before normalising this scale. A country where approximately everyone lives until 50 has a life expectancy of 50 years. for example. we have several measures of the weight of an object and we consider the average as a good estimate of its actual weight. But there is more to scale construction than scale normalisation. But why choosing y ∗ = $5 835 ? Why choose Atkinson’s formula ? Other formulas and other values for y ∗ would work just as well. But we saw that this property is not always desirable. education and income. When we compare countries on the basis of life expectancy. 4.1.

To make it easy. Quetelet measured the average size of humans. irregular or noisy copies of that average human.60 means that 60% of the population is literate. What do you get ? The adult literacy index! We can analyse the enrolment ratio index and the adjusted real GDP index in the same way as the ALI. They were too large! 3 5 4 5 12 13 4 8 9 Figure 4. spleen and other organs. What he got was an average human in which it was impossible to ﬁt all its average organs. Consider a variable whose value is 0 for an illiterate adult and 1 for a literate one. 4. e To convince you that the concept of the average human is quite strange (though possibly useful). Compute the average of this variable over the population and over some a time period. The ﬁrst one being a proportion and the second one being normalised. in all dimensions. They are quantities that are measured at country level. According to the United Nations Development Programme (1997). in the same proportion. a triangle which is not right-angled for 42 + 82 = 92 . both kinds of averages were called by diﬀerent names (moyenne proportionnelle–diﬀerent measures of one object–and valeur commune–diﬀerent objects.60 CHAPTER 4. they can also be interpreted at individual level. If we consider that an ALI of 0.1). divided by the total adult population to allow comparisons between countries. it is designed to . i. And this last interpretation is not more silly than computing a life expectancy index.1: Two right triangles and their average The adult literacy index is quite diﬀerent: it is just the number of literate adults. even if reality exists. In order to do so. The average right triangle is no longer a right triangle! What looks like a right angle is in fact approximately a 91 degrees angle. the average of the length of life doesn’t correspond to something real. In the same spirit. including the liver. Until the 19th. It is the length of life of a kind of average or ideal human. consider a country where all inhabitants are right triangles of diﬀerent sizes and shapes (example borrowed from Warusfel (1961)).60 means that the average literacy level is 60%. What he gets is a triangle with edges of length 4. let us suppose that there are just two kinds of right triangles (see Fig. then it is an average. During the 19th-century the Belgian astronomer and statistician Quetelet (1796-1894) invented the concept of the average human and uniﬁed both averages (Desrosi`res 1995). like averages. he computes the average length of each edge. 8 and 9. each measured once) and considered as completely different. Hence one could think it is not an average. What about the HDI itself. as if we (the real humans) were imperfect. then it is not an average. CONSTRUCTING MEASURES contrary. If we consider that an ALI of 0. A statistician wants to measure the average right triangle.century. heart. In fact it depends on how we interpret it.e.

5. GDPI and HDI should be interpreted in this way as well. For each pollutant. diﬀerent monitoring systems have been developed in order to provide governments as well as citizens with some information about air pollution. as a good quality air is not guaranteed by norms.html). developed by the French Environment Ministry (http://www-sante. For each pollutant. Therefore. for each pollutant. . the Clean Air Act in the US). The ATMO index is based on the concentration of 4 major pollutants: sulfur dioxide (SO2 ). suppose that the sub-indices are as in Table 4. the concentration is converted into a number on a scale from 1 to 10. one of them increased. dust.2 Air quality index Due to the alarming increase in air pollution. These two indicators are very similar and we will discuss the French ATMO. ] 61 Furthermore.epa. several governments and international organisations edited some norms concerning pollutants’ concentration in the air (e. levels 8 corresponds to the EU short term norms and 10 indicates hazardous conditions. In the following paragraphs.ujf-grenoble. To illustrate. due to heavy traﬃc. during the last decades. ] measure the average achievements in a country [. Naturally.gov/oar/oaqps/psi. In these conditions.2. this corresponds to a worse air: no pollutant did decreased. Usually these norms specify. Level 1 corresponds to an air of excellent quality. . that is 8. 4. the ozone sub-index increases from 3 to 8 for the air described in Table 4. ozone (O3 ) and particulate matter (soot. developed by the US Environmental Protection Agency ((Ott 1978) or http://www. Two examples of such systems are the Pollutant Standards Index (PSI). these norms are just norms and they are often are exceeded.. a concentration that should not be exceeded. In fact the . The resulting pollutant sub-index CO2 3 SO2 3 O3 2 dust 8 Table 4. levels 5 and 6 are just around the EU long term norms. particles).html).5: Sub-indices of the ATMO index ATMO index is the largest value. Clearly.2. Here is how each subindex is deﬁned. the HDI contains an index (LEI) which can only be interpreted bearing in mind Quetelet’s average human. 4. Therefore the ALI.1 Monotonicity Suppose that. .5.g. mainly in urban areas.4. The HDI somehow describes how developed the average human in a country is. nitrogen dioxide (NO2 ).fr/ SANTE/paracelse/envirtox/Pollatmo/Surveill/atmo. AIR QUALITY INDEX [. a sub-index is computed and the ﬁnal ATMO index is deﬁned as being equal to the largest sub-index. we discuss some problems arising with the ATMO index. and the ATMO Index. Hence the air quality is very bad. . the absence of wind and a very sunny day. we expect the ATMO index to worsen as well.

say 7 and 4. Air x is perfect on for all measurements but one: it scores just above pollutant x y CO2 1 5 SO2 1 4 O3 6 5 dust 1 5 Table 4. But in such a case. 4. It is of average quality on all dimensions and close to the EU long term norms for three dimensions. the statement “Today’s ATMO index (6) is twice as high as . and 7 is not twice as large as 4.2 Non compensation Let us consider the ATMO index for two diﬀerent airs (x and y).6. the compensation between dimensions was too strong. CONSTRUCTING MEASURES ATMO index does not change. are not reﬂected by the index. In the case of human development.62 CHAPTER 4. The small weakness of x (6 compared to 5.2. for carbon dioxide. To conclude. Thus some changes. This is done in an arbitrary way.2. Contrary to what we observed with the HDI. For a given pollutant.6: Sub-indices for x and y the EU long term norm for ozone. Consider the statement “Today’s ATMO index (6) is twice as high as yesterday’s index (3)”. What does it mean ? We are going to show that it is meaningless. 6-7 and 9 could have been chosen. The relevant information provided by the index is not the ﬁgure itself. even signiﬁcant ones. for ozone) is not compensated by its large strengths (1 compared to 4 or 5. we face another extreme: no compensation at all. the concentration is measured in µg/m3 . in both directions. which is probably not better. Note that if the ozone sub-index decreases from 8 to 3. In our example. are not reﬂected by the index. the quality of air x is considered to be lower than that of air y. the values of today’s and yesterday’s index would be diﬀerent. nitrogen dioxide and dust). no compensation at all occurs between the diﬀerent dimensions.3 Meaningfulness Let us forget our criticism of the ATMO index and suppose that it works well. The index would work as well. it is some information about the fact that we are above or below some norms that are related to the eﬀects of the pollutants on health (a somewhat similar situation has been encountered in Chapter 3). Hence. For example. the change is very signiﬁcant as the ozone sub-index was almost perfect and became very bad. Let us come back to the deﬁnition of the sub-indices. Air y is not good for any dimensions. instead of choosing 5-6 for the EU long term norms and 8 for the short term ones. The maximum is still 8. Some changes. in a certain sense. as described by Table 4. 4. the ATMO index does not change either though the air quality improves. The concentration ﬁgures are then transformed into numbers between 1 and 10. Here. This shows that the ATMO index is not monotonic. The ATMO index is 6 for air x and 5 for air y.

4.3. THE DECATHLON SCORE

63

yesterday’s index (3)” would be valid, or meaningful, only in a particular context, depending upon arbitrary choices. Such a statement is said to be meaningless. On the contrary, the statement “Today’s ATMO sub-index for ozone (6) is higher than yesterday’s sub-index for ozone (3)” is meaningful. Any reasonable transformation of the concentration ﬁgures into numbers between 1 and 10 would lead to the same conclusion: today’s sub-index is higher than yesterday’s one. By “reasonable transformation” we mean a transformation that preserves the order: a concentration cannot be transformed into an index value lower than the index value corresponding to a lower concentration. Concentration of 110 and 180 µg/m3 can be transformed in 3 and 6, or 4 and 6, or 2 and 4 but not 4 and 2. More subtle: “Today’s ATMO index (6) is larger than yesterday’s ATMO index (3)”. Is this sentence meaningful ? In the previous paragraph, we saw that the arbitrariness involved in the construction of the 1 to 10 scale of a sub-index is not a problem when we want to compare two values of the same sub-index. But if we want to compare two values of two diﬀerent sub-indices, it is no longer true. A value of 3 on a sub-index could be more dangerous for health than a 6 on another sub-index. Of course, the scales have been constructed with care: 5 corresponds to the EU long term norms on all sub-indices and 8 to the short term norms. This is intended to make all sub-indices commensurable. Comparisons should thus be meaningful. But can we really assume that a 5 (or the corresponding concentration in µg/m3 ) is equivalent on two diﬀerent sub-indices ? Equivalent in what terms ? Some pollutants might have short term eﬀects and other pollutants, long term eﬀects. They can have eﬀects on diﬀerent parts of the organism. Should we compare the eﬀects in terms of discomfort, mortality after n years, health care costs, . . . ?

4.3

The decathlon score

The decathlon is a 10-event athletic contest. It consists of 100-meter, 400-meter, and 1 500-meter runs, a 110-meter high hurdles race, the javelin and discus throws, shot put, pole vault, high jump, and long jump. It is usually disputed over two or three days. It was introduced as a three-day event at the Olympic Games of Stockholm in 1912. To determine the winner of the competition, a score is computed for each athlete and the athlete with the best score is the winner. This score is the sum of the single-event scores. The single event scores are not just times and distances. It doesn’t make sense to add the time of a 100-meter run to the time of a 1 500-meter run. It is even worse to add the time of a run to the length of a jump. This should be obvious for everyone. Until 1908, the single-event scores were just the rank of an athlete in that event. For example, if an athlete performed the third best high jump, his singleevent score for the high jump was 3. The winner was thus the athlete with the lowest overall score. Note that this amounts to using the Borda method (see p.14) to elect the best athlete when there are ten voters and the preferences of each voter are the rankings deﬁned by each event. The main problem with these single-event scores is that they very poorly reﬂect

64

CHAPTER 4. CONSTRUCTING MEASURES

points

points

distance

distance

Figure 4.2: Decathlon tables for distances: general shape of a convex (left) and concave (right) tables the performances of the athletes. Suppose that an athlete arrived 0.1 second before the next athlete in the 100-meter run. They have ranks i and i+1. So the diﬀerence in the scores that they receive is 1. Suppose now that the delay between these two athletes is 1 second. Their ranks are unchanged. Thus the diﬀerence of in the scores that they receive is still 1 though a larger diﬀerence would be more appropriate. That is why other tables of single-event scores have been used since 1908 (de Jongh 1992, Zarnowsky 1989). In the tables used after 1908, high scores are associated to good performances (contrary to scores before 1908). Hence, the winner is the athlete that has the highest overall score. Some of these tables (diﬀerent versions, in use between 1934 and 1962) are based on the idea that improving a performance by some amount (e.g. 5 centimetres in a long jump) is more diﬃcult if the performance is close to the world record. Hence, it deserves more points. The general shape of these tables, for distances, is given in Figure 4.2 (convex table). For times (in runs), the shape is diﬀerent as an improvement is a decrease in time. A problem raised by convex tables is the following: if an athlete decides to focus on some events (for example the four kinds of runs) and to do much more training for them than for the other ones, he will have an advantage. He will come closer to the world record for runs and earn many points. At the same time, he will be further away from the world record for the other disciplines but that will make him lose less points as the slope of the curve is more gentle in that direction. The balance will be positive. Thus these tables encourage athletes to focus on some disciplines, which is contrary to the spirit of the decathlon. That is why, since 1962, diﬀerent concave tables (see Figure 4.2) have been used. These tables strongly encourage the athletes to be excellent in all disciplines. An example of a real table, in use in 1998, is presented in Figure 4.3. Note that a new change occurred: this table is no longer concave. It is almost linear but slightly convex. There are many interesting points to discuss about the decathlon score. • How are the minimum and maximum values set ? They can highly inﬂuence the score as it was shown with the HDI (in Section 4.1.1). Obviously, the maximum value must somehow be related to the world record. But as

4.3. THE DECATHLON SCORE

65

1200

1100

1000

900

score

800

700

600

500

400

9.5

10

10.5

11 11.5 100 meters time

12

12.5

13

Figure 4.3: A plot for the 100 meters run score table in 1998 everyone knows, world records are objects that athletes like to break. • Why adding single-event scores ? Other operations might work as well. For example, multiplication may favour the athletes that perform equally well in all disciplines. To illustrate this point very simply, consider a 3-event contest where single-event scores are between 0 and 10. An athlete, say x obtains 8 in all three events. Another one, y obtains 9, 8 and 7. If we add the scores, x and y obtain the same score: 24. If we multiply the scores, x gets 512 while y looses with 504. • ... The point on which we will focus, in this decathlon example, is the role of the indicator.

4.3.1

Role of the decathlon score

Although one might think that the role of the overall score is clearly to designate the winner, we are going to show that it plays many roles (like student grades, see Chapter 3) and that this is one of the reasons why it changes so often. Of course, one of the roles is to designate the winner and it was probably the only purpose that the ﬁrst designers of the score had in mind. But we can be quite sure that immediately after the ﬁrst contest, another role arose. Many people probably used the scores to assess the performance of the athletes. Such athlete has a score very

66

CHAPTER 4. CONSTRUCTING MEASURES

close to that of the winner and is thus a good athlete. Another one is far from the winner and is consequently not a good one athlete. Not much later (after the second competition), a third role appeared. How did the athletes evolve ? This athlete has improved his score or x has a better score in this contest than the score of y in the previous contest. This kind of comparison is not meaningful: suppose that an athlete wins a contest with a score of 16. In the next contest, he performs very poorly: short jumps, slow runs, short throws. But his main opponents are absent or perform equally poorly. He might still win the contest and even with a higher score although his performance is worse than the previous time. After some time, the organisers of decathlons became aware of the second and third role. It was probably part of the motivations to abandon the sum of ranks and to use convex tables. These tables, to some extent, made the comparisons of scores across athletes and/or competitions meaningful. At the same time, the score found a new role as a monitoring tool during the training. Before 1908, the scores could be computed only during competitions as they were sums of ranks. And it was not long before a wise coach used it as a strategic tool, advising his athlete to focus on some events. For this reason, since 1962, the organisers conferred a new role to the score: to foster excellence in all disciplines. This was achieved by the introduction of concave tables. But it is most likely that the score is still used as a strategic tool, hopefully in a less perverse way. It is worth noting that this new role doesn’t replace any of the previous ones. The score aims at rewarding equal performances in all disciplines but it is also used to assess the performance of an athlete. Even if we only consider only these two roles (the other ones could be seen as side eﬀects), it is amazing to see how incompatible they are.

4.4

Indicators and multiple criteria decision support

Classically, in a decision aiding process, a decision-maker wants to rank the elements of the set of alternatives (or to choose the best element). In order to rank, he selects several dimensions (criteria) that seem relevant with respect to his problem. Each alternative is characterised by a performance on each criterion (this is the evaluation matrix or performance tableau). A MCDA method is then used to rank the alternatives, with respect to the preferences of the decision-maker. When an indicator is built, several dimensions are also selected. Each item is characterised by a performance on each dimension. An index that can be used to rank the items is computed. The analogy between a decision support method and an index is obvious: both aim at aggregating multi-dimensional information about a set of objects. But there is a tremendous diﬀerence as well: when an indicator is built, it is often the case that there is no clearly deﬁned decision problem, decisionmaker and, a fortiori, preferences. To avoid the absence of preference, one could consider that the preferences are those of the potential users of the indicator. To some extent, this is possible because very often the preferences of the users

4.4. INDICATORS AND MULTIPLE CRITERIA DECISION

SUPPORT

67

go in the same direction for each dimension taken separately. For example, for each dimension of the ATMO index, everyone prefers a lower concentration. But it is deﬁnitely not reasonable to assume that the global preferences are similar. Furthermore, even if single-dimensional preferences go in the same direction, it does not mean that single-dimensional preferences are identical. Those who are not very sensitive to a pollutant will value a decrease in concentration much more if it occurs at high concentration than at low concentration. On the contrary, sensitive people might value concentration decreases at low and high levels equally. The relevance of measurement theory The absence of preferences is crucial. In decision support, many studies and concepts relate to measurement theory. Measurement theory is the theory that studies how we can measure objects (assign a number to an object) so as to reﬂect a relation on these objects. E.g., how can we assign numbers to physical objects so as to reﬂect the relation “heavier than” ? That is, how to assign a number (called weight) to each object so that “x’s weight > y’s weight” implies “x is heavier than y” ? Additional properties may be required. For example, in the case of weight measurement, one wishes that the number assigned to x and y taken together be the sum of their individual weights. Another example is that of distance. How to assign numbers to points in the space so as to reﬂect the relation “more distant than” with respect to some reference point ? Contrary to the previous example, this one has several dimensions (usually two or three: : x, y or x, y, z or altitude, longitude, latitude, etc.). Each object (point) is characterised by a performance (co-ordinate) in each dimension and one tries to aggregate these performances into one indicator: the distance to the reference point. This problem is at the core of geometry. Note that the answer is not unique. Very often the Euclidean distance is chosen (assuming that the shortest path between two points is the straight line). Sometimes, a Gaussian distance is more relevant (when you consider points on the earth’s surface, unless you are a mole, the shortest path is no longer a straight line but a curve). In other circumstances, the Manhattan distance is more appropriate (between two points in Manhattan, if you are not ﬂying, the shortest path is not a straight line nor a curve, it is a succession of perpendicular straight lines). And there are many other distances. As far as physical properties are concerned (larger than, warmer than, faster than, . . . ), the problem is easy: good measurements were carried out in Antiquity without any theory of measurement. But when we consider other kinds of relations, things are more complex. How to assign numbers to people or alternatives so as to reﬂect the relations “more loveable than”, “preferable to” or “more risky than” ? In such cases, measurement theory can be of great assistance but is insuﬃcient to solve all problems. In decision support, measuring objects with respect to the relation “is preferred to” can be of some help because, once the objects have been measured, it is rather easy to handle numbers. It is often assumed that a preference relation over the alternatives exists but is not well known and one tries to measure the alternatives

An indicator can be considered as a kind of language. in such a case. measurement theory cannot tell us much about the index. On the contrary. institutes reality. Ambiguities and contradictions are certainly adequate for poetry otherwise we could never enjoy things like this: Mis pasos en esta calle Resuenan en otra calle donde . but some characteristics of the preference relation still exist a priori. Any athlete that was not convinced of this had to change his mind and to behave accordingly if he wanted to compete. not necessarily in the most eﬃcient way. Many governments probably try to exhibit good HDI for their country in order to keep international subsidies or to legitimise their authority to the population of the country or foreign governments. it is very likely that no decathlon contest would ever have taken place. One might be tempted to reject any indicator that does not reﬂect reality. the scores were designed to assess the performances and to compare them. Therefore. in many cases. It institutes or settles reality (Desrosi`res 1995). CONSTRUCTING MEASURES so as to discover the preference relation. Indicators and reality The index does not help to uncover reality. the indicators are not useless. willing to attract high salaried residents. that. in some arbitrary way. By “eﬃciently”. to have high air quality. the preference relation is not assumed to completely exist a priori. Measurement theory loses some of its power when there is no a priori relation to be reﬂected. that is a pre-existent relation. Between 1908 and 1962. Nevertheless. Sometimes.68 CHAPTER 4. Preferences can emerge and evolve during the decision aid process. we mean “more eﬃciently than without any language”. This is very obvious with the decathlon e score. it is not always precise and leaves room for ambiguities and contradictions. Some city councils. the score is considered as the true measure of performance. It is based on some (more or less necessarily arbitrary) conventions and helps us to eﬃciently communicate about diﬀerent topics or perform diﬀerent tasks. among others. even if other policies might be more beneﬁcial to the country. The most eﬃcient way for them to make their claim credible is to exhibit a good ATMO index (or any other index in countries other than France). As one of the most important things for a professional athlete is to win (contrary to the opinion of de Coubertin). Measurement theory can therefore be used to build or to analyse a decision support method. If the people that created the decathlon had decided to wait until a sound theory shows them how to designate the winner. claim. it seems that. Many indices are built without the assumption that a relation over the items a priori exists or without trying to reﬂect a pre-existent relation. the aim of an index is precisely to build or create a relation over the items. As any language. But this does not mean that all indicators are equally good. This is not particular to the decathlon score.

translated by Louis Untermeyer(van Doren 1928) And when I lean upon your breast / My soul is soothed with godlike rest. ¨ doch wenn du sprichst: ich liebe dich! so muss ich weinen bitterlich. 2 Heinrich . kommt’s uber mich wie Himmelslust. they should be avoided. CONCLUSIONS oigo mis pasos pasar en esta calle donde S´lo es real la niebla 1 o or Wenn ich mich lehn’ an deine Brust. relying solely on measurement theory is not possible. Here. most decision aiding processes relate to a more or less precisely deﬁned decision problem. at least some elements of preferences are present. to mention a few. probably cannot avoid some arbitrary elements. Ich liebe dich. But unlike cases where indicators are built without any decision problem in mind. translated by Nims (1990) My footsteps in this / street / Re-echo / in another street / where / I hear my footsteps / passing in this street / where / Nothing is real but the fog Heine.2 69 But. of the criteria. When possible. When certain elements of preferences are known for sure. like most indicators. of the aggregation scheme. They can occur at diﬀerent steps of the process: the choice of an analyst.5 Conclusions Among evaluation and decision models. 4. all indicators should reﬂect them. Consequently. if some measurement (associating numbers to alternatives) is performed during the aiding process. / But when you swear: I love but thee! / Then I must weep–and bitterly. it would be very unlikely that any aid would be required. ambiguities and contradictions should generally be kept at a minimum. indicators are probably more widespread than any other model (this is deﬁnitely true if you think of cost-beneﬁt analysis or 1 Octavio Paz. Otherwise. Therefore.5. Back to multiple criteria decision support In a decision aiding process. Therefore. Most decision aiding processes. preferences are not perfectly known a priori. measurement theory can be used to ensure that the model built during the aiding process does not contradict these elements of preferences.4. when it comes to decision-making. that it reﬂects them and that all sound conclusions that can be drawn from the conjunction of these elements are actually drawn.

at least. These problems are not speciﬁc to indicators. indicators are pervasive in many domains of human activity. . we analyzed three diﬀerent indicators: the human development index. to an incapability of dealing with dimension dependence. we saw that an indicator does not necessarily need to reﬂect reality or. contrary to student grades that are conﬁned to education (note that student grades could be considered as special cases of indicators). Some of them have already been discussed in Chapter 3 and/or will be met in Chapter 6. Indicators are usually presented as an eﬃcient way to synthesise information. actually. in many circumstances. besides the fact that most people use and/or encounter them. are.70 CHAPTER 4. CONSTRUCTING MEASURES multiple criteria decision support). as well– almost every one has faced them at some point of his life–but. Student grades are also very popular. to non monotonicity. the ATMO (an air quality index) and the decathlon score. . it does not need to reﬂect only reality. Indicators are not often thought of as decision support models but. But what do we need information for ? For making decisions ! In this chapter. all three indicators have been shown to present ﬂaws: they do not always reﬂect reality or what we consider as reality. On the one hand. . On the other hand. This is due to an excess or a lack of compensation. .

the allocation of rare resources to some alternatives rather than to others (e.g. Little and Mirlees 1968. Kirkpatrick and Weiss 1996. public agencies and ﬁrms or international organisations are complex and have a huge variety of consequences. Johansson 1993. deciding how to use one’s income). Goodman and Stano 1997. projects. or irradiated food (Hanley and Spash 1993. approving the human consumption of genetically-modiﬁed organisms. Johannesson 1996). Toth 1997). It is based on the following simple and apparently inescapable idea: a project should only be undertaken when its “beneﬁts” outweigh its “costs”. buying new diagnosis tools. building a high-speed train. It is therefore not at all surprising that the question of helping a decision-maker to choose between competing alternatives. Little and Mirlees 1974). • Transportation: building new roads or motor ways (Willis. Decisions made by governments. developing an energy policy for a nation (Dinwiddy and Teal 1996. courses of action and/or to evaluate them. Garrod and Harvey 1998). CBA is particularly oriented towards the evaluation of public sector projects.5 ASSESSING COMPETING PROJECTS: THE EXAMPLE OF COST-BENEFIT ANALYSIS 5. at some stage. Cost-Beneﬁt Analysis (CBA) is a set of techniques that economists have developed for this purpose. • Health: building new hospitals. Some examples of areas in which CBA has been applied will give a hint of the type of projects that are evaluated: • Economics: determining investment strategies for developing countries. 71 . choosing standard treatments for certain types of illnesses (Folland. Schoﬁeld 1989).1 Introduction Decision-making inevitably implies. has attracted the attention of economists. creating national parks. • Environment: establishing pollution standards. setting up prevention policies. allocating budgets among agencies. International Atomic Energy Agency 1993. reorganising the bus lines in a city (Adler 1987.

• give an idea of how these principles are applied in practice. Public Budgeting and Finance. Little and Mirlees (1974). Our aim. Research on CBA is still active and economists have spent considerable time and energy in investigating its foundations and reﬁning the various tools that it requires in practical applications (recent references include Boardman 1996. is not to support the nowadays-fashionable claim (especially among environmentalists) that CBA is an outdated useless technique either. more recently. we shall try here to: • give a brief and informal account of the principles underlying CBA. Journal of Health Economics. the Law makes it an obligation to evaluate projects using the principles of CBA. Land Economics. Water Resources Research). American Journal of Agricultural Economics. Environment and Planning. e. In many countries nowadays. the UK being the ﬁrst and more active one. Regional Science and Urban Economics. It would be impossible to give a fair account of the immense literature on CBA in a few pages. Marglin and Sen (1972) and. Although it has distant origins (see Dupuit 1844). ONUDI: Dasgupta. Pharmaco-Economics. Most economists view CBA as the standard way of evaluating such projects and of supporting public decisionmaking (numerous examples of practical studies using CBA can easily be found in applied economics journals. Journal of Transport Economics and Policy. Journal of Policy Analysis and Management. Nas 1996). if the claim of economists was to be perfectly well-founded there would be hardly any need for other decision/evaluation models. Asian Development Bank: Kohli (1993)). In pointing out what we believe to be some . A good overview of the early history of CBA can be found in Dasgupta and Pearce (1972). Journal of Public Finance and Public Choice. the principles of CBA were entrenched in a series of very inﬂuential “manuals for project evaluation” produced by several international organisations (OECD: Little and Mirlees (1968). the development of CBA has unsurprisingly coincided with the more active involvement of governments in economic aﬀairs that started after the great depression and climaxed after World War II in the 50’s and 60’s. while clearly not being to promote the use of CBA. two excellent introductory references are Dasgupta and Pearce (1972) and Lesourne (1975).g. These three objectives structure the rest of this chapter into sections. After having started in the USA in the ﬁeld of Water Resource Management (see Krutilla and Eckstein (1958) for an overview of these pioneering developments). Energy Economics. the principles of CBA were soon adopted in other areas and countries. • give a few hints on the scope and limitations of CBA. Since fairly diﬀerent approaches to these problems have been advocated. ASSESSING COMPETING PROJECTS These types of decision are immensely complex. World Bank: Adler (1987). Less ambitiously. While research on (and applications of) CBA grew at a very fast rate during the 50’s and 60’s. Journal of Environmental Economics and Management. Although somewhat old. it is important to have a clear idea of what CBA is. They aﬀect our everyday life and are likely to aﬀect that of our children. Brent 1996.72 CHAPTER 5.

u. this amount being the diﬀerence between the “beneﬁts” and the “expenses” generated by the project (including the residual value of the project in the last period). Suppose now that a project is to be evaluated on T time periods of equal length. some of the components of this vector (most notably a(0)) will be negative (if not.2. . 5. Although all components of the evaluation vector are expressed in identical monetary units (m. The next step is to try to evaluate the consequences of the project in each of these time periods. Note that these evaluations are relative: they aim at capturing the inﬂuence of the project on the ﬁrm and not its overall situation.5. Some discussion will therefore prove useful. A simple starting point is to be found in the literature on Corporate Finance on the choice between “investment projects” in private ﬁrms. it is the only “consistent” way to support decision/evaluation processes (Boiteux 1994). Let us denote b(i) (resp. the evaluation model of the project has the form of an evaluation vector with T +1 components (a(0).1 The principles of CBA Choosing between investment projects in private ﬁrms The idea that a project should only be undertaken if its “beneﬁts” outweigh its “costs” is at the heart of CBA. . Such a task may be more or less easy depending on the nature of the project. If the very nature of the project may command this choice (e. equipment will have to be replaced) the general case is that the duration of the project is more or less conventionally chosen as the period of time for which it seems reasonable and useful to perform the evaluation.2 5. In general.g. . It is of little practical content however unless we deﬁne more precisely what “costs” and “beneﬁts” are and how to evaluate and compare them. we only want to give arguments refuting the claim of some economists that. THE PRINCIPLES OF CBA 73 limitations of CBA. This involves some arbitrariness (should we choose years or semesters?) as well as trade-oﬀs between the depth and the complexity of the evaluation model. At this stage. We seek to obtain an evaluation of the amount of cash that is generated by the project during each time period. under all circumstances. . you should enjoy the free lunch and there is hardly any evaluation problem). real-world applications imply dividing the duration of the project into time periods of equal length. the expenses) generated by the project during the ith period of time. A useful way to evaluate such an investment project is the following. a(T )) where 0 conventionally denotes the starting time of the project. because after a certain date the Law will change. First a time horizon for its evaluation must be chosen. the environment of the ﬁrm and the duration of the project. a(1). An investment project may usefully be seen as an operation in which money is spent today (the “costs”).2. with the hope that this money will produce even more money (the “beneﬁts”) tomorrow. This claim may seem so obvious that it need not be discussed any further.). the (algebraic) sum a(0) . c(i)) the beneﬁts (resp. The net eﬀect of the project in period i is therefore a(i) = b(i) − c(i). Although a “continuous” evaluation is theoretically possible.

ASSESSING COMPETING PROJECTS is to be received today while a(1) will only be received one time period ahead.u. being sure of receiving 1 m. Using a similar reasoning and taking into account compound interest. the cash stream of the project is equivalent to receiving money now. . • the eﬀect of uncertainty and/or imprecision was neglected. in period 1 1 corresponds to receiving. we have made various hypotheses. .u. in period 1 will allow you to reimburse exactly what you 1 have to i.u. you can borrow an amount of 1+r m.1) NPV = i=0 a(i) = (1 + r)i T i=0 b(i) − c(i) (1 + r)i If N P V > 0. called the Net Present Value (N P V ) of the project is given by: T (5.e. are not directly comparable. you will have to spend (1 + r) m. taking into account the costs and the beneﬁts of the project and their dispersion in time.u. Similarly. Most notably: • a duration for the project was chosen. If you borrow 1 m. When N P V = 0. • all consequences of the projects were supposed to be adequately modelled as beneﬁts b(i) and costs c(i) expressed in m. In deriving this simple rule. thus. This simple reasoning underlies the following well-known rule for choosing between investment projects in Finance: “when projects are independent.u.u. This sum. although expressed in the same unit. in period 1. now. Hence. 1+r (1 + r) = 1 m. here and now. in period 1 in order to respect your contract. receiving 1 m. The reverse conclusion obviously holds if N P V < 0. There is a simple way however to summarise the components of the evaluation vector using a single number. a(T )) as the sum to be received now that is equivalent to this cash stream via borrowing and lending operations on the capital market. if you know 1 that you will receive 1 m. a(1). an amount 1+r m.: your revenue of 1 m. choose all projects that have a strictly positive N P V ”.u.74 CHAPTER 5. • a perfect capital market was assumed to exist. in period 1 i corresponds to an amount of (1+r)i m. Suppose that there is a capital market on which the ﬁrm is able to lend or borrow money at a ﬁxed interest rate of r per time period (this market is assumed to be perfect: borrowing and lending will not aﬀect r and are not restricted). for one time period on this market today. i.e. . the ﬁrm is indiﬀerent between undertaking the project or not. This is what is called discounting and r is called the discounting rate.u. it appears that the project makes the ﬁrm richer and.u. This suggests a simple way of summarising the components of the vector (a(0). .u. . for each time period. • the duration was divided into conveniently chosen time periods of equal length.u. Therefore these two numbers. should be undertaken.

The main extensions are the following: • in CBA “costs” and “beneﬁts” are evaluated from the point of view of “society”. c(2. evaluated in units that are speciﬁc to that dimension. one unit of the social cost on the kth dimension) expressed in m.5. generated by the project in period i. when this happens. The literature in Finance is replete with extensions of this simple model that allow to cope with less simplistic hypotheses. the beneﬁts b(i) and costs c(i) of a project in period i are seen in CBA as vectors with respectively and components: b(i) = (b(1.. on the kth dimension). using suitably chosen “prices”. c(k. c(i) = (c(1. letting: b(i) = j=1 p(j)b(j. . i). conveniently chosen “prices” are used to convert them into m. i)) where b(j. We denote by p(j) (resp p (k)) the price of one unit of social beneﬁt on the jth dimension (resp. i). CBA may usefully be seen as using a direct extension of the rule used in Finance. i)) denotes the “social beneﬁts” (resp.u. i) and c(i) = k=1 p (k)c(k.g. prices are assumed to be independent from the time period).u. b(2. • in CBA the discounting rate has to be chosen from the point of view of “society”. c( . “costs” and “beneﬁts” are converted into m.2 From Corporate Finance to CBA Although the projects that are usually evaluated using CBA are considerably more complex than the ones we implicitly envisaged in the previous paragraph. synergetic). the “social costs”) on the jth dimension (resp.u. i).2. . These prices are used to summarise the vectors b(i) and c(i) into single numbers expressed in m. (for simplicity.u.. projects may be exclusive. . . • in CBA “costs” and “beneﬁts” are not necessarily directly expressed in m. and consistently with real-world applications. In each period. . i)) .2. i) . b( . i). 5.u. . Retaining the spirit of the notations used above. THE PRINCIPLES OF CBA 75 • other possible constraints were ignored (e. i) (resp.

u. 5.2.2) N P SV = i=0 b(i) − c(i) = (1 + r)i T j=1 i=0 p(j)b(j. costs) generated by the project in period i converted into m. We would however like to give a hint of why CBA consistently insists on trying to “price out” every eﬀect of a project. The important point here is that CBA conducts project evaluation within an “environment” in which markets are especially important instruments of social co-ordination. An elementary theoretical model Consider a one-period economy in which m individuals consume n goods that are exchanged on markets.3 Theoretical foundations It is obviously impossible to give a complete account of the vast literature on the foundations of CBA which has deep roots in Welfare Economics here. thus.3 presents an elementary theoretical model that helps understanding the foundations of CBA.2. should be implemented (in the absence of other constraints). c(i)) denotes the social beneﬁts (resp. It should be observed that the diﬃculties that we mentioned concerning the computation of the NPV are still present here. We have: T (5. .76 CHAPTER 5. qjn ) where qji denotes the quantities of good i consumed by individual j. . It may be skipped without loss of continuity. . After this conversion and having suitably chosen a social discounting rate r. ASSESSING COMPETING PROJECTS where b(i) (resp. These preferences can be conveniently represented using a utility function Uj (qj1 . Extra diﬃculties are easily seen to emerge: • how can one evaluate “beneﬁts” and “costs” from a “social point of view”? • is it always possible to measure the value of “beneﬁts” and “costs” in monetary units and how should the prices be chosen? • how is the social discount rate chosen? It is apparent that CBA is a “mono-criterion” approach that uses “money” as a yardstick. it is possible to apply the standard discounting formula for computing the Net Present Social Value (N P SV ) of a project. i) − (1 k=1 + r)i p (k)c(k. . i) and a project where N P SV > 0 will be interpreted as improving the welfare of society and. Clearly the foundations of such a method and the way of using it in practice deserve to be clariﬁed. Section 5. . qj1 . Each individual j is supposed to have completely ordered preferences for consumption bundles.

We therefore rewrite equation 5. The existence of markets for the various goods and the hypothesis that individuals operate on these markets so as to maximise utility ensure that. Having chosen a particular good for num´raire (we shall call that good “money”). Under the hypothesis that. . we may always normalise W in such a way that λi Wj = 1.5. Using 5.3) dW = j=1 i=1 Wj Uji dqji where ∂Uj ∂W Wj = ∂Uj and Uji = ∂qji Social welfare will increase following the shock if dW > 0. consider a “project”.e.2. this implies that: e (5. THE PRINCIPLES OF CBA 77 Social preferences are supposed to be well-deﬁned in terms of the preferences of the individuals through a “social utility function” (or “social welfare function”) W (U1 . interpreted as an external shock to the economy. before the shock.6 as: .6. These modiﬁcations are supposed to be marginal. i. 5.6) dW = j=1 λi Wj i=1 pi dqji In equation 5. as the e marginal utility of “income” for individual j. Under this hypothesis. . the conclusion is that the coeﬃcients λi Wj are constant over individuals (otherwise income would have been reallocated in favour of individuals for which λi Wj is the larger).4) where pi denotes the price of the ith good. for all j. Un ). .3 can be rewritten as: m n (5. . before the shock. U2 . Starting from an initial situation in the economy. we have.5) Uji = λj pi where λj can be interpreted as the marginal eﬀect on the utility of individual j of a marginal variation of the consumption of the num´raire good. they will not aﬀect the prices of the various goods. It is useful to interpret W as representing the preferences of a “planner” regarding the various “social states”. consisting in a modiﬁcation of the quantities of goods consumed by each individual. the coeﬃcient λi Wj has a useful interpretation: it represents the increase in social welfare following a marginal increase of the income of individual j.5. the distribution of income is “optimal” in the society. The impact of such a shock on social welfare is given by (assuming diﬀerentiability): m n (5. for all individuals j and for all goods i and k: Uji pi = Ujk pk (5.

In spite of all its limitations. Although we shall not enter e into details. • the presence of numerous externalities (think of the pollution generated by a new motorway). • the economy is closed (no imports or exports) and there is no government (and in particular no taxes).7 and of related formulas is particularly clear in situations that are fairly diﬀerent from the ones in which CBA is currently used as an evaluation tool.7. our model allows us to understand. variations of social welfare are therefore conveniently measured in money terms using market prices. The general formula for computing the N P SV may be seen as an extension of 5. Extensions and remarks The limitations of the elementary model presented above are obvious. the relation 5. it should be emphasised that the theoretical foundations of CBA are controversial on some important points. These are often characterised by: • non-marginal changes (think of the construction of a new underground line in a city). ASSESSING COMPETING PROJECTS m n (5. The most important ones seem to be the following: • the model only deals with marginal changes in the economy. Returning to CBA. The appropriateness of equation 5. through the simple derivation of equation 5. • the distribution of income was assumed to be optimal.7 without these restrictions. • the model considers a single-period economy without production. the rationale for trying to price out all eﬀects of a project in order to assess its contribution to social welfare.e. the so-called consumer surplus). In this simple model.7 coincides with the computation of the N P SV when time is not an issue and the eﬀects (costs or beneﬁts) of a project can be expressed in terms of consumption of goods exchanged on markets.78 CHAPTER 5.7) dW = j=1 i=1 pi dqji which amounts to saying that the social eﬀects of the shock are measured as the sum over individuals of the variation of their consumption evaluated at market prices (i. . • the presence of numerous public goods for which no market price is available (think of health services or education). A detailed treatment of the foundations of CBA without our simplifying hypotheses can be found in Dr`ze and Stern (1987).

• eﬀects that are very unevenly distributed among individuals and raise important equity concerns (think of your reaction if a new airport were to be built close to your second residence in the middle of the countryside). Brown and Lucero 1998). to price them out In spite of these diﬃculties. In order to illustrate the type of work involved in such studies.g. An overview of this literature may be found in Sugden and Wiliams (1983) and in Zerbe and Dively (1994).g. We will simply illustrate some of these points in section 5.3 Some examples in transportation studies Public investment in transportation facilities amounts to over 80 109 FRF annually in France (around 14 109 USD or 14 109 e). It is impossible to give a detailed account of how CBA is currently applied in France for the evaluation of transportation investment projects.5. It includes: the determination of prices for “goods” without markets. e. Weitzman 1994). Loomis. CBA is presently the standard evaluation technique for such projects. Champ. contingent valuation techniques or hedonic prices (see Scotchmer 1985. see Boiteux (1994) and Syndicat des Transports Parisiens (1998). SOME EXAMPLES IN TRANSPORTATION STUDIES 79 • markets in which competition is altered in many ways (monopolies. For concreteness. Keeler and Cretin 1983. Economists have indeed developed an incredible variety of tools in order to use the N P SV even in situations in which it would a priori seem diﬃcult to do so.3. . we shall only take a few examples (for more details. this would take an entire book even for a project of moderate importance. 5. Eﬀects of such a project are clearly very diverse. a useful reference in English is Adler (1987)) based on a number of real-world applications. regulations). Peterson. exploitation costs) although their evaluation may raise problems.3. • the overwhelming presence of uncertainty (technological changes. • eﬀects that are highly complex and may concern a very long period of time (think of a policy for storing used nuclear fuel). We will concentrate on some of them here. thus. the consideration of irreversible eﬀects (e. It is impossible to review the immense literature that these eﬀorts have generated here. • the diﬃculty of evaluating some eﬀects in well-deﬁned units (think of the aesthetic value of the countryside) and. through the use of option values). Harvey 1995. leaving direct ﬁnancial eﬀects aside (construction costs. taxes. we shall envisage a project consisting in the extension of an underground line in the suburbs of Paris. Harvey 1994. long term eﬀects of air pollution on health). the inclusion of equity considerations in the calculation of the NPSV (Brent 1984). the treatment of uncertainty. future prices. maintenance costs. the determination of an appropriate social discounting rate (useful references on this controversial topic include Harvey 1992. CBA still mainly rests on the use of the N P SV (or some of its extensions) to evaluate projects.

a more or less crowded environment).g.3. stairs to be climbed. a stage in which all details (concerning e. frequently account for more than 50% of the beneﬁts of these types of projects). Local modiﬁcations in the oﬀer of public transportation may have consequences on the traﬃc in the whole region. Furthermore. Its main “beneﬁts” consist in “time gains”. ASSESSING COMPETING PROJECTS 5. Although this seems reasonable. In most models time gains are evaluated on the basis of what is called “generalised time” i. e. Traﬃc forecast models usually involve highly complex modal choice modules coupled with forecasting and/or simulation techniques. Their outputs are clearly crucial for the rest of the study. Unsurprisingly these models lead to very diﬀerent results. Such evaluations.g. on top of being technically rather involved.2 Time gains Traﬃc forecasts are used to evaluate the time that inhabitants of the Paris region would gain with the extension of the metro line. temperature.g. None of them seem to integrate the potential modiﬁcations of behaviour of a signiﬁcant proportion of the population in reaction to the new infrastructure (e. such forecasts are usually made at an early stage of development of the project.u.3. As far as we know. These forecasts are then more or less mechanically updated (e. increased following the observed rate of growth of the traﬃc in the past few years) in order to obtain ﬁgures for all the periods of study. These models are not part of CBA and indicating their limitations should not be seen as a criticism of CBA. They diﬀer on many points. the tariﬃng of the new infrastructure or the frequency of the trains) may not be completely decided yet.e. the statistical tools used for modal choice or the segmentation of the population that is used (Boiteux 1994). raise some basic diﬃculties: • is one minute equal to one minute? Such a question may not be as silly as it seems. much less eﬀorts have been devoted to the study of models allowing to convert time into generalised time than on the “price of time” that will be used afterwards. by moving away from the centre of the city) whereas such eﬀects are well-known and have proved to be overwhelming in the past. Their results. which are obviously directly related to traﬃc forecasts (time gains converted into m. Nearly all public transportation ﬁrms and governmental agencies in France have developed their own tools for generating traﬃc forecasts. a measure of time that accounts for elements of (dis)comfort of the journey (e.1 Prevision of traﬃc An inevitable step in all studies of this type is to forecast the modiﬁcation of the volume and the structure of the traﬃc that would follow the implementation of the project.80 CHAPTER 5.g. however.g. Implementing such forecasting models is obviously an enormous task. 5. all these models forecast the traﬃc for a period of time that is not too distant from the installation of the new infrastructure. . form the basis of the evaluation model.

3.g. • what is the value of time and how should time gains be converted into monetary units? Should we take the fact that people have diﬀerent salaries into account? Should we rather use price based on “stated preferences”? Should we take into account the fact that most surveys using stated preferences have shown that the value of time highly depends on the motive of the journey (being much lower for journeys not connected to work)? The present practice in the Paris region is to linearly evaluate all (generalised) time gains using the average hourly net salary in the Region (74 FRF/hour in 1994 or approximately 13 USD/hour or 13 e/hour). stated preference approaches. at that time.08 FRF per vehicle-km avoided in the Paris region. economists have developed many diﬀerent methods for evaluating the value of human life. Using these ﬁgures and combining them with statistical information concerning the occurrence of car accidents and their severity.3.3 Security gains Important beneﬁts of projects in public transportation are “security gains” (hopefully. they should be divided by a little less than 6 in order to obtain 1993 USD): Death Serious injury Other injury 3 600 000 FRF 370 000 FRF 79 000 FRF these ﬁgures being based on several stated preference studies (it is not without interest to note that these ﬁgures were quite diﬀerent before 1993. the value of life insurance contracts. leads to beneﬁts in terms of security which amount to 0. Although this might not appear as a very pleasant subject of study. interesting indications. the loss of one hour daily for some users may have a much greater impact than 60 losses of 1 minute. SOME EXAMPLES IN TRANSPORTATION STUDIES 81 • is one hour worth 60 times one minute? Most models evaluating and pricing out time gains are strictly linear. human life being. A ﬁrst step consists in evaluating. In view of the major uncertainties surrounding traﬃc forecasts that are used to compute the time gains and the arbitrariness of the “price of time” that is used. 10 seconds per user-day) might well be considered insigniﬁcant. including methods based on “human capital”. wages . using the metro is far less risky than driving a car). The following one consists in converting these ﬁgures into monetary units through the use of a “price for human life”.5. 5. This is dubious since some gains (e. valued at 1 866 000 FRF). at best. revealed preference approaches including smoking and driving behaviour. the gain of security in terms of the number of (“statistical”) deaths and serious injuries that would be avoided annually by the project. sums granted by courts following accidents. The following ﬁgures are presently used in France (in 1993 FRF. Furthermore. based on traﬃc forecasts. it does not seem unfair to consider that such evaluations give.

.82 CHAPTER 5.4 Other eﬀects and remarks The inclusion of other eﬀects in the computation of the NPSV of a project in such studies raises diﬃculties similar to the ones mentioned for time gains and security gains. it remains the best way to evaluate such projects. prices used to “monetarise” eﬀects like: • noise. The conclusions and recommendations of a recent oﬃcial report (Boiteux 1994) on the evaluation of public transportation projects stated that: • although CBA has limitations. • contribution to the greenhouse eﬀect. are mainly conventional. seemingly similar. ASSESSING COMPETING PROJECTS for activities involving risk (Viscusi 1992). these studies exhibit incredible variations across techniques and.3. • local air pollution. Moreover the “prices” that are used to convert them into monetary units can be obtained using many diﬀerent methods leading to signiﬁcantly diﬀerent results. one 1993 ECU being approximately one 1993 USD): Country Denmark Finland France Germany Portugal Spain Sweden UK Price of human life 628 147 ECU 1 414 200 ECU 600 000 ECU 406 672 ECU 78 230 ECU 100 529 ECU 984 940 ECU 935 149 ECU 5. Presently a rate of 8% is used (note e e that this rate is about twice as high as the rate commonly used in Germany). A period of evaluation of 30 years is recommended for this type of project. Besides raising serious ethical difﬁculties (Broome 1985). The social discounting rate used for such projects is determined by the government (the “Commissariat G´n´ral du Plan”). We reproduce below some signiﬁcant ﬁgures for the value of life used in several European countries (this table is adapted from Syndicat des Transports Parisiens 1998). Their evaluation is subject to much uncertainty and inaccurate determination. all ﬁgures are in 1993 European Currency Unit (ECU). Weinstein and Stason 1977). in which “beneﬁts” mainly include lives saved. “cost-eﬀectiveness” analysis is often preferred to CBA since it does not require to price out human life (see Johannesson 1995. As is apparent in Syndicat des Transports Parisiens (1998). countries (this explains why in many medical studies.

Monetarised eﬀects and non monetarised ones should not be included in a common table that would give the same statute and. In view of: • the immense complexity of such evaluation studies. A multiple criteria presentation would furthermore attribute an unwarranted scientiﬁc value to such tables. all words that are frequently spotted in texts on CBA (Boiteux 1994).5. • CBA emphasises the fact that decision and/or evaluation methods are not context-free. theoretical basis. it is not surprising that markets and prices are viewed as the essential parts of the environment in CBA. 5. importance to all. • CBA studies should remain as transparent as possible. “rational” and “objective” evaluation model. implicitly. Having emerged from economics. CONCLUSIONS 83 • all eﬀects that can reasonably be monetarised should be included in the computation of the NPSV. More generally. • all public ﬁrms and administrations should use a similar methodology in order to allow meaningful comparisons. the users of CBA can rely on more than 50 years of theoretical and practical investigations. • the unavoidable elements of uncertainty and inaccurate determination entering in the evaluation model. any decision/evaluation method that would claim to be context-free would seem of limited interest to us.4. the conclusion that CBA remains the “best” method seems unwarranted. seems no more convincing. • all other eﬀects should be described verbally. We would like to note in particular that: • it has a sound. • the rather unconvincing foundations of CBA for this type of project. CBA has often been criticised on purely ideological grounds. although limited and controversial on some points. . • an independent group of CBA experts should evaluate all important projects. Contrary to many other decision/evaluation methods that are more or less ad hoc.4 Conclusions CBA is an important decision/evaluation method. • extensive sensitivity analyses should be conducted. which seems ridiculous. However the insistence on seeing CBA as a “scientiﬁc”.

ASSESSING COMPETING PROJECTS • CBA emphasises the need for consistency in decision-making.g. ﬂexibility and reactivity are essential ingredients of the process. which may be detrimental to an adequate formulation. • although the implementation of CBA may involve highly complex models (e. Creativity. It is the belief and experience of the authors of this book that such methods may have a highly beneﬁcial impact on the treatment of highly complex questions. form an important—we would tend to say a crucial—part of any decision/evaluation support study. the foundations of CBA are especially strong in situations that are at variance with the usual context of public sector projects: . traﬃc forecasts). CBA oﬀers little help at this stage. Any decision/evaluation model should tackle this problem. We strongly recommend Dorfman (1996) as an antidote to this radical position.g. As we shall see in chapter 9. the underlying logic of the method is simple and easily understandable. promoting constructive dialogue and pointing out crucial issues. Furthermore. • CBA explicitly acknowledges that the eﬀects of a project may be diverse and that all eﬀects should be taken into account in the model. We already mentioned that we disagree with the view held by some economists that CBA is the only “rational” “scientiﬁc” and “objective” method for helping decision-makers (such views are explicitly or implicitly present in Boiteux (1994) or Mishan (1982)). • having sound theoretical foundations. too radical an interpretation of CBA might lead (Dorfman 1996) to an excessive attention given to monetarisation. negotiation. Although other means of evaluation and of social co-ordination (e. formal methods based on an explicit logic can provide invaluable contributions allowing sensitivity analyses. It aims at providing simple tools allowing. exercise of power) clearly exist. In view of the popularity of purely ﬁnancial analyses for public sector projects. We shall stress here why we think that decision/evaluation models should not be confused with CBA: • supporting decision/evaluation processes involves many more activities than just “evaluation”.84 CHAPTER 5. is probably a necessary but insuﬃcient condition to build useful decision/evaluation tools (let alone the “best” ones). Even worse. “formulation” is a basic activity of any analyst. • CBA is a formal method of decision/evaluation. to ensure a minimal consistency between decisions taken by various public bodies. A recurrent theme in OR is that a successful implementation of a model is contingent on many other factors than just the quality of the underlying method. this is worth recalling (Johannesson 1995). elections. the modelling of their objectives. The determination of the “frontiers” of the study and of the various stakeholders. They do not seem always to be compatible with a too rigid view on what a “good decision/evaluation model” should be. in a decentralised way. the invention of alternatives. such as CBA.

which might be an incentive for some stakeholders to simply discard CBA. • CBA is a mono-criterion approach. • a decision/evaluation tool will be all the more useful that it lends itself easily to an insertion into a decision process. Furthermore. CONCLUSIONS 85 non-marginal changes. the price of human life) from the start might not give many opportunities to stakeholders for reaching partial agreements and/or for starting negotiations. CBA tries to summarise the eﬀects of complex projects into a single number. The complex calculations leading to the NPSV use a huge amount of “data” with varying levels of credibility. • the implicit position of CBA vis-`-vis distributional considerations is puza zling. it is not unlikely that some projects may be easily discarded and/or that some clearly superior project will emerge.4. Probably all users of CBA would agree that an accident killing 10 000 people might result in a dramatic situation in which the “costs” incurred have little relation with the “costs” of 10 000 accidents each resulting in one loss of life (think of a serious nuclear accident compared to “ordinary” car accidents). the number of deaths per vehicle-km in a given area) with much more sensible and debatable information (e. they might be prepared to accept that there may exist air pollution levels above which all mammal life on earth could be endangered and that although these levels are multiples of those currently manipulated in the evaluation of transportation projects. This might also result in a model that might not appear transparent enough to be really convincing (Nyborg 1998). it is hardly ever used in practice. CBA oﬀers almost no clue as to where to place these limits. They last for years and involve many stakeholders generally having conﬂicting objectives. this possibility is at much variance with more subtle views . Merging rather uncontroversial information (e. implicit.g. It would seem to be a heroic hypothesis to suppose that such limits are simply never reached in practice. aggregation rule used in CBA can be subjected to the familiar criticisms already mentioned in chapters 3 and 4.5. If there are limits to linearity. On the basis of less ambitious methods. they may have to be priced out quite diﬀerently. public goods. This leaves little room for political debate. Although this allows to produce outputs in simple terms (the NPSV) it might be argued that the eﬀorts that have to be made in order to monetarise all eﬀects may not always be needed. Holland 1995. • the additive linear structure of the.g. Even when monetarisation is reasonably possible. it may not always be necessary. Similarly. implicit. Decision processes involving public sector projects are usually extremely complex. Although the possibility of including in the computation of the NPSV individual “weights” (capturing a diﬀerent impact on social welfare of individual variations of income) exists (Brent 1984). • in CBA the use of “prices” supposedly revealed by markets (most often in “market-like” mechanisms) tend to obscure the. externalities are indeed pervasive (see Brekke 1997. weighting of the various eﬀects of a project. Laslett 1995).

On the other hand. sensitivity analysis is often restricted to studying the impact of the variation of a few parameters on the NPSV. • the use of a simple “social discounting rate” as a surrogate for taking a clear position on inter-generational equity issues is open to discussion. We are afraid to say that if you disagree on this point. These limitations should not be interpreted as implying a condemnation of CBA. Fishburn and Straﬃn 1989. taking decisions today that will have important consequences in 1000 years (think of the storage of used nuclear fuel) while using a method that gives almost no weight to what will happen 60 years from now 1 ( 1. Gafni and Birch 1997. Fishburn and Sarin 1994. Practical texts on CBA always insist on the need for sensitivity analysis before coming to conclusions and recommendations. Harvey 1994. • the very idea that “social preferences” exist is open to question. you might ﬁnd the rest of this book of extremely limited interest. ASSESSING COMPETING PROJECTS on equity and distributional considerations (see Fishburn 1984. We doubt that markets are such particular institutions that they always allow to solve or bypass the problem in an undebatable way. Schneider. Weitzman 1994). in spite of its many qualities. Fishburn and Sarin 1991. the meaning of the NPSV of a project is far from being obvious.86 CHAPTER 5. We would argue that it gives. Schieber. This is especially true in the context of the evaluation of public sector projects. CBA is far from exhausting the activity of supporting decision/evaluation processes (Watson 1981). We consider them as arguments showing that. But if “social preferences” are ill-deﬁned. • decision/evaluation models can hardly lead to convincing conclusions if elements of uncertainty and inaccurate determination entering the model are not explicitly dealt with. .0860 ≈ 1%) seems debatable (see Harvey 1992. This is rather far from what we could expect in such situations. a partial and highly conventional view of the desirability of the project. Weymark 1981). Eeckoudt and Gollier 1997. Due to the amount of data of varying quality included in the computation of the NPSV. It seems hard to think of other forms of social co-ordination that could do much better. Even accepting the rather optimistic view of a continuous increase of welfare and of technical innovation. at best. a true “robustness analysis” should combine simultaneous variations of all parameters in a given domain. We showed in chapter 2 that “elections” were not likely to give rise to such a concept. one parameter varying at a time. if you expect to discover in the next chapters formal decision/evaluation tools and methodologies that would “solve all problems and avoid all diﬃculties” you should also realise that your chances of being disappointed are very high.

this is a consequence of the existence of decision-makers with many diﬀerent “value systems”. for instance people who like sportive car driving. the price for instance is a very delicate criterion since the amount of money the buyer is ready to spend clearly depends on his social condition. The relative importance of the criteria also very much depends on the personal characteristics of the buyer: there are various ideal types of car buyers. or large comfortable cars or reliable cars or cars that are cheap to run. We describe the context of the case below and will invoke it throughout this chapter for illustrating a sample of decision aiding methods. the problem is too roughly stated to be meaningful. The main advantage of this example is that the problem is familiar to most of us (except for one of the authors of this book who is deﬁnitely opposed to owning a car) and it is especially appealing for male decisionmakers and analysts for some psychological reason.1 Thierry’s choice How to choose a car is probably the multiple criteria problem example that has been most frequently used to illustrate the virtues and possible pitfalls of multiple criteria decision aiding methods. The case is simple enough to allow for a short but complete description. One point should be made very clear: it is unlikely that a car could be universally recognised as the best.6 COMPARING ON THE BASIS OF SEVERAL ATTRIBUTES: THE EXAMPLE OF MULTIPLE CRITERIA DECISION ANALYSIS 6. in a properly deﬁned context. desires and/or phantasms of the potential buyer of a new or second-hand car can be so diversiﬁed that it will be very diﬃcult to establish a list of relevant points of view and build criteria on which everybody would agree. it also oﬀers suﬃcient potential for reasoning on quite general problems raised by the treatment of multi-dimensional data in view of decision and evaluation. needs. one can object that in many illustrations. we have chosen to use the “Choosing a car” example. even if one restricts oneself to a segment of the market. Despite these facts. for illustrating the hypotheses underlying various elementary methods for modelling and aggregating evaluations in a decision aiding process. the motivations. 87 . However.

each viewpoint being decomposed into sub-points that can be further decomposed . 4 year old cars with powerful engines. estimated costs and performances. He thus limits his selection of alternatives to the 14 cars listed in Table 6.1. Thierry lives in town and does not have a garage to park the car in at night.88 CHAPTER 6. The story dates back to 1993.1: List of the cars selected as alternatives 6. so he decides to explore the middle range segment. our student—call him Thierry—aged 21. Being a student. COMPARING ON SEVERAL ATTRIBUTES Trademark and type Fiat Tipo 20 ie 16V Alfa 33 17 16V Nissan Sunny 20 GTI 16 Mazda 323 GRSI Mitsubishi Colt GTI Toyota Corolla GTI 16 Honda Civic VTI 16 Opel Astra GSI 16 Ford Escort RS 2000 Renault 19 16S Peugeot 309 GTI 16V Peugeot 309 GTI Mitsubishi Galant GTI 16 Renault 21 20 turbo 1 2 3 4 5 6 7 8 9 10 11 12 13 14 Table 6. is passionate about sportive cars and driving (he has taken lessons in sports car driving and participates in car races). Selecting the relevant points of view and looking for or constructing indices that reﬂect the performances of the alternatives for each of the viewpoints often constitutes a long and delicate task. then to look for such a car in second hand car sale advertisements. Many authors have advocated a hierarchical approach to criteria building. ﬁnding “the rare pearl” about twelve months after he made up his mind as to which car he wanted. Selecting the alternatives The initial list of alternatives was selected taking an additional feature into account. This explains why he discards cars like VW Golf GTI or Honda CRX. So he does not want a car that would be too attractive to thieves. This is what he actually did. he cannot aﬀord to buy either a new car nor a luxury second hand sports car. it is moreover a crucial one since the quality of the modelling will determine the relevance of the model as a decision aiding tool. Thierry intends to use the car in everyday life and occasionally in competitions. His strategy is ﬁrst to select the make and type of the car on the basis of its characteristics.1 Description of the case Our example is adapted from an unpublished report by a Belgian engineering student who describes how he decided which car he would buy.1.

It is suﬃcient to say that Thierry’s concerns are very particular and that he accordingly selected ﬁve viewpoints related to cost (criterion 1). actual mileage per year. criterion 2 (“Accel” in Table 6. Finally he expects (hopes) to use the car for 4 years. Note that the petrol consumption cost which is estimated with a rather high degree of imprecision counts for about one third of the total cost. actual selling price (in contrast to the oﬃcial quotation). Thierry evaluates the expenses as the sum of an initial ﬁxed cost and expenses resulting from using the car. plus various taxes. estimated by the oﬃcial quotation of the 4-year old vehicle.1. for a survey). Thierry somehow estimates his mileage at 12 000 km per year and the price of the petrol to . We shall not emphasise the process of selecting viewpoints in this chapter. The purchase cost is also highly uncertain.9 e per litre (1 e. The oﬃcial quotation of second hand vehicles of various ages is also published in such journals. Thierry’s particular interest in sporty cars is reﬂected in his deﬁnition of the other criteria. insurance and petrol consumption. Petrol consumption is estimated on the basis of three ﬁgures that are highly conventional: the number of litres of petrol burned in 100 km is taken from the magazine benchmarks. For building the other criteria Thierry has a large number of performance indices whose value is to be found in the magazine benchmarks at his disposal. The ﬁxed costs are the amount paid for buying the car. time needed to reach a speed of 100 km/h or to cover 400 meters that are also widely available.2) encodes the time (in seconds) needed to cover a distance of one kilometre starting from rest. performance of the engine (criteria 2 and 3) and safety (criteria 4 and 5). Maintenance costs are considered roughly independent of the car and hence neglected. Large variations from the estimation may occur due to several uncertainty and risk factors such as actual life-length of the car. Car performances are evaluated by their acceleration. is approximately equivalent to 1 USD). Saaty (1980)). although it is a matter of importance. Evaluating the alternatives Evaluating the expenses incurred by buying and using a speciﬁc car is not as straightforward as it may seem. The resale value of the car after 8 years is not taken into account due to the high risk of accidents resulting from Thierry’s oﬀensive driving style. The yearly costs involve another tax. Evaluations of the cars on these viewpoints have been obtained from monthly journals specialised in the benchmarking of cars. etc. A thorough analysis of the properties required of the family of criteria selected in any particular context (consistent family. On the basis of these hypotheses he gets the estimations of his expenses for using the car during 4 years that are reported in Table 1 (Criterion 1 = Cost). Some of these values may be imprecisely determined: they may be biased when provided by the car manufacturer (the procedures for evaluating petrol consumption are standardised but usually under- . i. exhaustive. THIERRY’S CHOICE 89 (Keeney and Raiﬀa (1976). the European currency unit. non-redundant and monotonic) can be found in Roy and Bouyssou (1993) (see also Bouyssou (1990). One could alternatively have taken other indicators such as power of the engine.e.6.

In view of Thierry’s particular motivations.9 36. brakes. To get an overall indicator of braking quality (and also for road-holding).g.2. from the point of view of the user. several other dimensions are investigated such as comfort. criteria 2 and 3 reﬂect diﬀerent requirements and are thus both necessary.66 1.75 2. maintenance.2: Data of the “choosing a car” problem estimate the actual consumption for everyday use).75 3. The 3 or 4 partial aspects of each viewpoint are evaluated on an ordinal scale the levels of which are labelled “serious deﬁciency”. the procedures for measuring are generally unspeciﬁed and might vary since the cars are not all evaluated by the same person. the torque).33 2 2.6 35. i.25 2.66 1.2) is the time (in seconds) needed for covering one kilometre when starting in ﬁfth gear at 40 km/h.75 2 2 2 1.33 2.25 2. when provided by specialised journalists in magazines.33 1.33 2. “Brakes” and “Road-h” in Table 6.4 29.4.5 2.7 34.2 28. equipment. a number of aspects are considered: 10 for comfort.6 36.2).e.3 36. . since they are generally positively correlated (powerful engines generally lead to quick response times on both criteria).4 30 28. “exceptional”.5 35.7 30. 4 for road-holding.33 2.66 1. cars that are specially prepared for competition may however lack suppleness in low operation conditions which is quite unpleasant in urban traﬃc. “above average”. The third criterion that Thierry took into consideration is linked with the pick up or suppleness of the engine in urban traﬃc. COMPARING ON SEVERAL ATTRIBUTES Name of cars Crit1 Cost 18 342 15 335 16 973 15 460 15 131 13 841 18 971 18 319 19 800 16 966 17 537 15 980 17 219 21 334 Crit2 Accel 30.5 1.8 35. only the qualities of braking and of road-holding are of concern to him and lead to the building of criteria 4 and 5 (resp. Again other indicators could have been chosen (e.66 2 2.8 35. “below average”.7 30. This dimension is not independent of the second criterion.3 34.8 28 28.33 1.6 34. the reader is referred to Section 6. . etc.9 35.7 Crit4 Brakes 2. this dimension is considered important since Thierry also intends to use his car in normal traﬃc.66 2 Crit5 Road-h 3 2.2 29 30.3 29. ﬁnish.2 41.7 37. For a short discussion about the notions of independence and interaction. The indicator selected to measure this dimension (“Pick up” in Table 6.90 CHAPTER 6. in terms of preferences.25 1 2 3 4 5 6 7 8 9 10 11 12 13 14 Fiat Tipo Alfa 33 Nissan Sunny Mazda 323 Mitsubishi Colt Toyota Corolla Honda Civic Opel Astra Ford Escort Renault 19 Peugeot 309 16V Peugeot 309 Mitsubishi Galant Renault 21 Table 6.9 Crit3 Pick up 37. So.75 1.5 1. boot.9 29. For each of these. . 3 for brakes. In the magazine’s evaluation report. body. Thierry re-codes the ordinal levels with integers .6 30. . road-holding behaviour. “average”.

i. Let us follow his reasoning. 5. In fact. In view of reaching a decision. Many types of representations can be thought of.1. Being intrinsically part of this data is an appreciation (more or less explicit) of their degree of precision and their reliability. The values for all criteria have been mapped (linearly) onto intervals of length 2. Obviously these numbers are also imprecise. THIERRY’S CHOICE 91 from 0 to 4 and takes the arithmetic mean of the 3 or 4 numbers.2 Reasoning with preferences In the second part of the presentation of this case. Thierry decides for himself and the consequences of his decision should not aﬀect him crucially). it is clear however that not too much conﬁdence should be awarded to the precision of these “evaluations”. obviously. Such a transformation of the data is not always innocent. Note that the evaluations for the various criteria have been re-scaled in view of a better readability of the ﬁgure. are not given but selected and elaborated on the basis of the available information. he was able to make up his mind without using any formal aggregation method.1 shows such a representation. eﬃcient brakes are also needed to keep the risk inherent to competition at a reasonable level. the ﬁrst criterion being represented in the [0. We shall however consider that these ﬁgures reﬂect. 6. in some sense. 13. Note that the ﬁrst 3 criteria have to be minimised while the last 2 must be maximised. we brieﬂy discuss this point below. the 3 components of each viewpoint are equally important and the levels of each of the three scales are equally spaced).e. 4] interval and so on. First of all he built a graphic representation of the data. in the relatively simple decision situation he was facing (“no wife. the lowest evaluation observed for the sample of cars is mapped on the lower bound of the interval while the highest value is represented on the upper bound of the interval.1. this results in the ﬁgures with 2 decimals provided in the last two columns of Table 1. not necessarily because of imprecision in the evaluations but because of the arbitrary character of the cardinal re-coding of the ordinal information and its aggregation via an arithmetic mean (postulating implicitly that. Figure 6. the second criterion. For each criterion. in the [2. car numbers 4. The reason for such an elimination is that a powerful engine is needless in competition if the chassis is not good enough and does not guarantee good road-holding.6. This completes the description of the “data” which. 8. popular spreadsheet software oﬀer a large number of graphical options for representing multi-dimensional data. the behaviour of each car from the corresponding viewpoint. Rules that would restate the set of remaining cars are for instance: criterion 4 ≥ 2 . 2] interval. The rules for discarding the above mentioned cars have not been made explicit by Thierry in terms of unattained levels on the corresponding scales. Thierry will provide information about his preferences. in some way. 6. Thierry ﬁrst discards the cars whose braking eﬃciency and road-holding behaviour is deﬁnitely unsatisfactory. no boss”. 9.

5 3 2.5 2 1.92 CHAPTER 6.1: Performance diagram of all cars along the ﬁrst three criteria (above. to be maximised) . to be minimised) and the last two (below.5 1 0. COMPARING ON SEVERAL ATTRIBUTES Criteria to be minimised 6 5 4 3 2 1 0 Fiat Alfa Nissan Mazda MitsuColt Toyota Honda Opel Ford R19 Peu16 Peu MitsuGal R21 Supple Accel Cost Criteria to be maximised 4 3.5 0 Fiat Alfa Nissan Mazda MitsuColt Toyota Honda Opel Ford R19 Peu16 Peu MitsuGal R21 Road−h Brakes Figure 6.

On Figure 6. Looking at the performances of the remaining cars. 2. Notice that car number 14 would not have been dominated if other criteria had been taken into consideration such as comfort or size: this car is indeed bigger and more classy than the other cars in the sample.1 instead after reordering the cars. their values on each criterion have all been linearly re-scaled. 2 and 3 are to be minimised while the others have to be maximised.2.7. the value 1 corresponds to the lowest value for one of the cars in the initial set of 14 alternatives on each criterion. if an alternative was to receive a 0 value on several criteria.1. those labelled 1. This suggests that the evaluations of the selected cars should not be transformed independently of the values of the cars in the initial set. 10 are further discarded.3 second) coupled with a deﬁnite disadvantage (. 3] interval. we show a close-up of Figure 6. THIERRY’S CHOICE and criterion 5 ≥ 2 93 with at least one strict inequality. The choice of interval [1. those evaluations would all be represented by the origin. which makes the graph less readable. being mapped on the [1. 1.12. he drew the same diagram as in Figure 6. 2.4. The set of remaining cars is restated for instance by the rule: criterion 2 < 30 Finally. Among the 3 remaining cars the one he chooses is number 11. Here are the reasons for this decision. the car labelled 14 is eliminated since it is dominated by car number 11.7 second) on the acceleration criterion.8 second) on suppleness.3 that is focused on the 4 selected cars only.6.11. In these star-diagrams each car is represented by a pentagon. these still constitute reference points in relation to which the selected cars are evaluated. The cars left after the above elimination process are those labelled 3. “Dominated by car 11” means that car 11 is at least as good on all criteria and better on at least one criterion (here all of them!). Thierry ﬁrst eliminates car number 12 on the basis of its relative weakness on the second criterion (acceleration). for the reader’s convenience. 3] instead of interval [0. the 4 candidate cars were all put on the right of the diagram as shown in Figure 6.2). in this way Thierry was still able to compare the diﬀerence in the performances of two candidate cars for a criterion to typical diﬀerences for that criterion in the initial sample. .3. Comparing cars 7 and 11. Thierry did not use the latter diagram (Figure 6. their performances are shown on Figure 6. remember that criteria 1. Thierry considers that the price diﬀerence (about 500 e ) is worth the gain (. On each axis. he considers that the cost diﬀerence (car 7 about 1 500 e more expensive) is not balanced by the small advantage on acceleration (. In interpreting the diagrams. the value 3 corresponds to the highest value for one of the 14 cars. Comparing cars 3 and 11. 2] is dictated by the mode of representation: the value “0” plays a special role since it is common to all axes.

33 2.6 34.8 35.33 2.00 crit 2 (accel) crit 3 (supple) crit 4 (brakes) crit 3 (supple) crit 4 (brakes) Figure 6.9 35.5 2 2.00 0.66 2.6 Crit3 Pick 34.3: Performances of the 4 candidate cars .00 crit 2 (accel) crit 4 (brakes) crit 3 (supple) crit 4 (brakes) crit 3 (supple) Peugeot 309 GTI 16 crit 1 (cost) 3.00 0.00 2.00 0.3 29.00 crit 5 (road-h) 1.75 2.00 crit 5 (road-h) 1.94 CHAPTER 6.00 2.00 2.33 Crit5 Road 2.00 crit 2 ( accel) Honda civic VTI 16 crit 1 (cost) 3.2: Star graph of the performances of the 4 cars left after the elimination process Name of car 3 7 11 12 Nissan Sunny Honda Civic Peugeot 16V Peugeot Crit1 Cost 16 973 18 971 17 537 15 980 Crit2 Acc 29 28 28.00 crit 5 (road-h) 1.3 Crit4 Brakes 2.00 2.75 Table 6.00 0.00 crit 2 (accel) Peugeot 309 GTI crit 1 (cost) 3.00 crit 5 (road-h) 1. COMPARING ON SEVERAL ATTRIBUTES Nissan sunny 20 GTI 16V crit 1 (cost) 3.

6.1. the 4 candidate cars stand on the right . THIERRY’S CHOICE 95 10 9 8 7 6 5 4 3 2 1 0 Fiat (1) Alfa (2) Mazda (4) Mitsu Colt (5) Toyota (6) Opel (8) Ford (9) R19 (10) Mitsu Gal (13) R21 (14) Nissan (3) Honda (7) Peu16 (11) Peu (12) Road−h (Max) Brakes (Max) Pick up (min) Accel (min) Cost (min) Figure 6.3: Performance diagram of all cars.

they could be insuﬃciently discriminating for the considered subset of cars (this is certainly the case for criterion 4): the values of the diﬀerences for the set of candidate cars could be such that they are not large enough to balance the diﬀerences on other criteria. to a value that could be described as both desirable and accessible. pp.96 CHAPTER 6. Notice that these levels have not been set a priori as minimal levels of satisfaction. The rules that have been used for eliminating certain alternatives have exclusively been combined in conjunctive mode since an alternative is discarded as soon as it does not fulﬁl any of the rules.3: the 4 cars remaining after initial screening Comments Thierry’s reasoning process can be analysed as being composed of two steps. there are several possible reasons for this: criteria 4 and 5 might be of minor importance or considered satisfactory once a certain level is reached. In the second step of Thierry’s reasoning. Another elementary method that has been used is the elimination of dominated alternatives (car 11 dominates car 14). 2.4: Detail of Figure 6. Criteria 4 and 5 were not invoked. . for instance mixing up conjunctive and disjunctive modes with aspiration levels deﬁned for subsets of criteria (see Fishburn (1978) and Roy and Bouyssou (1993). COMPARING ON SEVERAL ATTRIBUTES 10 8 6 4 2 0 Nissan (3) Honda (7) Peu16 (11) Peu (12) Pick up (min) Accel (min) Cost (min) Road−h (Max) Brakes (Max) Figure 6. Subtle considerations on whether the balance of diﬀerences in performance between pairs of cars on 2 or 3 criteria results in an advantage to one of the cars in the pair. The ﬁrst one is a screening process in which a number of alternatives are discarded on the basis of the fact that they do not reach aspiration levels on some criteria. 1. they have been set after having examined the whole set of alternatives. More sophisticated modes of combinations may be envisaged. 264-266).

a weighted average of their grades in the various subjects. . In his ex post justiﬁcation study.2 The weighted sum When dealing with multi-dimensional evaluations of alternatives. for instance being able to cope with new alternatives that could be suggested by the other people. In the rest of this chapter. more intuition is needed. . since the decision was actually made well before Thierry became aware of multiple criteria methods. THE WEIGHTED SUM 97 3. which is better supported by the original scales. Thierry has in addition tried to derive a ranking of the alternatives that would reﬂect his preferences. we consider the . This can be viewed as an ex post analysis of the problem. . 6. This kind of reasoning that involves comparisons of diﬀerences in evaluations is at the heart of the activity of modelling preferences and aggregating them in order to have an informed decision process. it may appear necessary to use tools for modelling preferences.2. Starting from the standard situation of a set of alternatives a ∈ A evaluated on n points of view by a vector g(a) = (g1 (a). are not expressed in directly interpretable units. In the simple case we are dealing with here. however. This attitude is perhaps inherited from school practice where all other performance evaluations of the pupils have long been (and often still are) summarised in a single ﬁgure. it is not sure that the kind of reasoning he used for just choosing the best alternative for him would have ﬁt the bill. . Note also that if Thierry’s goal had been to rank order the cars in order of decreasing preference. if the decision-maker is bound to justify his decision to other persons (shareholders. We report on how Thierry applied some of them to his case and extrapolate on how he could have used the others. that after the ﬁrst step consisting in the elimination of unsatisfactory alternatives. . g2 (a). Since criteria 4 and 5 are aggregates and. gn (a)).6. the analysis of the remaining four cars has been much more delicate. There is another rather frequent circumstance in which more formal methods are mandatory. the evaluation system should be more systematic. emphasising the very strong hypotheses underlying the use of this type of approach. ). we discuss a few formal methods commonly used for aggregating preferences. thus. . We discuss the application of the weighted sum to the car example below. . the small number of alternatives and criteria has allowed Thierry to make up his mind without having to build a formal model of his preferences. this might also have been a reason for not exploiting them in the ﬁnal selection. colleagues. the basic and almost natural (or perhaps. We have seen. The problems raised by such a practice have been discussed in depth in Chapter 3. In more complex situations (when more alternatives remain after an initial elimination or more criteria have to be considered or if a ranking of the alternatives is wanted). which would reﬂect the value of the alternatives on a synthetic “super scale of evaluation”. cultural?) attitude consists in trying to build a onedimensional synthesis. The reasoning is not made on the basis of re-coded values like those used in the graphics.

6. The ranges of variation on the scales are very heterogeneous: from 13841 to 21334 on the cost criterion. Statements such as “alternative a is twice as good as b on criterion i” remain valid after transformation.min and divide by the range gi.min For simplicity.66 on criterion 4. (6. This simple and most commonly used procedure relies however on very strong hypotheses that can seldom be considered plausibly satisﬁed. + kn gn (a) Suppose. alternatively one might subtract the minimal value gi. a ranking of the alternatives is obtained by ordering them in decreasing order of the value of f . These problems appear very clearly when trying to use the weighted sum approach on the car example. substitute gi by −gi or use a negative weight ki ). asking for values of the weights ki in terms of the relative importance of the criteria without referring to the scales would yield absurd results. b of alternatives remains unaltered: (6. Similarly.4) gi (a) gi (a) = gi (b) gi (b) This transformation can be advanced when using ratio scales.2.max .max − gi. Once the weights ki have been determined.e. gi were to be minimised.min gi.33 to 2. Clearly. .98 CHAPTER 6. COMPARING ON SEVERAL ATTRIBUTES value f (a) obtained by linearly combining the components of g .3) gi (a) = gi (a) = gi (a) gi.max − gi. These normalisations of the original gi functions are respectively denoted gi and gi in the following formulae (6. from 1. gi. One consists in dividing gi by the largest value on the ith scale. the top evaluation will be mapped onto 1 while the bottom one goes onto 0. i. In the former case the maximal value of gi will be 1 while value 0 is kept ﬁxed which means that the ratio of the evaluations of any pair a.2) prompts a remark that was already made when we considered representing the “data” graphically. The usual way out consists in normalising the values on the scales but there are several manners of doing this. that all criteria are to be maximised. in which the value 0 plays a special role. the larger the value gi (a). without loss of generality.1) f (a) = k1 g1 (a) + k2 g2 (a) + .2) (6.min . on the contrary.max gi (a) − gi. . i. the better the alternative a on criterion i (if. In the case of gi .e. we suppose here that gi are positive. choosing an alternative becomes straightforward: the best alternative is the one associated with the largest values of f .1 Transforming the evaluations A look at the evaluations of the cars (see Table 6. ratios are not preserved but ratios in diﬀerences of evaluations .

this rough assignment of weights yields car number 3 as ﬁrst choice followed immediately by car number 11 which was actually Thierry’s choice. c. 6. one could prevent such a drawback. (6.5. As can be seen in the last column of Table 6.e. permutations of alternatives in the order of preference.2.6. b.2.2 Using the weighted sum on the case Suppose we consider that 0 plays a special role in all scales and we choose the ﬁrst transformation option. this is certainly a crucial activity in a decision aiding process. The values of the gi ’s that are obtained are shown in Table 6. The ﬁrst three criteria receive negative weights namely and respectively −1. the re-scaling of the criteria yields values of gi that are not the same as in Table 6. is suﬃcient to cause a rank reversal between the leading two alternatives. the diﬀerence in the values of f for those two cars is tiny (less than . d.4 in decreasing order of the values of f . THE WEIGHTED SUM do: for all alternatives a. it does not alter the validity of statements like “the diﬀerence between a and b on criterion i is twice the diﬀerence between c and d”. Varying the values that are considered imprecisely determined is what is called sensitivity analysis. Note that the above are not the only possible options for transforming the data. in other words. to some extent.4 since gi. −2.5 where the set of alternatives is reduced to the 4 cars remaining after the elimination procedure. Of course. −1 (since they have to be minimised). it helps to detect what the stable conclusions in the output of a model are.3 Is the resulting ranking reliable? Weights depend on scaling To illustrate the lack of stability of the ranking obtained. It is likely that by varying the weights slightly from their present value.01) but we have no idea as to whether such a diﬀerence is meaningful. This perturbation.max depend on the set of alternatives. by using a normalising constant that would not depend on the . 6.max depends on the set of alternatives. without any change in the values of the weights. one would readily get rank reversals i.5) gi (a) − gi (b) gi (a) − gi (b) = gi (c) − gi (d) gi (c) − gi (d) 99 Such a transformation is appropriate for interval scales.min and gi.4. while the last two are given the weight . Moreover. A set of weights has been chosen which is.2. note also that these transformations depend on the set of alternatives: considering the 14 cars of the initial sample or the 4 cars retained after the ﬁrst elimination would yield substantially diﬀerent results since the values gi.4. let us consider Table 6. arbitrary but seems compatible with what is known about Thierry’s preferences and priorities. The alternatives are listed in Table 6. the ranking is not very stable. all we can do is being very prudent in using such a ranking since the weights were chosen in a rather arbitrary manner.

00 1.75 0.46 0.77 0.98 1.95 0.86 0.00 0.090 Table 6.98 0.00 0.77 0.03 -3.00 0.00 0.89 0.94 0.85 0.80 0.88 0.62 0.75 0.72 0.86 0.82 -2.88 0.84 1.5: Normalising then ranking a reduced set of alternatives .89 0.00 Weights ki -2 -1 0.62 0.91 1.5 Road 0.84 0.95 1.69 0.92 0.62 0.85 1.00 0.100 CHAPTER 6.85 0.75 0.88 0.84 1.15 Table 6.99 0.38 Value f -2.00 0.64 -2.72 0.71 0.80 0.98 1.54 0.85 -2.00 0.00 0.88 0.5 Accel Pick Brak 0.890 -2.00 0.85 0.88 0.86 0.62 1.00 0.896 -3.5 Road 1.62 0.50 0.66 -2.88 0.91 -2.88 1.73 Value f -2.96 -2.4: Normalising then ranking through a weighted sum Nr 11 3 12 7 Name of car Peugeot 16V Nissan Sunny Peugeot Honda Civic -1 Cost 0. COMPARING ON SEVERAL ATTRIBUTES Nr 3 11 12 10 7 1 5 2 8 6 4 9 14 13 Name of cars Nissan Sunny Peugeot 16V Peugeot Renault 19 Honda Civic Fiat Tipo Mitsu Colt Alfa 33 Opel Astra Toyota Mazda 323 Ford Escort Renault 21 Mitsu Galant −1 Cost 0.96 0.82 0.04 -3.98 0.65 0.94 0.91 0.89 0.71 -2.96 0.02 -3.93 1.83 0.92 0.88 0.99 0.96 0.89 0.75 0.88 0.5 Accel Pick Brak 0.92 -2.97 0.94 0.62 0.92 0.62 0.81 Weights ki −2 −1 0.91 0.97 -3.86 0.86 0.54 0.876 -2.63 -2.88 0.98 0.

2. Conventional codings Another comment concerns the ﬁgures used for evaluating the performances of the cars on criteria 4 and 5. 6. In particular.6) × gi (a) + λi = κi × gi (a) + λi gi. they certainly cannot be evaluated in a . the weight ki of gi should be obtained by dividing the weight ki by κi . gi is essentially related to gi by a multiplicative factor κi = 1. After transformation.1. Additive constants do not matter since they do not alter the rating. both gi and gi are independent of the choice of a unit. Notice that the above problem has already been discussed in Chapter 4. Indeed.2. in order to model the same preferences through a weighted sum of the gi and a weighted sum of the gi .3. The obtained ﬁgures presumably convey a less quantitative and more conventional meaning than for instance acceleration performances measured in seconds in standardisable (if not standardised) trials. you need an advantage of kj units for criterion i.min = 0.1. we have gi. for instance) on each criterion.max − gi. This was implicitly a reason for normalising the evaluations as was done through formulae 6. with such an option.4 The diﬃculties of a proper usage of the weighted sum The meaning of the weights What is the exact signiﬁcance of the weights in the weighted sum model? The weights have a very precise and quantitative meaning. for instance codings with unequal intervals separating the levels on the ordinal scale. for instance the worst acceptable value (minimal requirement for a performance to be maximised. the source of the lack of stability would be the imprecision in the determination of the worst acceptable value. Recall that those were obtained by averaging equally spaced numerical codings of an ordinal scale of evaluation. Some of these codings could obviously have changed the ranking.2.max gi (a) = (6. other codings of the ordinal scale might have been envisaged.min where λi is a constant. their weights should be diﬀerent. yet they are not identical and. THE WEIGHTED SUM 101 set of alternatives. Obviously. In a weighted sum model that would directly use the evaluations of the alternatives given in Table 6. In any case.2 and 6. So. in a consistent model. These ﬁgures however are treated in the weighted sum just like the “more quantitative” ones associated with the ﬁrst three criteria. unless gi. maximal level of a variable to be minimised. the weights have to be assessed in relation to a particular determination of the evaluations on each scale and eliciting them in practice is a complex task. An important consequence is that the weights depend on the determination of the unit on each scale. Section 4. they are trade-oﬀs: to compensate for a disadvantage of ki units for criterion j. it is clear that the weight of criterion 2 (acceleration time) has to be multiplied by 60 if times are expressed in minutes instead of seconds.6. a cost.

In order to use a weighted sum. Independence or interaction The next issue is more subtle.6 seconds. One is perfectly entitled to work with attributes that are (even strongly) correlated. For instance. Up to this point we have considered the inﬂuence on the weights of multiplying the evaluations by a positive constant. The diﬀerence between car 12 and car 3 with respect to acceleration is 0. In other words. a famous example of dependence in the sense of preferences in a gastronomic context is the following: the preference for white wine or red wine usually depends on whether you are eating ﬁsh or meat. their relative position should not be altered when the proﬁle they share on a subset of criteria is substituted by any other common proﬁle. the latter diﬀerence being positioned between 28. That is the ﬁrst point. This does not mean that the corresponding points of view are redundant and that one should eliminate some of them. are likely to be positively correlated. the viewpoints should be independent. but not in the statistical sense implying that the evaluations of the alternatives should be uncorrelated! They should be independent with respect to preferences.6 between 29 seconds and 29. This means that the gain for passing from 29. which may be used as attributes for assessing the alternatives for those viewpoints. the same for all alternatives. There are relatively simple tests for independence in the sense of preferences. COMPARING ON SEVERAL ATTRIBUTES meaningful manner through naive questions about the relative importance of the criteria. if two alternatives that share the same proﬁle on a subset of criteria compare in a certain way in terms of overall preferences. For instance in the car example. There is still a very important observation that has to be made: all scales used in the model are implicitly considered linear in the sense that equal diﬀerences in values on a criterion result in equal diﬀerences in the overall evaluation function f and this does not depend on the position of the interval of values corresponding to that diﬀerence on the scale.3 seconds.3 below). Note that translating the origin of a scale has no inﬂuence on the ranking of the alternatives provided by the weighted sum since it results in adding a (positive or negative) constant to f . car number 12 is ﬁnally eliminated because it accelerates too slowly.3 seconds and 29 seconds on the acceleration scale? It seems rather clear from Thierry’s motivations.102 CHAPTER 6. A second point is about independence. On the contrary. As will be conﬁrmed in the sequel (see Section 6.7 between cars 11 and 3. this is because the attributes that are used to reﬂect these viewpoints are often linked by logical or factual interdependencies. which consist in asking the . indicators of cost.6 seconds to 29 seconds has deﬁnitely less value than a gain of similar amplitude. comfort and equipment. it is very unlikely that Thierry’s preferences are correctly modelled by a linear function of the current scales of performance. Does Thierry perceive this diﬀerence as almost equally important as a diﬀerence of 0. reference to the underlying scale is essential. Evaluations of the alternatives for the various points of view taken into consideration by the decision-maker often show correlations. that coming close to a performance of 28 seconds is what matters to him while cars above 29 seconds are unworthy. say from 29 to 28.

it is not a suﬃcient one of course. In our case for instance. Arbitrariness. There is a diﬀerent concept that has been recently implemented for modelling preferences.2 (and even of transformations thereof such as obtained through formulae like 6. This uncertainty can be considered of stochastic nature. It is the concept of interacting criteria that was already discussed in example 2 of Chapter 3. varying the common proﬁle should not reverse the preferences when the points of view are independent. respectively acceleration and suppleness. THE WEIGHTED SUM 103 decision-maker about his preferences on pairs of alternatives that share the same proﬁle for a subset of attributes.3). Suppose that in the process of modelling the preferences of the decision-maker. he declares that the inﬂuence of positively correlated aspects should be dimmed and that conjoint good performances for negatively correlated aspects should be emphasised. for more detail on non-additive averages). This does not mean that no additive model would be suitable and it does not imply that the preferences are not independent (in the above-deﬁned sense). how precise is the measurement of the acceleration? Such an imprecision can be reduced by making the conditions of the measurement as standard as possible and can then be estimated on the basis of the precision of the measurement apparatus. Independence is a necessary condition for the representation of preferences by a weighted sum. . statistical data could help to master—to some extent—such a source of uncertainty. Let us summarise some of them: 1. provided they satisfy the independence property.2.6. criteria 2 and 3. It may then prove impossible to model some preferences by means of a weighted sum of the evaluations such as those in Table 6. it will generally be very diﬃcult to get suﬃcient relevant and reliable statistical information in for this kind of problems. Uncertainty in the evaluation of the cost: the buying price as well as the life-length of a second hand car are not known. 2. may be thought of as being positively correlated. in which the evaluations gi may be “re-coded” through using “value functions” ui . in such a model the weight of a coalition of criteria may be larger or smaller than the sum of the weights of its components (see Grabisch (1996). in practice. imprecision and uncertainty In the above discussion as well as in the presentation of our example we have emphasised the many sources of uncertainty (lack of knowledge) and of imprecision that bear on the ﬁgures used as input in the weighted sum. In the next section we shall study an additive model. see Chapter 3) there is a non-additive variant of the weighted average that could help modelling interactions among the criteria. If no re-coding is allowed (like in the assessment of students. With appropriate choices of u2 and u3 it may be possible to take the decision-maker’s preferences about positively and negatively correlated aspects into account. more general than the weighted average. Imprecision in the measurement of some quantities: for instance.

Or one takes imprecision into account from the start. in the car problem for instance. Making a decision All these sources of imprecision have an eﬀect on the precision of the determination of the value of f that is almost impossible to quantify. 2. the scales must ﬁrst be re-coded in order that one unit diﬀerence on a criterion has the same “value” everywhere on the scale (linearisation). On the one hand there are many possible strategies for varying the values of the imprecisely determined parameters. the apparently straightforward decision— choosing the alternative with the highest value of f or ranking the alternatives in decreasing order of the values of f —might be unconsidered as illustrated above. there is generally little information on the size of the imprecisions. there are two main approaches to solve the diﬃculties raised by the weighted sum: 1. Either one tries to prepare the inputs of the model (linearised evaluations and trade-oﬀs) as carefully as possible. of course. To master such an imprecision one could try to build quantitative indicators for the criteria or try to get additional information on the comparison between diﬀerences of levels on the ordinal scale: for instance. Arbitrary coding of non-quantitative data: re-coding of ordinal scales of appreciation of braking and road-holding behaviour. the range in which the parameters must be varied is not even clear as suggested above. by avoiding to exploit precise values when knowing that they are not reliable but rather working with classes of values and ordered categories.104 CHAPTER 6. The usual way out is extensive sensitivity analysis. is the diﬀerence between “below average” and “average” larger than the diﬀerence between “above average” and “exceptional”? 4. COMPARING ON SEVERAL ATTRIBUTES 3. there is not even probabilistic information on the accuracy of the evaluations. the simple remarks made above strongly suggest that it will be very diﬃcult to discriminate between cars 3 and 11. As a consequence. contrary to what can (often) be done in physics. Imprecision in the determination of the trade-oﬀs (weights ki ). Any re-coding that respects the order of the categories would in principle be acceptable. In view of the previous discussion. This part of the job is seldom carried out with the required exhaustivity because it is a delicate task at least in two respects. the imprecision of the linearisation process combines with the inaccuracy in the determination of weights. paying permanent attention to reducing imprecision and ﬁnishing with extensive sensitivity analysis. once the sensitivity analysis has been performed. these operations are far from obvious and as a consequence. On the other hand. Note that imprecision may well . one is likely to be faced with several almost equally valuable alternatives. usually parameters are varied one at a time which is not suﬃcient but is possibly tractable. the ratios of weights kj /ki must be elicited as conversion rates: a unit for criterion j is worth kj /ki units for criterion i. which could be described as part of the validation of the model. quite often.

in the middle or at the top of the scale). Varying the level of that common value on criterion i does not alter the way the two alternatives compare in the overall ranking. There is however a whole family of methods that we shall not consider here. Preference independence. This property. 3. 1. moreover. the so-called interactive methods (Steuer (1986). Consider two alternatives that share the same evaluation on at least one criterion. transforming the (linearised) scales results in a related transformation of the weights. 4. Weights depend on the scaling of the criteria. even extracted from perfectly precise evaluations.2. Equal diﬀerences between values on scale i. while the latter leads to the outranking approach. whatever the location of the corresponding intervals on the scale (at the bottom. the set of non-dominated solutions. The weights are trade-oﬀs. . 6. detailed preferential information. they do not lead to an explicit model of the decision-maker’s preferences. These implement various strategies for exploring the eﬃcient boundary. Linearity of each scale. On the contrary. i.2. Criteria do not interact. called preference independence. may prove rather diﬃcult to elicit.5 Conclusion The weighted sum is useful for obtaining a quick and rough draft of an overall evaluation of the alternatives. THE WEIGHTED SUM 105 lie in the link between evaluations and preferences rather than in the evaluations themselves. Weights tell how many units on the scale of criterion i are needed to compensate one unit of criterion j. As a conclusion to this section we summarise these conditions. produce the same eﬀect on the overall evaluation f : if alternatives a. Cardinal character of the evaluations on all scales. d are such that gi (a) − gi (b) = gi (c) − gi (d) for all i. Teghem (1996)).6. we have settled on problems with a (small) ﬁnite number of alternatives and we concentrate on obtaining explicit representations of the decision-maker’s preferences. the exploration jumps from one solution to another. The evaluations of the alternatives for all criteria are numbers and these values are used as such even if they result from the re-coding of ordinal data. can be formulated as follows. b. Vincke (1992). Such methods are mainly designed for dealing with inﬁnite and even continuous sets of alternatives. then f (a) − f (b) = f (c) − f (d). for instance.e. The former option will lead us to the construction of multi-attribute value or utility functions. One should however keep in mind that there are rather restrictive assumptions underlying a proper use of the weighted sum. These two approaches will be developed in the sequel. 2. which characteristics of the current solution he would like to see improved. say criterion i. c. it is guided by the decision-maker who is asked to tell.

u(b) of the alternatives in the following way: (6. Suppes and Tversky (1971). If the preferences of the decision-maker are compatible with an additive value model. the preference relation on the set of alternatives is a complete preorder. we get back to the weighted sum. the value u(a) usually is a function of the evaluations {gi (a). this relation relates to the values u(a).2. Of course. . if not.8) u(a) = i=1 ui (gi (a)) where the function ui (single-attribute value function) is used to re-code the original evaluation gi in order to linearise it in the sense described in the previous section.7) a b iﬀ u(a) ≥ u(b) As a consequence. In other words. . a complete ranking possibly with ties. another model should be looked for: a multiplicative model or. Note however that the imprecision issue is not dealt with inside the model (sensitivity analysis has to be performed in the validation phase. it may be possible to ask the decision-maker questions that will determine whether an additive value model is compatible with what can be perceived of his system of preferences. the alternatives can be “measured”. the weights ki are incorporated in the ui functions. n}. Krantz.e. i = 1. in terms of “worth” on a synthetic dimension of value or utility. 3. . if we denote by the overall preference relation of the decision-maker on the set of alternatives. The additive value function model can thus be viewed as a clever version of the weighted sum since it allows us to take some of the objections—mainly the second hypothesis in Section 6. i. Depending on the context. Luce. Much eﬀort has been devoted to characterising various systems of conditions under which the preferences of a decision-maker can be described by means of an additive value function model. The most common model in multiple criteria decision analysis is a formalisation of the idea that the decision-maker.106 CHAPTER 6. some systems of conditions may be interpretable and tested. Vol. etc. at least partially. a model that takes imprecision more intrinsically into account. a method of elicitation of the ui ’s may then be used. Suppes and Tversky (1990). This postulates that all alternatives may be evaluated on a single “super-scale” reﬂecting the value system of the decision-maker and his preferences. (see Krantz. behaves as if he was trying to maximise a quantity called utility or value (the term “utility” tends nowadays to be used preferably in the context of decision under risk. If this function is a linear combination of gi (a). . i = 1.3 The additive multi-attribute value model Our analysis of the weighted sum brought us very close to the requirements for additive multi-attribute value functions.5—against a naive use of it into account. when making a decision. Luce. A slightly more general case is the following additive model: n (6. .e. a nonindependent one. . COMPARING ON SEVERAL ATTRIBUTES 6. the elicitation of the partial value functions ui may also be a diﬃcult task. but is neither part of the model nor straightforward in practice). . but we shall use it sometimes for “value”). a non-additive one. i. n. Chapter 19). Accordingly. Chapter 7. more generally. . .

Suppose the answer is 29. Chapter 8. In view of the set of alternatives selected by Thierry. for instance. 29.3. 29.1 Direct methods for determining single-attribute value functions A large number of methods have been proposed to determine the ui s in an additive value function model. where ∼ denotes “indiﬀerent to”. pp. the reader is referred to von Winterfeldt and Edwards (1986). Relativising the gains as percentages of the half range from the central to the best values on each scale. 267 sq for an example starting from a worst point) The range for the cost will be the interval between 21 500 e to 13 500 e and from 28 to 31 seconds for acceleration. We brieﬂy describe the application of a technique of the latter category relying on what is called dual standard sequences. this means that the decision-maker 1000 . We are going to outline a simulated dialog between an analyst and a decision-maker that could yield an assessment of u1 and u2 . There are essentially two families of methods. say. For an accessible account of such methods. Consider a pair of criteria.5) as “average” values for cost and acceleration. First ask the decision-maker to select a “central point” corresponding to medium range evaluations on both criteria.3 second on the acceleration time is worth an increase of 1 000 e in cost. THE ADDITIVE VALUE MODEL 107 6. 28. this step will consist. the corresponding single-attribute value functions. We will say in the sequel that the parity is equal when the decision-maker agrees to exchange a percentage of the half range on a criterion against an equal percentage on another criterion. (Krantz et al. Wakker (1989)) that builds a series of equally spaced intervals on the scale of values. An assessment method based on indiﬀerence judgments Suppose we want to assess the ui s in an additive model for the Cars case. Note that we start the construction of the sequence from a “central point” instead of taking a “worst point” (see for instance von Winterfeldt and Edwards (1986). The answer might be. say Cost and Acceleration. x2 ).9. Then the standard sequence is constructed by asking which value x1 for the acceleration would make a car costing 16 500 e and accelerating in 29.2) ∼ (17 500. The answer could be explained by the fact that at the starting level of performance for the acceleration criterion.5 seconds indiﬀerent to a car costing 17 500 e and accelerating in x1 seconds.6.2 meaning that from the chosen starting point. Also ask the decision-maker to deﬁne a unit step on the cost criterion.3.5 =20% of acceleration time. the decision-maker is quite interested by a gain in acceleration time. It is assumed that the suitability of such a model for representing the decision-maker’s preferences has been established. The second step in the construction of the standard sequence is asking the decision-maker which value to assign to x2 to have (16 500. for ranges of evaluations corresponding to acceptable cars. one based on direct numerical estimations and the other on indiﬀerence judgements.3 is ready to lose 4000 =25% of the potential reduction in cost for gaining 1. Continuing along the same line would for instance yield the following sequence of . let us start with (17 500. a gain of 0. (1971). of passing from a cost of 17 500 e to 16 500 e. von Winterfeldt and Edwards (1986).

28. 29. COMPARING ON SEVERAL ATTRIBUTES value 28.7) (17 500. 28.1) Such a sequence gives the analyst an approximation of the single-attribute value function u2 . considering (for instance) the cost criterion with criteria 3. The trade-oﬀ between u1 and u2 is easily determined through solving the following equation that just expresses the initial indiﬀerence in the standard sequence (16 500. 28.2) (16 500.5) (17 500. 28. there are two linear parts in the graph: one ranging from 28 to 28.9 and 29.9) (16 500.5 0 28 CHAPTER 6.9) (17 500.5) = k1 u1 (17 500) + k2 u2 (29. one is able to re-code the scale of the cost criterion into the single-attribute value function u1 . k1 u2 (29. 29.2 and the other valid between 28.5 to 31. 28. 29.5 29 acceleration (sec) 29.7) (16 500. on the half range from 28 to 29.5) (16 500.5 1 0.5) .3) (17 500. 28. Then.2) − u2 (29. From there.3 .5 Figure 6.5 3 2.108 3.2) (17 500. 29.5]. one obtains a re-coding of each gi into a single-attribute value function ui .5 with a slope proportional to .3) ∼ ∼ ∼ ∼ ∼ ∼ (17 500.2) from which we get k2 u1 (16 500) − u1 (17 500) = .5 2 1. 4 and 5 in turn. 29.2) k1 u1 (16 500) + k2 u2 (29.5 seconds but it is easy to devise a similar procedure for the other half range.5: Single-attribute value function for acceleration criterion (half range) indiﬀerences: (16 500.5) ∼ (17 500. 28. using the same idea.9 where the slope is proportional to 1 1 . 29. 28.5) (16 500. 28. Figure 6. from 29.5 shows the re-coding u2 of the evaluations g2 on the interval [28.

Edwards. it should be assigned to the interval [28. one initially ﬁxes two “anchor” points that may be the extreme values of the evaluations on the set of acceptable cars. otherwise some sort of retroaction is required. (with linear interpolation between the speciﬁed values. Thierry might say that almost the same gain in value (40) from 30 seconds to 29 as from 29 to 28 (gain of 50) is unfair and he could consequently propose to lower to 40 the value associated with 29 seconds. An example is SMART (“Simple Multi-Attribute Rating Technique”). this formula yields k2 and the trade-oﬀs k3 . its size (in relative terms) in the original 3 scale.3. Thierry would be asked to rank-order the attributes. the evaluations for the acceleration criterion. THE ADDITIVE VALUE MODEL 109 If we set k1 to 1. here 28 and 31 seconds.6(b). Since 29 seconds seems to be the value under which Thierry considers that a car becomes deﬁnitely attractive from the acceleration viewpoint. yielding the initial sketch of a value function shown on Figure 6. Methods relying on numerical judgements In another line of methods. pp. developed by W. The above procedure.5 seconds. In order to re-code. say. the ﬁnal version is drawn in Figure 6. On the value scale. he also lowers to 65 the value of 28. starting from one reference point or another (worst point instead of central point) may result in variations in the assessments. This picture can be further improved by asking Thierry to see whether the relative spacings of the locations correctly reﬂect the strength of his preferences. Notice that the re-coding process of the original evaluations into value functions results in a formulation in which all criteria have to be maximised (in value). the questions are far from easy to answer. for more details. which is more a collection of methods than a single one. There are however many possibilities for checking for inconsistencies. The weights are usually derived through direct numerical judgements of relative attribute importance. at least numerous and densely spaced degrees are needed. an . k4 and k5 are obtained similarly. 278 sq.6. 29] a range of values larger than 1 . Thierry could for instance assign 29 seconds to 50 on the value scale.5 and 30 could be located respectively in 70 and 10. A similar work has to be carried over for all criteria and the weights must be assessed. We just outline here a variant referring to von Winterfeldt and Edwards (1986). for instance 31 to 0 and 28 to 100.6(a).. Suppose he is then satisﬁed with all other diﬀerences of values. Then 28. hopefully they will be consistent. although rather intuitive and systematic is also quite complex. Assume for instance that a single-attribute value function has been assessed by means of a standard sequence that links its scale to the cost criterion. one may validate this assessment by building a standard sequence that links its scale to another criterion and compare the two assessments of the same value function obtained in this way. although the theory is not ignored. the anchor points are associated to the endpoints of a conventional interval of values. Note ﬁnally that such methods may not be used when the scale on which the assessments are made only has a ﬁnite number of degrees instead of being the set of real numbers. simplicity and direct intuition are more praised than scrupulous satisfaction of theoretical requirements.

This is not appropriate since weights are trade-oﬀs between units on the various value scales and must vary with the scaling. directly as an estimation of the ratio of weights. . the decision-maker is asked to compare alternatives that “swing” between the worst and the best level for each attribute in terms of their contribution to the overall value. instead of from 28 to 31. to those raised in the approach based on indiﬀerence judgements. The weight attached to (less-than-unit) diﬀerence of 2 100 2 that criterion must vary in inverse proportion to the previous factor when passing from u2 to u2 . on the acceleration value scale that is normalised in the 0-100 range. In assessing the relative weights no reference is made to the underlying scales. a diﬀerence of one unit of value on the scale u2 illustrated in Figure 6. both in diﬃculty and in spirit. with initial sketch in dotted line “importance” of 10 could be arbitrarily assigned to the least important criterion and the importance of each other criterion should be assessed in relation to the least important one.6: Value function for acceleration criterion: (a) initial sketch. a formulation that seems independent of the scalings of the criteria. For instance. A way of avoiding these diﬃculties is to give up the notion of importance that seems misleading in this context and to use a technique called swing-weighting. we would have constructed a value function u2 with u2 (32) = 0 and u2 (27) = 100. the meaning of one unit varies depending on the range of original evaluations (acceleration measured in seconds) that are represented between value 0 and value 100 of the value scale. The argument of simplicity in favour of SMART is then lost since the questions to be answered are similar. If we had considered that the acceleration evaluations of admissible cars range from 27 to 32 seconds.6 corresponds to a u (28)−u (31) on the scale u2 . This approach in terms of “importance” can be and has been criticised.110 CHAPTER 6. (b) ﬁnal. COMPARING ON SEVERAL ATTRIBUTES (a) 100 90 80 70 60 value 50 40 30 20 10 0 28 29 30 acceleration (sec) 31 value 100 90 80 70 60 50 40 30 20 10 0 28 29 30 acceleration (sec) 31 (b) Figure 6. It is unlikely that a decision-maker would take the range of evaluations into account when asked to assess weights in terms of relative “importance” of criteria.

The levels of the verbal scale correspond to numbers and are dealt with as such in the computations. importance and likelihood are considered as perceived on a ratio scale (much like sound intensity). the top level of the hierarchy is Thierry’s goal of ﬁnding the best car according to his particular views. formally a weighted sum of single-attribute value functions (see Saaty (1980). b) denote the level of preference (or of relative importance) of a over b expressed by the decision-maker. the pairwise comparison of the nodes in relation to the parent node is done by means of a particular method that allows. it means that preference. 4.3. the results of the pairwise comparisons may thus be encoded in a square matrix α.2 AHP and Saaty’s eigenvalue method The eigenvalue method for assessing attribute weights and single-attribute value functions is part of a general methodology called “Analytic Hierarchy Process”. The assessment of the nodes may start (as is usually done) from the bottom nodes. This is indeed Saaty’s basic assumption. There are ﬁve main levels on the verbal scale. The second level consists in the 5 criteria into which his global goal can be decomposed. For instance. what the decision-maker expresses as a level on the scale is postulated to be the ratio of values associated to the alternatives or the criteria. when comparing a to b.6. The questions are expressed in terms of “importance” or “preference” or “likelihood” according to the context. THE ADDITIVE VALUE MODEL 111 6. the decision-maker is assumed to give an approximation of the ratio f (a) . there should . 8 can also be used. but 4 intermediary levels that correspond to numerical codings 2. Harker and Vargas (1987)). The same is then done for all criteria in relation to the top node. all nodes linked to the same parent node are compared pairwise. for instance in Keeney and Raiﬀa (1976)). 6. to some extent.3. the inﬂuence of all criteria on the global goal are also compared pairwise. the level “Moderate” corresponds to an alternative that is preferred 3 times more than another or a criterion that is 3 times more important than another. Since verbal levels are automatically translated into numbers in f (b) Saaty’s method. b. the decision-maker is asked to assess the “priority” of a as compared to the “priority” of b. Let α(a. The answers may be formulated either on a verbal or a numerical scale. we shall concentrate on assessing directly on the numerical scale. In other words. constructing numerical evaluations associated with all levels of the hierarchy and aggregating them in a speciﬁc fashion. Thus the hierarchical tree is composed of 1 ﬁrst level node. In our case. If Saaty’s hypotheses are correct.6. Such an interpretation of the verbal levels has very strong implications. a number f (a) is assumed to be attached to all a. For each pair of nodes a. it consists in structuring the decision problem in a hierarchical manner (as it is also advocated for building value functions. The conversion of verbal levels into numerical levels is described in Table 6. What we have to determine is the “strength” or priority of each element of a level in relation to their importance for an element in the next level. The last level can be described as the list of potential cars. At each level. 5 second level nodes and 5 times 14 third level nodes also called leaves. to detect and correct inconsistencies. It is asked for instance how much alternative a is preferred to alternative b from a certain point of view. in our case this amounts to comparing all cars from the point of view of a criterion and repeating this for all criteria.

more weight will be put on a particular cost diﬀerence. If the test f (b) conclusion is negative. a) α(a. namely. Each alternative a is then assigned an overall value v(a) computed as n (6. If one wants to apply AHP in a multiple criteria decision problem. 2 Relation (6. A test based on statistical considerations allows the user to determine whether the assessments in the pairwise comparison matrix show suﬃcient agreement with the hypothesis that they are approximations of f (a) . b. it is recommended either to revise the assessments or to choose another approach more suitable for the type of data.11) v(a) = i=1 ki ui (a) and the alternatives can be ranked according to the values of v. for an unknown f .112 Verbal Numeric CHAPTER 6. For instance.9) implies that all columns of matrix α should be approximately proportional to f . c) In view of the latter relation. for all a. say 1 000 e. c. criteria must also be compared in a pairwise manner to model their importance.g.6: Conversion of verbal levels into numbers in Saaty’s pairwise comparison method. correct errors made in the estimation of the ratios. (6. pairwise comparisons of the alternatives must be performed for each criterion. detect departure from the basic hypothesis in case the columns of α are too far from proportional. The pairwise comparisons enable to 1. This process results in functions ui that evaluate the alternatives on each criterion i and in coeﬃcients of importance ki . only one half (roughly) of the matrix has to be elicited. 2. COMPARING ON SEVERAL ATTRIBUTES Equal 1 Moderate 3 Strong 5 Very strong 7 Extreme 9 Table 6. e. some sort of averaging of the columns is performed yielding an estimation of f . when located in the range . c) ≈ α(a. b) ≈ 1 α(b. (6.9) and in particular. “Moderate” means “3 times more preferred” be some sort of consistency between elements of α. Applying AHP to the case Since Thierry did not apply AHP to his analysis of the case. b) × α(b.10) α(a. which amounts to answering n(n−1) questions. we have answered the questions on pairwise comparisons on the basis of the information contained in his report. when comparing cars on the cost criterion.

Mazda 323 and Toyota Corolla.7 has been ﬁlled. which is the amount of money he had budgeted for his car. Our assessments are shown in Table 6. namely the top four cars plus the Renault 19. is to determine how many times a is preferred to b on criterion i from looking at the evaluations gi (a) and gi (b). for instance of alternatives in relation to a criterion. This corresponds to the fact that Thierry said he is rather insensitive to cost diﬀerences up to about 17 500 e. Once the matrix in Table 6. For the sake of concision. several algorithms can be proposed to compute the “priority” of each criterion with respect to the goal symbolised by the top node of the hierarchy (under the hypothesis that the elements of the assessment matrix are approximations of the ratios of those priorities). for the cost.11 below. a transformation (re-scaling) is usually needed to go from evaluations to preferences.5 times more preferred to Car 11. which was initially proposed by Saaty.10. 16 000 . The most famous algorithm. All depends on what the decision-maker would consider as the minimum possible cost.e.5 . 16 000 − x i. this is because the cost evaluation does not measure the preferences directly. the blanks on the diagonal should be interpreted as 1’s. We made them directly in numerical terms taking into account a set of weights that Thierry considered as reﬂecting his preferences. x = 14 500 e. if Car 12 is declared to be 1. those weights have been obtained using the Prefcalc software and a method that is discussed in the next session. consists in computing the eigenvector of the matrix corresponding to the largest eigenvalue (see Harker .33? Similar questions arise for the comparison of importance of criteria. according to Thierry himself. Indeed. A major issue in the assessment of pairwise comparisons. we have restricted our comparisons to a subset of cars. But even in linear parts.3. The problem is even more crucial for transforming scales such as those on which braking or road-holding are evaluated. Car 11 costs approximately 17 500 e and Car 12 costs about 16 000 e. For computing those weights. the question is not easily answered.5 times more preferred than Car 11 for the cost criterion. By default. The 17 500 ratio of these costs.09375 times more than Car 11 on the cost criterion. For instance. how many times is 2. the relative importance of each criterion with respect to all others must be assessed.6. the zero of the cost scale x would be such that 17 500 − x = 1. for instance (supposing that the transformation of cost into preference is linear). We discuss the determination of the “weights” ki of the criteria in formula 6. Of course the (ratio) scale of preference on i is not in general the scale of the evaluations gi .09375 but this does not necessarily mean that Car 12 is preferred 1. the blanks below the diagonal are supposed to be 1 over the corresponding value above the diagonal. THE ADDITIVE VALUE MODEL 113 from 17 500 e to 21 500 e than when lying between 13 500 e and 17 500 e. the transformation is not linear since equal ratios corresponding to costs located either below or above 17 500 e do not correspond to equal ratios of preference. is equal to 1.66 better than (preferred to) 2. A decision-maker might very well say that Car 12 is 1. how many times is Car 3 preferred to Car 10 with respect to the braking criterion? In other words. according to equation 6. For example. or he could say 2 times or 4 times.7.

. Barzilai. some comparisons have been assessed by non-integer degrees.g. .060. Cook and Golany (1987) who argue in favour of a geometric mean). .352. the special structure of the matrix (reciprocal matrix) guarantees that all priorities will be positive.7 (except for 1’s that remain constant)? Would the coeﬃcients of importance have been twice as large? Not at all! The resulting weights would have been much more contrasted. for an interpretation of the “eigenvector method” as a way of “averaging ratios along paths”).5 1 Table 6. COMPARING ON SEVERAL ATTRIBUTES Relative importance Cost Acceleration Pick-up Brakes Road-holding Cost Accel 1.114 CHAPTER 6.7.060) .7. Alternative methods for correcting inconsistencies have been elaborated. This means that the weights are not perceived as very contrasted. using the intermediary level between “Equal” and “Moderate” would still mean “twice as important”.241. Note that the labelling of the degrees on the verbal scale may be misleading. Applying the eigenvector method to the matrix in Table 6. . the number 2 at the intersection of 1st row and 3rd column means that “Cost” is considered twice as important as “Pick-up’ and Vargas (1987). . . When the assessments are made through this verbal scale. which normally are not available on the verbal counterpart of the 1 to 9 scale described in Table 6. for instance by saying that cost and acceleration are equally important and substituting 1. .489.172. Since eigenvectors are determined up to a multiplicative factor.5 Brakes 3 2 1.7: Assessment of the comparison of importance for all pairs of criteria. most of them are based on some sort of a least squares criterion or on computing averages (see e.117. approximations should be made. for instance assessing the comparisons of importance by degrees twice as large as those in Table 6.5 by 1.254. in order to get the sort of gradation of the weights as above (the ratio of the highest to the lowest value is about 3). .5 Pick-up 2 1. namely: (. the vector of priorities is the normalised eigenvector whose components sum up to unity. one obtains the following values that reﬂect the importance of the criteria: (.137.6. one would quite naturally qualify the degree to which “Cost” is more important than “Acceleration” as “Moderate” until it is fully realised that “Moderate” means “three times as important”. What would have changed if we had scaled the importance diﬀerently. It should be emphasised that the “eigenvalue method” is not linear. . For instance.117) .5 Road-h 3 2 1. Note that only the lowest degrees of the 1 to 9 scale have been used in Table 6.

0 1.0 2. Notice that the origin is arbitrary in the single-attribute value model.33 0.0 2.5 0.50 0.2 11 1.5).67 4 5. Applying the eigenvalue method yields the following “priorities” attached to each of the cars in relation to acceleration: (.0 1.25 0.5 10 4.0 6 5.0 0.8.0 1. Suppose the pairwise comparison matrix has been ﬁlled as shown in Table 6.25 3 2. .67 0. in other words.0 0. .25 0. In AHP since the assessments of all nodes are made independently.8: Pairwise comparisons of preferences of 7 cars on the acceleration criterion Using the latter set of weights instead of the former would substantially change the values attached to the alternatives through formula 6.0 1.0 2.0 1. one may transform the value function of Figure 6. one may add any constant number to the values without changing the ranking of the alternatives (a term equal to the constant number times the trade-oﬀ associated to the attribute would just be added to the multi-attribute value function).33 0. transformation of the former must be compensated for by transforming the latter.7.5 so it coincides with AHP priority on the extreme values of the acceleration half range.0 3.0 0.0 3.0 4.0 1. without altering the way in which alternatives are ordered by the multi-attribute value function). As a further example. .1507. since trade-oﬀs depend on the scaling of their corresponding single-attribute value function. the scaling of the single-attribute value function is related to the value of the trade-oﬀ.5 1.0 115 Table 6. i.0548). THE ADDITIVE VALUE MODEL Name of car Honda Civic Peugeot 309/16V Nissan Sunny Peugeot 309 Renault 19 Mazda 323 Toyota Corolla Nr 7 11 3 12 10 4 6 7 1. these assessments are made on an absolute scale.11 and might even alter their ordering. Comparing these scales is not straightforward.50 0. In the multiattribute value model.0 2.0 1.0 4.6.5 0. contrary to the determination of the trade-oﬀs in an additive value model (which may be re-scaled through multiplying them by a positive number.0 3. changing the unit on the vertical axis amounts to multiplying ui by a positive number. A re-scaling of the same criterion had been obtained through the construction of a standard sequence (see Figure 6.3. .0 4.25 0.0 1. there is no degree of freedom in the assessment of the ratios in AHP.0745. in a way that seems consistent with what we know of Thierry’s preferences.0 1. . Figure 6. A picture of the resulting re-scaling of that criterion is provided in Figure 6. we now apply the method to determine the evaluation of the alternatives in terms of preference on the “Acceleration” criterion.e.2694.33 12 4. no transformation is allowed.25 0.7 shows the transformed single-attribute value function superimposed .0584.2 0.50 1.0 1. 28 and 29.0 1. the corresponding trade-oﬀ must then be divided by the same number.5. the solid line is a linear interpolation of the priorities in the eigenvector. . In order to compare the two ﬁgures.0 0. So.0934.0 1.0 1.0 0.2987.

COMPARING ON SEVERAL ATTRIBUTES 0.116 0.05 0 28 28. The ambition of AHP is to help construct evaluations of the alternatives for each viewpoint (in terms of preferences) and of the viewpoints with regard to the overall goal (in terms of importance). as mentioned above) could be corrected by asking the decision-maker about the relative importance of the viewpoints in terms of passing from the least preferred value to the most preferred value on criterion i compared .3 CHAPTER 6. Since the eigenvalue method yields a particular determination of this constant and this determination is not taken into account when assessing the relative importance of the various criteria. This weakness (that can also be blamed on direct rating techniques. to be determined up to a positive multiplicative constant. i.7: Priorities relatively to acceleration as obtained through the eigenvector method are represented by the solid line. this does not mean that applying the respective methodologies of these theories normally yields the same overall evaluation of the alternatives. which has been repeatedly criticised in the literature (see for instance Belton (1986) and Dyer (1990)). these evaluations are claimed to belong to a ratio scale. There are striking diﬀerences between the two approaches from the methodological point of view.15 0.e.5 are represented by the dotted line on the range from 28 to 29.5 29 29. the evaluations in terms of preference must be considered as if they were made on an absolute scale.25 priorities (solid).1 0.2 0.5 seconds (dotted line) on the graph of the priorities. There seems to be a good ﬁt of the two curves but this is only an example from which no general conclusion can be drawn.5 acceleration (sec) 30 30. value (dotted) 0.5 31 Figure 6. Comments on AHP Although the models for describing the overall preferences of the decision-maker are identical in multi-attribute value theory and in AHP. the linearly transformed single-attribute values of Figure 6.

25. highest) value of u is conventionally set to 0 (resp. A class of such methods consists in postulating an additive value model (as described in formulae 6. More precisely. needs 30. a among the remaining ones could now be ranked below an alternative b whilst it was ahead of b in the initial situation.7 and 6.6. 1) is the value of an (ﬁctitious) alternative whose assessment on each criterion would be to the worst (resp. . namely the possibility of rank reversal. costs 13 841 e.6 seconds. Thierry used a method of disaggregation of preferences described in JacquetLagr`ze and Siskos (1982). AHP has been criticised in the literature in several other respects.7 seconds. it is implemented in a software called Prefcalc. which e computes piece-wise linear single-attribute value functions and is based on linear programming (see also Jacquet-Lagr`ze (1990). Vincke (1992)). Taking this suggestion into account would however go against one of the basic principles of Saaty’s methodology.3. the lowest (resp. Besides the fact already mentioned that it may be diﬃcult to reliably assess comparisons of preferences or of importance on the standard scale described in Table 6. This ﬁctitious alternative is sometimes called the anti-ideal (resp. 0 (resp. Without loss of generality. the “anti-ideal” car. 1). starting in ﬁfth gear at 40km/h. its performance regarding brakes and road-holding are respectively 2.6.3. the assumption that the assessments at all levels of the hierarchy can be made along the same procedure and independently of the other levels. The idea is thus to infer a general preference model from partial holistic information about the decision-maker’s preferences. has remained unchanged. ideal ) point.66 and 3.33 and 1. This phenomenon was discussed in Belton and Gear (1983) and Dyer (1990) (see also Harker and Vargas (1987) for a defense of AHP). its performance regarding brakes and road-holding are respectively 1. costs 21 334 e. The “ideal car” on the opposite side of the range. there is an issue about AHP that has been discussed quite a lot.8) and inferring all together the shapes of all single-attribute value functions and the values of all the trade-oﬀs from declared global preferences on a subset of well-known alternatives. starting in ﬁfth gear at 40km/h. That is probably why the original method.e. i. it may happen that an alternative. say. e the software helps to build a function n u(a) = i=1 ui (gi (a)) such that a b ⇐⇒ u(a) ≥ u(b). best) evaluation attained for the criterion on the current set of alternatives. THE ADDITIVE VALUE MODEL 117 to a similar change on criterion j (Dyer 1990). In our example. 6. needs 28 seconds to cover 1km starting from rest and 34. although seriously attacked.8 seconds to cover 1 km starting from rest and 41.25. Suppose alternative x is removed from the current set and nothing is changed to the pairwise assessments of the remaining alternatives.3 An indirect method for assessing single-attribute value functions and trade-oﬀs Various methods have been conceived in order to avoid direct elicitation of a multi-attribute value function.

that maximises the discrimination between . induces the corresponding order on their overall value and hence. .2 2.43 in the example shown in Figure 6.1 29 30 Road 34 . the utility value at mid-range and the maximal utility. which will make the additive value function compatible with the declared information.1 38 42 1.2 3.0 2. Prefcalc then tries to ﬁnd levels ui. i. suppose that you decide to set it to 2 (which is a parsimonious option and the default value proposed in Prefcalc).2 Figure 6.1 .e. Note that the maximal value of the utility (reached for a cost of 13 841 e) is scaled in such a way that it corresponds to the value of the trade-oﬀ associated with the cost criterion. i. COMPARING ON SEVERAL ATTRIBUTES .8. If the program is not contradictory. the single-attribute value function of the cost could for instance be represented as in Figure 6. i.118 Cost CHAPTER 6. if an additive value function (with 2-piece piece-wise linear single-attribute value functions) proves compatible with the preferences.2 for each criterion i. the system tries to ﬁnd a solution among all feasible solutions. the single-attribute value function is completely determined by two numbers.2 are variables of the linear program that Prefcalc writes and solves.33 Brake 28 .23 Pick .84 17.7 1. generates constraints of the linear program. Those values.8. the value of the trade-oﬀ is written in the right upper corner of each box The shape of the single-attribute value function for the cost criterion for instance is modelled as follows. The user ﬁxes the number of linear pieces. one for each half of the cost range. The ordering of these alternatives. The pieces of information on which the formulation of the linear program relies are obtained from the user.59 21.43 Acc . u1.13 13. ui.8: Single-attribute value functions computed by means of Prefcalc in the “Choosing a car” problem. Note also that with two linear pieces. which include the ﬁctitious ideal and anti-ideal ones.3 2.1 . The user is asked to select a few alternatives that he is familiar with and feels able to rank-order according with his overall preferences.e.e. say u1.

the importance (weight = . Thierry then looks at the ranking of the cars according to the computed value function. the third criterion has a certain importance although it is of less importance than the second one). He believes that the relative importance of the fourth and ﬁfth criteria should be revised. the system ﬁts the parameters of the model on the basis of partial information about the user’s preferences. the set of alternatives on which the user declares his global preferences may be viewed as a learning set. Ford Escort (Car 9) 5. 2. Thierry declares this criterion to be the second most important after cost (weight = . e In his ex post study Thierry selects ﬁve cars. Renault 21 (Car 14) This ranking is compatible with an additive value function. He agrees with many features of the ﬁtted single-attribute value functions and in particular with. This method could be described as a learning process. Mitsubishi Galant (Car 13) 4.9. it must be pleasant in everyday use and hence. 4. Thierry examines this result and makes the following comments. THE ADDITIVE VALUE MODEL 119 the selected alternatives. Thierry disagrees with the modelling of the braking criterion. for instance by using a higher number of linear pieces in the description of the single-attribute value functions.3.13) of getting as close as possible to 34 seconds in the acceleration test starting from 40 km/h (above 38 seconds he agrees that the car loses all attractiveness. the modelling of the road-holding criterion. besides the ideal and anti-ideal ones and ranks them in the following order: 1. Peugeot 309 GTI 16 (Car 11) 2.8. the lack of sensitivity in the price in the range from 13 841 e to 17 576 e (he was a priori estimating his budget at about 17 500 e). 1. 3. For more details on the method. However. If no feasible solution can be found.23) given to approaching 28 seconds on the “acceleration” criterion (above 29 seconds. The ranking as well as the multi-attribute value assigned to each car are given in Table 6. the system proposes to increase the number of variables of the model. Such a compatible value function is described in Figure 6. the car is not only used in competition.6. the reader is referred to Vincke (1992). the car is useless since a diﬀerence of 1 second in acceleration results in the faster car being two car lengths ahead of the slower one at the end of the test. which he considers equally important as road-holding.43). . Nissan Sunny (Car 3) 3. the high importance (weight = . Jacquet-Lagr`ze and Siskos (1982).

here Prefcalc.53 0. contributes to raising the level of conﬁdence the user puts in the tool. the sum of the maximal values of the single-attribute value functions may be only approximately equal to 1.01) associated with 2 remains unchanged while the utility of the level 2.48 0. he considers this as a good point as far as Prefcalc is concerned.7 is raised to 0. The road-holding criterion is also modiﬁed. Thierry modiﬁes the single-attribute value functions for criteria 4 and 5.50 0.1 instead of 0. He observes that if he had used Prefcalc a few years earlier. For the braking criterion.54 0. Thierry feels that the new ranking is fully satisfactory.61 0. he would have made the same choice as he actually did. in particular I am more conscious of the relative importance I give to the various criteria”. After he sees the modiﬁed ranking yielded by Prefcalc. which is common in human practice: the fact that previous intuition or previous more informal analyses are conﬁrmed by using a tool.01.54 0. COMPARING ON SEVERAL ATTRIBUTES Rank 1 2 3 4 5 6 7 8 9 10 11 12 13 14 Cars Peugeot 309/16 (Car 11) Nissan Sunny (Car 3) Renault 19 (Car 10) Peugeot 309 (Car 12) Honda Civic (Car 7) Fiat Tipo (Car 1) Opel Astra (Car 8) Mitsubishi Colt (Car 5) Mazda 323 (Car 4) Toyota Corolla (Car 6) Alfa 33 (Car 2) Mitsubishi Galant (Car 13) Ford Escort (Car 9) R 21 (Car 14) Value 0. of course due to the numbers display format with two decimal positions.1 (see Figure 6. In view of these observations.65 0.66 0.9).49 0.52 0.2) associated with the level 3.84 0. Note that Prefcalc normalises the value function in order that the ideal alternative is always assigned the value 1. . the value (0. the utility (0.68 0. He ﬁnally makes the following comments: “Using Prefcalc has enhanced my understanding of both the data and my own preferences.10 and the revised multi-attribute value after each car name.32 0.9: Ranking obtained using Prefcalc. The cars ranked by Thierry are those marked with a * Thierry feels that Car 10 (Renault 19) is ranked too high while Car 7 (Honda Civic) should be in a better position. Running Prefcalc with the altered value functions returns the ranking in table 6.16 * * * * * Table 6.120 CHAPTER 6. Comments on the method First let us emphasise an important psychological aspect of the empirical validation of a method or a tool.2 is lowered to 0.

The cars ranked by Thierry are those marked with * .2 Figure 6.66 0.50 0.10: Modiﬁed ranking using Prefcalc.85 0.53 0.51 0.0 2.2 2.1 Road .16 Table 6.75 0.3 2.32 0.61 0.2 3.48 0.55 0.1 1.7 1.6.65 0.3.54 0.9: Modiﬁed single-attribute value functions for the braking and roadholding criteria Rank 1 2 3 4 5 6 7 8 9 10 11 12 13 14 * * * * * Cars Peugeot 309/16 (Car 11) Nissan Sunny (Car 3) Honda Civic (Car 7) Peugeot 309 (Car 12) Renault 19 (Car 10) Opel Astra (Car 8) Mitsubishi Colt (Car 5) Mazda 323 (Car 4) Fiat Tipo (Car 1) Toyota Corolla (Car 6) Mitsubishi Galant (Car 13) Alfa 33 (Car 2) Ford Escort (Car 9) R 21 (Car 14) Value 0.47 0. THE ADDITIVE VALUE MODEL 121 Brake .

with all trade-oﬀs within ±.e.10) results in exchanging the positions of Honda Civic and Peugeot 309. After such a successful empirical validation step he will be more prone to use the method in new situations that he does not master that well. determining the trade-oﬀs with suﬃcient accuracy could be both crucial and challenging. . he simply validates the method by using it to reproduce results that he has conﬁdence in.8). There are some additional restrictions due to the fact that the shapes of the single-attribute value functions that can be modelled by Prefcalc are limited to piece-wise linear functions. which are ranked 3rd and 4th respectively after the change. Stability of ranking The main problem raised by the use of such a tool is the indetermination of the estimated single-attribute value functions (including the estimation of the tradeoﬀs).13.10) to (. i.11. one may question the inﬂuence of the selection of a learning set. are constrained to appear in the . It should thus be very clear that in practice. there will be several value functions that can represent these preferences. Passing from the set of trade-oﬀs (. see the survey by Fishburn (1991)). COMPARING ON SEVERAL ATTRIBUTES Observe that the user may well have a very vague understanding of the method itself.21.10. . those on which Thierry focused after his preliminary analysis (see Table 6.12. It is therefore of prime importance to carry out a lot of sensitivity analyses in order to identify which parts of the result remain reasonably stable.43.23. In particular. This is hardly a restriction when dealing with a ﬁnite set of alternatives. by adapting the number of linear pieces one can obtain approximations of any continuous curve that can be as accurate as desired. the most discriminating (in a sense). if the preferences declared on the set of well-known alternatives are compatible with an additive value model. .02 of their value in Figure 6. .45. Dependence on the learning set In view of the fact that small variations of the trade-oﬀs may even result in changes in the ranking of the top alternatives. This rank reversal is obtained by putting slightly more emphasis on cost and slightly less on performance. . this may however be a more serious restriction. When bounded to a small number of pieces. Slight variations in the trade-oﬀ values can yield rank reversals.9. changes already occur. For instance. the top two alternatives were chosen to be in the learning set and hence. may lead to variations in the rankings of the remaining alternatives. Other choices of a model albeit compatible with the declared preferences on the learning set. Usually. Prefcalc chooses one such representation according to the principles outlined above.3). .122 CHAPTER 6. . . In the case under examination. Note that such a slight change in the trade-oﬀs has an eﬀect on the ranking of the top 4 cars. What are the drawbacks and traps of Prefcalc? Obviously Prefcalc can only be used in cases where the overall preference of the decision-maker can be represented by an additive multi-attribute value function (as described by Equation 6. this is not the case when preferences are not transitive or not complete (for arguments supporting the possible observation of non-transitive preferences.

and Nissan Sunny (4). One may expect that the decision-maker will naturally choose alternatives that he considers as clearly distinct from one another as members of the learning set. the analyst’s instructions of selecting alternatives that are as contrasted as possible. the value of the trade-oﬀs may depend drastically on the learning set. Further experiments have been performed. where. Of course for the rest of the cars huge variations may appear in their ranking. two cars in the middle segment of the ranking. . THE ADDITIVE VALUE MODEL 123 correct order in the output of Prefcalc. . From a general point of view. .10). one may consider that in the case of Prefcalc.08. Nissan Sunny) by Renault 19. three of the former top cars remain in the top four. reintroducing in turn one of the 4 top cars and removing Renault 19. by choosing to maximise the contrast between the evaluations of the alternatives in the learning set) is not aimed at being as insensitive as possible with regard to the selection of a learning set. Renault 19 is heading the race mainly due to excellent road-holding. but one is usually more interested in the top ranked alternatives. the vector of trade-oﬀs is (. one can be relatively satisﬁed with the results since the top 3 cars are usually well-ranked. the option implemented in the mathematical programming model to reduce the indeterminacies (essentially. the ranking of the Honda Civic is much more unstable and it is not diﬃcult to understand why (weakness on road-holding and relatively high cost). which may be a desirable property in the perspective of uncovering an objective model of preferences measurement. typically.08. the analyst might alternatively instruct the decision-maker to do so. information is incomplete.06. What would have happened if the learning set had been diﬀerent? Let us take another subset of 5 cars and declare preferences that agree with the ranking validated by Thierry (Table 6. Other options could be experimentally investigated in order to see whether some could consistently yield more stable evaluations. Some sort of preliminary analysis of the user’s preferences can help to choose the learning set or understand the variations in the ranking and the trade-oﬀs a posteriori. Mitsubishi Colt. . it must be decided how to complement the available facts by some arbitrary default assumptions. is not necessarily a relevant requirement when the goal is to exploit partial available information. In the choice of the present learning set. Honda Civic is relegated to the 12th position.25) and the top four in the new ranking are Renault 19 (1). Honda recedes due to its higher cost and its weakness on road-holding. The Renault 19 appears as an outsider due to excellent road-holding. Peugeot 309 (2). stronger emphasis has been put on cost and safety (brakes and road-holding) and much less on performance (acceleration and pick up).3. Clearly. The information should then be collected while taking the assumptions made into account. When substituting the top 2 cars (Peugeot 309/16V.6. It should be noted however that stability. In the present case.53. . In a learning process. is in good agreement with the implementation options. Peugeot 309/16V (3).

4 Conclusion This section has been devoted to the construction of a formal model that represents preferences on a numerical scale. Suppose that each voter expresses his preferences through a complete ranking of the candidates. the decision becomes transparent. Once the hypotheses of the model have been accepted or proved valid in a decision context and provided the process of elicitation of the various parameters of the model has been conducted correctly.4 6. Indirect methods based on exploiting partial information and extrapolating it (in a recursive validation process) may help when the information is not available in explicit form. it thus relies on ﬁrm theoretical bases. 6. and so on). but also provides less conclusive outputs. which is undoubtedly part of the intellectual appeal of the method. the best decision is the one the model values most (provided the imprecisions in the establishment of the model and the uncertainties in the evaluation information allow to discriminate at least between the top alternatives).1 Outranking methods Condorcet-like procedures in decision analysis Is there any alternative way of dealing with multiple criteria evaluation in view of a decision to the one described above for building a one-dimensional synthetic evaluation on some sort of super-scale? To answer this question (positively). when established and accepted by the stake-holders. In conclusion. the winner is the candidate with . Such a model can only be expected to exist when preferences satisfy rather demanding hypotheses.124 CHAPTER 6. since it is directly interpretable in terms of decision. the Borda score of a candidate is the sum of the ranks assigned to him by the voters.3. The additive multi-attribute value model is rewarding. The counterpart of the clear-cut character of the conclusions that can be drawn from the model is that establishing the model requires a lot of information and of a very precise and particular type.4. each candidate is assigned a rank for each of the voters (rank 1 if candidate is ranked ﬁrst by a voter. With Borda’s method. such models can be used to legitimate a decision to persons that have not been involved in the decision making process. it remains that the quality of the information is crucial and that a lot of it is needed. There is at least one additional advantage to theoretically well-founded decision models. inspiration can be gained from the voting procedures discussed in Chapter 2 (see also Vansnick (1986)). In the next section we shall explore a very diﬀerent formal approach that may be less demanding with regard to the precision of the information. rank 2 if he is ranked second. COMPARING ON SEVERAL ATTRIBUTES 6. direct assessment of multi-attribute value functions is a narrow road between the practical problem of obtaining reliable answers to diﬃcult questions and the risks involved in building a model on answers to simpler but ambiguous questions. This means that the model may be inadequate not only because the hypotheses could not be fulﬁlled but also because the respondents might feel unable to answer the questions or because their answers might not be reliable.

we count the number of criteria according to which a is at least as good as b. The result of such a procedure is a preference relation on the set of candidates that in general is neither transitive nor acyclic. the elements of the matrix are integers ranging from 0 to 5. i.e. A further step is thus needed in order to exploit this relation in view of the selection of one or several candidates or in view of ranking all the candidates.11. b in the “Choosing a car” problem the smallest Borda score. Since there are 5 criteria. This method can be seen as a method of construction of a synthetic evaluation of the alternatives in multiple criteria decision analysis.11 and 0 to any number smaller than 3 yielding the . OUTRANKING METHODS Cars 1 2 3 4 5 6 7 8 9 10 11 12 13 14 1 5 2 4 3 3 2 3 3 2 4 4 4 3 2 2 3 5 4 1 3 2 3 2 3 4 4 4 2 3 3 1 2 5 1 1 1 1 1 1 2 3 2 0 1 4 2 4 4 5 5 2 4 4 4 3 4 4 2 3 5 2 2 4 1 5 2 4 4 4 2 4 4 1 3 6 3 3 4 3 3 5 4 4 3 3 4 4 2 3 7 3 2 4 1 2 2 5 3 1 2 4 4 1 1 8 2 3 4 2 2 2 3 5 2 3 5 4 2 3 9 3 3 4 1 2 2 4 3 5 3 4 3 1 3 10 2 1 3 2 3 2 3 2 2 5 3 4 1 2 11 2 1 2 1 1 1 2 0 1 3 5 3 1 0 12 2 1 3 1 1 1 2 2 2 2 4 5 0 1 13 2 4 5 4 5 3 4 4 4 4 4 5 5 4 14 3 3 4 2 2 2 4 3 3 3 5 4 1 5 125 Table 6.6. A candidate is declared to be preferred to another according to a majority rule. Condorcet’s method consists of a kind of tournament where all candidates compare in pairwise “contests”. not taking into account criteria for which a and b are tied. Note that we might have alternatively decided to count the criteria for which a is better than b. What we could call the “Condorcet preference relation” is obtained by determining for each pair of alternatives a. b whether or not there is a (simple) majority of criteria for which a is at least as good as b. We do this below. we show how the problems raised by a direct transposition rather naturally lead to elementary “outranking methods”.11: Number of criteria in favour of a when compared to b for all pairs of cars a. all criteria-voters have equal weight and coding by the rank number of the position of the candidate in a voter’s preference looks like a form of evaluation. the majority is reached as soon as at least 3 criteria favour alternative a when compared to b. This yields the matrix given in Table 6. the points of view corresponding to the voters and the alternatives to the candidates. if more voters rank him before the latter than the converse. This idea can of course be transposed in the multiple criteria decision context. For each pair of cars a and b.4. The preference matrix is thus obtained by substituting 1 to any number larger or equal to 3 in Table 6. using Thierry’s case again for illustrative purpose.

one might decide to impose more demanding levels of majority in the deﬁnition of a preference relation. We might require that an alternative be at least better than another on 4 criteria. All cycles in the previous relation disappeared. When ranking the alternatives . 3 (possibly diﬀerent) criteria saying that 7 is at least as good as 10. The same alternatives appear as seldom beaten: 3 and 11 (only once. 7 and 10 (preferred to all but 3).126 Cars 1 2 3 4 5 6 7 8 9 10 11 12 13 14 1 1 0 1 1 1 0 1 1 0 1 1 1 1 0 CHAPTER 6.12. 12 (preferred to all but 2).13. Note that a criterion counts both in favour of a and in favour of b only if a and b are tied on that criterion. 13. and ﬁnally 3 criteria saying that 13 is at least as good as 1. 5. COMPARING ON SEVERAL ATTRIBUTES 2 1 1 1 0 1 0 1 0 1 1 1 1 0 1 3 0 0 1 0 0 0 0 0 0 0 1 0 0 0 4 0 1 1 1 1 0 1 1 1 1 1 1 0 1 5 0 0 1 0 1 0 1 1 1 0 1 1 0 1 6 1 1 1 1 1 1 1 1 1 1 1 1 0 1 7 1 0 1 0 0 0 1 1 0 1 1 1 0 0 8 0 1 1 0 0 0 1 1 0 1 1 1 0 1 9 1 1 1 0 0 0 1 1 1 1 1 1 0 1 10 0 0 1 0 1 0 1 0 0 1 1 1 0 0 11 0 0 0 0 0 0 0 0 0 1 1 0 0 0 12 0 0 1 0 0 0 0 0 0 0 1 1 0 0 13 0 1 1 1 1 1 1 1 1 1 1 1 1 1 14 1 1 1 0 0 0 1 1 1 1 1 1 0 1 Table 6. 4. by avoiding cycles as much as possible. 7. 3. . How can we possibly obtain something from this matrix in view of our goal of selecting the best car? A closer look at the preference relation reveals that some alternatives are preferred to most others while some to only a few ones. To make things appear more clearly. 11. 6. Majority rule and cycles It is not immediately apparent that this relation has cycles and even cycles that go through all alternatives. an instance of such a cycle is 1. A “1” at the intersection of the a row and the b column means that a is rated not lower than b on at least 3 criteria relation described by the 0-1 matrix in Table 6. then come 10 (5 times) and 7 (6 times). . the relation is reﬂexive since any alternative is at least as good as itself along all criteria. among the former are alternatives 11 (preferred to all).12: Condorcet Preference relation for the “Choosing a car”problem. Obviously it is not straightforward to suggest a good choice on the basis of such a relation since one can ﬁnd 3 criteria (out of 5) saying that 1 is at least as good as 7. 1. . excluding by themselves). . 9. The new preference relation is shown in Table 6. 10. 3 (preferred to all but one). 12. 8. 12 (twice). 14. 2.

4. In other words.6. A “1” at the intersection of the a row and the b column means that a is rated not lower than b on at least 4 criteria by the number of those they beat (i. In the present case. i. we see that the simple approach that was used essentially makes the same cars emerge as the methods used so far. OUTRANKING METHODS Cars 1 2 3 4 5 6 7 8 9 10 11 12 13 14 1 1 0 1 0 0 0 0 0 0 1 1 1 0 0 2 0 1 1 0 0 0 0 0 0 1 1 1 0 0 3 0 0 1 0 0 0 0 0 0 0 0 0 0 0 4 0 1 1 1 1 0 0 1 0 0 1 1 0 0 5 0 0 1 0 1 0 0 1 0 0 1 1 0 0 6 0 0 1 0 0 1 0 0 0 0 0 1 0 0 7 0 0 1 0 0 0 1 0 0 0 1 1 0 0 8 0 0 1 0 0 0 0 1 0 0 1 1 0 0 9 0 0 1 0 0 0 1 0 1 0 1 0 0 0 10 0 0 0 0 0 0 0 0 0 1 0 1 0 0 11 0 0 0 0 0 0 0 0 0 0 1 0 0 0 12 0 0 0 0 0 0 0 0 0 0 1 1 0 0 13 0 1 1 1 1 0 1 1 1 1 1 1 1 0 14 0 0 1 0 0 0 1 0 0 0 1 1 0 1 127 Table 6. 7 is beaten by 3 cars. the result of the “Condorcet” procedure would have been exactly the same.3. suppose that all that we know (or that Thierry considers relevant in terms of preferences) about the cost criterion is the ordering of the cars according to the estimated cost. The second diﬀerence is more in the nature of the type of approach. then come 10 and 12 (beaten by one car). There are at least two radical diﬀerences between approaches based on the weighted sum and some more sophisticated way of assessing each alternative by a single number that synthesises all the criteria values. only the signs of those diﬀerences do. 8 and 10 that beat only 3 other cars.4. the most striking point is that the size of the diﬀerences in the evaluations of a and b for all criteria does not matter.e. One is that all criteria have been considered equally important. 11 and 12 come in the ﬁrst position (they are preferred to 10 other cars). 3 and 11. More precisely. Conversely.13: Condorcet preference relation for the “Choosing a car” problem.e. Car 6 Car 10 Car 5 1 Car 2 1 Car 4 1 Car 12 1 Car 3 1 Car 13 1 Car 11 1 Car 8 Car 1 1 Car 7 1 Car 9 1 Car 14 1 1 1 . are at least as good on 4 criteria or more) one sees that 3. had the available information been rankings of the cars with respect to each criterion (instead of numeric evaluations). it is possible however to take information on the relative importance of the criteria into account as will be seen in section 6. then there is a big gap after which come 7. there are two non-beaten cars.

COMPARING ON SEVERAL ATTRIBUTES where 1 represents “ is preferred to .33 to 2.12 and 6. neglecting the size of the diﬀerences for a criterion such as cost may appear as misusing the available information.2.33 to 2.1.128 CHAPTER 6. in terms of preferences. • in case of group decision. Does the diﬀerence between the levels 2. for instance.4. Suppose that similar hypotheses are made for the other 4 criteria. which implies making any diﬀerence in evaluations on a criterion equivalent to some uniquely deﬁned diﬀerence for any other criterion. • it may be that the importance of the decision to be made does not justify such an eﬀort. in particular it is presumed that on average the lifetimes of all alternatives are equal.66 and 2? How much would you accept to pay (in terms of criterion 1) to raise the value for criterion 4 from 2. criterion 4 (Brakes). The many methods that can be used to build a value function by questioning a decision-maker about his preferences may well fail however. let us list a few reasons for the possible failure of these methods: • time pressure may be so intense that there is not enough time available to engage in the lengthy elicitation process of a multiple criteria value function. i. “ is cheaper than . Take. even reliable ones. are not necessarily related with preferences on the cost criterion in a simple way.1 concerning the construction of these scales). • the decision-maker might not know how to answer the questions or might try to answer but prove inconsistent or might feel discomfort in being forced to give precise answers where things are vague to him.66 or from 1.1.1). Such a perception can only be obtained . This appears perhaps better if we consider the more artiﬁcial scales associated with criteria 4 and 5 (see section 6. ”. The whole analysis carried out there was aimed towards the construction of a multiple criteria value function. In such cases it may be inappropriate or ineﬃcient to try building a value function and other approaches may be preferred.13.33 and 2. Such issues were discussed extensively in section 6.33? Of course questions raised for eliciting value functions are more indirect but they still require a precise perception of the meaning of the levels on the scale of criterion 4 by the decision-maker.66 have a quantitative meaning? If it does. Of course. . .e. the analyst may be unable to make the various decision-makers agree on the answers to be given to some of the questions raised in the elicitation process. . is it reasonable in those circumstances to rely on precise values of diﬀerences of these estimations to select the “best” alternative? • estimations of cost. if this were the case we would have obtained the same matrices as in Tables 6. less than or equal to the diﬀerence between the levels 1. more than. there are at least two considerations that could mitigate this commonsense reaction: • the assessments for the cars on the cost criterion are rather rough estimations of an expected cost (see section 6. . on Criterion 1 ”. is this diﬀerence.

while there is no essential reason to refute that statement (Roy (1974). This is deﬁnitely the option chosen in the methods discussed in the present section. (For an overview of outranking methods. yet usually in a coarse-grained fashion. cited by Vincke (1992). Below.4. So. not even a set of good alternatives. the reader is referred to the books by Vincke (1992) and Roy and Bouyssou (1993)). ELECTRE I is a tool designed to be used in the context of a choice decision problem. ELECTRE I. .6. p. to Thierry’s case. Roy. when looking at alternatives a and b. Let us emphasise that this set cannot be described as the set of best alternatives. Our goal is not to make a survey of all outranking methods. Not that these methods are purely ordinal. it builds up a set of which the best alternative—according to the decision-maker’s preferences—should be a member. 6. Such an approach has been operationalised through various procedures and particularly the family of ELECTRE methods associated with the name of B. we just want to present the basic ideas of such methods and illustrate some problems they may raise. but such knowledge cannot be expected from a decision-maker (otherwise there would be no room on the marketplace for all the magazines that evaluate goods in order to help consumers spend their money while making the best choice). in particular in view of helping to rank the alternatives.2 A simple outranking method The Condorcet idea for a voting procedure has been transposed in decision analysis under the name of outranking methods. Such a transposition takes the peculiarities of the decision analysis context into account. but just a set that contains the “best” alternatives. The principle of these methods is as follows. it is claimed that a “outranks” b if there are enough arguments to decide that a is at least as good as b. Note that taking strong arguments against declaring a preference into account is typically what is called “discordance” and is original with respect to the simple Condorcet rule. this does not favour a deep intuitive perception of what the levels on that scale may really mean. Each pair of alternatives is considered in turn independently of third part alternatives. OUTRANKING METHODS 129 by having experienced the braking behaviour of speciﬁc cars rated at the various levels of the scale. additional elements such as the notion of discordance have also been added. in particular the fact that criteria may be perceived as unequally important. 58). Also remember that braking performance has been described by the average of 3 indices evaluating aspects of the cars’ braking behaviour.4. we discuss an application of the simplest of these methods. in order not to take into account diﬀerences that are only due to the irrelevant precision of numbers. one has to admit that in many cases the deﬁnition of the levels on scales is quite far from precise in quantitative terms and it may be “hygienic” not to use the fallacious power of numbers. We shall then show how the fundamental ideas of ELECTRE I can be sophisticated. but diﬀerences between levels on a scale are carefully categorised.

preferences are not simply described through rules extracted from partial information obtained on a learning set. This point was already made in Chapter 2 about Condorcet’s method. In addition. carefully. the concern is not making a decision but helping a decision-maker to make up his mind. the feelings and the values of a decision-maker. Roy. the model of preferences is built explicitly and formally. it has many features in common with a learning process. Defenders of the approach support the idea that forcing preferences to be expressed in the format of a complete ranking is in general too restrictive. it may be useful to emphasise the fact that outranking methods (and more generally methods based on pairwise comparisons) do not generally yield preferences that are transitive (not even acyclic). something like the surest and most stable expression of a complex. although raising operational problems. A phase of exploitation of the outranking relation is needed in order to provide the decision-maker with information more . in this concept. will seldom directly yield a ranking of the alternatives. In this approach very little hypotheses are made on preferences (like rationality hypotheses). the way of thinking. to some extent. prudently and interactively. as repeatedly stressed in the writings of B. models are built that reﬂect. independently of the other alternatives. vague and evolving object that is named. The pairs of alternatives that belong to the outranking relation are normally those between which the preference is established with a high degree of conﬁdence. The analysis of a decision problem is conceived as an informational process. Fishburn (1991)). contradictions are reﬂected either in cycles (a outranks b that outranks c that . may be viewed not as a weakness but rather as faithfully reﬂecting preferences as they can be perceived at the end of the study. the reader is referred to Roy (1993). COMPARING ON SEVERAL ATTRIBUTES The lack of transitivity. acyclicity and completeness issues As a preamble. The approach could be called constructive. in which. For more about the constructive approach including comparisons with the classical normative and descriptive approaches (see Bell. the job of suggesting a decision is thus not straightforward. . see also Bouyssou (1992) and Perny (1992)). however. there is experimental evidence that backs their viewpoint (Tversky (1969). Raiﬀa and Tversky (1988)). helping him to understand a decision problem while taking his own values into account in the modelling of the decision situation. . Since the hypotheses of Arrow’s theorem can be re-formulated to be relevant in the framework of multiple criteria decision analysis (through the correspondence candidate-alternative. voter-criterion. in contrast with most artiﬁcial intelligence practice. it is no wonder that methods based on comparisons of alternatives by pairs. Once the outranking relation has been constructed. for simplicity. that outranks a) or incomparabilities (neither a outranks b nor the opposite).130 CHAPTER 6. Explicit recognition that some alternatives are incomparable may be an important piece of information for the decision-maker. Let us emphasise that the lack of transitivity or of completeness. one may even doubt that preferences pre-exist the process from which they emerge. the outranking relation should be interpreted as what is clear-cut in the preferences of the decisionmaker. “the preferences of the decisionmaker”.

this threshold is just half the number of criteria and in general one will choose a number above half the sum of the weights of all criteria. in particular all non-outranked alternatives belong to the kernel. in the Condorcet voting method. OUTRANKING METHODS 131 directly interpretable in terms of a decision. For each pair of alternatives a and b. The notion of weights will be discussed below. all alternatives in a cycle are considered to be equivalent. The level from which a coalition is judged strong enough is determined by the so-called concordance threshold. In a graph without cycles. the concordance index is proportional to the number of criteria in favour of a as compared to b as in the Condorcet-like method discussed above. So. the kernel may be viewed as a set of alternatives on which the decision-maker’s attention should be focused. if this happens for at least one criterion one says that there is a veto to the preference of a over b. we successively have to determine . If all criteria are equally important.4. it measures the strength of the coalition of criteria that support the idea that a is at least as good as b. in the resulting relation without cycles. they are substituted by a unique representative node. might be pinpointed as preventing a from outranking b. in other words. One therefore checks whether there is any criterion for which b is so much better than a that it would make it meaningless for a to be declared preferred overall to b. when in disfavour of a. Note that the outranking relation is not asymmetric in general. one extracts the kernel of the graph of the outranking relation after having the cycles reduced. an alternative incomparable to all others is always in the kernel. Another feature that contrasts ELECTRE with pure Condorcet but also with purely ordinal methods. with the simple majority rule. the kernel is deﬁned as a subset of alternatives that do not outrank one another and such that each alternative not in the kernel is outranked by at least one alternative in the kernel.6. If the concordance index passes some threshold (“concordance threshold”) and there is no veto of b against a.3 Using ELECTRE I on the case We brieﬂy review the principles of the ELECTRE I method. which may have cycles and be incomplete (neither a outranks b nor the opposite). alternatives in the kernel may be beaten by alternatives not in the kernel.4. This process yields a binary relation on the set of alternatives. the so-called concordance index is computed. then a outranks b. In order to apply the method to Thierry’s case. it may happen that a outranks b and that b outranks a. The strength of a coalition is just the sum of the weights associated to the criteria that constitute the coalition. 6. It should be emphasised that all alternatives in the kernel are not necessarily good candidates for selection. is that some large diﬀerences in evaluation. a unique kernel always exists. In order to propose a set of alternatives of particular interest to the decision-maker from which the best compromise alternative should emerge. Such a two-stage process oﬀers the advantage of good control on the transformation of the multi-dimensional information into a model of the decision-maker’s preferences including a certain degree of inconsistency and incompleteness.

e. i.5 .e. 0. . the evaluation of alternative a for criterion i (which is assumed to be maximised.5). the weight pi would be added when the converse inequality holds.5.1. b) = i:gi (a)≥gi (b) pi where the pi ’s are normalised weights that reﬂect the relative importance of the criteria. gi (a) denotes. So. there is no need for looking at the other criteria. Using these weights in outranking methods would lead to an overwhelming predominance of criteria 2 (Acceleration) and 3 (Pick-up).1). Dividing the weights by their sum (= 5). they are completely independent of the scales for the criteria. Such a feature of the preference structure could indeed be reﬂected . . 2. A practical consequence is that one may question the decision-maker in terms of relative importance of the criteria without reference to the scales on which the evaluations for the various viewpoints are expressed. its weight now enters into the weight of the coalition (additively) in favour of a.4.2. It was never Thierry’s intention that once a car is better on criteria 2 and 3. as often as the evaluation of a passes or equals that of b on a criterion. 0. i. if it is bad on the braking or road-holding criterion. . that a fast and powerful car is useless. b). that measures the coalition of criteria along which a is at least as good as b may be computed by the formula (6.2.132 CHAPTER 6. which are also linked since they are facets of the cars performance. Note that these were not obtained through questioning on the relative importance of criteria but in the context of the weighted sum with Thierry bearing re-scaled evaluations in mind: the evaluations on each criterion had been divided by the maximal value gi. This does not mean however that they are independent of the method and that one could use values given spontaneously by the decision-maker or through questioning in terms of “importance” without care. it is impossible for a car to be outranked when it is better on criteria 2 and 3 even if all other criteria are in favour of an opponent. It is important to bear in mind how the weights will be used. 1.12) c(a. let us ﬁrst consider those suggested by Thierry in section 6. without reference to the evaluations as is done in Saaty’s procedure. the whole initial analysis shows on the contrary. In the context of outranking. . A criterion can count both for a against b and the opposite if and only if gi (a) = gi (b). the weights are not trade-oﬀs.2. yields the normalised weights (.2. if it were to be minimised.max attained for that criterion. in this case to measure the strength of coalitions in pairwise comparisons and decide on the preference only on the basis of the coalitions. COMPARING ON SEVERAL ATTRIBUTES • weights for the criteria • a concordance threshold • ordered pairs of evaluations that lead to a veto (and this for every criterion) Evaluating coalitions of criteria The concordance index c(a. as usual. (1. gi (a) ≤ gi (b)). With such weights and a concordance threshold of at least . for instance. To be more speciﬁc and contrast the meaning of the weights from those used in weighted sums.

3.60). when a is better than b on some criterion.13.45) becomes more important than the “performance coalition” (Criteria 2 and 3.35 + . we will just choose a set of weights in an intuitive manner.20. i. let us take weights proportional to (10. consider the following reformulation of the condition under which a is preferred to b in the weighted sum model (a similar formulation is straightforward in the additive value model) n (6. the full weight of the criterion counts in favour of a. On the contrary. important criteria count for little in pairwise comparisons when the diﬀerence between the evaluations of the alternatives are small enough. . . With these values as coeﬃcients of importance. As an additional conclusion. 6. whether a is either slightly or by far better than b. Using these weights for measuring strength of coalitions does not seem appropriate.4. . .min onto 0 and the largest gi.12 ) obtained through Saaty’s questioning procedure in terms of “importance” (see section 6. the “safety coalition” (Criteria 4 and 5.28.13) a b iﬀ i=1 ki × (gi (a) − gi (b)) ≥ 0. weight = . multiplying them by the inverse of the range of the values for the corresponding criterion prior to the transformation) one would obtain (.25) as a weight vector. Hence. one is inclined to choose less contrasted weights than those examined above.4.17. we might consider the weights used with the normalised criteria of Table 6.35. OUTRANKING METHODS 133 through the use of vetoes.max onto 1. but only in a negative manner.2). since criteria 1 and 2’s predominance is too strong (joint weight = . one may note that the values of the weights vary tremendously depending on the type of normalisation applied.20). the inﬂuence of this fact in the comparison between a and b is reﬂected by the term ki × (gi (a) − gi (b)) which is presumably small.24 = .6. To make it clearer. . If a is slightly better than b on a point of view i.14. in outranking methods. it makes no sense in principle to use the weights initially provided by Thierry as coeﬃcients measuring the importance of the criteria in an outranking method. not by allowing a safe car to outrank a powerful one. weight = . . If we nevertheless try to use them.59). 8. At least the ordering of the values seems to be . Since the weights in a weighted sum depend on the scaling of each criterion and there is no acknowledged standard scaling. We see that the importance of the “safety coalition” (Criteria 4 and 5) would be negligible (weight = . Note that the above weights may nevertheless be appropriate for a weighted sum because in such a method. Although there are procedures that have been proposed to elicit such weights (see Mousseau (1993).24. the weights are multiplied by the evaluations (or re-coded evaluations). weights are not divided. 6. .27) that Thierry may consider unfair. Now look at the weights (.12. by removing the outranking of a safe car by a powerful one. There is another reasonable normalisation of the criteria that does not ﬁx the zero of the scale but rather maps the smallest attained value gi. while the importance of the “performance coalition” (Criteria 2 and 3) would be overwhelming (weight = . . Roy and Bouyssou (1993)). Due to the all or nothing character of the weights in ELECTRE I.e. Transforming the weights accordingly (i.e. 6) as reﬂecting the relative importance of the criteria.

83 .73 14 .17 .73 .56 . This tells us something about coalitions that we did not know.44 .56 5 .73 .73 .83 .61 .61 .22 .66 (one of the three criteria 3. the “lightest” coalition of three criteria involves criteria 3.28 .78 .73 . COMPARING ON SEVERAL ATTRIBUTES 3 .22 4 .73 .44 .44 .73 .73 and there is only one value of the concordance index between . 4.17.56 9 . we have three diﬀerent coalitions weighing .78 .44 . this is done by selecting a concordance threshold above which we consider that they are.66 .33 .56 .27.17 .134 Cars 1 2 3 4 5 6 7 8 9 10 11 12 13 14 1 1 . which is Car 3.44 .17 8 .73 .56 .61 .61 . a binary relation obtained through deciding which coalitions in Table 6. .56 .44 .83 .33 . and three coalitions weighing .62 there is no longer a cycle.61 .61 (two of the criteria 3.56 CHAPTER 6. Determining which coalitions are “strong enough” At this stage we have to build the concordance relation.44 .61 1 .33 .56 1 .83 .78 1 .17 1 .78 . The new thing that we can learn is the following: the relation obtained by looking at coalitions of at least 4 criteria plus coalitions of three that involve criterion 1 has a big cycle.00.33 .33 . The weights of the three groups of criteria are rather balanced.33 . .56 7 .60.83 .17 . 4.28 .56 . Cutting the concordance index at .61 1 .28 .34 for safety.28 . with fewer arcs) is obtained when cutting above .73 .66 .61 .44 .56 .28 1 Table 6.83 .28 .28 .28 .73 1 1 .44 .61 . 5 with criterion 1).44 .27 for cost. Previous analysis with equal weights (see Section 6.28 .17 .73 .73 .66 .1) showed that the relation in Table 6.66 . 4 and 5 and weighs .61 . When we cut above . The concordance matrix c(a.39 .4.73 1 1 .44 0 .83 .61 .33 . So cutting between .44 .78 .56 1 .56 6 .39 .56 10 . 5 together with criteria 1 and 2).28 . b) computed with these weights is shown in Table 6. ﬁnally there are three coalitions weighing .56 .56 .39 .61 1 .17 .44 .13.66 .51. .73.73 .39 1 .22 .33 . obtained through looking at concordant coalitions involving at least three criteria.73.66. .39 .5 1 .28 .56 (two of the criteria 3.73 .17) after rounding in such a way that the normalised weights sum up to 1.83 .73 .44 . Normalising the weight vector yields (.39 .72 will yield the relation in Table 6.73 .73 .61 and .33 . 5 with criterion 2). 4. which we have already looked at. then.66 1 .14: Concordance index (rounded to two decimals) for the “Choosing a car” problem in agreement with what is known about Thierry’s perceptions.56 .44 . . .33 .17.73 . had a cycle passing through all alternatives.73 .39 2 .83 .56 .83 1 .83 .39 for performance and .28 .44 1 .61 .44 1 .14.44 .12.44 .44 .e.39 .73 .28 .61 .73 . with the weights we have now chosen.83 .56 .78 .61 .39 .49 .28 .39 0 .28 .39 . In the sequel we will .66 and .28 .22 13 .66 .17 . we obtain a concordance relation with a cycle passing through all alternatives but one.66 .60 thus only keeps the 3-coalitions that contain criterion 1 with the coalitions involving at least 4 criteria.14 are strong enough.33 .56 . If we set the concordance threshold at . a poorer relation (i.73 .56 .39 .17 .28 0 12 .56 .44 .44 .78 .28 .73 . The “lightest” 4-coalition weighs .78 . in increasing order. namely .73 1 0 .39 11 .73 .61 .73 .22.

the information on how the alternatives compare with respect to all others is completely lost. there is no simple cycle passing once through all alternatives except Car 3. concordance relations tend to become increasingly poor. which amounts to consider all alternatives in a cycle as equivalent. 10. one class is composed of the single Car 3 while the other class comprises all other alternatives. this relation is shown in Table 6. Rankings of the alternatives may also be obtained from Table 6. Introducing vetoes will just remove arcs from the concordance relation but the operations performed on the outranking relation during the exploitation phase are exactly those that are applied below to the concordance relation. which is the largest acyclic concordance relation that can be obtained. 13.15. 7. 8.60 and . above these values. This seems to be an interesting set in a choice process. 10 and 11. 2. 4.6. which beats almost all other alternatives in the cut at . below. Moreover. 12. an example of (non-simple) cycle is 1. Obviously. 6.65.14 at . 5. 14. Reducing the cycles of this concordance relation results in considering two classes of equivalent alternatives. up to a positive scaling factor. In the above presentation the weights sum up to 1. 1 and again. Note that multiplying all the weights by a positive number would yield the same concordance relations provided the concordance threshold is multiplied by the same factor. reducing the cycles involves some drawbacks. we consider the cut at level . For illustrative purposes. cutting the concordance relation of Table 6. would be considered as equivalent to Car 6 which beats almost no other car.60 yields a concordance relation with cycles involving all alternatives but Car 3. Its kernel is composed of cars 3. 12. the weights in ELECTRE I may be considered as being assessed on a ratio scale. Beside the fact that this partition is not very discriminating it also considers as equivalent alternatives that are not in the same simple cycle. The kernel of the resulting acyclic relation is then searched for and it is suggested that the kernel contains all alternatives on which the attention of the decision-maker should be focused. starting from 12. they are less and less discriminating.4. OUTRANKING METHODS 135 concentrate on two values of the concordance threshold. that are on both sides of the borderline separating concordance relations with and without cycles. Supporting choice or ranking Before studying discordance and veto we show how a concordance relation. 11. For example. Cars 3 and 11 are not outranked and car 10 is the only alternative that is not outranked either by car 3 or by car 11. 9. for instance Car 12. the exploitation procedure of ELECTRE I ﬁrstly consists in reducing the cycles.60 of the concordance relation.e. 1 plus. 1. In view of supporting a choice process. For instance. which is just an outranking relation without veto. 12. . can be used for supporting a choice or a ranking in a decision process. consider the alternatives either in decreasing order of the number of alternatives they beat in the concordance relation or in increasing order of the number of alternatives by which they are beaten in the concordance . i.65 of the concordance index. in view of the analysis of the problem carried out so far.15 in a rather simple manner.

65). We observe that the usual group of “good” alternatives form the top two classes of these rankings. since the ranking is based on two cuts. the other. the corresponding rankings are respectively labelled “A” and “B” in Table 6. . it makes better use of the information contained in the concordance index.17 and concordance threshold . In the .16: Rankings obtained from counting how many alternatives are beaten (ranking “A”) or beat (ranking “B”) each alternative in the concordance relation (threshold . ELECTRE II.15 and ranking the alternatives accordingly (we do not count the 1’s on the diagonal since the coalition of criteria saying that an alternative is at least as good as itself always encompasses all criteria).136 Cars 1 2 3 4 5 6 7 8 9 10 11 12 13 14 1 1 0 1 1 1 0 0 1 0 1 1 1 1 0 CHAPTER 6.60 cut corresponds to weak preference (or weak outranking) while the .16. one could consider that the . To some extent. 4 (8) 9 1. 6. the numbers between parentheses in the second row of ranking A (resp. with a strong preference threshold.65 Class A B 1 11 (11) 3.15: Concordance relation for the “Choosing a car” problem with weights .17. 10 (3) 2. ranking B) are the numbers of beaten (resp.28. 12 (10) 12 (1) 3 8 (7) 10 (2) 4 7 (6) 7. for instance in our case. . one linked with a weak preference threshold.65 cut corresponds to strong preference. 4 (2) 5 (6) 8 13. was designed for fulﬁlling this goal.22. 11 (0) 2 3. COMPARING ON SEVERAL ATTRIBUTES 2 0 1 1 0 1 0 0 0 0 1 1 1 0 0 3 0 0 1 0 0 0 0 0 0 0 0 0 0 0 4 0 1 1 1 1 0 1 1 1 0 1 1 0 0 5 0 0 1 0 1 0 1 1 1 0 1 1 0 0 6 0 0 1 0 0 1 1 1 0 0 1 1 0 0 7 0 0 1 0 0 0 1 0 0 0 1 1 0 0 8 0 0 1 0 0 0 0 1 0 0 1 1 0 0 9 0 0 1 0 0 0 1 1 1 0 1 0 0 0 10 0 0 0 0 1 0 0 0 0 1 0 1 0 0 11 0 0 0 0 0 0 0 0 0 0 1 0 0 0 12 0 0 0 0 0 0 0 0 0 0 1 1 0 0 13 0 1 1 1 1 0 1 1 1 1 1 1 1 1 14 0 0 1 0 0 0 1 1 0 0 1 1 0 1 Table 6. 6 (0) 13 11 Table 6. 14 (1) 1. beating) alternatives for each alternative of the same column in the ﬁrst row relation. There are more sophisticated ways of obtaining rankings from outranking relations. This amounts to counting the 1’s respectively in rows and columns of Table 6. 8 (3) 5 5 (5) 9 (4) 6 9. .17. 14 (5) 7 2. which we do not describe here. .

ti (gi (a)) = .1. we treated the assessments of the alternatives as if they were ordinal data.4. In our case. it is not likely that Thierry would really mark a preference between cars 3 and 10 on the Cost criterion since their estimated costs are within 10 e (see Table 6. Thresholding amounts to identifying intervals on the criteria scales. not for saying that a is (strictly) better than b. OUTRANKING METHODS 137 above method. One could ask the decision-maker to tell. need to be equivalent throughout the scale. we have considered that ti (gi (b)) = 0. we have considered previously that b was preferred to a on criterion i as soon as gi (b) ≥ gi (a). for instance. as mentioned at the end of section 6. the function ti should be negatively valued. In any case.12 of the concordance index is adapted in a straightforward manner . i.g. Deﬁnition 6.e.60 cut by using the method we applied to the . Does this mean that outranking methods are purely ordinal? Not exactly! More sophisticated outranking methods exploit information that is richer than purely ordinal but not as demanding as cardinal.2. For instance.65 cut. gi (b). however this is not a necessary condition since some differences. we could have obtained exactly the same results (kernel or ranking) by working with the orders induced from the set of alternatives by their evaluations on the various criteria. but not all. consider that the assessment of b on criterion i.6.2). therefore. it is reasonable to determine a threshold function ti and say that criterion i is such an argument as soon as gi (a) ≥ gi (b) + ti (gi (b)). the size of the interval between the evaluations is not taken into account when deciding that a is overall preferred to b. ideally for each evaluation gi (a) of each alternative on each criterion. This is done through what we shall call “thresholding”. from which value onwards an evaluation should be considered at least as good as gi (a). Thresholding is all the more important that. Hence one should be prudent when deciding that a criterion is or is not an argument for saying that a is at least as good as b. since we examine reasons for saying that a is at least as good as b. Thresholding To this point. i. the information contained in other cutting levels has been totally ignored although the rankings obtained from them may not be identical. which represent the minimal diﬀerence evaluation above which a particular property holds.e. from which value gi (b) + ti (gi (b)) onwards.05 × gi (a)).4. both in the Condorcet-like method and the basic ELECTRE I method (without veto). will an alternative a be said to be preferred to b? Implicitly. Determining such a threshold function is not necessarily an easy task. They may even diﬀer signiﬁcantly as can be seen when deriving a ranking from the . is given and criterion i is to be maximised. Things may become simpler if the threshold may be considered constant or proportional to gi (a) (e. Note that constant thresholds could be used when a scale is “linear” in the sense that equal diﬀerences throughout a scale have the same meaning and consequences (see end of section 6.3). In view of imprecision in the assessments and since it is not clear for all criteria that there is a marked preference when the diﬀerence |gi (a)−gi (b)| is small. one may be led to consider a non-null threshold to model preference.

just like in the voting context. (29. Of course it may be the case that the function vi be a constant.7) (all evaluations expressed in seconds) and all intervals wider than those listed. a veto threshold on criterion i is in general a function vi encoding a diﬀerence in evaluations so big that it would be out of the question to say that a outranks b if (6. COMPARING ON SEVERAL ATTRIBUTES as follows and the method for building an outranking relation remains unchanged: (6. b) = i:gi (a)≥gi (b)+ti (gi (b)) pi .3. Thresholding is a key tool in the original outranking methods. Although there was no precise indication on setting vetoes in Thierry’s preliminary analysis (section 6. this may result in incomparabilities in the outranking relation if in addition b does not outrank a. there is a veto against declaring that a outranks b if b is so much better than a on some criterion that it becomes disputable or even meaningless to pretend that a might be better overall than b. it allows one to bypass the necessity of transforming the original evaluations to obtain linear scales. lead to a veto (against claiming that the alternative with the higher evaluation could be preferred to the other one.16) gi (a) < gi (b) − vi (gi (b)) when criterion i is to be maximised. 30).6). 29.4).15) gi (a) > gi (b) + vi (gi (b)) when criterion i is to be minimised.2). since here. Discordance and vetoes Remember that the principle of the outranking methods consists in examining the validity of the proposition “a outranks b”. the criterion most likely to yield a veto is acceleration.1. (28. There is another occasion for invoking thresholds.138 CHAPTER 6. If this would seem reasonable then we would not be far from . say b. the concordance index “measures” the arguments in favour of saying so. but there may be arguments strongly against that assertion (discordant criteria). In our case. These discordant voices can be viewed as vetoes. may not outrank the other one. (29. or (6. then the alternative against which there is a veto. are used in a variant of the ELECTRE I method called ELECTRE IS (see Roy and Skalka (1984) or Roy and Bouyssou (1993)). in view of Thierry’s particular interest in sporty cars.14) c(a. Let us emphasise that the eﬀect of a veto is quite radical. If a veto threshold is passed on a criterion when comparing two alternatives. 30. that lead to indiﬀerence zones. one might speculate that on the acceleration criterion. pairs such as (28. 30. To be more precise. either because the coalition of criteria stating that b is at least as good as a is not strong enough or because there is also a veto of a against b on another criterion. which is in the analysis of discordance. say a. Note that preference thresholds. the criterion is to be minimised).

4 second. Using 1.5 second. The pairs of evaluations are compared to intervals that can be viewed as typical of classes of ordered pairs of evaluations on each criterion (for instance the classes “indiﬀerence”. the pair (a. obviously methods using thresholds may show discontinuities in their consequences and that is why sensitivity analysis is even more crucial here than with more classical methods.9 to 30. It suﬃces to say that the outranking relation.6. “preference” and “veto”). conventional levels of signiﬁcance (like the famous 5% rejection intervals) are widely used to decide whether a hypothesis must be rejected or not. if small variations do have a strong inﬂuence. Setting the value of the veto threshold obviously involves some degree of arbitrariness. setting the veto threshold to 1. to each value of the index corresponds a set of proﬁles. here as well.6 second. However. Note that • a credibility index of outranking (for instance “weak” and “strong” outranking) may be deﬁned. b) it is determined whether a outranks b by comparing their evaluations gi (a) and gi (b) on each point of view i. it must be veriﬁed whether small variations around the chosen value of a parameter (such as a veto threshold) do not inﬂuence the conclusions in a dramatic manner.4 seconds (like Mazda 323) may not outrank a car that accelerates in 28. its kernel and the derived rankings are not dramatically modiﬁed in the present case.5 from 28 to 29. Of course.5 as a veto threshold thus implies that diﬀerences of at least 1. We will allude in the next section to more “gradual” methods that can be designed on the basis of concordance-discordance principles similar to those outlined above.6 or from 28. OUTRANKING METHODS 139 accepting a constant veto threshold of about 1.4. which would imply that Mazda 323 may not outrank Nissan Sunny? In such cases.4 have the same consequences in terms of preference. . For each pair of alternatives (a. A related facet of using thresholds is that growing diﬀerences that are initially not signiﬁcant. b) is declared to be or not to be in the outranking relation.9 (like Opel Astra or Renault 21) but might very well outrank a car that accelerates in 29 (like Nissan Sunny) if the performances on the other criteria are superior.4 Main features and problems of elementary outranking approaches The ideas behind the methods analysed above may be summarised as follows.5 or 1. On the basis of the list of classes to which it belongs for each criterion (its “proﬁle”). the underlying logic is quite similar to that on which statistical tests are based. why not set the threshold at 1. In order not to be too long we do not develop the consequences of introducing veto thresholds in our example.4. it means that a car that accelerates from 0 to 100 km/h in 29. If we decide that there is a veto with a constant threshold on the acceleration criterion for diﬀerences exceeding 1. detailed investigation is needed in order to decide which setting of the parameter’s value is most appropriate. brutally crystallise into signiﬁcant ones as soon as a crisp threshold is passed. 6.6 seconds (as is the case of Peugeot 309 GTI) could not conceivably outrank a car which does it in 28 (as Honda Civic does) whatever the evaluations on the other criteria might be.5 implies that a car needing 30.

2). the smaller or larger diﬀerence in evaluations between alternatives does not matter once a certain threshold is passed. there are of course rationality requirements for the sets of proﬁles associated with the various values of the credibility index. these weights are typically used additively to measure the importance of coalitions of criteria independently of the evaluations of the alternatives. Since the decisionmaker has no direct intuition of this object. The result of the construction.4. On the other hand. This fact. provided diﬀerences gi (a) − gi (b) equal to such thresholds have the same meaning independently of their location on the scale of criterion i (linearity property). Non-compensation The weights count entirely or not at all in the comparison of two alternatives. • thresholds may be used to determine the classes in diﬀerences for preference on each criterion. the conclusion of the theorem is not necessarily valid and one may hope that there is no criterion playing the role of dictator. one can hardly expect to get reliable answers when questioning him about the properties of this relation. Due to their lack of transitivity and acyclicity.140 CHAPTER 6. the property of independence of irrelevant alternatives (see Chapter 2 where this property is evoked) is lost. COMPARING ON SEVERAL ATTRIBUTES if the proﬁle of the pair (a. ranking. . Since this is an hypothesis of Arrow’s theorem and it is violated. which was discussed in the second . The various procedures that have been proposed for exploiting the outranking relation (for instance transforming it into a complete ranking) are not above criticism. independently of the rest. then the outranking of b by a is assigned this value of credibility index. this property was satisﬁed in the construction of the outranking relation since outranking is decided by looking in turn at the proﬁles of each pair of alternatives. the outranking relation. procedures are needed to derive a ranking or a choice set from the outranking relation.e. It is supposed to include all the relevant and sure information about preference that could be extracted from the data and the questions answered by the decision-maker. is then exploited in view of a speciﬁc type of decision problems (choice. i. b) is one of those associated with a particular value of credibility of outranking. ). In the process of deriving a complete ranking from the outranking relation. • the rules for determining whether a outranks b (eventually to some degree of a credibility index) generally involve weights that describe the relative importance of the criteria. . this credibility index is to be interpreted in logical terms. it models the degree to which it is true that there are enough arguments in favour of saying that a is better than b while there is no strong reason of refuting this statement (see the deﬁnition of outranking in Section 6. the outranking relation (possibly qualiﬁed with a degree of a credibility index). . a direct characterisation of the ranking produced by the exploitation of an outranking relation seems out of reach. it is especially diﬃcult to justify them rigorously since they operate on an object that has been constructed.

while incomparable alternatives may be ranked in classes quite far apart. for instance. there are several grades of outranking (weak. . .3. ) and rules associate speciﬁc combinations of classes to each grade. the available information sometimes does not allow to make up one’s mind on whether a is preferred to b or the converse. The treatment of the two categories is quite diﬀerent in the exploitation phase.4. should one prefer a more expensive project with a lower risk or a less expensive one with higher risk (see Chapter 5.4.3. OUTRANKING METHODS 141 paragraph of this section 6. strong in ELECTRE II. . A large diﬀerence in favour. • the diﬀerences between the evaluations of a pair of alternatives for each criterion are categorised in discrete classes delimited by thresholds (preference. • rules are invoked to decide which combinations of these classes lead to outranking. incomparability should not be assimilated to indiﬀerence. Bouyssou (1986).3. outranking is determined on the basis of the proﬁles of performance of the pair only. In any case. In such a case a and b are said to be incomparable. This may be interpreted in two diﬀerent ways. Incomparability and indiﬀerence For some pairs (a. Bouyssou and Vansnick (1986). 6. for evaluations of the cost of human losses in various countries)? Other people support the idea that incomparability results from insuﬃcient information. Section 5. b) it may be the case that neither a outranks b nor the opposite. this can occur not only because of the activation of a veto but alternatively because the credibility of both the outranking of a by b and of b by a are not suﬃciently high. It has been argued. of a over b on some criterion is of no use to compensate for small diﬀerences in favour of b on many criteria since all that counts for deciding that a outranks b is the list of criteria in favour of a. . . ). more generally. Another example concerns the comparison of projects that involve the risk of loss of human life. indiﬀerent alternatives should appear in the same class of a ranking or in neighbouring one.6. Vetoes only have a “negative” action. . Indiﬀerence occurs when alternatives are considered as almost equivalent.5 Advanced outranking methods: from thresholding towards valued relations Looking at the variants of the ELECTRE method suggests that there is a general pattern on which they are all built: • alternatives are considered in pairs and eventually. One may advance that some alternatives are too contrasted to be compared. . impeding that outranking be declared. The reader interested in the non-compensation property is referred to Fishburn (1976). that comparing a Rolls Royce with a small and cheap car proves impossible because the Rolls Royce is incomparably better on many criteria but is also incomparably more expensive. incomparability is more concerned with very contrasted alternatives. veto. say.4. is sometimes called the non-compensation property of outranking methods.

in the elementary outranking methods (ELECTRE I and II) much care was taken. to avoid performing arithmetical operations on the evaluations gi (a). the outranking relation is also valued in such a context. b) ∀j S(a. These degrees are often interpreted in logical fashion as a degree of credibility of the preference. b) × j:Dj (a.b) In the above formula. in particular by the manner in which the indices are elaborated. to compute the overall degree of credibility S(a. b) or Dj (a. COMPARING ON SEVERAL ATTRIBUTES • specialised procedures are used to exploit the various grades of outranking in view of supporting the decision process. Then each combination of values of the credibility index on the various criteria may be assigned an overall value of the credibility index for outranking. a method leading to a valued outranking relation (see Roy and Bouyssou (1993) or Vincke (1992)). We do not enter into the detail of how c(a. b) can be computed.2 has taught us that operations that may appear as natural. b) models the degree to which alternative a is preferred to alternative b on criterion j. the converse with concordance. vetoes were used in a very radical fashion. The weighted sum also has good heuristic properties at ﬁrst glance. rely on strong assumptions that suppose very detailed information on the preferences. Other formulae might have been chosen with similarly good heuristic behaviour. The justiﬁcation of such a formula is mainly heuristic in the sense that the response of the formula to the variation of some inputs is not counter-intuitive: when discordance raises outranking decreases. No special . just remember that they are valued between 0 and 1. b) if Dj (a. This does not mean that the formula is fully justiﬁed. This is indeed a strong assumption that does not seem to us to be supported by the rest of the approach. Consider the following formula which is used in ELECTRE III. b) = 1−Dj (a. when discordance is maximal there may not be any degree of outranking at all. Dj (a.b) 1−c(a. b) is a degree of credibility of discordance. Dealing with valued relations and especially combining “values” raises a question: which operations may be meaningfully (or just reasonably) performed on them. only cuts of the concordance index were considered (which is typically an operation valid for ordinal data). The formula above involves operations such as multiplication and division that suppose that concordance and discordance indices are plainly cardinal numbers and not simply labels of ordered categories. i. It is thus appealing to work with continuous classes of diﬀerences of preference for each criterion. for instance.142 CHAPTER 6. b) ≤ c(a. Deﬁning the classes through thresholding raises the problem of discontinuity alluded to in the previous section. but deeper investigation shows that the values it yields cannot be trusted as a valid representation of the preferences unless additional information is requested from the decision-maker and used to re-code the original evaluations gj . Our analysis of the weighted sum in section 6. A value cj (a.e.b)>c(a. b) of the outranking of b by a. directly with valued relations. c(a.b) otherwise c(a. b) on arc (a.

b) = 0 for all j). min{1 − Dj (a. Hence. 1] interval. In the ﬁrst option. b) and the 1 − Dj (a. b)’s just consists in the ordering of their values in the [0. b)’s but only ordinal operations. n}} On the ﬁrst case. This means that transforming c(a. OUTRANKING METHODS 143 attention. b) is equal to . b) by the same transformation. b) and the 1 − Dj (a. namely taking the minimum. nothing guarantees that these indices can be combined by means of arithmetic operations and produce an overall index S representative of a degree of credibility of an outranking. was paid to building concordance and discordance indices.10. consider the following two cases which lead to an outranking degree of . There are thus two directions that can be followed for taking the objections to the formula of ELECTRE III into account. b) is signiﬁcantly larger than S(c. b) = 0 for all j = 1.417) that the value of the degree of outranking obtained by a formula like the above should be handled with care. That is.e. comparable to what was needed to build value functions from the evaluations. Consider for instance the following: S(a. then the former formula is not suitable.4. . For both. Obviously another formula with similar heuristic behaviour might have resulted in quite diﬀerent outputs. the reader is referred to chapters 2 and 3 of the book edited by Slowi´ski (1998). • the concordant coalition weighs . n The fact that the value obtained for the outranking degree may involve some degree of arbitrariness did not escape Roy and Bouyssou (1993) who explain (p.90 while Dj (a. the formula yields a degree of outranking of . Dj (a. one considers that the meaning of the concordance and discordance degrees is ordinal and one tries to determine a family of aggregation formulae that fulﬁl basic requirements including compatibility with the ordinal character of concordance and discordance. the degree falls to . 1] interval would just amount to transforming the original value of S(a. it yields an outranking degree of . . It is likely that in some circumstances a decision-maker might ﬁnd the latter model more appropriate. b)’s by an increasing transformation of the [0. b). j = 1.40 as well but on the second case. . D1 (a. at least tentatively. We agree with this statement but unfortunately it seems quite diﬃcult to assign a value to a threshold above which the diﬀerence S(a. . b) and the 1 − Dj (a. For a survey of possible ways of aggregating preferences into a valued relation. d).4: • the concordance index c(a.80 but there is a strong discordance on criterion 1. b) − S(c. d) could be claimed as “signiﬁcant”.40. Note also that the latter formula does not involve arithmetic operations on c(a. b) = .40 and there is no discordance (i. the option followed . if the information content of the c(a. b). in particular. This is not the case with the former formula. b) = min{c(a.6. The other option consists in revising the way concordance and discordance indices are constructed in order to have a quantitative meaning that allows to use arithmetic operations for aggregating them. For instance. they advocate that thresholds be used when comparing two such values: the outranking of b by a can be considered to be more credible than the outranking of d by c only if S(a.

Numbers may have an ordinal meaning. 6. in particular all the methods that do not rely on a formal modelling of the preferences (see for instance the book edited by Rosenhead (1989) in which various approaches are presented for structuring problems in view of facilitating decision making). they may be evaluations on an interval scale or a ratio scale and there are appropriate transformations that are allowed for each type of scale. leaves the door open to remarks analogous to those addressed to the weighted sum in Section 6. we may summarise our main conclusions as follows: • Numbers do not always mean what they seem to. Even if numeric evaluations actually mean what they seem to. in which case it cannot be recommended to perform arithmetic operations on them. such as the type of scale or the degree of precision or the degree of certainty into account. COMPARING ON SEVERAL ATTRIBUTES in the PROMETHEE methods (see Brans and Vincke (1985) or Vincke (1992). they may or may not depend on the scaling of the criteria. We have also suggested that the signiﬁcance of a number may be intermediate between ordinal and cardinal. There is no best model. On the particular topic of multi-attribute decision analysis. It makes no sense to manipulate raw evaluations without taking the context into account.g. the interval separating two evaluations might be given an interpretation: one might take into consideration the fact that intervals are e. this function would represent the overall diﬀerence in preference between any two alternatives. in that case. • The (vague) notion of importance of the criteria and its implementation are strongly model-dependent. There are other continents that have been almost completely ignored. . Evaluations may also be imprecise and knowing that should inﬂuence the way they will be handled.5 General conclusion This long chapter has enabled us to travel through the continent of formal methods of decision analysis. It also incorporates subjective information in relation to the preferences of the decision maker. We neither looked into all methods nor did we explore those we looked into completely. medium or small.144 CHAPTER 6. large. • Preference modelling does not only take objective information linked with the evaluations or with the data. • There are various types of models that can be used in a decision process. The way that this function is constructed in practice however.g. by “formal” we mean those methods relying on an explicit mathematical model of the decision-maker’s preferences. all have their strong points and their weak points. Weights and trade-oﬀs should not be elicited in the same manner depending on the type of model since e. Preference modelling is speciﬁcally the activity that deals with the meaning of the data in a decision context. their signiﬁcance is not immediately in terms of preferences: the interval separating two evaluations must be reinterpreted in terms of diﬀerence in preferences.2. these methods may be interpreted as aiming towards building a value function on the pairs of alternatives.

The main goal of the above study was to illustrate the issue of internal and external validity on a few methods in a speciﬁc simple problem.6. GENERAL CONCLUSION 145 The choice of a particular approach (including a type of model) should be the result of an evaluation. the various outputs are remarkably consistent and the variants can be explained to some extent. We have encountered such a situation several times in the above study. External consistency consists in checking whether the available information matches the requirements of acceptable inputs and whether the output may help in the decision process. It is suﬃcient to recall that experiments have shown that there is much variability in the answers of subjects submitted to the same questions at time intervals. the approaches use diﬀerent concepts and the questions the decision maker has to answer are accordingly expressed in diﬀerent languages. the dynamics of such decision processes is by far more complex. this means that the questions asked to the decision-maker must make sense to him and he should not be asked for information he is unable to provide in a reliable manner. is the type of information that is wanted as output: the decision maker needs diﬀerent information when he has to rank alternatives to when he has to choose among alternatives or when he has to assign them to predeﬁned (ordered) categories (we put the latter problem aside in our discussion of the car choosing case). the way of thinking of the decision-maker. in a given decision situation. but it remains that using problem structuring tools (such as cognitive maps) may prove proﬁtable. involving conﬂicts and negotiation aspects. Notice that additional dimensions make the choice and the construction of a model in group decision making even more diﬃcult. then the method should perform operations on the input that are compatible with the supposed properties of the input. of the chances of being able to elicit the parameters of the corresponding model in a reliable manner. cars may be ranked in diﬀerent positions according to the method that is used. constructing complete formal models in such contexts is not always possible. Another factor that should be considered for choosing a model. Second. • A direct consequence of the possibility of using diﬀerent models is that the output may be discordant or even contradictory. First of all. this of course induces variability. This does not puzzle us too much. his knowledge of the problem. So. these “chances” obviously depend on several factors including the type and precision of the available data. This is no wonder since the information that decision analysis aims at capturing cannot usually be precisely measured. There are several criteria of validity. in our view. this in turn induces an output which enjoys particular properties. should master several methodologies for building a model.5. because the observed diﬀerences appear more as variants than as contradictions. the ideal decision analyst. . Does this mean that all methods are acceptable? Not at all. One is that the method has to be accepted in a particular decision situation. There are also internal and external consistency criteria that a method should fulﬁl. Internal consistency implies making explicit the hypotheses under which data form an acceptable input for a method.

146

CHAPTER 6. COMPARING ON SEVERAL ATTRIBUTES

Besides the above points that are speciﬁc to multiple criteria preference models, more general lessons can also be drawn. • If we consider our trip from the weighted sum to the additive multi-attribute value model in retrospect, we see that much self-conﬁdence and therefrom much convincing power can be gained by eliciting conditions under which an approach such as the weighted sum would be legitimate. The analysis is worth the eﬀort because precise concepts (like trade-oﬀs and values) are sculptured through analysis that also results in methods for eliciting the parameters of the model. Another advantage of theory is to provide us with limits, i.e. conditions under which a model is valid and a method is applicable. From this viewpoint and although the outranking methods have not been fully characterised, it is worth noticing that their study has recently made theoretical progress (see e.g. Arrow and Raynaud (1986), Bouyssou and Perny (1992), Vincke (1992), Fodor and Roubens (1994), Tsouki`s and a Vincke (1995) , Bouyssou (1996), Marchant (1996), Bouyssou and Pirlot (1997)), Pirlot (1997)) . • An advantage of formal models that could not be overemphasised is that they favour communication. In the course of the decision process, the construction of the model requires that pieces of information, knowledge and priorities that are usually implicit or hidden, be brought into light and taken into account; also, the choice of the model reﬂects the type of available information (more or less certain, precise, quantitative). The result is often a synthesis of what is known and what has been learnt about the decision problem in the process of elaborating the model. The fact that a model is formal also allows for some sort of calculations; in particular, testing to what extent the conclusions are stable when the evaluation of imprecise data are varied is possible within formal models. Once a decision has been made, the model does not lose its utility. It can provide grounds for arguing in favour or against a decision. It can be adapted to make ulterior decisions in similar contexts. • The “decisiveness” of the output depends on the “richness” of the information available. If the knowledge is uncertain, imprecise or simply nonquantitative in nature, it may be diﬃcult to build a very strong model; by “strong”, we mean a model that clearly suggests a decision as, for instance, those that produce a ranking of the alternatives. Other models (and especially those based on pairwise comparisons of alternatives and verifying the independence of irrelevant alternatives property) are not able— structurally—to produce a ranking; they may nevertheless be the best possible synthesis of the relevant information in particular decision situations. In any case, even if the model leads to a ranking, the decision is to be taken by the decision-maker and it is not in general an automatic consequence of the model (due for instance to imprecisions in the data that calls for a relativisation of the model’s prescription). As will be illustrated in greater detail in Chapter 9, the construction of a model is not all of the decision process.

7

DECIDING AUTOMATICALLY: THE EXAMPLE OF RULE BASED CONTROL

7.1

Introduction

The increasing development of automatic systems in most sectors of human activities (e.g. manufacturing, management, medicine, etc.) has progressively led to involving computers in many tasks traditionally reserved to humans, even the more “strategic” ones such as control, evaluation and decision-making. The main function of automatic decision systems is to act as a substitute for humans (decision makers, experts) in the execution of repetitive decision tasks. Such systems can be in charge of all or part of the decision process. The main tasks to be performed by automatic decision systems are collecting information (e.g. by sensors), making a diagnosis of the current situation, selecting relevant actions, executing and controlling these actions. Automatisation of these tasks requires the elaboration of computational models able to simulate human reasoning. Such models are, in many respects, comparable to those involved in the scientiﬁc preparation of human decisions. Indeed, deciding automatically is also a matter of representation, evaluation and comparison. For this reason, we introduce and discuss some very simple techniques used to design rule-based decision/control systems. This is one more opportunity for us to address some important issues linked to descriptive, normative and constructive aspects of mathematical modelling for decision support: • descriptive aspects: the function of automatic decision systems is, to some extent, to be able to predict, simulate and extrapolate human reasoning and decision-making in an autonomous way. This requires diﬀerent tasks such as the collection of human expertise, the representation of knowledge, the extraction of rules and the modelling of preferences. For all these activities, the choice of appropriate formal models, symbolical as well as numerical, is crucial in order to describe situations and process information. • constructive aspects: in most ﬁelds of application, there is no completely ﬁxed and well formalised body of knowledge that could be exploited by the analyst responsible for the implementation of a decision system. Valuable information can be obtained from human experts, but this expertise is often 147

148

CHAPTER 7. DECIDING AUTOMATICALLY very complex and ill-structured, with a lot of “exceptions”. Hence, the formal model handling the core of human skill in decision-making must be constructed by the analyst, in close cooperation with experts. They must decide together what type of input should be used, what type of output is needed, and what type of consideration should play a role in linking output to input. One must also decide how to link subjective symbolic information (close to the language of the expert) and objective numeric data that can be accessible to the system.

• normative aspects: it is generally not possible to ask the expert to produce an exhaustive list of situations with their adequate solution. Usually, this type of information is given only for a sample of typical situations, which implies that only a partial model can be constructed. To be fully eﬃcient, this model must be completed with some general principles and rules used by the expert. In order to extrapolate examples as well as expert decision rules in a reasonable way, there is a need for normative principles putting constraints on inference so as to decide what can seriously be inferred by the system from any new input. Hence, the analysis of the formal properties of our model is crucial for the validation of the system. These three points show how the use of formal models and the analysis of the mathematical properties of the models are crucial in automatic decision-making. In this respect, the modelling exercise discussed here is comparable to those treated in the previous chapters, concerning human decision-making, but includes special features due to the automatisation (stable pre-existing knowledge and preferences, real-time decision-making, closed system completely autonomous, etc.). We present a critical introduction to the use of simple formal tools such as fuzzy sets and rule-based system to model human knowledge and decision rules. We also make explicit multiple criteria aggregation problems arising in the implementation of these rules and discuss some important issues linked to rule aggregation. For the sake of illustration, we consider two types of automatic decision Systems in this chapter: • decision systems based on explicit decision rules: such systems are used in practical situations where the decision-maker or the expert is able to make explicit the principles and rules he uses to make a decision. It is also assumed that these rules constitute a consistent body of knowledge, suﬃciently exhaustive to reproduce, predict and explain human decisions. Such systems are illustrated in section 7.2 where the control of an automatic watering system is discussed, and in section 7.4 where a decision problem in the context of the automatic control of a food process is brieﬂy presented. In the ﬁrst case, the decision problem concerns the choice of an appropriate duration for watering, whereas in the second case, it concerns the determination of oven settings aimed at preserving the quality of biscuits. • decision systems based on implicit decision rules: such systems are used in practical applications for which it is not possible to obtain explicit decision

7.2. A SYSTEM WITH EXPLICIT DECISION RULES

149

rules. This is very frequent in practice. The main possible reasons for it are the following: – the decision-maker or the expert is unable to provide suﬃciently clear information to construct decision rules, or his expertise is too complex to be simply representable by a consistent set of decision rules, – the decision-maker or the expert is able to provide a set of decision rules, but these decision rules are not easily expressible using variables that can be observed by the system. A typical example of such a situation occurs in the domain of subjective evaluation (see Grabisch, Guely and Perny 1997) where the quality of a product is deﬁned on the basis of human perception. – the decision-maker or the expert does not want to reveal his own strategy for making decisions. This can be due to the existence of strategic or conﬁdential information that cannot be revealed or alternatively because this expertise represents his only competence making him indispensable to his organisation. Such systems are illustrated in section 7.3, also in the context of the automatic control of food processes. We will use the problem of controlling the biscuit quality during baking as an illustrative case where numerical decision models based on pattern matching procedures can be used to perform a diagnosis of disfunction and a regulation of the oven, without any explicit rule.

7.2

A System with Explicit Decision Rules

Automatising human decision-making is often a diﬃcult task because of the complexity of the information involved in human reasoning. In some cases, however, the decision making process is repetitive and well-known so that automatisation becomes feasible. In this section, we would like to consider an interesting subclass of “easy” problems where human decisions can be explained by a small set of decision rules of type: if X is A and Y is B then Z is C where the X and Y variables are used to describe the current decision context (input variables) and Z is a variable representing the decision (output variable). Whenever X and Y can be automatically observed by the decision system (e.g. using sensors), human skill and experience in problem solving can be approximated and simulated using the fuzzy control approach (see e.g. Nguyen and Sugeno 1998). Such an approach is based on the use of fuzzy sets and multiple criteria aggregation functions. Our purpose is to emphasise the interest as well as the diﬃculty of resorting to such formal notions on real practical examples.

150

CHAPTER 7. DECIDING AUTOMATICALLY

7.2.1

Designing a decision system for automatic watering

Let us consider the following case: the owner of a nice estate has the responsibility of watering the family garden, and this task must be performed several times per week. Every evening, the man usually estimates the air temperature and the ground moisture so as to decide the appropriate time required for watering his garden. This amount of time is determined so as to satisfy a twofold objective: on the one hand he wants to preserve the nice aspect of his garden (especially the dahlias put in by his wife at the beginning of the summer) but on the other hand, he does not want to use too much water for this, preferring to allocate his ﬁnancial resources to more essential activities. Because this small decision problem is very repetitive and also because the occasional gardener does not want to delegate the responsibility of the garden to somebody else, he decided to purchase an automatic watering system. The function of this system is ﬁrst to check every evening, whether watering is necessary or not, and second to determine automatically the watering time required. The implicit aim of the occasional gardener is to obtain a system that implement the same rules as he does; in his mind, this is the best way to really preserve the current beautiful aspect of the garden. In this case, we need a system able to periodically measure the air temperature and the soil moisture and a decision module able to determine the appropriate duration of watering, as shown in Figure 7.1.

Figure 7.1: The Decision Module of the Watering System

7.2.2

Linking symbolic and numerical representations

Let t denote the current temperature of the air (in degrees Celsius), and m the moisture of the ground deﬁned as the water content of the soil. This second quantity, expressed in centigrams per gram (cg/g), corresponds to the ratio: m = 100 × x1 − x2 x2

where x1 is the weight of a soil sample and x2 the weight of the same sample o after drying in a low-temperature oven (75–105 C). Assuming the quantities t and m can be observed automatically, they will constitute the input data of the decision module in charge of the computation of the watering time w (expressed in minutes), which is the sole output of the module. Clearly, w must be deﬁned as a function of the input parameters. Thus, we are looking for a function f such that w = f (t, m) that can simulate the usual decisions of the gardener. Function f must be deﬁned so as to include the subjectivity of

if air temperature is Hot and soil moisture is High then watering time is Medium. we can use propositional logic and deﬁne rules of the following form: If T is A and M is B then W is C where T and M are descriptive variables used for temperature and soil moisture. if air temperature is Hot and soil moisture is Medium then watering time is Long. if air temperature is Cool and soil moisture is Low then watering time is Long. suppose the gardener is able to formulate the following empirical decision rules: Decision rules provided by the gardener: R1 R2 R3 R4 R5 R6 R7 R8 R9 R10 if air temperature is Hot and soil moisture is Low then watering time is VeryLong. For example. even if it is the result of a close collaboration with experts in that domain.7. Indeed. if air temperature is Warm and soil moisture is High then watering time is Short. moisture and watering time respectively. W is an output variable used to represent the decision and A. general rules used by experts may appear to be partially inconsistent and must often include explicit exceptions to be fully operational. if air temperature is Warm and soil moisture is Medium then watering time is Medium. A SYSTEM WITH EXPLICIT DECISION RULES 151 the gardener both in diagnosis steps (evaluation of the current situation) and in decision-making steps (choice of an appropriate action). Even in the case of control rules where there is no need for chaining inferences (we assume here that the rules directly link inputs (observations) to outputs (decisions)). if air temperature is Cool and soil moisture is Medium then watering time is Medium. the individual acceptance of each rule is not suﬃcient to validate the whole set of rules. C are linguistic values (labels) used to describe temperature. structuring . unsuitable conclusions may appear. For instance. if air temperature is Cool and soil moisture is High then watering time is VeryShort if air temperature is Cold then watering time is Zero Notice that the elicitation of such rules is usually not straightforward. resulting from several inferences due to the coexistence of apparently “reasonable” rules. B. A common way to achieve this task is to elicit decision rules from the gardener using a very simple language. This makes the validation of a set of rules particularly diﬃcult. In some situations. Even without any inconsistency. if air temperature is Warm and soil moisture is Low then watering time is Long. as close as possible to the natural language used by the gardener to explain his decision.2.

DECIDING AUTOMATICALLY the expert knowledge so as to obtain a synthesis of the expert rules in the form of a decision table (table linking outputs to inputs) requires a signiﬁcant eﬀort. whose role is to compute a watering time w from any input (t.3. and the watering time W . identify the current state (diagnosis) and provide a symbolic description of this state.1: The decision table of the gardener This decision table represents a symbolic function F linking Tlabels and Mlabels to Wlabels (Wk = F (Ti . These labels can be seen as diﬀerent words used to specify diﬀerent areas on the time scale Using these labels.VeryLong}. Warm. We can observe that the decision rules are expressed using only three variables. To build such a function. Long. We will show alternative approaches that do not require the explicit formulation of decision rules in Section 7. the air temperature T .152 CHAPTER 7. Medium. we need to produce a numerical translation of function F in order to construct a numerical function f called “transfer function”.e. the rules can be synthesised by the following decision table (see Table 7. i. VeryShort. the problem is the following: suppose the current air temperature and soil moisture are known. the standard process consists in the following stages: 1. Hot}. the soil moisture M . how can a watering time be computed from these sentences.1): Mj \ T i Low Medium High Cold Zero (R10 ) Zero (R10 ) Zero (R10 ) Cool Long (R3 ) Medium (R6 ) VeryShort (R9 ) Warm Long (R2 ) Medium (R5 ) Short (R8 ) Hot VeryLong (R1 ) Long (R4 ) Medium (R7 ) Table 7. moisture and watering time are given by the sets Tlabels. m). • Mlabels = {Low. they all take the following form: either or if T is Ti then W is Wk if T is Ti and M is Mj then W is Wk The possible labels Ti . . in other words how can f be deﬁned so as to properly reﬂect the strategy underlying these rules? Some partial answers could be obtained if we could deﬁne a formal relation linking the various labels occurring in the decision rules and the physical quantities observable by the system. Cool. assuming that the above set of decision rules has been obtained. Moreover. Mj )). Mj and Wk for temperature. Short. High}. These labels can be seen as words used to specify diﬀerent areas on the moisture scale • Wlabels = {Zero. Now. Medium. Mlabels and Wlabels respectively: • Tlabels = {Cold. These labels can be seen as diﬀerent words used to specify diﬀerent areas on the temperature scale. Now.

activate the relevant decision rules for the current state (inference). symbols can be linked to scalars. intervals or fuzzy sets. A SYSTEM WITH EXPLICIT DECISION RULES 2. A possible way of constructing such tables is to put the expert in various situations. etc). depending of the level of sophistication of the model.3 and 7. and . that the following numerical information has been provided by the expert (see Tables 7. to ask him to qualify each situation with one of the admissible labels.3 Interpreting input labels as scalars A ﬁrst and simple way of building the symbolic/numerical correspondence is by asking the decision-maker to associate a typical scalar value to each input label used in the rules.2. Note that the simplicity of the task is only apparent. the deﬁnition of the decision function f relies on a symbolic translation of the initial numerical information in the diagnosis stage. beliefs. a purely symbolic inference implementing the usual decisionmaking reasoning and then a numerical translation of the conclusions derived from the rules. may feel uncomfortable in specifying the scalar translation precisely. Let us assume now. 7. he will have to sacriﬁce a large part of his expertise and the resulting model may lose much of its relevance to the real situation. the expert may be reluctant to make a categorical symbolic/scalar translation. 153 3. The decision stage consists of a synthesis of the various conclusions derived from the rules and the selection of the most appropriate action (at this stage. This is particularly true concerning parameters like “soil moisture” which are not easily perceived by humans and whose qualiﬁcation requires an important cognitive eﬀort. For example.2. the subjectivity of the decision maker is not only expressed in choosing particular decision rules. The symbolic/numerical translation possibly includes the subjectivity of the decision-maker (perceptions.2. expert or not. Thus. the expert or decision-maker’s subjectivity can also be expressed by linking output labels (Wlabels) with elements of the time scale. but also in linking input labels (T labels and M labels) to observable values chosen on the basis of the temperature and moisture scales. In the decision step. In the following subsections. in the gardener example. both in the diagnosis and decision stages. 7. The inference stage consists of an activation of the rules whose premises match the description of the current state. Even for apparently simpler notions such as temperature and duration.7. the selected action is precisely deﬁned by numerical output values). synthesise the recommendations induced from the rules and derive a numerical output (decision) The diagnosis stage consists in identifying the current state of the system using numerical measures and describing this state in the language used by the expert to express his decision rules.4). There are several ways of establishing the symbolic/numeric translation ﬁrst in the diagnosis stage and then in the decision stage. we present the main basic possibilities and discuss the associated representation and aggregation problems. If nevertheless he is constrained to produce scalars. In both stages. An individual. for the sake of illustration. We will see later how the diﬃculty can partly be overcome by the use of non-scalar translations of labels.

This implies averaging the outputs associated to the reference points located in the neighbourhood of the observed parameters (t. w) where w = f (t. m. He must keep it in mind during the whole construction of the system and also later in interpreting the outputs of the system.08. 16) the neighbourhood is given by 4 reference points obtained from rules R1 . 20). 10). 20) with the respective weights 0. 0. The simplest method is to perform a linear interpolation from the reference points given in Table 7. 0.3: Typical moisture levels associated to labels Mj Wlabels Times (mn) VeryShort 5 Short 10 Medium 20 Long 35 VeryLong 60 Table 7. yj ) being .P4 = (30. The analyst must be aware of the share of arbitrariness attached to such a symbolic/numerical translation. Of course. 10). and R5 .154 CHAPTER 7. weight ωij of point (xi . m) = (29. P2 = (25. the rules allow the following reference points to be constructed: t m w 30 10 60 25 10 35 20 10 35 30 20 35 25 20 20 20 20 20 30 30 20 25 30 10 20 30 5 10 10 0 10 20 0 10 30 0 Table 7. This leads to a well-known mathematical problem since function f must be deﬁned so as to interpolate points of type (t. This yields points P1 = (30.12. R4 . if the observation is (t. DECIDING AUTOMATICALLY Tlabels Temperatures (o C) Cold 10 Cool 20 Warm 25 Hot 30 Table 7.48. There is no space in this chapter to discuss the relative interest of the various possible interpolation methods that could be used to obtain f . the reliability of the information elicited with such a process is questionable. R2 . the solution is not unique and some additional assumptions are necessary to deﬁne precisely the surface we are looking for.5: Typical reference points Hence. and P5 = (25.4: Typical times associated to Wk to measure the observable parameters with gauges so as to make the correspondence.32. the “transfer function” f linking watering time w to the pair (t. For instance. m) is known for a ﬁnite list of cases and must be extrapolated to the entire range of possible inputs (t. 0. Of course. m). m). From the above tables of scalars.5. m).2: Typical temperatures associated to labels Ti Mlabels Soil water content (cg/g) Low 10 Medium 20 High 30 Table 7.

w). 35. First of all. 20 and therefore.2. m. the ﬁnal time obtained by a weighted linear aggregation is 41 minutes and 12 seconds. Moreover. . Thus. reference points are replaced by reference areas in the parameter’s space (t.2: Approximation of f by linear interpolation This piecewise linear interpolation method is however not completely satisfactory. see Figure 7. one can use more sophisticated interpolating methods based on B-spline functions that produce very smooth surfaces with good continuity and locality properties (see e. the deﬁnition of reference points from the gardener’s rules is far from being easy and other relevant sets of scalar values could be considered as well. P2 .2. m) leads to the following piecewise linear approximation of function f . For example. Figure 7. Many other interpolation methods could be used as well.7. As a consequence. as mentioned above.g. Performing the same approach for any possible input (t.5 is itself questionable. one may prefer to modify the link between symbols and numerical scales in order to allow symbols to be represented by subsets of plausible numerical values. A SYSTEM WITH EXPLICIT DECISION RULES deﬁned by: (7.1) ωij = 1− |29 − xi | 30 − 25 × 1− |16 − yj | 20 − 10 155 The watering times associated to points P1 . P4 and P5 are 60. Bartels. 35. Instead of performing an exact interpolation of these points. no information justiﬁes that function f is linear between points to be interpolated. Beatty and Barsky 1987). This point is discussed below. and the interpolation problem must be reformulated. the need of interpolating the reference points given in Table 7. making a non-linear f possible.

1 and the conclusion is easy to reach. If the intervals are deﬁned so as to cover all plausible values. 15) that respectively . On the contrary. This is the case. 15) Medium [15. Hence.5) Cool [17. 100] Table 7.2. Basically.9) that both translate as (Cool. This is the case of (17. it does not provide a complete solution since function f is only known for a ﬁnite sample of inputs and requires interpolation to be extended to the entire set of possible inputs. Mj ) is the label associated to the interval containing t (resp. 15) and (22. m) corresponds to a pair {Ti .5. For example. 14. of the two inputs (17. Mj } where Ti (resp. we can distinguish two cases. 16). This process is simple but has serious drawbacks. Thus. in many cases. M edium). In such cases.4 the numerical output is 35. The granularity of the language used to describe the current state of the system is poor and many significantly diﬀerent states are seen as equivalent.5) Hot [27.5.4. for example.4 Interpreting input labels as intervals In the gardener’s example. In this case. +∞) Table 7. depending on whether the intervals associated to labels partially overlap or not. m) = (29. then the associated labels are {Hot.6: Intervals associated to labels Ti Mlabels Soil water content (cg/g) Low [0.7: Intervals associated to labels Mj If (t. the translation diverges. each corresponding to the most plausible values attached to a label Ti . 25) High [25. thereis a unique active rule in Table 7. Moreover. However.5) Warm [22. each label represents a range of values rather than a single value on a numerical scale. these intervals form a partition of the temperature and moisture scales respectively.5. can be translated into at least one label. m).9) and (17.156 CHAPTER 7. the only active rule is R4 whose conclusion is “watering time is long”. Labels represented by disjoint intervals Suppose that the gardener is able to divide the temperature scale into consecutive intervals. DECIDING AUTOMATICALLY 7. M edium} and therefore. if we keep the interpretation of “long” given in Table 7. 17. representing the diﬀerent labels used in the rules by intervals seems preferable. let us consider the following intervals: Tlabels Temperatures (o C) Cold (−∞. any possible input belongs to at least one interval and therefore. 22. 24. substituting labels Ti and Mj by scalar values on the temperature and moisture scales has the advantage of simplicity. Assuming this is also possible for the moisture scale. each input (t.4.5.5 27. for some other pairs of inputs that are very similar.

Any output label (labels Wk in the example) must be translated by numbers and these numbers must be aggregated to obtain the numerical output of the system (the value of w in the example). This is more realistic. from 20o C to 25o C. from 25o C to 30o C both labels are valid. This raises a new problem since these rules may possibly conclude to diverging recommendations from which a synthesis must be derived. but the number of rules necessary to characterise f would grow signiﬁcantly with the number of labels. hot is valid but not warm. one can make them partially overlap. 30] and [25. and from 30o C. In the ﬁrst case. Zero. leading to alternate starts and stops of the system.4. Such discontinuities cannot really be justiﬁed and make the output f (t. m) arbitrarily sensitive to the inputs (t. we . M edium). Thus. we have to specify the links between the values of physical variables describing the system and the symbolic labels used to describe the current state of the system more carefully. the deﬁnition of a numerical output can be seen as an aggregation problem. and possibly leading to dysfunctions. Medium.2. A ﬁrst option for this is allowing for overlap between consecutive intervals. 20 minutes according to Table 7. More precisely. +∞) respectively. reﬂecting the possible hesitation of the gardener in the choice of a unique label. rule R10 is activated and a zero watering time is decided. 29o C becomes a temperature compatible with the two labels. As a consequence. Note however that measuring a temperature of 29o C possibly allow several rules to be active in the same time. This can produce alternated sequences of outputs such as Short. reducing discontinuity induced by interval boundaries without multiplying labels is possible. two consecutive labels are associated to a given temperature. As an illustration. Low) and (Cool. Warm is a valid label (a possible source of rule activation) but not Hot. m). Typically. if Warm and Hot are represented by intervals [20.7. In the second case. where aggregation is used to interpolate between conﬂicting rules. as shown below. Labels represented by overlapping intervals In order to improve on the previous solution. This progressive transition between the two states warm and hot reﬁnes the initial sharp transition from warm to hot by introducing an intermediary state corresponding to an hesitation between the two labels. in some intermediary areas of the temperature scale. Suppose for example that several consecutive situations of temperature and moisture in a stable situation yield diﬀerent values for parameter t and m due to the imperfection of gauges and that these variations occur around a point of discontinuity in the system. A SYSTEM WITH EXPLICIT DECISION RULES 157 give (Cold. Since it is diﬃcult to separate such intervals with precise boundaries. It is true that narrowing the intervals and multiplying the labels would reduce these drawbacks and reﬁne the granularity of the description. Zero. Nevertheless. especially because there is no reasonable way of separating the “warm” and “hot” with a precise boundary. rule R6 is activated and a medium watering time is recommended. Expressing so many labels and rules requires a very important cognitive eﬀort that cannot reasonably be expected from the expert. This is not suitable because such decision systems are often included in a permanent observation/reaction loop.

35 and 60 minutes that must be aggregated. Hot} for temperature and {M edium. R4 }. Of course. i. More generally. deﬁning what could be a fair synthesis of conﬂicting qualitative outputs is not an easy task. For example. R2 . Therefore. 30] High [20. ω(R3 ). 20] Medium [10. we obtain from the gardener’s rules B(35) = {R2 . the relevant labels are {W arm. we can choose: (7. we can deﬁne a weight ω(R) for each decision rule R in the gardener database B.e 20. This weight represents the activity of the rule and. we can observe 3 conﬂicting recommendations and the ﬁnal decision must be derived from a synthesis of these results.9: Tlabels Temperatures (o C) Cold (−∞. These qualitative labels allow some of the gardener’s rules to be activated. DECIDING AUTOMATICALLY assume now that the labels are represented by the intervals given in Tables 7. A simple idea is to process symbols as numbers. Since 35 (minutes) is the scalar translation of Long. m) = (29. ω(R4 )} = 1. namely R1 . R3 . a weight ω(α) measuring the activity or importance of the set B(α) can be deﬁned as a continuous and increasing function of the quantities ω(R). Similarly .158 CHAPTER 7. each watering time activated by at least one rule receives the weight 1 and any other time receives the weight 0. Long (by R2 . For example.8 and 7. Deriving a numerical duration from this synthesis is not any easier. R2 .9: Intervals associated to labels Mj If the observation of the current situation is t = 29o C and m = 16cg/g.2) ω(α) = sup ω(R) R∈B(α) Hence.4. one can calculate the arithmetic mean of the 3 outputs. R ∈ B(α). This gives several symbolic values for the watering duration. we have seen that the active rules are R1 . R5 . 20] Cool [15. by convention. Let us now present in detail the calculation of ω(35). +∞) Table 7.8: Intervals associated to labels Ti Mlabels Soil water content (cg/g) Low [0. High} for moisture. R4 and R5 and therefore ω(R1 ) = ω(R2 ) = ω(R4 ) = ω(R5 ) = 1 whereas ω(R) = 0 for any other rule R. For any possible value α of w. m). we set ω(R) = 1 when the decision rule R is activated and ω(R) = 0 otherwise. we obtain three different durations. In the example. Hence ω(35) = sup{ω(R2 ). 25] Warm [20. For this. 16). R4 . 30] Hot [25. for any state (t. R4 ) and VeryLong (by R1 ). For example. one can link symbolic and numerical information using Table 7. with the observation (t. 100] Table 7. namely Medium (by R5 ). Let B(α) denote the subset of rules concluding to a watering time α.

translate symbols into quantitative numerical outputs 6. ω(α) = 0 for all other α. The more a given time is supported by the set of active rules. In this approach.α α ω(α) From the observation (t. we now obtain: ω(60) = ω(R2 )+ ω(R3 )+ ω(R4 ) = 2 whereas the others ω(α) remain unchanged.4) is questionable and one could formulate criticisms similar to those addressed to the weighted average in the previous chapters (especially in chapter 6). Note that the choice of a weighted sum as ﬁnal aggregator in equation (7.3) yields: w = 0. one could prefer resorting to (7. On the contrary. read the current values of input parameters t and m 2. Formally the ﬁnal output is deﬁned by: (7.2.4) yield a watering time of (60 + 35 + 20)/3 yielding 38 minutes and 20 seconds. Everything works as if each active rule was voting for a time.3) ω(α) = ω(R) R∈B(α) Coming back to the example. aggregate these numerical outputs . one can easily imagine that the choice of one of these options is not easy to justify. The option (7. the most popular approach is the “centre of gravity” method which amounts to performing a weighted sum (see also chapter 6) of all possible times α. A SYSTEM WITH EXPLICIT DECISION RULES 159 we get ω(20) = 1 thanks to R5 and ω(60) = 1 thanks to R1 . In a practical situation.7. detect the decision rules activated by these observations 4.25 × (60 + 35 + 35 + 20) that amounts to 37 minutes and 30 seconds. there is only a ﬁnite number of times activated by the rules in a given state.2) and (7. whereas equation (7. as in the linear interpolation approach used in the previous subsection. the ﬁnal result has been obtained as a result of the following sequence: 1. when the activation of a subset of rules necessarily implies that another subset of rules is also active. 16). Because there are no active rules left.3) could be preferred when the activation of the various rules are independent. Another option taking account of the number of rules supporting each time α could be: (7.4) w= α ω(α). m) = (29. ﬁnd the symbolic qualiﬁers that best ﬁt these values 3. Since there is a ﬁnite number of rules. collect the symbolic outputs resulting from the inferences 5. the more it becomes important in the calculation of the ﬁnal watering time. In order to synthesise these diﬀerent times. This second option gives more importance to a time α supported by several rules than to a time α supported by a single rule.2) so as to avoid possible overweighing due to redundancy in the set of rules. equations (7.

The resulting watering time obtained by equation (7. This activates rules R5 . any decision can be explained very simply.160 CHAPTER 7. 19. and {Low. there is a signiﬁcant diﬀerence in the watering times computed from the two input vectors. the numerical outputs resulting from the decision rules may vary signiﬁcantly. state s1 makes valid the labels {W arm. m) = (24. despite the similarity of the states. m) = (25. The outputs can always be presented as a compromise between recommendations derived from several of the expert’s decision rules. High} for soil moisture. Things are really diﬀerent for s2 however. In fact. R2 and R4 are fully active but this is no longer the case in the left neighbourhood of the point (t < 25 and m > 20) where they are replaced by rules R6 . m) for (t. R4 and R5 whose recommendations are VeryLong. This induces arbitrary choices in the description of the current state which could disrupt the diagnosis stage and make the automatic decision process discontinuous. the following question can be raised: why not usenumbers directly? . DECIDING AUTOMATICALLY This process is perhaps the more elementary way of using a set of symbolic decision rules to build a numerical decision function. R8 and R9 whose recommendations are Medium. despite the close similarity between states s1 and s2 . R8 and R9 . Hot} for temperature. M edium} for soil moisture.01). • it allows one to deﬁne a reasonable decision function allowing numerical outputs to be computed from any possible numerical input. Mj ). It is worth noting that.8 and 7. This activates rules R1 . The activations and computations performed for s1 and s2 diﬀer signiﬁcantly. interpreting labels as intervals does not really prevent discontinuous transfers from inputs to outputs. The valid labels are {Cool. In the right neighbourhood of this entry (t > 25 and m < 20).4) is therefore 38 minutes and 45 seconds. The resulting watering time obtained by equation (7. It is true that. ♦ This criticism is serious. as shown by the following example. 20. Long. Medium.99.99) and (t. thus leading to a much shorter time. Medium respectively. 20). R6 . VeryShort respectively. and {M edium. This is due to the discontinuity of the transfer function that deﬁnes the watering time from the input (t. it is not easy to describe a continuum of states (characterised by all pairs (t. Nevertheless.01. The main advantages of such a process are the following: • it relies on simple decision rules expressed in a language close to the natural language used by the expert. Since the numerical/symbolic and then symbolic/numerical translations are both sources of arbitrariness. W arm} for temperature.9. depending on the choice of the numerical encoding of the labels. the decision rules R1 . According to Tables 7. but the diﬃculty can partly be overcome. Example (1). It shows a simple illustration of the so-called “computing with words” paradigm advocated by Zadeh (see Zadeh 1999). R2 . m) in the gardener example) with a ﬁnite number of labels of type (Ti .4) is therefore 13 minutes and 45 seconds. Long. They lead to very diﬀerent outputs. Consider two very similar states s1 and s2 characterised by the observations (t. m) = (25. Short. • if necessary.

1]-valued function µTi deﬁned on the temperature scale in such a way that µTi (t) represents the compatibility degree between temperature t and label Ti . Thus. As a convention. As an expert. we set µTi (t) = 0 when temperature t is not connected to the label Ti . in many decision contexts.4 for which we have: (7.4. he could explain that Warm means between 20 and 30 degrees with 25 as the most plausible value. Warm. the possibility of justifying decisions is a great advantage. µLow (m) + µM edium (m) + µHigh (m) = 1 .2. These fuzzy labels can partially overlap but they must be deﬁned in such a way that any part of the temperature scale is covered by at least one label. one can deﬁne the relative likelihood of each temperature when the temperature has been qualiﬁed as Hot. each label Ti is represented by a [0.3 and 7. the fuzzy labels are deﬁned in such a way that membership adds up to 1 for any possible value of the numerical parameter. each label Ti is deﬁned with fuzzy boundaries and characterised by the function µTi .5 Interpreting input labels as fuzzy intervals One step back in the modelling process. the ability of automatic decision systems to simulate human reasoning and explain decision by rules is generally seen as an important advantage.7. More precisely.2. there are several ways of improving the process proposed above and of reﬁning the formal relationship between qualitative labels and numerical values. A SYSTEM WITH EXPLICIT DECISION RULES 161 There are two partial answers: ﬁrst. This is the case of labels deﬁned in ﬁgure 7. Cool or Cold. even if each decision considered separately is of marginal importance. Second.3: Fuzzy labels for the air temperature Note that sometimes. and µTi (t) = 1 when t is perfectly representative of the label. He can also deﬁne areas that are deﬁnitely not concerned with each label. In this case. Although this is not crucial in our illustrative example.5) ∀m ≥ 0. A simple example of such fuzzy labels is represented in Figures 7. This argument often justiﬁes the use of rule-based systems to automatise decision-making. the gardener can easily specify the typical temperatures associated with each label. It is not our purpose to cover all possibilities in detail. 7. For example. we can redeﬁne the relationship between a given label and the numerical scale associated to the label more precisely. We only present and discuss some very simple and intuitive ideas used to construct more sophisticated models and tools in this context. Figure 7.

m) = (29.4) = 0.6 0 Cold 0 0 (R10 ) 0 (R10 ) 0 (R10 ) Cool 0 0 (R3 ) 0 (R6 ) 0 (R9 ) Warm 0.6) ωij = h(µTi (t). the temperature is Hot to the degree 0.8 0.4. The observation (t.10): ωij Mj Low Medium High Ti µMj \ µTi 0. y) = min(x.4 and therefore. y).8 and µLow (m) = 0. Note however that this property makes sense only when membership values have a cardinal meaning. Mj ). for any rule Rij of type: if T is Ti and M is Mj then W is Wk where Wk = F (Ti . Thus.10: The weights of the rules when (t. µMj (m)) where h is an aggregation function representing the logical “and” used in the rule. 0. This is the degree to which the numerical inputs match the premises of the rule. m).162 CHAPTER 7.2 (R2 ) 0.4 (R1 ) 0. Medium and High form a partition of the set of possible moistures. consider the gardener’s rule R1 . 16) leads to µHot (t) = 0.2 0. With such fuzzy labels. 16) . Using this approach for each rule with h = min yields the following activation weights (see Table 7. Mj ).5) the numerical translation of a natural condition requiring that the fuzzy labels Low. DECIDING AUTOMATICALLY Figure 7.2 (R5 ) 0 (R8 ) Hot 0.6 (R4 ) 0 (R7 ) Table 7. the weight of the rule R1 is min(0. As a numerical example.8. h(x.g.4 0. e. each decision rule can be activated to a certain degree.8 and the moisture is Low to the degree 0. It is therefore natural to state: (7. m) = (29.4. More precisely. the weight (or activation degree) ωij of the rule Rij reﬂects the importance (or relevance) of the rule in the current situation. m) and the premise (Ti .4: Fuzzy labels for the soil moisture This property (7. This importance depends on the matching of the input (t. and for any numerical observation (t.

001 (R2 ) 0.6).4 + 0. The activation level of each rule is graduated on the [0.998 0.11 and 7. using equation (7.4) we get w= 0.998 0 (R2 ) 0. m) = (25. 20. from equation (7.4 × 60 + 0.2 163 and therefore the watering time is 40 minutes. the resulting activation weights are those given in Tables 7.001 0.002 0.998 (R5 ) 0 (R8 ) Hot 0.001 (R8 ) Hot 0 0 (R1 ) 0 (R4 ) 0 (R7 ) Table 7. and if we choose h = min in equation (7. we get w(s1 ) = 20 minutes and 5 seconds as the ﬁnal output.4.6 + 0. These weights depend continuously on input parameters t and m.99) ωij Mj Low Medium High Ti µMj \ µTi 0 0.999 0 Cold 0 0 (R10 ) 0 (R10 ) 0 (R10 ) Cool 0 0 (R3 ) 0 (R6 ) 0 (R9 ) Warm 0.2. Similarly.001 (R1 ) 0. A SYSTEM WITH EXPLICIT DECISION RULES Hence. everything works as if each active rule was voting for one candidate chosen in the set Wlabels.002 (R6 ) 0.01.6 × 35 + 0. In the additive formulation characterised by equation (7.4) gives w(s2 ) = 19 minutes and 58 seconds. the activation of the rules obtained from equation (7. we notice that the activity of each rule does not vary signiﬁcantly when passing from state s1 to state s2 .4) and Table 7.6) are only slightly diﬀerent from those for s1 and the ﬁnal output derived from Table 7.998 (R5 ) 0. The more the premise of the rule matches the current situation.002 0 (R3 ) 0. 19. If we consider the two neighbour states s1 and s2 introduced in this example.12.2 × 35 + 0.2. the more important the rule is in the voting process.12: The weights of rules when (t.01) Hence.2 × 20 0. As a consequence. 1] scale and the weights directly reﬂect the adequacy of the rule in the current situation. Note that the deﬁnition of an aggregation function yields a compromise solution between the various active decision rules whose outputs are partially conﬂicting. ωij Mj Low Medium High Ti µMj \ µTi 0.002 (R4 ) 0 (R7 ) Table 7.001 Cold 0 0 (R10 ) 0 (R10 ) 0 (R10 ) Cool 0. Here. This is due to the way activation weights are deﬁned and used in the process.4).11: The weights of rules when (t.4.7.001 (R9 ) Warm 0. and the membership functions deﬁning the labels have soft variations. for state s2 .12 using equation (7.99. since the aggregation function . This enables a soft control of the output that can be perfectly illustrated by the example discussed at the end of subsection 7.999 0.2 + 0. m) = (24.

• the interpretation of symbolic labels used to describe outputs of the rules as scalar values is not easy to justify. several criticisms can be addressed to the small fuzzy decision module presented above. moisture) into symbolic variables used in decision rules. Thus.6) requires that quantities of type µTi (t) and µMj (m) are commensurate. This advantage is due to the use of fuzzy sets and has greatly contributed to the practical success of the fuzzy approach in automatic control (fuzzy control. A perfectly sound deﬁnition of such membership values would require more information than can easily be obtained in practice. in the same way as for input labels? The last criticism suggests an improvement of the current system.6 Interpreting output labels as (fuzzy) intervals Suppose for example that Wlabels are no longer described by scalar values but by subsets of the time scale. Moreover.g. quantity w depends continuously on input parameters t and m. two moistures) to a Label Ti (resp. It also requires comparing the ﬁt of any temperature to any label Ti with the ﬁt of any moisture to any label Mj . Among them. For instance. Note however that the idea of the conjunction is captured by any other t-norm (see for instance.4)). This point is discussed in the next subsection. We have to sophisticate the previous construction so as to improve the output processing. we can use intervals or fuzzy intervals later in the process so as to continuously link symbolic outputs of the rules (Wlabels) to numerical outputs (watering times). 7. the use of fuzzy labels to interpret input labels has a signiﬁcant advantage: it makes it possible to deﬁne a continuous transformation of numerical input data (temperature. Mj ).164 CHAPTER 7. This assumption. Fodor and Roubens (1994)). is very strong because it requires much more than comparing the relative ﬁt of two temperatures (resp. the choice of min is often justiﬁed by the fact that h is used to evaluate a conjunction between several premises of a given rule (a conjunction of type “temperature is Ti and moisture is Mj ”). (see e. However. the product could perhaps replace the min and the particular choice of the min is not straightforward. Nguyen and Sugeno 1998). This is problematic because this choice is not without consequence on the deﬁnition of the watering time.2. let us mention the following: • the choice h = min in equation (7. Gacogne 1997. we assume here that Wlabels are . This explains the observed improvement with respect to the previous model based on the use of all or nothing activation rules. DECIDING AUTOMATICALLY used to derive the ﬁnal watering time w is also a continuous function of quantities ω(R) (see equation (7. Sugeno 1985. Thus. the labels Wk could be represented by a set of intervals (overlapping or not) with advantages similar to those mentioned for input labels Ti and Mj . which is rarely explicit. Paralleling the treatment of symbolic inputs. Mamdani 1981. The resulting decision system is more realistic and robust to slight variations of inputs. Why not use a description of these labels as intervals. Bouchon 1995. More generally.

However. µMj (m). However. µMj (m) and µWk (α). we can use an equation similar to (7. In order to obtain a precise watering time. The idea in equation (7. this equation must be generalised because there may be an inﬁnity of times activated by the rules (e.2. µWk (α)) Rij ∈B where B represents the set of rules (here the gardener’s rules) and Rij represents the rule: If T = Ti and M = Mj then W = Wk and h is a non-decreasing function of its arguments (in Mamdani’s approach.4). R5 . To be fully considered.6. The usual extension of the weighted average to an inﬁnite set of values is given by the following integral: (7.5: Fuzzy labels for the watering time For any state (t.m (α) dα .7) is that a watering time α must receive an important weight when there is at least one rule Rij whose premises (Ti . the observation (t.7. a whole interval). h = min). 70].8) w= α ωt. Notice that equation (7.g. R4 . R2 . In more nuanced situations. Each of them represents a possible numerical translation of a label Wk obtained by the activation of one or several rules. In the example. A SYSTEM WITH EXPLICIT DECISION RULES 165 represented by fuzzy intervals of the time scale. we let us consider the labels represented in Figure 7. the weight attached to a possible time is function of the ﬁtness of the times activated to a certain degree by the rules.5. Mj ) are valid for the observation (t. For the sake of illustration.16 (w) represented in Figure 7. the active rules are R1 . the range of relevant watering times is the union of all values compatible with labels Wk derived from active rules. by analogy with Mamdani’s approach to fuzzy control (Mamdani 1981). m) and whose conclusion Wk is compatible with α.m (α) = sup h(µTi (t). all times are not equivalent inside this set. m) of the system.m (α) is deﬁned as an increasing function of quantities µTi (t). In our example. the weight of any watering time α can be deﬁned by: (7. “Long” and “VeryLong”. and therefore the Wlabels concerned are “Medium”. Hence the set of relevant watering times is [10.7) ωt. 16) leads to a function ω29. m) = (29. a time must be perfectly representative of a label Wk that has been obtained by a fully active rule. For example.7) is a natural extension of equation (7.m (α) dα ωt. This explains that ωt. Figure 7.2).

the use of equations (7. However. In our example. the conjoint use of the min operator to interpret the conjunction on the left hand-side and that of the Lukasiewicz implication would lead to the following h function: h(x. µWk (α)) stands for the numerical translation of the proposition: (Ti = t and Mj = m) implies Wk = α In the ﬁelds of multi-valued logic and fuzzy sets theory. 1) Note that this function is not increasing in its arguments.9) w= i ωt. µMj (m). 1) where v(A) and v(B) are the values of A and B respectively.6: Weighted times induced by rules that can be approximated by the following quantity: (7. bearing in mind the form of rule Rij . This last sophistication meets our objective because it provides a transfer function f with good continuity properties. z) = min(1 − min(x. y.m (αi ) where (αi ) is a strictly increasing sequence of times resulting from a ﬁne discretisation of the time scale. as required above in the text. the quantity h(µTi (t). y) + z.7–7. Bouchon 1995. However. Perny and Pomerol 1999).7) from an increasing aggregation function h is not very natural.166 CHAPTER 7. DECIDING AUTOMATICALLY Figure 7.9) can be seriously criticised: • the deﬁnition of ωt. As an example the value attached to the sentence “A implies B” can be deﬁned by the Lukasiewicz implication min(1 − v(A) + v(B). a discretisation with step 0. resorting to implication operators instead of conjunctions in order to implement an inference via rule Rij also seems legitimate.1 gives a ﬁnal time of 37 minutes and 32 seconds. Indeed. In our case. admissible functions used to translate implications are required to be non-increasing with respect to the value of the left hand-side of the implication and non-decreasing with respect to the value of the right hand-side (Fodor and Roubens 1994.αi i ωt. This .m (αi ).m (α) proposed in equation (7.

moisture. For example. 30] ∪ [40. The above information leaves room for an inﬁnity of functions. even if µM edium (30) = 0. the deﬁnition of h is not straightforward and must be justiﬁed in the context of the application. For example. their membership is equal to 1. in many cases.1 and µLong (25) = 0.e. Thus. – the membership function making a continuous transition from the border of the support to the border of the kernel. In practice. y) could be the Lukasiewicz t-norm: max(x + y − 1. i. A reasonable alternative to min(x. µLong (21) = 0. To go further in this direction. Usually. – the core. One could expect that the decision-maker is able to specify the support and core of each fuzzy label. Now.4.m (α) in equation (7. 0). Thus. This does not necessarily mean that 25 minutes is more Long than 30 minutes is Medium. nor that 25 minutes is more Long than 26o C is Hot. core [30.6). time. However. 40] and two linear transitions (membership to non-membership) in the range [20. we should be able to determine whether any temperature t is a better representative of a label Ti than time α is representative of label Wk . This is a very strong assumption. one could also discuss the use of min to interpret a conjunction whereas the Lukasiewicz implication is used to interpret implications.5 only means that 25 minutes is a better numerical translation of the qualiﬁer Long than 21 minutes. the interval of all numerical values perfectly representative of the label (the core is a subset of the support).2.8) with h = min is diﬃcult to justify.7) is used to generalise the so-called “modus ponens” inference rule (Zadeh 1979). the choice of a precise membership function often remains arbitrary. (Dubois and Prade 1988). i. (Bouchon 1995). a label thought as a fuzzy interval is assessed on the basis of 3 elements: – the support. Even with this information. the only reliable information contained in the membership function is the relative adequation of each temperature.2. Some general guidelines for choosing a suitable h are given in (Bouchon 1995).7. without such assumptions. 55]. especially if we consider the way these labels are represented in the model. . even if µHot (26o ) = 0. their membership must be strictly positive. to each label. the label Long in Figure 7. however. • Equation (7. the interval of all numerical values compatible with the label. the shape of the membership function in the transition area is often chosen as linear or gaussian (for derivability) but rarely justiﬁed by questioning the decision-maker. (Baldwin 1979).7) requires even more commensurability than equation (7.e. As a conclusion. A SYSTEM WITH EXPLICIT DECISION RULES 167 is usual in the ﬁeld of fuzzy inference and approximate reasoning where a formula like (7. the deﬁnition of weights ωt. inequalities of type µTi (t) > µWk (α) play a role in the process.5 is deﬁned by support [20. as well as the trend of the membership function (increasing from the border of the support to the border of the core). 55].

Although this inversion of duration is not a crucial problem in the case of the watering system. we need to consider that 25 minutes is 5 times better than 21 minutes to represent “long”. assuming we use equations (7. Then. despite the important diﬀerence between inputs i1 and i2 .1 0.168 CHAPTER 7. it could be more problematic in other contexts. w(i2 ) = 19 minutes and 42 seconds. This can be easily explained by observing that. the weights cannot necessarily be interpreted as cardinal values and the weighted aggregation proposed in equation (7. √ Now. This gives the following watering times: w(i1 ) = 20 minutes and 34 seconds.2 (R8 ) Hot 0.8 0 (R1 ) 0.8) is questionable. In fact.13: The weights of the rules for input i1 This example shows that comparison of output values is not invariant to monotonic transformations of membership values and this explains the “more than ordinal” interpretation of membership values in the computation of w. These two inputs lead to activation weights given in Tables 7.16.4 for interpretation of labels Wk .8 (R7 ) Table 7. and µWk (α). consider the following example showing the impact of an increasing transformation of membership values on the output watering time: Example (2). This is one more very strong hypothesis.13 and 7. we use the non-fuzzy labels given in Table 7. However. µMj (m). we obtain the following result: w(i1 ) = 19 minutes and 33 seconds and w(i2 ) = 21 minutes and 40 seconds. if we use a similar system (based on fuzzy rules) to rank candidates in a competition. in the second case.2 0 (R2 ) 0. Consider the two following input vectors i1 = (29.9 Cold 0 0 (R10 ) 0 (R10 ) 0 (R10 ) Cool 0 0 (R3 ) 0 (R6 ) 0 (R9 ) Warm 0. but the soil water content is also lower. 29) and i2 = (18. Note that we now have w(i1 ) > w(i2 ) whereas it was just the opposite before the transformation of membership values. the choice of . 16). the activation tables are altered as shown in Tables 7. DECIDING AUTOMATICALLY • Bearing in mind that the weights ωt. For instance.m are used as cardinal weights in (7. This preserves the support and the core of each label.1 (R4 ) 0. the temperature is lower. Then. and the two aspects compensate each other. it represents the same ordinal information about membership degrees.15 and 7.14. For example. for the sake of simplicity. Even when the commensurability assumption of membership scales is realistic.4) while they are deﬁned from membership values µTi (t).4) to deﬁne the watering time w. As an illustration of the latter. as well as the slope (increasing or decreasing) of membership functions. the membership values should have a cardinal interpretation.2) and (7. ♦ ωij Mj Low Medium High Ti µMj \ µTi 0 0. Notice that the times as not so diﬀerent. we transform all membership functions of the labels by the function φ(x) = 3 x.1 (R5 ) 0. because the membership value is 5 times larger.

Fargier and Perny 1999). To go further with rule-based systems using fuzzy sets.6 0.928 (R7 ) Table 7. However.6 (R6 ) 0 (R9 ) Warm 0 0 (R2 ) 0 (R5 ) 0 (R8 ) Hot 0 0 (R1 ) 0 (R4 ) 0 (R7 ) 169 Table 7. These works present formal models but also empirical principles derived from practical applications and thus provide a variety of techniques that have proved . which has received much attention in the past decades.2 0. one can consult (Mamdani 1981).7. There is no room here to discuss the use of numerical representations in rulebased automatic decision systems further.585 0.585 (R8 ) Hot 0.843 0.16: The modiﬁed weights of the rules for input i2 a particular shape for membership must be well justiﬁed because it may really change the winner. e. (Gacogne 1997) and (Nguyen and Sugeno 1998) for a recent synthesis on the subject.464 0.965 Cold 0 0 (R10 ) 0 (R10 ) 0 (R10 ) Cool 0 0 (R3 ) 0 (R6 ) 0 (R9 ) Warm 0. and could be used advantageously to process ordinal weights.4 0.464 (R4 ) 0.2 (R10 ) 0 (R10 ) Cool 0. They are not as discriminating as the weighted sum and they cannot completely avoid commensurability problems (see Dubois. As a ﬁrst set of references for theory and applications.843 (R6 ) 0 (R9 ) Warm 0 0 (R2 ) 0 (R5 ) 0 (R8 ) Hot 0 0 (R1 ) 0 (R4 ) 0 (R7 ) Table 7.15: The modiﬁed weights of the rules for input i1 ωij Mj Low Medium High Ti µMj \ µTi 0.928 0 (R1 ) 0.737 (R3 ) 0. (Sugeno 1985). Dubois and Prade 1987). Another possibility is resorting to other aggregation methods that do not require the same level of information. A SYSTEM WITH EXPLICIT DECISION RULES ωij Mj Low Medium High Ti µMj \ µTi 0.843 0 Cold 0.6 0 Cold 0.14: The weights of the rules for input i2 ωij Mj Low Medium High Ti µMj \ µTi 0 0. Sugeno integrals (see Sugeno 1977.2 (R10 ) 0.737 0. the reader should consult the literature about fuzzy inference and fuzzy control. Prade and Sabbadin 1998. Several alternatives to the weighted sum are compatible with ordinal weights.585 (R10 ) 0 (R10 ) Cool 0.2.464 (R5 ) 0.585 0 (R2 ) 0. they also have some limitations. (Bouchon 1995).585 (R10 ) 0.4 (R3 ) 0.g.

the colour of the biscuits and on the operator’s skill in reacting to possible perturbations of the baking process. when an overcooked biscuit is detected. bringing justiﬁcations to some methods used by engineers in practical applications and also suggesting also multiple improvements (see Dubois. a level a on . In the ﬁeld of biscuit manufacturing. which are not easily linked to human perception. human operators controlling biscuit baking lines have the possibility of regulating the ovens during the baking process. The overall eﬃciency of production lines and the quality of the ﬁnal product highly depend on the ability of human supervisors to identify a degradation of the quality of the ﬁnal product and on their aptitude to best ﬁt the control parameters to the current situation. a colour sensor is located in the oven. DECIDING AUTOMATICALLY eﬃcient in practice. It measures colours with 3 parameters. Prade and Ughetto 1999). Perrot. diagnosis and decision tasks that could perhaps be automatised. As an example. let us report some elements of an application concerning the control of the quality of biscuits through oven regulation during baking (for more details see Trystram. Perrot 1997. Trystram. 7.3 7. Grabisch et al. In the case of an automatic system. 1997). a visual inspection of the general aspect. some theoretical justiﬁcations of choices of representations and operators are now available. Perrot and Guely 1995. within the oven. In the example of automatic diagnosis during baking.g. the only available measures are the following: • a sensor located in the oven measures the air moisture.1 A System with Implicit Decision Rules Controlling the quality of biscuits during baking The control of food processes is a typical example where humans traditionally play an important role to preserve the standard quality of the product. which are the luminance L. t is deﬁned as the mean of 6 consecutive measures performed on biscuits and expressed in mm and the desired values are about 33 or 34 mm. Moreover. • concerning the biscuit aspect. the only information accessible to the system consists of physical objective parameters obtained from measures and sensors.170 CHAPTER 7. e. the operator properly retroacts on the oven settings after checking its current temperature. such automatisation is not obvious because human expertise in oven control during the baking of biscuits mainly relies on a subjective evaluation. The evaluation m is given in cg/g (centigrams per one gram of dry matter) in the range [0. • the thickness t of the biscuit is measured every 10 minutes. near the biscuit line. 10] with the desired values being around 4 cg/g. For instance. Le Guennec and Guely 1996. However. This implies periodic evaluation.3.

subjective evaluation of biscuits can be partially explained by their objective description. a pattern associated to each disfunction z is deﬁned by the set of points xi such that d(xi ) = z. a biscuit).2 seems problematic. especially concerning the aspect of the biscuit that cannot be easily linked by the expert to the physical parameters (L. Sometimes. i. b). the diagnosis task performed by the expert controlling baking can be seen as a pattern recognition task. In this context. and a decision stage. each biscuit can be evaluated by the expert and a diagnosis of disfunction d(xi ) can be obtained for each description xi . Hence.g. L. which must determine a regulation action on the oven. bi ) in the multiple attribute space of physical variables used to describe biscuits. following the approach adopted in section 7. we can construct an explicit representation of patterns in a more “objective” space formed by the observable variables. However. t. Like in many other domains. 7. . a standard regulation action is known. b) measured by an automatic system.g. L. Then. ai . Hence. assuming that a ﬁnite list of categories is implicitly used by the expert (each of them being associated to a pattern. a classiﬁcation procedure can be seen as a function assigning to . .3. Li . The desired color is not easy to specify.7. which consists in evaluating the state of the last biscuits. Determining the right pattern for any new input vector x is a classiﬁcation problem where the categories C1 . the patterns are implicit and subjective.e. the only information accessible must be directly inferred from the expert’s observation during his control activity. “oven not hot enough”). a. Moreover. a characteristic set of “irregular” biscuits) the diagnosis stage consists in identifying the relevant pattern for any irregular biscuit and the decision stage consists in performing the regulation action appropriate to the pattern. a. Cq are the q possible disfunctions and the objects to be assigned are vectors x = (m. ti . explaining the bad quality of biscuit i (e. “oven too hot”.2 Automatising human decisions by learning from examples In performing oven control. . Let X be the set of all possible vectors x = (m. Thus. a. we can represent each biscuit i of the sample by a vector xi = (mi . Assuming a representative sample of biscuit is available. if necessary. A SYSTEM WITH IMPLICIT DECISION RULES 171 the red-green axis and a level b on the yellow-blue axis. b) describing an object (e. the decision-making process consists of two consecutive stages: a diagnosis stage. .4 we will see an approach integrating expert rules in the control of baking). using sensors. t.3. It is not unrealistic to assume that usual disfunctions have been identiﬁed and categorised by the expert and that for each of them. The following subsection presents an alternative way of establishing this link using similarity from known examples. In this space. it is not always possible to obtain suﬃciently explicit knowledge from the expert to construct a satisfactory rule database (in section 7. They can be approximated by observing the action of a human controller on the oven in a variety of cases.

in most cases. µCq (x)) giving the membership of x to each category (e. Indeed. the value i for which g(i) is maximal. The basic principle of the k−Nearest Neighbour assignment rule (k−NN) introduced in (Fix and Hodges 1951) is to assign an object to the class to which the majority of its k-nearest neighbours belong. several weighted extensions of the k−NN algorithm has been proposed (see Keller. . the membership value µCj (x) is deﬁned as the weighted average of quantities µCj (y).10). g(i) represents. . the choice of the weighted sum as an aggregator of membership values µCj (y) for all y in the neighbourhood Nk is not straightforward.11) µCj (x) = y∈Nk (x) x−y m−1 1 x−y 2 m−1 where m ∈ (1. This formula seems natural but several points are questionable. Gray and Givens 1985. When this is not the case. Bezdek. For this reason. choose all of them. When this information is not available (this is the case in our example) the nearest neighbour algorithm is very useful. One of the most popular classiﬁcation methods is the so called Bayes rule which is known to minimise the expected error rate. The choice of a compromise operator itself can be criticised and one can readily imagine cases where a disjunctive or a conjunctive operator should be preferred. This supposes that the maximum is reached for a unique i. function g(i) equals y∈Nk (x) µCi (y) and represents the total number of vectors. the neighbours are not equally distant from x and one may prefer to give less importance to neighbours very distant from x. among the k-nearest neighbours of x that have been assigned to category i. B´reau and Dubuisson 1991). the rule requires knowing the prior and conditional probability densities of all categories. However. for any sample S ⊂ X of vectors whose correct assignment is known. n} by: (7. one can use a second criterion for discriminating between all g-maximal solutions or.g possible disfunction of the oven). the k−NN rule is deﬁned for any k ∈ {1. For example. Indeed. +∞) is a technical parameter. Firstly. . which is not frequent in practice. if Nk (x) represents the subset of S formed by the k nearest neighbours of x within S. the fuzzy k−NN e rule proposed by Keller et al. The main drawback of the k − N N procedure is that all elements of Nk (x) are equally weighted. Note that membership induction of a new input x is also a matter of aggregation. (1985) is deﬁned by: µCj (y) y∈Nk (x) 2 (7. weighted by coeﬃcients inversely proportional to a power of the Euclidean distance between x and y. In equation (7. . It has been proved that the error rate of the k−NN rule tends towards the optimal Bayes error rate when both k and n tend to inﬁnity while k/n tends to 0 (see Cover and Hart 1967). .172 CHAPTER 7. It includes several implicit assumptions that are not necessarily valid (see chapter 6) and alternative compromise aggregators could possibly be used advantageously. y ∈ Nk (x). . alternatively. . Chuah and Leep 1986. More precisely. Moreover. even .10) µCj (x) = 1 0 if j = Arg maxi { otherwise y∈Nk (x) µCi (y)} where Arg maxi . DECIDING AUTOMATICALLY each vector x ∈ X the vector (µC1 (x). .

y) as a function of quantities of type xi − yi for any attribute i. ∼ (x. . These coeﬃcients evaluate the necessity for a regulation action. the norm of x − y is not necessarily a good measure of the relative dissimilarity between the two biscuits represented by x and y. It should be noted however that the deﬁnition of similarity indices ∼i (x. In order to distinguish between signiﬁcant and non signiﬁcant diﬀerences on each dimension.µCj (yi )) and ∼ (x. for instance. . ∼ (x. y) is the weighted average of one-dimensional similarity indices (∼i (x. one per attribute i) deﬁned as follows: if |xi − yi | ≤ qi 1 |xi −yi |−qi (7. . . . one could deﬁne a fuzzy similarity relation ∼ (x. Then. y). . representing the relative closeness of x and y for the expert. yi ). Indeed. Moreover the linear transition from similarity to non-similarity is not easy to justify and a full justiﬁcation of the shape of the similarity function ∼i would require a lot of information about diﬀerence of type xi − yi . y) = if qi < |xi − yi | < pi pi −qi 0 if |xi − yi | ≥ pi In the above formula. yk } and ψ is an aggregation function. y1 ). µCj (yk ). qi and pi are thresholds (possibly varying with the level xi or yi ) used to deﬁne a continuous transition from full similarity to dissimilarity as shown in the example given in Figure 7.12) µCj (x) = ψ(µCj (y1 ). the k−NN algorithm can be used for periodically computing two coeﬃcients µtoo hot (x) and µnot hot enough (x). the use of weights linked to distances of type x − y and to parameter m is not obvious. yk )) where Nk (x) = {y1 . we can use a general aggregation rule of type: (7.14) ∼i (x. one may include discrimination thresholds (see chapter 6) in the comparison. This is particularly suitable in the ﬁeld of subjective evaluation in which preferences and perceptions of the expert (or decision-maker) are not usually linearly related to the observable parameters. A SYSTEM WITH IMPLICIT DECISION RULES 173 when the weighted arithmetic mean seems convenient. . For instance. by analysing the measure x of the last biscuit. This is the case. . the construction of such similarity functions is only based on empirical evidence and common sense principles. y) is very demanding. It requires assessing two thresholds for attribute level xi .13) µCj (x) = 1 − i=1 (1− ∼ (x.3.7. . allowing to distinguish diﬀerences that are signiﬁcant for the expert from those that are negligible. Coming back to the example. Usually. The decision process . when units are diﬀerent and non commensurate on the various axis. For instance. (Henriet and Perny 1996) and (Perny and Zucker 1999) where the membership of µCj (x) is deﬁned by: k (7. .7. This is the proposition made in (Henriet 1995). . . µtoo hot (x) = 1 and µnot hot enough (x) = 0 means that decreasing the oven temperature is necessary.

and these values can be interpreted as indicators of the amplitude of the regulation and help the system in choosing a soft regulation action. However. the automatic pre-ﬁltering of loan ﬁles in a bank. Actually. and secondly to explain decisions a posteriori to the clients. the values µtoo hot (x) and µnot hot enough (x) possibly take any value within the unit interval. The main drawback of this automatic decision process is the absence of explicit decision rules explaining the regulation actions. which are the moisture (m). y) is improved if we use the fuzzy version of the k−NN algorithm in the diagnosis stage. in this application. e. Then. in many other decision problems involving an automatic system. subjectively evaluated. 7. “underdone”. In this case. “done”.qi xi + qi x i + pi yi Figure 7. ﬁrst to validate a priori the system. “not done”. y) 1 0 x i . the quality of the biscuit is evaluated by the expert on the basis of 3 attributes. This is not a real drawback in this context because the quality of biscuits is a suﬃcient argument for validation. “overdone”.4 An hybrid approach for automatic decisionmaking In the case reported in (Perrot 1997) about the control of biscuits during baking.pi xi .g. DECIDING AUTOMATICALLY ~ i (x. The qualiﬁers used for labelling these attributes are: • moisture: “dry”. the need for explanations is more crucial. the thickness (t) and the aspect of the biscuit (colour). the diagnosis stage was not uniquely based on the k−NN algorithm. “good”. “too thick” • aspect “burned”.7: One-dimensional similarity indices ∼i (x. The use of rules in the context of baking control is discussed in the next section. “normal”. Indeed.174 CHAPTER 7. the human expertise in the diagnosis stage is expressed using these labels by rules of type: If moisture is normal or dry and colour is overdone then the oven is too hot . “humid” • thickness: “too thin”. it was possible to elicit decision rules for the diagnosis stage.

L. t.8 and 7.7 5. with a label yi each element i of a representative sample of biscuits. a. It is indeed suﬃcient to ask an expert in baking control to qualify. The numeric-symbolic translation is natural for moisture and thickness.7.8 4. AN HYBRID APPROACH FOR AUTOMATIC DECISION-MAKING 175 If moisture is humid or normal and colour is underdone then the oven is not hot enough It has therefore been decided to construct membership functions linking parameters (m. yi ) for all biscuits i in the sample. b). b) to the labels used in the rules.9).8: Fuzzy labels used to describe biscuit moisture too thin good too thick 1 0 t 28 32 35 38 (mm) Figure 7. dry normal humid 1 0 m 3 3. a.9: Fuzzy labels used to describe biscuit thickness The translation is more diﬃcult for labels used for the biscuit aspect because the aspect is represented by a fuzzy subset of the 3-dimensional space characterised by the components (L. in order to be able to implement a hybrid approach based on k−NN algorithms to get a fuzzy symbolic description of the biscuit and the fuzzy rule-based approach presented in section 7.2 to infer a regulation action.8 (cg/g) Figure 7. The labels used for these two parameters are represented by the following fuzzy sets (see Figures 7. Then the fuzzy k−NN algorithm is applied with reference points (xi . b) it gives the membership values µyj (x) . using only the 5 labels introduced to describe aspect. ai . This problem has been solved by the fuzzy k−NN algorithm. bi ) describing the biscuit i in the physical space. a. the sensors assess the vector xi = (Li . At the same time.4. For any input x = (L.

In the biscuit example. j ∈ {1. whereas computers are basically suited to perform numerical computations. convenient for an automatisation. This control system can be integrated within a continuous regulation loop. the integration of the k − N N algorithm to a fuzzy rule-based system provides a soft automatic decision system whose action can be explained by the expert’s rules. They are based on the deﬁnition of fuzzy sets linking labels to observable numerical measures through membership functions. b) space. This makes it possible to resort to the fuzzy control approach presented in section 7. we have shown that many “apparently natural” choices in the modelling process possibly hide strong assumptions that can turn out to be false in practice. . For instance. j = 1.5 Conclusion We have presented simple examples illustrating some basic techniques used to simulate human diagnosis.10: The action-retroaction loop controlling baking 7. small numerical examples given in the chapter show that. and because human reasoning is mainly based on words and propositions drawn from the natural language. alternating action and retroaction steps. . . as illustrated in Figure 7. However.2. DECIDING AUTOMATICALLY for any label yj .10 m t L a b x Diagnosis Module µ too hot (x) µ not hot enough(x) Decision Module ∆t Measures biscuits Baking oven settings Figure 7. Indeed. The task is diﬃcult because human diagnosis is mainly based on human perception whereas sensors naturally give numerical measures. 5 by fuzzy subsets of the (L. in the context of repeated decision problems. a. As shown in this chapter. enabling to establish a formal correspondence between symbolic and numeric information. 5} used to describe the biscuit’s aspect. We have shown the importance of constructing suitable mathematical representation of knowledge and decision rules. a proper use of these fuzzy sets requires a very careful analysis. . the output of the system highly depends on the choice of numbers used to represent symbolic knowledge. one must be aware that multiplying arbitrary choices in the construction of membership functions can make the output of the system completely meaningless. reasoning and decision-making.176 CHAPTER 7. In particular. in the context of rule based control systems. . The fuzzy nearest neighbour algorithm provides a representation of labels yj . . . . some simple and intuitive formal models have been proposed. .

CONCLUSION 177 Moreover. This can even be used to learn the rules themselves. It must be clear that by not thoroughly respecting these constraints. because it takes advantage of the efﬁciency of neural networks while preserving the “easy to interpret” feature of a rule based-system. the outputs of any automatic decision system are more the consequences of arbitrary choices in the modelling process than those of a sound deduction justiﬁed by the observations and the decision rules. This is usually the case when the automatisation of a decision task is expected. but one should be aware that this approach is not easily transposable to more complex decision situations where preferences as well as decision rules are still to be constructed. the rules and the membership values can be learned automatically (see e. Designing an automatic decision process in which the arbitrary choice of numbers used to represent knowledge is more decisive than the knowledge itself is certainly the main pitfall of the modelling exercise. . The empirical or practical validation consists in testing the decisional behaviour of the system in various typical states of the system. we have shown that. there is a need of weighting propositions and aggregating numerical information. This is the opportunity to control the continuity and the derivatives of the function. due to the need for learning examples to show the system what the right decisions in a great number of situations are. at any level of computation. The theoretical validation consists in investigating the mathematical properties of the transfer function that forms the core of the decision module. their properties and the constraints to be satisﬁed in order to preserve the meaningfulness of conclusions. This can be used to determine suitable membership functions characterising the rules. It takes the form of trial and errors sequences enabling a progressive tuning of the fuzzy-rule based model to better approximate the expected decisional behaviour. both theoretical and empirical validations of the decision system are necessary.g Bouchon-Meunier and Marsala 1999) or (Nauck and Kruse 1999) for neuro-fuzzy methods in fuzzy rule generation.7. but also to check whether the computation of the outputs is meaningful with respect to the nature of the information given to the system as input. This shows the great importance of mastering the variety of aggregation operations.5. Since one cannot reasonably expect to avoid all arbitrary choices in the modelling process. when a suﬃciently rich basis of examples is available. The neuro-fuzzy approach is very interesting for designing an automatic decision system. the learning-oriented approach is only possible when the decision task is completely understood and mastered by a human. Notice however that. Indeed.

.

of the methodology adopted and of the resulting software would require nearly a whole book.5 provides some general comments on the advantages and drawbacks of this approach.8 DEALING WITH UNCERTAINTY: AN EXAMPLE IN ELECTRICITY PRODUCTION PLANNING 8. The main purpose of this presentation is to show how diﬃcult it is to build (or to improvise) a pragmatic decision model that is consistent and sound. A detailed presentation of the ﬁrst discussions.1 Introduction In this chapter.2 and 8. of minor interest in the framework of this book. Section 8.3 present the context of the application and the model that was established. Our purpose is to point out some characteristics of the problem. It illustrates the interest and the importance of having well-studied formal models at our disposal when we are confronted with a decision problem. we describe an application that was the theme of a research collaboration between an academic institution and a large company in charge of the production and distribution of electricity. Section 8. We do not give an exhaustive description of the work that was done and of the decision-aiding tool that was developed. of the diﬃculties encountered. of the hesitations and backtrackings. especially on the modelling of uncertainties. of the assumptions chosen. Sections 8. were neglected.4 is based on a didactical example: it ﬁrst illustrates and comments some traditional approaches that could have been used in the application. 8. of the progressive formulation of the problem.2 The context The company must periodically make some choices for the construction or closure 179 . The description was thus voluntarily simpliﬁed and some aspects. then it gives a detailed description of the approach that was applied in the concrete case.

Due to the diversity of points of view to be taken into account. one coal and two gas production units are planned and that the downgrade plan has to be anticipated. technical and environmental points of view into a type of generalised cost (see Chapter 5) was neither possible nor very serious. At most one unit of each type per year may be ordered. In terms of electricity production and delay. Gas) to be planned and in specifying whether the downgrade plan (previously deﬁned by another department of the company) has to be followed. DEALING WITH UNCERTAINTY of coal. A decision for a block of 3 years could thus be for example {1N.1: Power and construction delay for the diﬀerent types of production unit For simplicity. gas and nuclear power stations. the managers of the production department wanted to develop a multiple criteria approach for evaluating and comparing potential actions. separated by a time period of about 3 years (this period between two decisions is called block ).3. a signiﬁcant temporal dimension and a very high level of uncertainty on the data needed to be managed. each unit and modiﬁcation of the downgrade plan has diﬀerent speciﬁcities (see Table 8. 1C. They considered that aggregating ﬁnancial. Coal. A collaboration was established between the company and an academic department (we will call it “the analyst”) that rapidly discovered that. and the choice concerning the downgrade plan (follow.1). in order to ensure the production of electricity and satisfy demand. meaning that one nuclear. A}. The next section points out these aspects through the description of the model as it was formulated in collaboration with the company’s engineers. the decisions are only taken at chosen milestones. 2G. we call decision a choice made at a speciﬁc point in time: it consists in choosing the number of production units of the diﬀerent types of fuel (Nuclear.3 8. an enormous set of potential actions.1 The model The set of actions In this chapter. anticipate or delay) is of course exclusive. Type N C G A D Power (MW) 900 400 350 −300 +300 Delay (years) 9 6 3 0 0 Table 8. beside the multiple criteria aspect.180 CHAPTER 8. or partially anticipated (A) or delayed (D). 8. .

Even after adding some simple rules–only one (or zero) nuclear units are allowed exclusively on the ﬁrst and last block. exploitation cost. the decision-maker only kept the actions so that.2). the surplus is less than 1 000 MW and the deﬁcit be less than 200 MW. to minimise. It was important to test the methodology with a realistic set of criteria but it was also clear that the methodology should be independent of the criteria chosen. in BEF. THE MODEL 181 Each decision is irrevocable and naturally has consequences for the future. {}. 2G. anticipation and delay are only allowed on the ﬁrst and second blocks. Many of these actions are completely unrealistic. • investment cost. 8. The temporal dimension of the problem naturally leads to a tree structure for these actions. (see Section 8. 1C}. i. A}.3. to minimise. there are typically between 3 and 30 branches leaving each decision node. safety. 1C.1. in BEF. a period of about 20-25 years or 7 blocks. to minimise. Remember that the purpose of the study was to build a decision-aiding methodology and was not to make a decision. as for example no new unit for 20 years or 3G and 3C in every block: they can be eliminated by ﬁxing reasonable limits on the power production of the park. {3G}. 2G} .2 The set of criteria The list of criteria was deﬁned by the industrial partner in order to avoid unbearable diﬃculties in data collection and to work on a suﬃciently realistic situation. the following eight criteria were taken into account. not only on the production of electricity. An action is a succession of decisions over the whole time period concerned by the simulation (the horizon).e. in Belgian Francs (BEF). An action is thus for example {1N. for the time period of the simulation: • fuel cost.. i. an anticipation followed by a delay (or the inverse) is forbidden–the number of actions is still of around 108 . {1N.3. . • deﬁcient power in TWh.. . {1G. to minimise. but also in terms of investment. • exploitation cost. to minimise. built on decision nodes (represented by squares in Figure 8.1). as seen in Table 8. {2G}. These limitations led to a set of approximately 100 000 potential actions.e. {1C}. for each block. the amount of total cost for a variation of 1 GWh. In this problem.3. in BEF. In the application described here.8. environmental eﬀects. Depending on the block considered. • marginal cost. The number of possible actions is of course enormous.

the value is relative to the future for a parameter with a completely erratic evolution. . to maximise.3 Uncertainties and scenarios Generally speaking. because they depend on many factors that are not or not well known by the decision-maker. in tons. {2G}. {2G}. The uncertainties have an impact on the evaluations. {2C.2 presents an example of evaluations for two particular actions in a scenario where the fuel price is low and the demand for electricity is relatively weak. Other scenarios must be envisaged in order to improve the realism and usefulness of the model. to minimise. the coal power stations will be more intensively exploited than the gas ones. The evaluations of the actions on these criteria are of course not known with certainty. {} A 33 500 45 000 360 000 730 16. {1N }. DEALING WITH UNCERTAINTY A : {}. 8. • purchase and sales balance. {2C}.3. {}.182 CHAPTER 8.7 22 000 70 23 000 B 31 000 49 000 770 000 620 10. • SO2 and N Ox emissions. 2C}. to minimise. {3C}. this will have an impact on the fuel costs and the environmental impacts of the production park). in BEF. Table 8. {} B : {1N.2: The evaluations of two particular actions • CO2 emissions. 2G. 1G}. {}. the determination of the value of a parameter at a given moment can lead to the following situations: • the value is not known: the value is relative to the past and was not measured. which can be direct (the prices of the raw materials inﬂuence their total costs) or indirect (if the gas price increases more than the coal price. the value is relative to the present but is technically impossible or very expensive to obtain.3 16 000 48 30 000 Fuel cost Exploitation cost Investment cost Marginal cost Deﬁcient power CO2 emissions SO2 + N Ox emissions Sales Balance MBEF MBEF MBEF KBEF/GWH TWH Ktons Ktons MBEF Table 8. in tons. {3G}. {3G}.

This information may modify the choices of the decision-maker. • the value is not unique: several measures did not yield the same value. with a certain information on the degree of reliability. the demand for electricity (same reasoning) and the legislation concerning pollution (in this example. may use the previous values of the uncertain parameters and deduce information from them about the future. but in both cases. Suppose for instance that a variable x may be equal to 0 or 1 in the future. or more severe. at a given time. a possibility or a conﬁdence index can be associated with each value of the interval. a conﬁdence index or the result of a voting process can be associated with each value. the law may change for the third block. several scenarios are possible. The industrial partner considered that nuclear availability in the future was completely independent of the knowledge of the past and called this type of uncertainty “alea”: this means that the level of nuclear availability was completely open for each period of three years (a breakdown at a given time does not imply that there will be no breakdown in the near future). In the particular situation described here. the interval is due to the imprecision of the measure or to the use of a forecasting method. THE MODEL 183 • the value can be approximated by an interval: the bounds result from the properties of the system considered.3. they were used to working with probabilities and the framework of the study did not allow to suggest anything else. the industrial partner was already using stochastic programming for the management of the production park. The selling price of electricity was also considered as an “alea” in order to be able to capture the deregulation phenomena due to a forthcoming new legislation. More precisely.8. again a probability. • the value is unique but not reliable. The “major uncertainties” (for which some dependence can exist between the values at diﬀerent moments) were the fuel price (the market presents global tendencies and a high price for the ﬁrst two blocks reinforces the probability of having a high price for the third one). two types of uncertainties were distinguished and respectively called “aleas” and “major uncertainties”: the diﬀerence between them is based on the more or less strong dependence between the past and the future. So. and the uncertain parameters after this block are thus strongly related: either the same as for the ﬁrst blocks. a possibility. sometimes. The corresponding probabilities are assessed as follows: . scenarios were deﬁned and subjective probabilities were assigned to them by the company’s experts. a probability. The “major uncertainties” allow for a learning process that must be taken into account in the analysis: each decision. For the uncertainties. constant over all blocks after block 2). however. He wanted to have another methodology in order to take better account of the number of potential actions and the multiple criteria aspects.

a complete treatment and a tree-structure for these scenarios (a scenario is a succession of observed uncertainties) are necessary. Of course. The decision-maker has to choose between two decisions: a and b. 2 levels for the legislation. the complete scenario for a decision node at time t is not known but a probability is associated to each of them. In practice. because their independence does not allow for direct inference from the past. If there are 3 levels for the selling price and 2 levels for the availability of nuclear units . Second. with the same probability distribution. two sequences were retained for legislation (MMMMMMM and MMHHHHH ). some consequences of the decisions appear after a very long time (as the environmental consequences for example). 8. M for medium and L for low) is much less probable than a sequence HHHMMMM.5. after past scenario B. First. 3 levels for the demand. and a sequence of levels for the fuel price such as HHLMHLH (H for high. Fortunately.184 CHAPTER 8. there are. allowing to compute the conditional probability of each complete scenario knowing the already observed partial scenario at time t. and if the horizon is divided into 7 blocks. Because of the statistical dependence and of the possible learning process in the major uncertainty case. the consequences . the “aleas” are by essence uncorrelated and there is no reason to neglect any scenario.4 The temporal dimension Independently of the dependence between the past and the future in the modelling of the uncertainties. the time period between the decision to build a certain type of power station and the beginning of the exploitation of that station is far from being negligible.3. (3 × 3 × 2)7 6×108 possible scenarios. it was imposed that scenarios could only change after two blocks. Third. most of these scenarios are negligible because the probability of a very ﬂuctuating scenario is very small: the “major uncertainty” scenarios are rather strongly correlated. The previous explanation is not valid for “aleas”. a priori. If he prefers a when x = 0 and b when x = 1. On the contrary. If there are 3 levels for the fuel price. the aleas act much more simply than the major uncertainties. where the “past scenario” is known at the time of decision. after past scenario A. and it is possible to take the whole set of scenarios into account. a reasonable decision will be to choose a after scenario A and b after scenario B.5. P (x = 0) < 0. For these reasons. the temporal dimension plays an important role in this kind of problem. then the number of scenarios is (3 × 2)7 = 279 936. Fortunately. DEALING WITH UNCERTAINTY P (x = 0) > 0. The analyst ﬁnally retained around 200 representative scenarios that were gathered in a tree-structure of major uncertainty nodes (represented by circles in Figure 8.1). the tree structure of the “aleas” is obvious: each node gives rise to the same possibilities. and each modiﬁcation was penalised so that very ﬂuctuating scenarios were hardly possible.

8. THE MODEL 185 First decisions First period Second decisions Last decisions Last period Consequences Figure 8.1: The decision tree .3.

That is why the analyst kept the possibility to introduce discounting or not. F. electricity selling price. two decisions A and B are eligible. the decision nodes (squares) correspond to active parts of the analysis where the decision-maker has to establish his strategy. Figure 8. However. during the ﬁrst period. taking the previous information into account. and so on until the last decision (square) node and the last scenario (circle) node that determine the whole action and the whole observed scenario. DEALING WITH UNCERTAINTY themselves can be dispersed over rather long periods and vary within these periods. It is rather usual. At the beginning of the second period. G are eligible if the ﬁrst decision was B. 8. to introduce a discounting rate that decreases the weight of the evaluations for distant consequences (see Chapter 5) and the industrial partner did this here. each with probability 1/2. two decisions C and D are eligible if the ﬁrst decision was A and three decisions E. while the uncertainty nodes (circles) correspond to passive parts of the analysis where the decision-maker undergoes the modiﬁcations of the parameters. Fourth. During the second period.4 A didactic example Consider Figure 8.3.5 Summary of the model The complete model can be described by a tree structure including decision nodes (squares) and uncertainty nodes (circles). one may observe the actual values of the uncertain parameters (nuclear disponibility.2 presents the tree and the evaluation of each action (set of decisions) for each complete scenario. the consequences of a decision can be diﬀerent according to the moment that decision is taken. electricity demand and environmental legislation).2 describing two successive time periods.1. two events U and V are possible after S (with respective probabilities 1/4 and 3/4) and two events Y and Z are possible after T (with respective probabilities 3/4 and 1/4). Remark that this didactic example contains only one . At t = 0 (square node at the beginning of block 1). such an approach may not be the best one and the decision-maker could be more conﬁdent in the ﬂexible approach and the richness of the scenarios. 8. fuel price. determining one branch leaving the considered circle node and leading to one of the decision nodes at time t = 1.186 CHAPTER 8.1). In the resulting tree (Figure 8. At time t = 0. as illustrated in Figure 8. leading to a circle node. for a long term decision problem with important consequences for future generations. in planning models. During block 1. a ﬁrst decision is made (a branch is chosen) without any information on the scenario. two events S and T are possible. A new decision is then made.

8. The game consists of tossing a coin repeatedly until the ﬁrst time it lands on “heads”. Of course. Remember the famous St. although B is better than A in two scenarios out of three. in any case.5) = 42/8.5) = 41/8 while the expected value of decision D is (1/4 × 4.4. applying the expected value approach. Of course. At node N2 (beginning of the second period). if the probabilities of S. this is only possible when the evaluations are elements of a numerical scale.4.e. At node N1. the amount would not be very big. the player wins 2k e. as illustrated in the example presented in Figure 8. the expected values of decisions A and B are respectively 39/8 and 5. 2k . one obtains the tree represented in Figure 8. the mean of the corresponding probability distributions for the evaluations. Making similar calculations for N3. the expected value presents some characteristics that the user must be aware of. So.2 Some comments on the previous approach Just as the weighted sum (already discussed in the other chapters of this book). so the best decision is B. the expected value of decision C is (1/4 × 7 + 3/4 × 4. A DIDACTIC EXAMPLE 187 evaluation for each action (problem with one criterion).8. would be completely compensated by a diﬀerence of three units in favour of D over C for event U because its probability is 1/4. i. We do not insist on the multiple criteria aspect of the problem here (this was treated in Chapter 6) and focus on the treatment of uncertainty. the nodes of the tree are considered from the leaves to the root (“folding back”) and the decisions are taken at each node in order to maximise their expected values. T and U are all equal to 1/3.5 + 3/4 × 5. whose probability is 3/4.2 = +∞. we see that the expected gain is ∞ k=1 1 k . Petersburg game (see for example Sinn 1983) showing that the expected value approach does not always represent the attitude of the decision-maker towards risk very well.4. probabilities intervene as tradeoﬀs between the values for diﬀerent events: the diﬀerence of one unit in favour of C over D for event V. depending on whether the event occurred in the ﬁrst period was S or T. the best decision at node N2 is D and the expected value associated to N2 is 42/8. the answer depends on the player but. A consequence is that a big diﬀerence in favour of a speciﬁc decision in some scenario could be suﬃcient to overcome a systematic advantage for another decision in all the other scenarios. In conclusion. the “optimal action” obtained by the traditional approach will consist in applying decision B at the beginning of the ﬁrst period and decision E or G at the beginning of the second period.4. if this happens on the k th toss. 8.1 The expected value approach In the traditional approach. For example. However. The question is to ﬁnd out how much a player would be ready to bet in such a game. In this example. N4 and N5. the expected value will give preference to A.3.

2: A didactic example .5 5.188 CHAPTER 8. DEALING WITH UNCERTAINTY Value U (1/4) C V (3/4) N2 S (1/2) D U (1/4) N6 N7 N8 N9 N10 N11 N12 N13 N14 N15 N16 N17 N18 N19 N20 N21 N22 N23 N24 N25 7 4.5 4.5 5.5 4.5 3 1 1 1 6 1 2 2 5 5 V (3/4) Y (3/4) C Z (1/4) N3 A Z (1/4) U (1/4) D Y (3/4) T (1/2) N1 N4 E F G B S (1/2) V (3/4) U (1/4) V (3/4) U (1/4) V (3/4) Y (3/4) T (1/2) N5 E F G Z (1/4) Y (3/4) Z (1/4) Y (3/4) Z (1/4) Figure 8.5 1 5 3.5 4.

The optimal action is then to apply decision A at the beginning of the ﬁrst period and decision C at the beginning of the second period. allows to resolve this paradox and. In the case of the St. .4.3 The expected utility approach As the preferences of the decision-maker are not necessarily linearly linked to the evaluations of the actions. were only studied in the present century (see for instance von Neumann and Morgenstern 1944).5) = 2. which is the subject of the next section.2 and with a utility function deﬁned by u(1) = u(2) = 1.8. to take diﬀerent possible attitudes towards risk into account.4. the expected utility of refusing the game is u(0). x > 220 . we obtain the tree given in Figure 8. 2. u(4. it may be useful to replace these evaluations by the “psychological values” they have for the decision-maker through so-called utility functions (Fishburn 1970). more generally. for example.5) = 3.5. Petersburg game. n) is given by pi u(xi ). In the example in Figure 8. if we denote by u(x) the utility of “winning x e”. u(3) = u(3. in terms of preferences. and is negative for larger values. the logarithmic function. k=1 As an exercise. The expected utility can also be ﬁnite with an unbounded utility function such as. contrary to what was obtained with the expected value approach. Denoting by u(xi ) the utility of the evaluation xi .. . the expected utility of betting s e in the game is positive (hence superior to the expected utility of refusing the game) as long as s is less than 21(1 − 1/220 ) e. while the expected utility of betting an amount of s e in the game is ∞ 1/2k u(2k − s). u(6) = u(7) = 4. i This model dates back at least to Bernoulli (1954) but the basic axioms. the expected utility value of a decision leading to the evaluation xi with probability pi (i = 1. A DIDACTIC EXAMPLE 189 The expected utility model..5) = u(5) = u(5.. the reader can verify that for a utility function deﬁned by u(x) = x/220 1 iﬀ iﬀ x ≤ 220 . 8.

3: Application of the expected value approach 10 S T A 15 U 20 15 B S T 20 U 9 Figure 8.5 N1 B S(1/2) N4 E 5 T(1/2) N5 G 5 Figure 8.25 A T(1/2) N3 C 4.4: Illustration of the compensation eﬀect . DEALING WITH UNCERTAINTY Best decision S(1/2) N2 D Value 5.190 CHAPTER 8.

the expected utility model yields 0. the probabilities being objective or subjective: see for example Savage (1954). A DIDACTIC EXAMPLE Best decision S(1/2) N2 C Value 13/4 191 A T(1/2) N3 C 1/2 N1 B S(1/2) N4 E 11/4 T(1/2) N5 E 1/2 Figure 8. it is reasonable to prefer an alternative providing 2 500 000 e with probability 0.89u(0).11u(500 000) + 0.89u(500 000) + 0.9u(0) > 0. . Barbera. McCord and de Neufville (1983).9 to an alternative providing 500 000 e with probability 0.8. 0. Fishburn (1970) and Fishburn (1982).01.01u(0). Bell et al.1u(2 500 000) + 0.89. Allais and Hagen (1979). hence. It is not unusual to prefer a guaranteed gain of 500 000 e to an alternative providing 500 000 e with probability 0. In this case. Hammond and Seidl (1998)) We simply recall one or two characteristics here that every user should be aware of. Applying the expected utility model leads to the following inequality u(500 000) > 0. the expected utility approach implicitly assumes that the preferences of the decision-maker satisfy some properties that can be violated in practice.1 and 0 e with probability 0. 2 500 000 e with probability 0.89. Luce and Raiﬀa (1957). grouping terms.11 and 0 e with probability 0.11u(500 000) > 0. As in every model. At the same time. Ellsberg (1961).01u(0). The following example illustrates the well-known Allais paradox (see Allais 1953) .5: Application of the expected utility approach 8. Loomes (1988).1u(2 500 000) + 0.1 and 0 e with probability 0.4 Some comments on the expected utility approach Much literature is devoted to this approach.1u(2 500 000) + 0. (1988).4.4.

Barbera et al. DEALING WITH UNCERTAINTY A B W R 100 0 0 100 W 100 0 G 0 0 C D R G 0 100 100 100 Table 8. (1988).11u(500 000). This last interpretation led scientists to propose many variants of the expected utility model. Another interpretation is that the expected utility approach sometimes implies unreasonable constraints on the preferences of the decisionmaker (in the previous example.3 where W. A possible attitude in this case is to consider that the decision-maker should revise his judgment in order to be more “rational”. let us mention why using probabilities may cause some trouble in modelling uncertainties or risk. Before explaining why the expected utility model (or one of its variants) was not applied by the analyst in the electricity production planning problem. while the expected utility approach leads to indiﬀerence between A and B and as well as between C and D. grouping the terms. red or green. You only know that the two other balls are either both red (R). p.3 hence. An urn contains one white ball (W) and two other balls.01u(0) > 0.1u(2 500 000) + 0. So. and G represent the three states according to whether one ball drawn at random is white. in order to satisfy the axioms of the model. The following example illustrates the so-called Ellsberg paradox and is extracted from Fishburn (1970. .192 CHAPTER 8. or both green (G). the violated property is the so-called independence axiom of Von Neumann and Morgenstern). 1987). Bell et al. or one is red and one is green. other tools (possibility theory.172). Machina (1982. This type of situation shows that the use of the probability concept may be debatable for representing attitude towards risk or uncertainty. Intuition leads many people to prefer A to B and D to C. The ﬁgures are what you will be paid (in Euros) after you make your choice and a ball is drawn. 0. R. which is in contradiction with the inequality obtained above. as in Kahneman and Tversky (1979). the expected utility model cannot explain the two previous preference situations simultaneously. that is. belief functions or fuzzy integrals) can also be envisaged. (1998). Consider the two situations in Table 8.

4 C 7 4.8. The analyst proposed the following index to measure the preference of C over D.5 D 4. The comparison between C and D was made on the basis of the diﬀerences in preference between them for each of the considered events similarly to what is done in the Promethee method (Brans and Vincke 1985).5) = 1/4.2. This function expresses the fact that a diﬀerence which is smaller or equal to 1 is considered to be non signiﬁcant. 1/4 3/4 Table 8. The analyst decided to propose a paired comparison of the actions. the analyst did not know whether the probabilities given by the company were really probabilities (and not “plausibility coeﬃcients”) and it was not sure that the consequences of one scenario were really comparable to the consequences of another. .5 − 5. On the one hand. an advantage of this approach is to enable the introduction of indiﬀerence thresholds.4. At node N2.3. thus avoiding some of the pitfalls mentioned in Chapter 6 on the multi-attribute value functions. Let us consider a preference function deﬁned by f (x) = 1 ∀x > 1. Moreover. it was deﬁnitely excluded to transform all the consequences into money and to aggregate them with a discounting rate (as in Chapter 5). scenario by scenario. on the basis of the data contained in Table 8. so that it was impossible to envisage an enriched variant of the expected utility model. 0 elsewhere.5 The approach applied in this case: ﬁrst step We will now present the approach that was applied in the electricity production planning problem.5 5. This approach is certainly not ideal (some drawbacks will be pointed out in the presentation). As we see. it does not aggregate the multiple criteria consequences of the decisions into a single dimension. we have to consider Table 8. Other functions can be deﬁned similarly to what is done in the Promethee method. it does not introduce a discounting rate for the dynamic aspect (see Chapter 5) and it allows to model the particular preferences of the decision-maker along each evaluation scale.5) + 3/4 × f (4. A DIDACTIC EXAMPLE Events U V Probab. In the electricity production planning problem described in Section 8. the company was not prepared to devote much time to the clariﬁcation of the probabilities and to long discussions about the multiple criteria and dynamic aspects of the problem. On the other hand. where x is the diﬀerence in the evaluations of two decisions.5 193 8.4. However. as illustrated below for the didactical example presented in Figure 8.4: 1/4 × f (7 − 4.4.

5 − 5) = 3/4. The preference index of G over E (for example) is C 0 0 D 3/4 0 C 4. The scores of C and D are respectively 3/4 and −3/4. in the multiple criteria case. so that the chosen decision at node N3 is also C.6 while the preference of D over C is given by 1/4 × f (4. Note also that. could have been used here.7 .5 − 7) + 3/4 × f (5. despite the analyst’s doubt about the real nature of the “probabilities”. decision E dominates F and G and is thus chosen (where “dominates” means “is better in each scenario”).5 − 4.5) = 0. we have to consider Table 8. which will be described in a volume in preparation. These preference indices are summarised in Table 8. leading to the preference indices presented in Table 8. At node N5. The score of each decision is then the sum of the preferences of this decision over the other minus the sum of the preferences of the other over it. So. this trivially gives 1/4 and −1/4 as respective scores for C and D. the preference index of C over D is 3/4 × f (4. The maximum score determines the chosen decision. In the case of Table 8.5. This is certainly a weak point of the method and other tools. At node N3.194 CHAPTER 8.8.5 Events Y Z Probab. he used them to calculate a sort of expected index of preference for each decision over each other decision. we must consider Table 8. a (possibility weighted) sum is computed for all the criteria in order to obtain the global score of a decision.7. Remark that.5 − 1) + 1/4 × f (4. the chosen decision at node N2 is C. DEALING WITH UNCERTAINTY C 0 0 D 1/4 0 C D Table 8. 3/4 1/4 Table 8. For example.5 4.5.6.5 D 1 5 C D Table 8. At node N4.

they yield 1/2. We can now consider Table 8.10 .5(E) 5(G) 5(G) Table 8. On basis of this table.5(E) 5. −7/4 and 5/4 as respective scores for E.9. In conclusion. while the preference of B over A is 1/8f (−3. A DIDACTIC EXAMPLE Probab.5(C) 4.5(C) 4.4. giving A as the best ﬁrst decision. the “optimal action” obtained through this ﬁrst step consists in choosing A at the beginning of the ﬁrst period and C at the beginning of the second period.5) + 1/8f (0. This approach allows to take the comparisons of the decisions separately for each scenario into account.5) + 3/8f (1) + 3/8f (0. 3/4 1/4 E 6 1 F 2 2 G 5 5 195 Y Z Table 8.8.5) = 0.5) + 3/8f (−1) + 3/8f (−0. the preference of A over B is 1/8f (3. 1/8 3/8 3/8 1/8 A 7(C) 4.5(C) B 3. The other preference indices are presented in Table 8.5) + 1/8f (−0.5) = 1/8.9 3/4 × f (5 − 6) + 1/4 × f (5 − 1) = 1/4. so that G is the chosen decision at node N5. F and G.10 associated to N1.8 E F G E 0 0 1/4 F 3/4 0 1 G 0 0 0 Table 8. Let us illustrate this point for the example of Figure Scenarios S-U S-V T-Y T-Z Probab. The values in this table are those that correspond to the chosen decisions at the nodes N2 to N5 (they are indicated in parentheses).

the expected utility approach gives the same value 1/3 u(10) + u(15) + u(20) to A and B that are thus considered as indiﬀerent. but there are no uncertainties during the ﬁrst two periods. three events S. so that F will be the decision chosen at node N4. this will lead to the choice of B. During the last period. and the preference of D over C by . Two decisions A and B are possible at the beginning of the ﬁrst period.196 CHAPTER 8. The approach described in this section will give a preference index of A over B equal to 1/3 × f (10 − 15) + 1/3 × f (15 − 20) + 1/3 × f (20 − 10) and a preference index of B over A equal to 1/3 × f (15 − 10) + 1/3 × f (20 − 15) + 1/3 × f (10 − 20). with a probability equal to 2/3.11. The example presented in Figure 8. we compute the preference index of C over D by 1/3 × f (15 − 20) + 1/3 × f (20 − 0) + 1/3 × f (0 − 5) = 1/3. However. At node N2. At node N4. On basis of Table 8. we see that B is better than A for events S and T. where 9 has been replaced by 10 in the evaluation of B for event U.4. At the beginning of the third period. With the same function f as before. two decisions E and F are possible after C while only one decision is possible in each of the other cases. 8. Making the (natural) assumption that f (x) = 0 when x is negative. two decisions C and D are possible after A and only one decision is possible after B.11.4. while the preference index of F over E will be 1/3 × f (15 − 10) + 1/3 × f (20 − 15) + 1/3 × f (0 − 20) = 2/3. each with a probability of 1/3. T and U can occur.6 will allow to illustrate a ﬁrst drawback. where the values of C are those of F (decision chosen at node N4). If the probabilities of S.5 with the same function f . it also presents some pitfalls which must be mentioned. we must consider Table 8.6 Comment on the ﬁrst step As this approach is based on successive pairwise comparisons. we see that this approach will lead to indiﬀerence between A and B only with a function f such that f (20 − 10) = f (15 − 10) + f (20 − 15). if we compare A and B separately for each event.T and U are equal to 1/3. the preference index of E over F will be 1/3 × f (10 − 15) + 1/3 × f (15 − 20) + 1/3 × f (20 − 0) = 1/3. three periods of time are considered. DEALING WITH UNCERTAINTY 8. In this example. At the beginning of the second period. Let us apply the approach described in Section 4.

4. 1/3 1/3 1/3 Table 8. A DIDACTIC EXAMPLE 197 S T N7 N8 N9 N10 N11 N12 10 15 20 15 20 0 20 0 5 0 5 10 E N4 C F N2 A U S T U D N5 S T U N13 N14 N15 N16 N17 N18 N1 S B T N3 N6 U Figure 8.11 C 15 20 0 D 20 0 5 .6: A pitfall of the ﬁrst step Events S T U Probab.8.

The conclusion was many indiﬀerences between the decisions at each decision node. so that B will be chosen at node N1. 1/3 1/3 1/3 Table 8. while the preference index of B over A is 1/3 × f (0 − 20) + 1/3 × f (5 − 0) + 1/3 × f (10 − 5) = 2/3.13 1/3 × f (20 − 15) + 1/3 × f (0 − 20) + 1/3 × f (5 − 0) = 2/3.C. This is due to the fact that the comparisons are “too local” in the tree. On basis of Table 8. 8.12. the analyst proposed to introduce a second step that is the subject of the next section. a second step was added by the analyst.7 The approach applied in this case: second step In order to introduce more information into the comparisons of local decisions and to take the tree as a whole into account. where the values of A are those of D (decision chosen at node N2).13. for decisions at nodes relative to the last periods.12.12 Events S T U Probab. the evaluations were not very diﬀerent. the preference index of A over B is given by 1/3 × f (20 − 0) + 1/3 × f (0 − 5) + 1/3 × f (5 − 10) = 1/3. so that D will be the decision chosen at node N2. due to the large common part of the actions and scenarios preceding these decisions. the methodology leads to the choice of the action B despite the fact that it is dominated by the action (A.198 CHAPTER 8. 1/3 1/3 1/3 B 0 5 10 (A.4.E) 10 15 20 A 20 0 5 B 0 5 10 Table 8. To improve the methodology.E) as is shown in Table 8. DEALING WITH UNCERTAINTY Events S T U Probab. in the concrete application described in this chapter another drawback was the fact that.C. At node N1. we must consider the Table 8. In conclusion. .

At node N3. C is therefore chosen at node N2.4. the preference of D over E is [1/4 × f (1) + 3/4 × f (0)] = 0 and the preference of E over D is [1/4 × f (−1) + 3/4 × f (0)] = 0. i. F.16 . at node N2. at each decision node.14 Using the same preference function as before. However. C 0 0 0 D 1/4 0 0 E 1/4 0 0 C D E Table 8. so that.15 The scores for C and D are respectively 1/2 and −1/4. C and D are also compared to the best decision in N4.5) + 3/4 × f (1)] = 0. 1/4 3/4 C 7 4. the local decisions are also compared to the best actions in the same scenarios in each of the branches of the tree. the interest of this second step is to choose.e. so that C is also chosen in N3.17 gives the preference indices.5 D 1 5 G 5 5 Table 8. we compare C and D with the best decision in N5. G and C (N3)) lead to the same conclusions as in the ﬁrst step. the preference of D over C is still 0. F.16. not only locally. This leads to the consideration of Table 8. In Figure 8. but also in Events Y Z Probab.14 At each decision node.5 E(N4) 3. the second step does not change anything.5 4. The scores of C and D are respectively 3/4 and −3/2.5 5.5 5. the preference of C over D is still 1/4 (see section 4. the preference of E over C is [1/4 × f (−3. G and C (N2)) and of N5 (comparison of E. 3/4 1/4 C 4. on basis of Table 8.5 199 Table 8. The analysis of N4 (comparison of E.15 summarises these values.5 D 4. i.e.4). the preference of C over E is [1/4 × f (3. to E (after event S). Table 8.8.2. with G (after event T). a decision leading to a ﬁnal result that is strong. A DIDACTIC EXAMPLE Events U V Probab. Table 8. in this example.5) + 3/4 × f (−1)] = 1/4.

This is illustrated by the example in Figure 8.6 where the second step works as follows.18 Table 8.17 Prob. The scores of E and F respectively become 1 and 1/3. 1/3 1/3 1/3 E 10 15 20 F 15 20 0 D 20 0 5 B 0 5 10 Table 8. so that the best decision at N4 is now E. At node N4. although this property is not guaranteed in all cases. through Table 8.18 comparison with the strongest results obtained during the ﬁrst step in the other branches of the tree (always in the same scenarios).5 Conclusions This approach (ﬁrst and second steps) was successfully implemented and applied by the company (after many diﬃculties due to the combinatorial aspects of the problem) and some visual tools were developed in order to facilitate the decisionE 0 2/3 1/3 0 F 1/3 0 2/3 1/3 D 2/3 1/3 0 2/3 B 1 2/3 1/3 0 E F D B Table 8. So we see that this second step somehow avoids to choose dominated actions.19 .19 presents the preference indices.200 CHAPTER 8. 8. so that the best decision in N2 is now C. we compare E and F with D and B (the best actions in the other branches as they are unique). At N2. we have to compare A (followed by C and E) with B and we choose A (that dominates B). At N1. we have to compare C (followed by E) with D and B (best action in the other branch): the scores of C and D are respectively 4/3 and -2/3. DEALING WITH UNCERTAINTY C 0 0 0 D 3/4 0 3/4 G 0 0 0 C D G Table 8.

. although the role of the so-called probabilities is not that clear in the modelling of uncertainty. . Munier (1989). Gilboa and Schmeidler (1993). such as Dekel (1986). They pointed out more or less desirable properties: linearity. Beside the expected utility model (traditional approach). Jaﬀray (1989). . It presents the following advantages: • it compares the consequences of a decision in a scenario with the consequences of another decision in the same scenario. CONCLUSIONS maker’s understanding of the problem. but it does not guarantee that the chosen action is non-dominated. more generally. diﬀerent kinds of independence. mixture separability. • it is a rather bizarre mixture of local (ﬁrst step) and global (second step) comparisons of the actions. to model the preferences of the decision-maker for each evaluation scale. • it allows to introduce indiﬀerence thresholds or. The literature on the management of uncertainty is probably one of the most abundant in decision analysis.5. Quiggin (1993). Moreover. . .8. as mentioned by Machina (1989). this approach also presents some mysterious aspects that should be more thoroughly investigated: • it computes a sort of expected index for preference of each action over each other action. A dy- . 201 Let us now summarise the characteristics of this approach. a lot of other approaches were studied by many authors. stochastic dominance. However. . replacement separability. it is important to make the distinction between what he calls static and dynamic choice situations.

illustrating the so-called dynamic inconsistency. so that the best choice for him (before knowing the ﬁrst choice of nature) is A.202 CHAPTER 8.9). Now consider the tree of Figure 8. This example shows that no approach can be considered as ideal in the context of decision under uncertainty. can present some pitfalls that have to be known by the analyst.8).1 (and nothing with probability 0. As for the other situations studied in this book. will be B.2 (and nothing with probability 0. he can easily calculate that if he chooses A. Let us illustrate this concept by a short example. the actual choice of the decision-maker. Machina (1989) showed that this argument relies on a hidden assumption concerning behaviour in dynamic choice situations (the so-called consequentialism) and argued that this assumption is inappropriate when the decision-maker is a “non-expected utility maximiser”.8 Figure 8.2 (and nothing with probability 0. Assume that a decisionmaker prefers a game where he wins 50 e with probability 0.9) to a game where he wins 10 e with probability 0. he wins 50 e with probability 0. Knowing the underlying assumptions of the decision-aid model which . DEALING WITH UNCERTAINTY 0. So. at node N1.5). an interesting property is the so-called dynamic consistency: a decision-maker is said to be dynamically inconsistent if his actual choice when arriving at a decision node diﬀers from his previously planned choice for that node.1 (and nothing with probability 0. the actual choice at N1 diﬀers from the planned choice for that node.5 A N1 0.8). he prefers to receive 10 e with certainty to a game where he wins 50 e with probability 0. In such a context. It can be shown that any departure from the traditional approach can lead to dynamic inconsistency.7: The dynamic consistency namic choice problem is characterised by the fact that at least one uncertainty node is followed by a decision node (this is typically the case of the application described in this chapter). if he has to plan the choice between A and B before knowing the ﬁrst choice of nature. However. each model.5 (and nothing with probability 0. Note that these preferences violate the independence axiom of Von Neumann and Morgenstern. However. At the same time.5 0 10 0 0. According to the previous information. while if he chooses B. he wins 10 e with probability 0.7.2 B 1 50 0. each procedure.

many decision tools are developed in real applications without taking enough precautions (this is also the case in the example presented in this chapter.8.5. due to lack of time and other priorities. It is a fact that. for the analyst. to guarantee an as scientiﬁc as possible approach of the decision problem. due to the short delays and to the necessity of overcoming the combinatorial aspects of the problem). . CONCLUSIONS 203 will be used is probably the only way. This is why we consider providing some guidelines for modelling a decision problem important to the analysts: this will be the subject of a volume in preparation.

.

the resources 1A large part of this chapter uses material already published in Paschetta and Tsouki`s (1999). evaluation model etc. the evaluation model created and the multiple criteria method adopted. More precisely. 2. 1. problem formulation. Section 1 introduces and deﬁnes some preliminary concepts that will be used in the rest of the chapter such as decision process.9 SUPPORTING DECISIONS: A REAL-WORLD CASE STUDY Introduction In this chapter1 we report on a real world decision aiding process which took place in a large Italian ﬁrm. Ackermann and Shepherd 1997.. Bana e Costa. e Roy and Bouyssou 1993). often neglected in many conventional decision aiding methodologies and in operational research. We will try to extensively present the decision process for which the decision support was requested. From this point of view the reader may ﬁnd questions already introduced in previous chapters of the book. the construction of the criteria etc. in late 1996 and early 1997. The ﬁrst reason consists in our will to give an account of what providing decision support in a real context means and to show the importance of elements such as the participating actors. Our objective is to stimulate the reader to reﬂect on how decision support tools and concepts are used in real life situations and how theoretical research may contribute to aide real decision– makers in real decision situations. a 205 . We introduce such a real case description for two reasons. The second reason is our will to introduce the reader to some concepts and problems that will be extensively discussed in a forthcoming volume by the authors. concerning the evaluation of oﬀers following a call for tenders for a very important software acquisition. Vincke 1992. Ensslin. actors. Corrˆa and Vansnick 1999. including the problem structuring and formulation. the actors involved and their concerns (stakes). the actors involved. the decision aiding process.. the chapter is organised as follows. the problem formulation. but here they are discussed from a decision aiding process perspective. The reader should be aware of the fact that very few real world cases of decision support are reported in literature although much more occur in reality (for noteworthy exceptions see Belton. decision aiding process. Section 2 presents the decision process for which the decision support was requested.

Section 3 describes the decision aiding process. . Masser 1983. e e Checkland 1981. Raisinghani and Th´oret 1976. SUPPORTING DECISIONS involved and the timing. 9. Svenson and V´ri 1993. Humphreys. Ostanello and Tsouki`s 1993). • Decision Process: a sequence of interactions amongst persons and/or organisations characterising one or more objects or concerns (the “problems”). can have diﬀerent interpretations. • Problem Formulation: a formal representation of the problem for which the client asked the analyst to support him (this is one of the products of the decision aiding process). Jacquet-Lagr`ze.206 CHAPTER 9. Section 4 summarises the lessons learned in such an experience. while the complete list of the evaluation attributes is provided in Appendix B. mainly through the diﬀerent “products” of such a process that are speciﬁcally analysed (the problem formulation. • Evaluation Model: a model creating a speciﬁc instance of the problem formulation for which a speciﬁc decision support method can be used (this is one of the products of the decision aiding process). • Analyst: an actor in a decision process who supports a client in a speciﬁc demand. The clients’ comments on the experience are also included in this section. Roy and Hirsch 1978. • Client: an actor in a decision process who asks for a support in order to deﬁne his behaviour in the process. Heurgon 1982. Rosenhead 1989. decision process etc. Ostanello 1990. • Decision Aiding Process: part of the decision process and more precisely the interactions occurring at least between the client and the analyst. but in this context we prefer to use the term client. In order to help a the reader understand how such terms are used in this presentation we introduce some informal deﬁnitions. a Moscarola 1984.1 Preliminaries We will make extensive use of some terms (like actor. Moscarola. the evaluation model and the ﬁnal recommendation) and discusses the experience conducted. • Problem Situation: a descriptive model of what happens in the decision process when the decision support is requested and what the client is expecting to obtain form the decision support (this is one of the products of the decision aiding process). • Actors: the participants in a decision process. All technical details are included in Appendix A (an ELECTRETRI type procedure is used).) in this chapter that. Nutt 1984. Mintzberg. The term decision–maker is also used in the literature and in other chapters of this book. although present in literature (see Simon 1957. Ostanello 1997.

some of the company’s external consultants concerned with software engineering. • the GISD felt able to describe and evaluate diﬀerent GIS products based on a set of attributes (at the end several hundreds). diﬀerent suppliers of GIS software. the RDA. but also very committing. the company’s Information Systems Department (ISD) asked the aﬃliated research and development agency (RDA) and more speciﬁcally the department concerned with this type of information technology (GISD) to perform a pilot study of the market in order to orient the company towards an acquisition. • the company required a very particular version of GIS that did not exist as a ready made product on the market. but was not able to provide a synthetic evaluation. . with the addition of ad-hoc written software for the purpose of the company. The actors involved at this level are the company’s IS manager.2. acquisition (AQ) manager. the purpose of which was just as obscure (the use of a weighted sum was immediately set aside because it was perceived as “meaningless”). as part of a strategic development policy. The GISD of the RDA noticed that: • the market oﬀered a very large variety of software which could be used as a GIS for the company’s purposes. The MCDA/SE unit responsible then decided to activate its links with an academic institution in order to get more insight and advice on the problem that soon appeared to overcome the knowledge level of the unit at that time. • The decision process for which the decision aid was provided concerned the “acquisition of a GIS for X (the company)”. • A ﬁrst decision aiding process was established where the client was the IS manager and the analyst was the GIS department of the RDA. At this point we can make the following remarks. However.2 The Decision Process In early 1996 a very large Italian company operating a network based service decided. • the question asked by the ISD was very general. since (at that time) this was quite a new technology.9. THE DECISION PROCESS 207 9. but had to be created by customising and combining diﬀerent modules of existing software. because it included an evaluation prior to an acquisition and not just a simple description of the diﬀerent products. At this point of the process the GISD found out that a unit concerned with the use of the MCDA (Multiple Criteria Decision Analysis) methodology in software evaluation (MCDA/SE) was operating within the RDA and presented this problem as a case study opening a speciﬁc commitment. to equip itself with a Geographical Information System (GIS) on which all information concerning the structure of the network and the services provided all over the country was to be transferred.

3. • the timing of the evaluation (including testing the oﬀers) could be extremely long compared with the rapidity of the technological evolution of this type of software. the whole budget being several million e. The call for tenders concerned the acquisition of hundreds of software licenses. As already noted before. 2. From a procedural point of view the administration of a bid of this type is delegated to a committee which in this case included the IS manager. the MCDA/SE unit as the analyst and the supervisor. providing him with expert methodological knowledge and framing his activity. but a collection of existing modules of GIS software which was expected to be used in order to create ad-hoc software for the speciﬁc necessities of the company. The ﬁrst advice by the analyst to the GISD was to negotiate a more speciﬁc commitment such that their task could be more precise and better deﬁned with their client. At this point it is important to note the following. For this purpose the GISD drafted a decision aiding process outline where the principal activities to be performed were speciﬁed. From a ﬁnancial point of view it represented a large stake for the company and a high level of responsibility for the decision–makers. a delegate of the CEO and a lawyer from the legal staﬀ. From such a perspective the task of the GISD (and of the decision aiding process) was to provide the IS manager with a “global” technical evaluation of the oﬀers that could be used in the negotiations with the AQ manager (inside of the committee) and the suppliers (outside of the committee). and submitted this draft to its client (see ﬁgure 9. plus the hardware platforms on which such software was expected to run. the AQ manager. After such a negotiation the GISD’s activity has been deﬁned as “technical assistance to the IS manager in a bid. We will focus our attention on this second decision aiding process where four actors are involved: the IS manager. .1). SUPPORTING DECISIONS • A second decision aiding process was established where the client was the GIS department of the RDA and the analyst was the MCDA/SE unit. concerning the acquisition of a GIS for the company” and its speciﬁc task was to provide a “technical evaluation” of the oﬀers that were expected to be submitted. A third actor involved in this process was the “supervisor” of the analyst in the sense of someone supporting the analyst in diﬀerent tasks. the bid concerned software that was not ready made. as well as the timing.208 CHAPTER 9. Two diﬃculties arose from this: • the a priori evaluation of the software behaviour and its performance without being able to test it on speciﬁc company-related cases. 1. the GISD (or team of analysts) as the client (bear in mind their particular position of clients and analysts at the same time).

THE DECISION PROCESS 209 Bid Start Preparation of call for tenders Call for tenders Client desired environment study Methodology study technical advisor Call for tenders answer preparation First set of answers from suppliers First Selection Definition of requirements. points of view & decision problem client advisor + client supplier Problem Formulation Make invitation letter Invitation letter Tender preparation Completion of decision model for second selection Definition of prototype requirements Second set of answers from suppliers Lab preparation for prototype evaluation Second selection Completion of decision model for ranking: definition of criteria & aggregation procedure Prototype Requirements Prototype Development Prototypes Prototype analysis. sorting & final ranking Final Choice Figure 9.2.9.1: the bid process .

what they learned and what their appreciation was. but we can anticipate that the ﬁnal formulation consisted in an absolute evaluation of the oﬀers under a set of points of view that could be divided into two parts: the “quality evaluation” and the “performance evaluation”. in reality and in this case speciﬁcally. Such remarks are introduced in the following section. It is interesting to notice that the GISD staﬀ charged with this evaluation has been “supported” by external consultants. Iob. SUPPORTING DECISIONS Once the call for tenders had been prepared (including the software requirements sections.3 Decision Support We present the three products of the decision aiding process here: the problem formulation. a set of was presented to the company and the technical evaluation activity was settled. the tenderers requirements section. but we can anticipate that such an elaboration highlighted some questions (substantial and methodological) that have not been considered before. Maﬃoli. A third and ﬁnal step in the decision aiding process was the elaboration of the ﬁnal recommendation after all the necessary information for the evaluation had been obtained and the evaluation performed. the timing and evaluation procedure). 9. The discussion was conducted in a very informal way. they have been generated contemporaneously. how they perceived it. It is this extended group that signed the ﬁnal recommendation presented to the IS manager and that we will hereafter call “team of analysts” (for the IS manager) or client (for the MCDA/SE unit and for us). an outline of the evaluation model has to be included in . Although the set of alternatives was relatively small (only six alternatives were considered). We should remember that the problem formulation and a ﬁrst outline of the evaluation model were established while the call for tenders was under elaboration for two reasons: • for legal reasons. We will discuss such constructions in detail in the next sections. but the client provided us with some written remarks that were also reported during a conference presentation (see Fiammengo. A second step in the decision aiding process was the generation of a problem formulation and of an evaluation model. expanded in an hierarchy with 134 leaves resulting in 183 evaluation nodes (see Appendix B). software engineering experts in the company’s sector who practically acted as the IS manager’s delegates in the group. Actually there were seven basic evaluation dimensions.210 CHAPTER 9. We will discuss the problem formulation and the evaluation model in detail in the next section. the evaluation model and the ﬁnal recommendation. Panarotto and Turino 1997). Although we formally consider the two as two distinct products of the process. the set of attributes was extremely complex (as often happens in software evaluation). Buosi. Some months after the end of the process and the delivery of the ﬁnal report we asked our client (the team of analysts) to discuss their experience with us and to answer some questions concerning the methodology used.

“.3. which could be able to take what was happening in the decision process into account. The tenderers therefore knew that they had to produce a prototype within a certain time frame. how to deﬁne the call for tenders.. We recall the client’s remarks: “. 9..”. at the beginning of the process.as a formal approach MCDA generated greater control and transparency.”. such as a bid. For instance. decision–makers may consider it dangerous to make a decision without having a clear idea of the . Reporting the client’s remarks: “.. such an ambiguity might result in an impossibility to understand and ultimately to propose viable solutions....... the call for tenders speciﬁed that a prototype was requested in order to test some performances.9. “A complex process... Moreover. “. the problem situation was absolutely unclear. when signiﬁcant stakes are considered (as in our case).. what his client expected and what they were able to provide..MCDA (Multi Criteria Decision Analysis) was very useful in organising the overall process and structure of the bid: what were the important steps to do....3. Complex decision processes are based on human interactions and these are based on the intrinsic ambiguity of human communication (thanks to ambiguity human communication is also very eﬃcient).”.as a formal process MCDA guaranteed greater control and transparency to the process. However..MCDA was used as a background for the whole decision process. DECISION SUPPORT the call for tenders. The choice to introduce some tests was made during the deﬁnition of the evaluation model. 211 • the evaluation model implicitly contains the software requirements of the oﬀers which in turn deﬁnes the information to be provided by the tenderers.” It is this last sentence which clearly highlights the necessity for the client to have a support along the whole process and for all its aspects... It was extremely important for the client (the team of analysts) to understand his role in the process... We actually agree with their comment that “any process modelled methodology could be useful” and we consider that their positive perception of MCDA is based on the fact that it was the ﬁrst decision support approach process they came to know... With such a perspective it turned out to be very useful because every activity had a justiﬁcation.1 Problem Formulation From the presentation of the process we can make the following observations: 1. the client considered to be able to understand that the expectations of the other actors involved in the process were extremely relevant both for strategic reasons (having to do with organisational problems of the company) and operational reasons (recommend something reliable in a clear and sound way for all the actors involved in the bid)... 2. In fact. Moreover. could be greatly eased by the use of any process centred methodology.”.

“bad” etc. A ﬁrst idea to evaluate the tenderers. The set of alternatives was considered to be the set of oﬀers submitted after the call for tenders. We deﬁne (Morisio and Tsouki`s 1997) a problem formulation as the collection a of: a set of actions. After some discussion the problem statement adopted was the one of an “absolute” evaluation of the oﬀers both on a disaggregated level and on a global one. his relation with the IS manager (his client) and gave him a precise activity to perform. Such points of view formed a huge hierarchy (see further on for details). The use of a formal approach enables the reduction of ambiguity (without completely eliminating it) and thus appears to be an important support to the decision process. 2. was eliminated due to the particular technology where no consolidated producers exist. A simple ranking of the oﬀers could conceal the fact that all of them could be of very poor quality or satisfy the software requirements to a very low level. It is clear that deﬁning a precise problem formulation became a key issue for the client because it clariﬁed his role in the decision process (the bid management). One concerning “quality” including speciﬁc technical features required for the software plus some ISO/IEC 9126 (1991) based dimensions and the second concerning the performance of the oﬀered software to be tested on prototypes. A ﬁrst informal discussion of the problem of compensation convinced them to overcome this question by comparing the oﬀers to proﬁles about which they had suﬃcient knowledge. They considered that such a process was so long that the information available at the beginning and . Actually. The only point that caused a discussion in the analysts team concerning the problem formulation was the problem statement. a set of points of view and a problem statement. The team of analysts felt uncomfortable with the idea of comparing the merits (or de-merits) of an oﬀer with merits (or de-merits) of another oﬀer. 1. The set of points of view was deﬁned using the team of analysts’ technical knowledge and can be viewed in two basic sets.212 CHAPTER 9. an ex-post remark made by the team of analysts concerned the length of the evaluation process. As far as the problem formulation is concerned. Using the terminology introduced by Roy (1996). If we interpret the concept of measurement in a wide sense (comparing the oﬀers to pre-established proﬁles can be viewed as a measurement procedure) the result that the team of analysts was looking for appeared to be the conclusion of repeated aggregations of measures. the problem statement appeared to be an hierarchically organised sorting of the oﬀers. There were two reasons for this choice. the sorting being repeated at all levels of the hierarchy. SUPPORTING DECISIONS consequences of their acts. and not to compare bids amongst themselves. the team of analysts interpreted the client’s demand as a question of whether the oﬀers could be considered as intrinsically “good”. In other words it could happen that the best bid could be “bad” and this was incompatible with the importance and cost of the acquisition. No cost estimates were required by the client and so they were not considered in this set. as well as the oﬀers.

it could be considered unfair to modify the evaluations just before the ﬁnal recommendation. 9. We may notice that despite the fact that we had a large amount of information to handle in our model. 134 leaves and 183 nodes in total (the complete list is available in Appendix B). Another observation made by part of the team of analysts was that towards the end of the process. the length of the evaluation was considered as a negative critical issue in the client’s remarks. the case did not present any exogenous uncertainty since the client considered the basic data and its judgements reliable and felt conﬁdent with them.the semantics of each attribute. We consider that this is a critical issue for decision support and decision aiding processes. The result is that such a list is an abstract collection of attributes. . This is a typical situation in software evaluation (see Morisio and Tsouki`s a 1997. Although each oﬀer was composed of diﬀerent modules and software components. While for relatively short decision aiding processes the problem may be irrelevant. due to the knowledge acquired in this period (mainly due to the process itself). Stamelos and Tsouki`s 1998).3.. the experience. it is certain that in long processes such a problem cannot be neglected and requires speciﬁc consideration. Such a list is generally provided by the literature. Regarding the ﬁrst issue. The set of alternatives was identiﬁed as the set of oﬀers legally accepted by the company in reply to the call for tenders.the choice of the attributes to use. they could revise some of their judgements. Information is valid only for a limited period of time and consequently the same is true for all evaluations based on such information. they have been considered as wholes.9. Each node was subject to extensive discussion before arriving at a ﬁnal version. DECISION SUPPORT 213 the formulation itself could no longer be valid at the end of the process. Moreover the client himself may revise the problem formulation or update his perception of the information and modify his judgements.2 The Evaluation Model The diﬀerent components of the evaluation model were speciﬁed in an iterative fashion. independent from the spe- . The set of evaluation dimensions was a complex hierarchy with seven root nodes. international standards etc. Actually. Blin and Tsouki`s 1998. This was partly due to the very rapid evolution of GIS technology that could completely innovate the state of the art in six months. a frequent attitude of technical committees charged with evaluating complex objects (as in our case) is to deﬁne an “excellence list” where every possible aspect of the object is considered. No preliminary screening of the oﬀers was expected to be made. This is rarely considered in decision aiding methodologies.3. Basically two issues have been considered in such discussions: . The ﬁnal report did not consider any revision of the formulation and the evaluations since in the context of a call for tenders. In the following we present their deﬁnition as they occurred in the decision aiding process. The key idea was that a a each node of the hierarchy was an evaluation model itself for which the evaluation dimensions to aggregate and the aggregation procedure had to be deﬁned.

It was almost impossible that the experts could be able to give more information than such an order and it was exactly this type of information that pushed the client to look for another evaluation model than the usual weighted sum widely diﬀused in software evaluation manuals and standards (see ISO/IEC . “acceptable” etc. at a certain point in the hierarchy deﬁnition process. in the sense that each sub–node should be able to discriminate alone the oﬀers with respect to the evaluation considered at the parent level. The ﬁnal choice was to put process attributes at the top level because directly emanating from the evaluation scope. SUPPORTING DECISIONS ciﬁc problem at hand.. i. Verifying the separability of each sub–dimension with respect to the parent node was very helpful. type.. the whole process could be faster because we needed the software for a due date. while the former were expressed on ordinal scales. in his ex-post considerations: “.. thus containing redundancies and conceptual dependencies which can invalidate the evaluation. The repeated use of a coherence test (in the sense of Roy and Bouyssou 1993) for each intermediate node of the hierarchy made it possible to eliminate a signiﬁcant number of redundant and dependent attributes (more than 30%) and to better understand the semantics of each attribute used. but had no knowledge and no tools to enable him to simplify and reduce the ﬁrst version of the list they had deﬁned. Such an approach helped the client both to eliminate redundancies (before using the coherence test which is time consuming) and in better understanding the contents of the evaluation model. there was a discussion about some attributes that could also be considered as leaves at the top level of the hierarchy. one can consider a process attribute (at the ﬁnal level) and then subdivide it in quality aspects. the client wrote. it could be preferable to use a limited number of criteria.. The latter were expressed on nominal scales.. The basic information available was of the “subjective ordinal measurement” type. a short description of the model (why a certain value was considered as better than another). For instance.e. “compatible with graphic engine Y” etc. Such an activity also helped the client to realise that they needed an absolute evaluation of the alternatives for almost all the intermediate nodes of the hierarchy thus implicitly deﬁning the problem statement of the model. the client) of the “good”.214 CHAPTER 9. Despite this work. With respect to the second issue we pushed the client to provide us with a short description of each attribute and when a preference model was associated to it. or alternatively consider single independent quality aspects whose evaluation depends on how the process attribute is considered. With this term we want to indicate that each alternative could be described by a vector of the 134 elementary pieces of information that were in the large majority either subjective evaluations by experts (mostly part of the team of analysts. type or descriptions of the “operating system X”.it was not necessary to be so detailed in the evaluation. On the other hand it is also true that it is only after the process that the client was able to determine which were the really signiﬁcant criteria that discriminated among the alternatives. Our client was aware of the problem. In fact. These were the so called “process attributes”.. they were intended to evaluate special functionality inside diﬀerent processes (in this context “process” means a chunk of functionality aiming towards supporting a stream of activities of a software).”.

Moreover. In our case. Actually. DECISION SUPPORT 215 9126 1991. But such a problem can invalidate the problem formulation adopted.3. The preference among the alternatives was expected to be induced once the alternatives could be “measured” by the attributes. At this point the team was ready to deﬁne their speciﬁc evaluation models for all nodes. Gathering and obtaining the relevant information for an evaluation model is often considered as a second level activity and therefore neglected from further speciﬁc considerations. 1. the information used in an evaluation model results from the manipulation of the rough information available at the beginning of the process. See also appendix A for a presentation of the procedure. We consider that this is also a critical issue in a decision aiding process. From a certain point of view we can claim that. “acceptable” etc. but a time consuming process that required the establishment of an ad-hoc procedure during the process (see ﬁgure 9. For all leave nodes an ordinal scale was established. Moreover. type). The available technical knowledge consisted in diﬀerent possible “states” in which an oﬀer could ﬁnd . associating a preference model on the elements of the nominal scale of the attribute. Such an observation greatly helped the client to understand the nature and scope of the evaluation model and ultimately to deﬁne the problem statement of the model.9.. the discussion on the diﬀerent typologies of measurement scales helped the client to understand the problem of choosing an appropriate aggregation procedure. the presence of ordinal information for almost all leaves and the problem statement that required a “repeated sorting” of the oﬀers. When asked to formulate preferences they concerned the elements of the nominal scales and not the alternatives themselves. IEEE 92 1992). the client did not compare the alternatives amongst themselves. Under such a perspective it was important for the client to understand on what they were expressing their preferences on. An important discussion with the client concerned the distinction between measures and preferences. oriented the team of analysts to choose an aggregation procedure based on the ELECTRE-TRI method (see Yu 1992). “acceptable” etc. Obtaining the information was not a diﬃcult task. We can consider that the information is constructed during the decision aiding process and cannot be viewed as a simple input. the basic information consisted either in observations concerning the oﬀers (expressed in nominal scales) or in expert judgements (expressed in ordinal scales of value of the “good”.1). the client needed to aggregate ordinal measures and not preferences (in the sense that they had to aggregate the ordinal measures obtained when comparing the alternatives to the standards and not to compare the alternatives amongst themselves). In particular we had the following cases. except for the ﬁnal aggregation level. Clearly all nominal scales had to be transformed into ordinal ones.. As already reported. All the intermediate nodes were expected to provide information of the second type. but to a-priori deﬁned (by the client) standards of “good”. Before continuing the deﬁnition of the model associated to each node the problem of the aggregation procedure was faced since it could inﬂuence the construction of such models.

availability of an advanced graphic language (E).216 CHAPTER 9. good (G).2 (graphic engine of the user interface in the land-base management). . . non standard graphics (NSG).1.S. 2.3 is N.1 (type of presentation on the user interface in the land-base management).SG or T.3 (customisation of the user interface in the land-base management). For instance.1: standard graphics (SG). The possible states on these characteristics were: 1. 1. an exhaustive combination of the values of the sub– nodes was provided.S. The relative importance of the sub–nodes and the concordance threshold .S.G: T.SG.1.2: station M (M.1. For all parent nodes. 1.S.M.3: availability of a graphic tool (T).1. 1. .1: SG NSG.1. other non acceptable graphic engine (ON). . two possibilities for deﬁning the relationship between the values on the sub–nodes and the values on the parent nodes were established.1 When possible.2 When an exhaustive combination of the values was impossible.1. an ELECTRETRI procedure was used.the concordance threshold for the establishment of the outranking relation among the oﬀers and the proﬁles. 1. availability of a standard programming language (S). graphic engine already adopted in other software used in the company).M.M.a veto condition on the sub node such that the value on the parent node could be limited (possibly unacceptable). acceptable (A). other acceptable graphic engine (OA).U: all cases where 1.S).OA.1. .E.SG or T. In this case diﬀerent possible combinations were possible (for instance a software could provide both an advanced graphic language and a standard programming language: value E.1. In this case we have the following evaluation model: .E.SG or T. consider the leave nodes 1.1 is NSG or 1.1. no customisation available (N).E. 2.SG.1.2: M OA ON.E T. excellent (E).E: T. SUPPORTING DECISIONS itself.SG or T.OA.OA.OA.E.A: all remaining cases except the unacceptable.M. a brief descriptive text of what the node was expected to evaluate was provided.S E S N.VG: T.M.M. Then.S T.SG or E. For instance consider node 1.SG. very good (VI).3: T.the relative importance of the diﬀerent sub nodes. For this purpose.S T E.1.SG or E. 1. 1.1 (user interface of the land-base management) which has the three evaluation models introduced in the previous example as sub–nodes. All parent nodes were equipped with the same number of classes: unacceptable (U). .2 is ON or 1.1.SG or T.E. The three ordinal scales associated to the three nodes were ( representing the scale order): 1. 2. the following information was requested: .S.

The team of analysts also established very high concordance thresholds (never less than 80%. 1. The veto condition was established as the presence of the value “unacceptable” at a sub–node. 1. revised the importance parameters several times. 7.1. of at least a part of the team of analysts. 1. 1.3 and any two of the nodes 1. As already mentioned. 1.2: Functionality. 1.3. For example we can take node 1 (land-base management) which has eight sub nodes: 1. Such a choice reﬂected the conviction. 1. 1. The reader may notice that this is a very strong interpretation of a veto condition among the ones used in the outranking based sorting procedures. 7.4) = 4. sub–nodes 7.3.6) = 8. but it was the one with which the team of analysts felt comfortable at the time of construction of the evaluation model. In other words the team of analysts established the characteristics of the sub–nodes for which an oﬀer could be considered very good (therefore should outrank the very good proﬁle) and consequently compared the values of the parameters of relative importance and of the concordance threshold.6: Interoperability.1: User interface. w(1.5) = 1. w(1. w(1. Such performances are basically measured in the time necessary to execute a set of speciﬁc tasks under certain conditions and with some external ﬁxed . very often around 90%) that result in very severe evaluations. DECISION SUPPORT 217 have been established using a reasoning on coalitions (for details see Chapter 6).6.8) = 2 and the concordance threshold was ﬁxed as 29/36 (around 0. Such choices imply that no coalition that excluded nodes 1. In other words.3) = 5.8).7. 3.2.1) = 4. The seventh root node (node 7.4 and 1. The relative importance parameters were established as follows:w(1. corresponding to six (among seven) of the root nodes of the model.9. w(1. w(1.3: Development environment. this conviction had wider eﬀects than the team of analysts could imagine. The ﬁrst generated six evaluation dimensions.8: Integration among land-base products.1. The analyst and the supervisor explained this aspect to the client who on this basis.2. which will be called the “quality attributes” or “quality criteria” or “quality part of the hierarchy” hereafter.5: Work ﬂow connection. 1.7 was acceptable and that the smallest acceptable coalition should necessarily include the nodes 1. The presence of a veto also produced an “unacceptable” value at the level of the parent node.4) concerned the evaluation of the performances of the prototypes submitted to tests by the team of analysts.2 or 1. the set of dimensions was built around two basic points of view: the “quality” and the “performances”. 1. 7.7) = 8. w(1.2) = 8. the team of analysts considered any “unacceptable” value to be a severe technical limitation of the oﬀer.7: Integration between land-base products and the Spatial Data manager. w(1. that very strong reasons were required to qualify an oﬀer as very good.4: Administration tools. Since the whole model was calibrated starting from the very good value.

The set of criteria to be used. the interpolation is not necessarily linear). No exogenous uncertainty was considered in the evaluation model. is less true for node 7 and its sub–nodes. The same model was applied to all sub–nodes of node 7. The combination is obtained.3 (performance under load). in this case. not only by the quantity of nodes to deﬁne. but also because the team of analysts was obliged to deﬁne a new measurement scale and a precise measurement aggregation procedure for each node. the technology is quite new and there are no standards of what a “ very good” performance could be. mainly of a technical nature (concerning the speciﬁc contents of the values for each node). it was the only way to obtain meaningful values for the oﬀers. We shall . all performances presenting a diﬀerence of more than 20% and less than 25% “third”. The length of the process is justiﬁed. through the following formula: v(x) = Wx (t)Tx (t)dt In this case there are no external proﬁles with which to compare the performances because the prototypes are created ad-hoc. However. since such an approach corresponded to a cautious attitude. An ordinal scale was created considering the best performances as “ﬁrst”.218 CHAPTER 9. For instance. This reasoning however. it prevailed in the team and ﬁnally was accepted. SUPPORTING DECISIONS parameters. Although this process can be often qualiﬁed as “subjective measurement”. The dimension is expected to evaluate the performance of the prototype while the quantity of data that have to be elaborated increases. It took four to ﬁve months for all the nodes to be equipped with their evaluation model and the process generated several discussions inside the team of analysts. The most discussed concept of the model was the concordance threshold and the veto condition since part of the team considered that the required levels were extremely severe. if a preference aggregation comparing the alternatives amongst themselves was requested. all performances presenting a diﬀerence of more than 25% and less than 50% “fourth” and all performances presenting a diﬀerence of more than 50% “ﬁfth”. but the team of analysts felt suﬃciently conﬁdent with the tests and did not analyse the problem further. This process was repeated for all the intermediate nodes up to the seven root nodes representing the seven basic evaluation dimensions. A sorting procedure could then be established to obtain the ﬁnal evaluation. all performances presenting a diﬀerence of more than 5% and less than 20% “second”. consider node 7. was deﬁned as the seven root nodes equipped with a simple preference model: the weak order induced by the ordinal scale associated to each of these nodes. The information provided by the tenderers concerning their oﬀers was considered to be reliable and the use of ordinal scales made it possible to avoid the problems of imprecision or of measurement errors. Some endogenous uncertainty appeared as soon as the model was put into practice (the oﬀers being available). The value v(x) (x being an oﬀer) combines an observed measure Wx (t) and an interpolated one Tx (t) (t representing the data load.

9.3. DECISION SUPPORT

219

discuss this problem in more detail in the next section (concerning the elaboration of the ﬁnal recommendation), but we can anticipate that the problem was created by the “double” evaluation provided by the chosen ELECTRE-TRI type aggregation consisting in an “optimistic” and a “pessimistic” evaluation which may not necessarily coincide. The evaluation model was coded in a formal document that was submitted (and explained) to the ﬁnal client receiving his consensus. It is worthwhile to note that the ﬁnal client was not able to participate in the elaboration of the model (technical details, establishment of the parameters etc.). Part of the team of analysts (some of the external consultants) were acting as his delegates. The establishment of the evaluation model and its acceptance by the client opened the way for its application on the set of oﬀers received and for the elaboration of the ﬁnal recommendation. The client greatly appreciated his involvement in the establishment of the evaluation model that turned out to be a product considered to be their own (from their ex-post remarks: “....this (the involvement) turned out to be important....for the acceptability of the evaluation results”). The fact that each node of the hierarchy was discussed, analysed and ﬁnally deﬁned by the team of analysts allowed them to understand the consequences for the global level, to be able to explain the contents of the model to their client and justify the ﬁnal result on the grounds of their own knowledge and experience, not of the procedure adopted. In other words we can claim that the model was validated during its construction. Such an approach helped both the acceptability of the model and the ﬁnal result, eased the discussion when the question of the ﬁnal aggregation was settled and deﬁnitely legitimated the model in the eyes of the client.

9.3.3

The ﬁnal recommendation

The evaluation of the six oﬀers, which eﬀectively had been submitted after the call for tenders was elaborated, was carried out in two main steps. The ﬁrst consisting in evaluating the six “quality attributes” and the second consisting in testing the prototypes provided by the tenderers. The method adopted to aggregate the information and construct the ﬁnal evaluations was a variant of the ELECTRE TRI procedure (see Yu 1992). The reader can also see Appendix A and refer to Chapter 6 for more details. We have the following remarks on the use of such a method. 1. The key parameters used in the method are the proﬁles (to which the alternatives are compared in order to be classiﬁed in a speciﬁc class), the importance of each criterion for each parent criterion classiﬁcation and the concepts of concordance thresholds and veto conditions. For each intermediate node such parameters were extensively discussed before reaching a precise numerical representation. As already mentioned in section 3.2 the relative importance of each criterion and the concordance threshold were established using a reasoning based on the identiﬁcation of the “winning coalitions” enabling the outranking relation to hold. The veto

220

CHAPTER 9. SUPPORTING DECISIONS condition was initially perceived as a theoretical possibility of no practical use, then, as an eliminatory threshold, but the client soon realised its importance mainly when it was necessary to have an incomparability instead of an indiﬀerence that was a counterintuitive situation when very diﬀerent objects were compared. Further on and as soon as the veto conditions were understood by the client, they decided to introduce a similar concept each times they wanted to distinguish between positive reasons (for the establishment of the outranking relation) and negative reasons (against the establishment of the outranking relation), since they are not necessarily complementary and must be evaluated in a separate and independent way. The proﬁles were established using the knowledge of the team of analysts (experts in their domain) that were able to identify the minimal requirements to qualify an object in a certain class. It is interesting to notice that for the client, the intuitive idea of a proﬁle was that of a typical object of a class and not of the lower bound of the class. The shift from the intuitive idea to the one used in the case study was immediate and presented no problems. The fact remains, that the distinction between the two concepts of proﬁle is crucial, while the lower bound approach appears to be less intuitive than the typical element one.

2. The whole method (and the model) was implemented on a spreadsheet. This was of great importance because spreadsheets are a basic tool for communication and work in all companies and enable an immediate understanding of the results. Moreover, they enable on-line what-if operations when speciﬁc problems, concerning precise information and/or evaluation, appeared during the discussions inside the team of analysts. The experimental validation of the model was greatly eased by the use of the spreadsheet. Further on it helped the acceptability and legitimation of the model through the idea that “if it can be implemented on a spreadsheet it is suﬃciently simple and easy to be used by our company”. In fact some of the critiques by the client about the approach adopted in this case were that “....MCDA is not yet a universally known method....”, “....seems less intuitive than other well known techniques such as the weighted sum...”, “....it is time consuming to apply a new methodology....”, all these problems limiting the acceptability of the methodology towards the client’s client (the IS manager) and the company more generally. Being able to implement the method and the model on a spreadsheet was, for them, a proof that, although new, complex and apparently less intuitive, the method was simple and easy and therefore legitimately used in the decision process. A speciﬁc problem which was raised in the ﬁrst step was the generation of uncertainty due to the aggregation procedure. The ELECTRE-TRI type procedure adopted produces an interval evaluation consisting in a lower value (the pessimistic evaluation) and an upper value (the optimistic evaluation). When an alternative has a proﬁle on the sub–nodes that is very diﬀerent from the proﬁles of the classes on the parent node then, due to the incomparabilities that occur when comparing

9.3. DECISION SUPPORT O1 A-A A-A A-A A-G U-U A-A O2 G-G G-VG G-G G-VG G-VG VG-VG O3 A-VG A-VG A-VG A-VG G-G E-E O4 A-G A-VG G-G G-VG A-G VG-VG O5 G-VG G-G A-A A-VG G-VG G-G O6 A-A A-G A-A A-G U-U VG-VG

221

C1 C2 C3 C4 C5 C6

Table 9.1: the values of the alternatives on the six quality criteria (U: unacceptable, A: acceptable, G: good, VG: very good, E: excellent)

the alternative to the proﬁles, it may happen that the two values do not coincide (see more details in Appendix A). When the user of the model is not able to choose one of the two evaluations in an hierarchical aggregation can be a problem since at the next aggregation the sub–nodes may have evaluations expressed on an interval. This is a typical case of endogenous uncertainty created by a method itself and not by the available information. The client was keen to consider the pessimistic and optimistic evaluation as bounds of the “real” value, but there was no uncertainty distribution on the interval. For this purpose, the following procedure was adopted. Two distinct aggregations were made, one where the lower values were used and the other where the upper values were used. Each of these, in turn, may produce a lower value and an upper value. At the next aggregation step, the lowest of the two lower values and the highest of the two upper values is used. This is a cautious attitude and has the drawback of widening the intervals as the aggregation goes up the hierarchy. However, this eﬀect did not occur here and the ﬁnal result for the six dimensions is represented in table 9.1 (from here on we will represent the criteria by Ci and the alternatives by Oi).

We consider that the problem of interval evaluation on ordinal scales is an open theoretical problem that deserves future consideration (very little literature on the subject is available to our knowledge: (see Roubens and Vincke 1985, Vincke 1988, Pirlot and Vincke 1997, Tsouki`s and Vincke 1999). a Another modiﬁcation introduced in the aggregation procedure concerned the use of the veto concept. As already mentioned, a strong veto concept was used in the evaluation model such that the presence of an “unacceptable” value on any node (among the ones endowed with such veto power) could result in a global “unacceptable” value. However, during the evaluation of the oﬀers, weaker concepts of veto appeared necessary. The idea was that certain values could have a “limitation” eﬀect of the type: “if an oﬀer has the value x on a sub–node then it cannot be more than y on the parent node”. The results on node 7 concerning the performances of the prototypes are pre-

222 O1 A-A O2 G-G

CHAPTER 9. SUPPORTING DECISIONS O3 G-G O4 A-A O5 E-E O6 A-A

C7

Table 9.2: the values of the alternatives on the performance criterion (U: unacceptable, A: acceptable, G: good, VG: very good, E: excellent)

sented in table 9.2. Remember that such a result is an ordinal scale obtained by aggregating the four scales deﬁned as explained in the previous section. Therefore, it could be considered more as a ranking than as an absolute evaluation. For this reason the team of analysts decided to use such an attribute only to rank the diﬀerent oﬀers after their sorting obtained by using the six quality attributes. For this purpose the team of analysts tested three diﬀerent aggregation scenarios corresponding to three diﬀerent hypotheses about the importance of the performance attribute.

1. The performance attribute is considered to have the same importance as the set of six quality attributes. This scenario represents the idea that the tests on the software performances correspond to the only “real” or “objective” measurement of the oﬀers and it should therefore be viewed as a validation of the result obtained through the subjective measurement carried out on the six quality attributes. The aggregation procedure consisted in using the six quality attributes as criteria equipped with a weak order from which to obtain a ﬁnal ranking. Since the evaluations for some of the six attributes were in the form of an interval, an extended ordinal scale was deﬁned in order to induce the weak order: E V G G − V G G A − V G A − G A U . The importance parameters are w(1.) = 2, w(2.) = 2, w(3.) = 4, w(4.) = 1, w(5.) = 4, w(6.) = 2 and the concordance threshold 12/15 (0.8). The six orders are the following (x,y standing for indiﬀerence between x and y): - O5 O2 O3 O4 O1, O6; - O2 O5 O3 O4 O6 O1; - O2 O4 O3 O5, O1, O6; - O2, O4 O3, O5 O1, O6; - O2, O5 O3, O4 O1, O6; - O3 O2 O6, O4 O5 O1. The ﬁnal result is presented in table 9.3. In order to rank the alternatives a “score” is computed for each of them. It is the diﬀerence of the number of alternatives preferred to this speciﬁc alternative and the number of alternatives to which this speciﬁc alternative is preferred. Then, the alternatives are ranked by decreasing magnitude of this score. The ﬁnal ranking thus obtained is given in ﬁgure 9.2 2a (it is worthwhile noting that the indiﬀerence obtained in the ﬁnal ranking corresponds to incomparabilities obtained in the aggregation step). An intersection was therefore operated with the

9.3. DECISION SUPPORT

223

O2 d d O2 © d O3

O5

c O3,O4,O5

c O4

c O6

c O6

c O1 2a

c O1 2b

Figure 9.2: 2a: the ﬁnal ranking using the six quality criteria. 2b: the ﬁnal ranking as intersection of the six quality criteria and the performance criterion

ranking obtained on node 7. resulting in a ﬁnal ranking reported in ﬁgure 9.2 2b. 2. The performance attribute is considered to be of secondary importance, to be used in order to distinguish among the alternatives assigned in the same class using the six quality attributes. In other words, the principal evaluation was to be considered as the one using the six quality attributes and the performance evaluation was only a supplement enabling an eventual further distinction. Such an approach resulted in a low conﬁdence evaluation being awarded to the performance and the undesirability of assigning it high importance. A lexicographic aggregation has been therefore applied using the six quality criteria as in the previous scenario and applying the performance criterion to the equivalence classes of the global ranking. The ﬁnal ranking is O2 O5 O3 O4 O6 O1. 3. A third approach consisted in considering the seven attributes as seven criteria to be aggregated to obtain a ﬁnal ranking assigning them a reasoned importance parameter. The idea was that while the client could be interested in having the absolute evaluation of the oﬀers (result obtainable only using the six quality attributes) he could also be interested in a ranking of the alternatives that could help him in the ﬁnal choice. From this point of

w(7. O6. O1. The seven weak orders are the following: .) = 2. . O3 O4. O5 O1. w(4. O4 O5 O1. . the third scenario was adopted and used as the ﬁnal result. The importance parameters are w(1. O6.8). O5 Finally and after some discussions with the client.O2 O4 O3 O5. . . w(2.O5 O2. O4 O6 O1. O6. O5 O3. it was not meaningful to translate the weak order obtained for the performance attribute as an ordinal measurement of the oﬀers. SUPPORTING DECISIONS O3 0 1 1 0 0 0 O4 0 1 0 1 0 0 O5 0 1 0 0 1 0 O6 0 1 1 1 1 1 O1 O2 O3 O4 O5 O6 Table 9.while it was meaningful to interpret the ordinal measures for the six quality attributes as weak orders representing the client’s preferences. The two basic reasons were: .) = 1. w(3.O5 O2 O3 O4 O1. w(5. The ﬁnal result is reported in table 9. O6. .O2.224 O1 1 1 1 1 1 1 O2 0 1 0 0 0 0 CHAPTER 9.) = 4.) = 4 and the concordance threshold 16/19 (more than 0. . O6.) = 4.O3 O2 O6. O4 O1.) = 2. Using the same ranking procedure the ﬁnal ranking is now: O2 O3. w(6.O2.O2 O5 O3 O4 O6 O1.3: the outranking relation aggregating the six quality criteria O1 1 1 1 1 1 1 O2 0 1 0 0 0 0 O3 0 1 1 0 0 0 O4 0 1 0 1 0 0 O5 0 0 0 0 1 0 O6 0 1 1 1 1 1 O1 O2 O3 O4 O5 O6 Table 9.4: the outranking relation aggregating the seven criteria view the absolute evaluations on of the six quality attributes were transformed into rankings as in the ﬁrst scenario adding the seventh attribute as a seventh criterion. O1. O4 O3.) = 2.4. .

the choice of the ﬁnal aggregation was justiﬁed by a speciﬁc attitude towards the two basic evaluation “points of view”: the quality information and the performance of the prototypes. the client considered the approach to be useful because “every activity was justiﬁed”. DECISION SUPPORT 225 .it is possible to compare the result with a cost criterion following two possible approaches: 1..) or establish a value function of the client using one of the usual protocols available in literature (see also in Chapter 6) to obtain the trade-oﬀs between the . the thresholds etc. using an ordinal aggregation procedure construct a ﬁnal choice (then the negotiation should concentrate on deﬁning the importance parameters. The importance parameters and the concordance threshold adopted in the ﬁnal version made it possible to deﬁne a compromise of these two extreme positions expressed during the decision aiding process. In fact the performance criterion is associated with an importance parameter of 4 which combined with the concordance threshold of 16/19 implies that it is impossible for an alternative to outrank another if its value on the performance criterion is worse (and this satisﬁed the part of the team of analysts that considered the performance criterion as a critical evaluation of the oﬀers). an inspector. As already reported. A ﬁnal question that arose during the elaboration of the ﬁnal recommendation was elaborated was whether it would be possible to provide a numerical representation of the values obtained by the oﬀers and of the ﬁnal ranking. A major concern for people involved in complex decision processes is to be able to justify their behaviour. but not to the client’s perception of the problem. For this purpose an appendix was included in the ﬁnal recommendation where the following was emphasised: . 2.) either induce an ordinal scale from the cost criterion and then. but is was meaningless to use such a numerical representation in order to establish implicit or explicit trade-oﬀs with a cost criterion. a committee etc.the ﬁrst and second scenarios implicitly adopted two extreme positions concerning the importance of the performance attribute that correspond to two diﬀerent “philosophies” present in the team of analysts. The ﬁnal ranking obtained respects this idea and the outranking table could be understood by all the members of the team of analysts. In this case.). It was extremely important for the client to be able to summarise the correspondence between an aggregation procedure and an operational attitude because it enabled them to better argue against the possible objections of their client. for instance. Such a justiﬁcation applies both to how a speciﬁc result was obtained and to how the whole evaluation was conducted. a superior in the hierarchy of the company. . recommendations and decisions towards a director. It was soon clear that the question originated from the will of the ﬁnal client to be able to negotiate with the AQ manager on a monetary basis since it was expected that he would introduce the cost dimension into the ﬁnal decision.9.3.it is possible to give a numerical representation to both the ordinal measurement obtained using the six quality attributes and to the ﬁnal ranking obtained using the seven criteria. Giving a regular importance parameter to the performance criterion avoided the extreme situation in which all other evaluations could become irrelevant.

simulations and whatever else. values. which in other decision aiding processes can be extremely useful. By this we want to indicate the fact that the client will be much more conﬁdent in the result and much more ready to apply it if he feels that he owns the result in the sense that it is a product of his own convictions. If the support was limited to answering the client demand on how to deﬁne a global evaluation (based on the weighted sum of their notes on the products) we may have provided them with an excellent multi-attribute value model that would have been of no interest for their problem. and the way to provide a useful contribution. Such ownership can be achieved if the client not only participates in elaborating the parameters of the evaluation model. The ﬁnal client was very satisﬁed with the ﬁnal recommendation and was also able to understand the reply about the numerical representation. a correct deﬁnition of the evaluation model and an understandable and legitimated ﬁnal recommendation are the products that we have to provide in a decision aiding process.”.pointed out that it was not necessary to always use ratio scales and weighted sums. computations. dynamic assignment of alternatives to classes and other innovative techniques were considered too “new” by the client who already considered the use of an approach diﬀerent from the usual grid and weighted sum a revolution (compared with the company’s standards)... interval comparisons using extended preference structures. experience..the team of analysts was also available to conduct this part of the decision aiding process if the client desired it. but an emphasis on a process based decision aiding activity. 9. as we thought before. What the client needed was continuous assistance and support during the decision process (the management of the call for tenders) enabling them to understand their role. valued similarity relations. the expected results. A careful analysis of the problem situation.4 Conclusions Concluding this chapter we may try to summarise the lessons learned in this real experience of decision support. A ﬁnal consideration can be the fact that it is sure that there was space (but no time) to experiment with more variants and methods for the aggregation procedure and the construction of the ﬁnal recommendation. The most important lesson perhaps concerns the process dimension of decision support. Valued relations.. but actu- . This is not against multiattribute value based methods. the fact of being able to aggregate the ordinal information available in a correct and meaningful way was more than satisfactory as they report in their ex-post remarks: “... but that it was possible to use judgements and aggregate them. . He nevertheless decided to conduct the negotiations with the AQ manager personally and so the team of analysts terminated its task with the delivery of the ﬁnal recommendation. A second lesson learned concerns the “ownership” of the ﬁnal recommendation. the performance evaluations and the cost criterion (then the negotiations should concentrate on a value function). In their view. SUPPORTING DECISIONS quality evaluations. a consensual problem formulation.226 CHAPTER 9.

Knowing that a software has n function points. . Such “ownership” greatly eases the legitimisation of the recommendation since it is not just the “advice recommended by the experts who do not understand anything”.4. It might be interesting to notice that a customised implementation of the model on the tools on which the client is accustomed (as in our case the company spreadsheet) greatly improves the acceptance and legitimisation of the evaluation model. A ﬁfth lesson concerns the deﬁnition of the aggregation procedure in the evaluation model. It is possible that such two dimensions may conﬂict. The ﬁrst refer to observations made on the set of alternatives either through “objective” or through “subjective” measures. the evaluation model has to satisfy both requirements. uncertainty may appear (as in our case). ordinal measurement. but not least. thus implying a process of adaptation guided by reciprocal learning for the client and the analyst. A sixth lesson is about uncertainty. Moving from one to the other might be possible. Although the speciﬁc case may be considered exceptional (due to the speciﬁc dimension of the evaluation model and the double role of the client being analyst for another client at the same time) we claim that is always possible to include the client in the construction of the evaluation model in a way that allows him to feel responsible and to own the ﬁnal recommendation. while another has m function points does not imply any particular preference between them. uncertainty can appear in a very qualitative way and not necessarily in the form of an uncertainty distribution. The ﬁrst is a theoretical and conceptual one and refers to the necessity to manipulate the information in a sound and correct way. we emphasise the signiﬁcant number of open theoretical problems the case study highlights (interval evaluation. hesitation modelling. Even when the available information is considered reliable. The second is a practical one and refers to the necessity to manipulate the information in a way understandable by the client and corresponding to his intuitions and concerns. We hope that the case study oﬀered an introduction to this problem. Last. ordinal value theory etc. A third lesson concerns the key issue of meaningfulness. Therefore. The previous chapters of this book provide enough evidence that universal methods for aggregating preferences and/or measures do not exist.9. CONCLUSIONS 227 ally build the model with the help of the analyst (which has been the case in our experience). is always subjective and depends on the problem situation. The seconds refer to the clients values. Moreover.). the aggregation procedures included in an evaluation model are choices that have to be carefully studied and justiﬁed. The construction of the evaluation model must obey two dimensions of meaningfulness. but not obvious and has to be carefully studied. However. It is necessary to have a large variety of uncertainty representation tools in order to include the relevant one in the evaluation model. A fourth lesson concerns the importance of the distinction between measures and preferences. The existence of clear and sound theoretical results for the use of speciﬁc preference modelling tools. preference and/or measure aggregation procedures and other modelling tools deﬁnitely helps such a process. hierarchical measurement.

A relative importance wj (usually normalised in the interval [0.∀x ∈ A Pj (eh . x) ⇔ gj (x) eh j j . Ij for each criterion gj such that: . 1. x ∈ P : C(x. • An outranking relation S ⊂ (A × P) ∪ (P × A). ph being a collection of degrees. such that if eh belongs to proﬁle ph . • A set C of categories cλ . j = 1 · · · n. • Each criterion gj is equipped with an ordinal scale Ej with degrees el .∀x ∈ A Ij (x. • A set A of alternatives ai . ≈ induced by the ordinal scale associated to criterion gj . • A set G of criteria gj . ph = eh · · · eh . eh ) ⇔ gj (x) eh j j . y ∈ P : C(x. i = 1 · · · m. where s(x. SUPPORTING DECISIONS Appendix A The basic concepts adopted in the procedure used (based on ELECTRE TRI) are the following. y) ⇔ C(x.∀x ∈ A Pj (x. • A set of preference relations Pj . λ = 1 · · · t + 1. y) ⇔ j∈G± wj ≥ c and ( j∈G+ wj ≥ j∈G− wj ) ∀y ∈ A. eh+1 cannot belong to proﬁle n 1 j j ph−1 . The procedure works in two basic steps. y) should be read as “x is at least as good as y”. such that the proﬁle ph is the upper bound of category ch and the lower bound of category ch+1 . eh ) ⇔ gj (x) ≈ eh j j . h = 1 · · · t. y) and not D(x. 1]) is attributed to each criterion gj . Establish the outranking relation on the basis of the following rule: s(x. y) where ∀x ∈ A. l = j 1 · · · k. y) ⇔ ( j∈G± wj ≥ c and j∈G+ wj ≥ j∈G− wj ) or ( j∈G+ wj > j∈G− wj ) . • A set P of proﬁles ph .228 CHAPTER 9.

2. • Four criteria g1 · · · g4 .c: the concordance threshold c ∈ [0.1 pessimistic assignment . B. .4. • Two proﬁles p1 = C. B. B.5.as soon as s(ai . • Three alternatives: a1 = D. B. y) ⇔ x = D . 1] . p1 being the minimum proﬁle for category A). . CONCLUSIONS 229 ∀(x. • Further on.G+ = {gj ∈ G : Pj (x. x)} . C . C. each of them equipped with an ordinal scale A B C D. y): veto. B. C.2 optimistic assignment . a2 = B. then no uncertainty exists for the assignment. of equal importance (∀j wj = 1/4). an uncertainty exists and should be considered by the user. y)} .d: the discordance threshold d ∈ [0. y) ∈ (A × P) ∪ (P × A) : not D(x. 1] .G− = {gj ∈ G : Pj (y. B. d = 0. assign any element ai on the basis of the following rules. acceptable (A) and good (G) (p2 being the minimum proﬁle for category G. A.G= = {gj ∈ G : Ij (x. assign ai to category ch . The pessimistic procedure ﬁnds the proﬁle for which the element is not the worst. ai )∧¬s(ai . Otherwise.40 and ∀j vj (x.ai is iteratively compared with p1 · · · pt .9.G± = G+ ∪ G= . If the optimistic and pessimistic assignments coincide. B . of y on x 2. 2. ph ) then assign ai to category ch−1 . In order to better understand how the procedure works consider the following example. B deﬁning three categories: unacceptable (U). The optimistic procedure ﬁnds the proﬁle against which the element is surely the worse.as soon as is established s(ph . ph ) is established. ﬁx c = 0. expressed on criterion gj .75. C and p2 = A. y)} . y) j∈G− where .ai is iteratively compared with pt · · · p1 . A . y) ⇔ wj ≤ d and ∀gj not vj (x. a3 = A.vj (x. C. When the relation S is established.

while the optimistic assignment puts all three alternatives in category A. (p2 . a1 ). p1 )}. SUPPORTING DECISIONS With such information it is possible to establish the outranking relation that is S = {(p2 . a2 ).230 CHAPTER 9. (a2 . (a3 . (p2 . p1 ). The reader can easily check that the pessimistic assignment puts alternative a1 in category U and alternatives a2 and a3 in category A. . a3 ).

3 Graphical rendering functions 1.3.4.4 Vectorial data products integration Descriptive data products integration Raster data products integration Digital Terrain Model products integration 1.7 Integration between Land-base products and the Spatial Data Manager 1.1 1.1 1.3 1.7.2.1 Graphics type 1.5 Documentation Quality 1.3.5.4 Libraries personalisation Development support tools Debugging support tools Code documentation 1.3.3 Interface personalisation 1.1 Interfaces integration 1.1 1.3.3.1.4. CONCLUSIONS 231 Appendix B The complete list of the attributes used in the evaluation model 1 LAND-BASE MANAGEMENT 1.2.4 Completeness Documentation support type Information retrieval ease Contextual help 1.2 Functionality 1.2.2 1.2.3 1.2 1.1 Planes analysis functions 1.1 Availability 1.3.5.3.2.7.4.2 Topological connectivity functions 1.4.1.3 Development environment 1.1.3 Performance data collection 1.3.2 Graphics engine adequacy 1.8.9.2 Code browsing 1.2 Software conﬁguration management 1.7.8.1 Documentation support tools 1.5.4 Administration tools 1.2 1.2 Adequacy 1.2 Data sharing .1 User interface 1.4.5.3.2.2.3.6 Interoperability 1.3 1.8 Integration among Land-base products 1.7.3.2.1 User administration functions 1.5 Work ﬂow connection 1.4.

4 Administration tools 2.7.2 Data sharing 3 PLANNING.6.2 Graphics engine adequacy 2.5.1.2 2.2.1 Availability 2.2 Functionality . IMPLEMENTATION AND OPERATING SUPPORT 3.1.2.2.6 Integration between Geomarketing products and the Spatial Data Manager 2.2.3 Development environment 2.4.3 Interface personalisation 2.1 Interfaces integration 2.3 Interface personalisation 3.3 2.2 Adequacy 2.7.1.3.232 2 GEOMARKETING 2. SUPPORTING DECISIONS 2.3.2 Descriptive data products integration 2.2.1 User interface CHAPTER 9.5 Documentation Quality 2.1 Graphics type 2.2 Graphical rendering functions 2.1. DESIGN.1 Graphics type 3.3.3 2.4 Completeness Documentation support type Information retrieval ease Contextual help 2.3.3.7 Integration among Geomarketing products 2.1.1.3.1 Software conﬁguration management 2.3.2 Functionality 2.6.3 Raster data products integration 2.2 Graphics engine adequacy 3.3.2 2.3.5.4.5.2.5 Interoperability 2.6.2 Code browsing 2.4 Libraries personalisation Development support tools Debugging support tools Code documentation 2.5.3.3.1 2.1 User interface 3.4.1 2.1 Planes analysis functions 2.1 Documentation support tools 2.1 Vectorial data products integration 2.

3.7.2.7 Integration between this process products and the Spatial Data Manager 3.1 User administration functions 3.2 Adequacy 3.2.1 Interfaces integration 3.4.3.2.1 3.2.3.7.3 3.2. CONCLUSIONS 3.2.1.4 Completeness Documentation support type Information retrieval ease Contextual help 3.2 3.4.6 Interoperability 3.4.4 Administration tools 3.3 Interface personalisation 4.1 Documentation support tools 3.3.2 3.5.8.3 3.1 Availability .1 Graphics type 4.2 Functionality 4.2.9.4.3.5 Work ﬂow connection 3.1 3.1 Availability 3.3 3.4.2.4 Planes analysis functions Topological connectivity functions Graphical rendering functions Network schema creation 233 3.2.2 Code browsing 3.1 User interface 4.5 Documentation Quality 3.2.3.3 Development environment 3.7.4 Libraries personalisation Development support tools Debugging support Code documentation 3.3 Performance data collection 3.1 3.2 Data sharing 4 DIAGNOSIS SUPPORT AND CUSTOMER CARE 4.4 Vectorial data products integration Descriptive data products integration Raster data products integration Digital Terrain Model products integration 3.3.3.3 3.2 3.3.4.1.7.8.2 3.1 3.5.2 Software conﬁguration management 3.3.1.3.5.2 Graphics engine adequacy 4.2.5.8 Integration among this process products 3.

3 5.1 5.2.2 Performance data collection 4.7 Integration among this process products 4.2.3.2 Adequacy 4.5.3.1.3.1 Documentation support tools 4.2.3 4.1 Interfaces integration 4.3.7.3.3 4.3.4 Completeness Documentation support type Information retrieval ease Contextual help 4.4 Fundamental properties Transaction typology support Data / Function association Client data access libraries 5.2.3.2 5.2.2 Descriptive data products integration 4.5.6.3 4.2 4.2.1 5.1.4 Libraries personalisation Development support tools Debugging support Code documentation 4.2.2 5.3 5.2.2.2 Basic properties of the Spatial Data Manager 5.2 Code browsing 4.1 Data base properties 5.3 Raster data products integration 4.1 Software conﬁguration management 4.1 4.1.1.3 Development environment 4.5 Documentation Quality 4.4 Administration tools 4.4.3.3.6.2 4.2.4 Data model Data management Data integration Spatial operators .7.2.6.2 Data sharing 5 SPATIAL DATA MANAGER 5.2 4.1 4.234 4.3.6 Integration between this process products and the Spatial Data Manager 4.4. SUPPORTING DECISIONS Planes analysis functions Topological connectivity functions Graphical rendering functions Network schema creation 4.4.5.1 4.5 Interoperability 4.2.3.2.4.4 CHAPTER 9.1 Vectorial data products integration 4.5.

4.3 Special properties of the Spatial Data Manager 5.4.3 5.3.5.4.4.1 Public libraries for feature manipulation 5.3 Backup 6 SOFTWARE QUALITY 6.3.2 5.4.4.3.2 Data Manager under diﬀerent operation typology 7.1 Server data access libraries 5.9.2 Structured Query Language to access descriptive data 5.1 5.4.5.2.1 Database distribution 5.1 Robustness 6.2 Database access control 5.4.7 Independence from features structure Integration with Oracle Integration with Unix and MVS relational databases Integration with Oracle Designer 2000 Logical scheme import capability Spatial Data Manager platform 5.2.3 5.3 Data Manager under diﬀerent concurrent transactions 7.6 Vectorial data continuous management 5.5 5.5.4 Data sharing constraints Feature versioning Feature life-cycle management Data distribution 235 5.3 Easiness of installation and maintenance 7 PERFORMANCES 7.2 Maturity 6.5 Coordinate systems 5.5 Data administration tools 5.3.1 Single transaction under diﬀerent data volume 7. CONCLUSIONS 5.1.1.2 5.6 5.4 Integration between the Spatial Data Manager and the Data Layer 5.4.4 5.4.4 Graphical interfaces performances .

.

54. confronted with such methods. It is likely that the present professional life of many readers is still governed by some type of formal evaluation method that somehow uses “grades” (this is clearly the case for most academics). Computer Science. Although these methods may not be entirely formalised. “Following a democratic election Mr. Similar votes may well lead to very diﬀerent results depending on the rules used to process them. X might not have been elected. we hopefully have to cast several kinds of votes. their underlying logic should be explicit contrary to. It is not an overstatement to say that nowadays nearly everyone is. Economics Operational Research. Therefore we cannot allow him to continue with this programme” Our early life at school was governed to a large extent by the grades we obtained.) and are used to support numerous kinds of decision or evaluation processes. under a slightly diﬀerent electoral system. By this. assess and process information in order to make recommendations in decision and/or evaluation processes. Statistics. As mentioned in chapter 2. In chapter 3 we saw. Engineering. X has been elected” As citizens. thus.10 CONCLUSION 10. Such methods emanate from many diﬀerent disciplines (Political Science. is in fact a complex evaluation model. “Your child has a GPA of 9. the aggregation of such evaluations is not an obvious task. that a “grade”. Decision Theory. elections are governed by “rules” that are very far from being innocuous. Mr. the exams we passed or not. Therefore. We brieﬂy summarise below the main methods presented in this book and the diﬃculties that have been encountered. implicitly or explicitly. Therefore. say. astrology or graphology. the decision made concerning your child might well 237 . Not surprisingly.1 Formal methods are all around us The aim of this book was to provide a critical introduction to a number of “formal decision and evaluation methods”. etc. Such “electoral rules” contribute towards shaping the entire political debate in a country and. inﬂuence the type of democracy we live in. Education Science. although being a very familiar concept. we mean a set of explicit and well-deﬁned rules to collect.

. etc. criteria into account when making a decision ? This area.238 CHAPTER 10. “Calculations show that it is not proﬁtable to equip this hospital with a maternity department” The quality of the roads on which we drive. The resulting numbers do not appear to be measured on some well-deﬁned type of scale. in most cases. the analyst has the choice between several “aggregation strategies” that could lead to diﬀerent results. the very notion of a ‘best buy’ is highly debatable. CONCLUSION have been signiﬁcantly diﬀerent depending on the grading policy and/or correction habits of some teachers. This raises many diﬃculties outside simple cases: how to convert the various consequences of a complex project into monetary units. the safety regulations applied to factories near our homes.) by using numbers. etc. Therefore. at best. Therefore. the comparison of these strategies raises many problems.. In chapter 4. are shown to have little (if any) clear meaning outside a well-deﬁned aggregation strategy. the quality of our social security system. “Based on numerous tests it appears that the ‘best buy’ is car Z” How to take several. “Things are going well since the ‘well-being’ index in our country raised by more than 10% over the last three years” Statisticians have elaborated an incredible number of indicators or indices aiming at capturing many aspects of reality (including the quality of the air we breeze. Furthermore. a very crude indication. generally conﬂicting. the fact that his exams were corrected late at night or on the way his various grades were aggregated. the richness of a country. claiming that the ‘well-being’ index has increased by 10% gives. like the “importance” of criteria. Not only are our newspapers full of these kinds of ﬁgures but they are also routinely used to make important political or economic decisions. We showed that. the way our electricity is produced. are highly dependent on numerous debatable hypotheses (e. the pricing of a number of statistical “delivery incidents” due to a longer transportation time for some mothers). It is not unlikely that other reasonable hypotheses may have led to an opposite decision. Cost-beneﬁt analysis evaluates such projects using money as a yardstick. known as Multiple Criteria Decision Making (MCDM) is the subject of chapter 6. we saw that such “measures” should not be confounded with the familiar “measurement operations” in Physics. the tariﬃng of public transportation.g. how to cope with equity considerations in the distribution of costs and beneﬁts. Their properties are sometimes intriguing and they surely should be manipulated with care. Therefore. Each of these strategies requires the assessment of more or less rich and precise “inter-criteria” information. depend on particular ways of assessing and summarising the costs and the beneﬁts of alternative projects. apparently familiar concepts. the apparently objective calculations invoked to refuse the creation of a maternity department in our hospital. how to take the distribution in time of these consequences into account? In chapter 5 we saw that cost-beneﬁt analysis can hardly claim to always solve all these diﬃculties in a satisfactory manner. Since such assessments shape preference information as much as they collect it. its state of development. because each potential buyer has his own preferences and interests and there are many diﬀerent and yet reasonable ways to aggregate them.

Using a real example in electricity production planning. like the dynamic consistency of choices and the aggregation of consequences over time were shown to be largely open questions.2 What have we learned? Although the methods examined in this book are apparently very diﬀerent and emanate from various disciplines.g.2. you should not invest in this project in view of its expected utility” Standard decision analysis techniques (see e. they appear to have a lot in common. concerning the amount of water or energy to use. This should not be much of a surprise since these methods have the common objective of providing recommendations in complex decision and evaluation processes. we showed why the implementation of these standard techniques may not be as straightforward as is often believed. The “decision modules” underlying such automatic decisions were studied in chapter 7. The real case-study presented in chapter 9 has shown that their proper use can have a signiﬁcant impact on real complex decision or evaluation processes. Besides possible computational problems. What might be slightly more surprising. our new camera will choose the ‘optimal focus’ for you” Our washing machines.10. the assessment and revision of (subjective) probability distributions in highly ambiguous environments and in situations involving a long period of time. there might be more than one way to assess preferences and beliefs and to combine them in order to make a recommendation. thus. Furthermore. fuzzy sets and other kinds of non-additive uncertainty measures may appear as good contenders although their theoretical basis may be seen as less ﬁrm than the one underlying standard Bayesian analysis. relying on the automatic decisions taken by the new camera might not always be your best option. belief functions. our cameras. Alternative tools. raise similar problems and questions. the right focus. “Given what you told me about your preferences and beliefs. Therefore. is an enormous task. Raiﬀa 1970) are often seen as synonymous with decision support methods in risky and/or uncertain situations. The authors of this book believe that it may be interesting and proﬁtable to give them a closer look. it seems diﬃcult nowadays to escape from formal decision and evaluation methods. WHAT HAVE WE LEARNED? 239 “Relax. We saw that they are based on concepts and techniques that are very similar to the ones examined in chapter 6 and. our TV sets often take decisions on their own.g. the clarity of an image. important considerations. Contrary to the situation in chapter 6 however. 10. • Objective and scope of formal decision/evaluation models . supposedly “optimal” tuning of channels. We may ignore them. Therefore. Whether we like it or not. in chapter 8. This raises new diﬃculties and issues. Let us try to summarise the main ﬁndings and problems encountered in the preceding chapters here. is that most of these methods and tools are plagued with many diﬃculties. e. such as possibilities. they are used in real time without human intervention after the implementation stage. the.

The choice between these various possible options is only partly guided by “scientiﬁc considerations”. 4. Moreover. ambiguity and/or uncertainty. More complex recommendations. 6 and 7). Furthermore. Implementing a decision/evaluation model only rarely implies capturing aspects of reality that can be considered as independent of the model (see chapters 6 and 9). these numbers seem. transparency of the model. are also frequently needed (see chapters 3. etc.). – The use of evaluation models greatly contributes to shaping and transforming the “reality” that we would like to “measure”. e.240 CHAPTER 10. communication with actors involved in the process. 4. they are often plagued with imprecision. Their usefulness not only depends on their intrinsic formal qualities but also on the quality of their implementation (structuration of the problem. 8). CONCLUSION – Formal decision and evaluation models are implemented in complex decision/evaluation processes.g. ranking the possible courses of action or comparing them to standards. 4. more often than not. – The numbers resulting from such “evaluation models” often appear as constructs that are the result of multiple options. the usefulness of such models is not limited to the elaboration of several types of recommendations. 4. Using them rarely amounts to solving a well-deﬁned formal problem. using “numbers” may only be a matter of convenience and does not imply that any operation can be meaningfully performed on them (see chapters 3. • Aggregating evaluations . This more or less inevitably implies building “evaluation models” trying to capture aspects of “reality” that are diﬃcult to deﬁne with great precision (see chapters 3. When properly used. They are measured on scales that are diﬃcult to characterise properly. Therefore. – The properties of the numbers manipulated in such models should be examined with care. 6 and 9). These numbers should not be confounded with numbers resulting from classical measurement operations in Physics. at best. 6 and 7). 6. they may provide support at all steps of a decision process (see chapter 9) • Collecting data – All models imply collecting and assessing “data” of various types and qualities and manipulating these data in order to derive conclusions that will hopefully be useful in a decision or evaluation process. Having a sound theoretical basis is therefore a necessary but insuﬃcient condition to their usefulness (see chapter 9). – The objective of these models may be diﬀerent from recommending the choice of a “best” course of action. to give an order of magnitude of what is intended to be captured (see chapters 3.

this type of method should be abandoned and that “intuition” or “expertise” are not likely to do much worse. We saw that the methods reviewed in chapters 2 to 8 are far from being without problems. 4 and 6). – Intuitive preference information. this would be a totally unwarranted conclusion. The use of weighted averages should in fact be restricted to rather speciﬁc situations that are seldom met in practice.g. The type of aggregation model that is used greatly contributes to shaping this information. 5 and 6).10.2. Indeed these chapters can be seen as a collection of the defects of these methods. – Deriving robust conclusions on the basis of such aggregation models requires a lot of work and care. It is not easy to create an alternative framework in which problems such as dynamic consistency or respect of (ﬁrst order) stochastic dominance are dealt with in a satisfactory manner (see chapters 6 and 8). 4. this is not the only possible aggregation strategy (see chapters 3. Apparently reasonable principles can lead to a model with poor properties. faced with such evidence. the model should explicitly deal with imprecision. WHAT HAVE WE LEARNED? 241 – Aggregating the results of complex “evaluation models” is far from being an easy task. Although many aggregation models amount to summarising these numbers into a single one. In our opinion. – Many diﬀerent tools can be envisaged to model the preferences of an actor in a decision/evaluation process (see chapters 2 and 6). concerning the relative importance of several points of view. – The pervasive use of simple tools such as weighted averages may lead to disappointing and/or unwanted results. • Dealing with imprecision. – Devising an aggregation technique is not an easy task. may be diﬃcult to interpret within a welldeﬁned aggregation model (see chapter 6). The search for robust conclusions may imply analyses much more complex than simple sensitivity analyses varying one parameter at a time in order to test the stability of a solution (see chapters 6 and 8). not only collect but shape and/or create preference information (see chapter 6). uncertainty and inaccurate determination. ambiguity and uncertainty – In order to allow the analyst to derive convincing recommendations. Modelling all these elements into the classical framework of Decision Theory using probabilities may not always lead to an adequate model. It is the ﬁrm belief and conviction of the authors . e. therefore. Assessment techniques. Some readers may think that. at lower cost and with less eﬀort. A formal analysis of such models may therefore prove of utmost importance (see chapters 2. – Aggregation techniques often call for the introduction of “preference information”.

formal methods have a number of advantages that often prove crucial in complex organisational and/or social processes: • they promote communication between the actors of a decision or evaluation process by oﬀering them a common language. Russo and Schoemaker 1989. • they require building models of certain aspects of “reality”. Three main arguments can be proposed to support this claim. Poulton 1994. policy towards the carrying of guns. a very simple decision/evaluation process involving a single actor) they appear to us fundamental to us in most social or organisational processes (see chapter 9).242 CHAPTER 10. Similarly. etc. policy against crime and drugs. it has been more or less always shown that such types of judgements are based on heuristics that are likely to neglect important aspects of the situation and/or are aﬀected by many biases (see the syntheses of Kahneman. Hogarth 1987. . However. the establishment of environmental standards. it should not be forgotten that formal tools lend themselves more easily to criticism and close examination than other kinds of tools. whenever “intuition” or “expertise” has been subjected to close scrutiny. We would thus answer a clear and deﬁnite yes to the question of whether formal decision and evaluation tools are useful. Bazerman 1990. Thaler 1991) Second.). decision support systems and expert systems to standardised evaluation tests and impact studies).g. CONCLUSION that the use of formal decision and evaluation tools is both inevitable and useful. Slovic and Tversky 1981. ﬁscal policy. Thus. money and time consumed in some situations (e. Although many companies use tools such as graphology and/or astrology in order to select between applicants for a given position. These exploration capabilities are crucial in order to devise robust recommendations. formal methods are often indispensable structuration instruments. First. laws and regulations (e. Although these advantages may have little weight compared to the obvious drawbacks of formal methods in terms of eﬀort involved. an area in which they are strikingly absent in many countries. It is our belief that the introduction of such tools may have quite a beneﬁcial impact in many areas in which they are not commonly used. this implies concentrating eﬀorts on crucial matters. we are more than inclined to say that the use of more formal methods could improve such selection processes (let alone on issues such as fairness and equity) in a signiﬁcant way. casual observation suggests that there is an increasing demand for such tools in various domains (going from executive information systems.g. the introduction of more formal evaluation tools in the evaluation of public policies. • they lend themselves easily to “what-if” types of questions. Third. would surely contribute to a more transparent and eﬀective government.

As should be apparent from of chapter 9. has been applied several times in real-world problems and has been well accepted by the actors in the process. supporting real decision and evaluation processes should not be confounded with this formal exercise. etc.3 What can be expected? Our plea for the introduction of more formal decision and evaluation tools may appear paradoxical in view of the content of this book. of communication with stakeholders. are elements of utmost importance in the quality of a decision/evaluation aid process. is nothing but clear. Supporting a decision or an evaluation process should not be confounded with solving a “well-deﬁned formal problem”. Although we would deﬁnitely not favour a method that would be unable to pass such a test. We doubt that this is a reasonable belief. we doubt that the “engineering” argument is suﬃcient to deﬁne what would distinguish “good” formal decision or evaluation methods.e. non-exclusive. i. WHAT CAN BE EXPECTED? 243 10. it is important to remember that the “quality” of the support provided by a formal tool is very diﬃcult to separate from considerations linked to the implementation of the method. it is often diﬃcult to know whether the proposed model “worked” or not. .3. in practice. It should not be unexpected however. Second. Indeed.10. unless one believes that there is a single “best way” to provide support in each type of decision or evaluation process. the very way in which a “good” formal decision/evaluation method is deﬁned. None of them appear totally convincing to us. the availability of user-friendly softwares. lasting a long time and being governed by complex rules and/or regulations). the questions they raised. the formal tools used by an analyst are implemented in decision or evaluation processes that may be highly complex (involving many diﬀerent actors. the very presence of analysts. Two main. the timing and costs of the study. Indeed. the type of reasoning they have promoted could have had a signiﬁcant impact on the decision process. The resulting decision/evaluation aid process is therefore conditioned by many factors outside the realm of a formal method: the quality of the structuration of the problem. First. Although it may make sense to associate a “good” method for solving it to such a problem. • the engineering route that amounts to saying that a method is good because “it works”. Should we say then that the method has “worked” or not? A close variant of the engineering route could be called the naive route. a thorough critical examination of each of the methods covered in chapters 2 to 8 could be the subject of an entire book. paths have often been suggested for this purpose. Even though the ﬁnal decision is at variance with the recommendations derived from the model. Have we been overly critical then? Certainly not. Our willingness to keep mathematics and formalism to the lowest possible level has not allowed us to explore many technical details and diﬃculties. The paradox between our conviction in the usefulness of formal methods and the content of this book is only apparent and results from a misunderstanding. The fact that many decision and evaluation tools are plagued with serious diﬃculties is troublesome.

etc. Kunreuther and Schoemaker 1982. expected utility theory was considered almost unanimously as the “rational theory of choice under risk”. • the rational route which amounts to saying that a method is adequate if it is backed by a sound theory of “rational choice”. Fishburn 1988. Nau and McCardle 1991).244 CHAPTER 10. Gilboa and Schmeidler 1989. Having axioms is certainly useful in order to compare theories but the “rational” content of the axioms and their interpretation remain much debated. Contrary to most engineers. Jaﬀray 1988. Hershey. Dubois. Russo and Schoemaker 1989. Jaﬀray 1989. While.g. Hammond and Raiﬀa 1999). Kahneman and Tversky 1979. Machina 1982.) and that the essential idea is to promote a good “decision process”. A striking example of this diﬃculty can be found in the area of decision under risk and uncertainty.g. these “decision engineers” often lack clear criteria for appreciating the “success” or “failure” of their models. CONCLUSION It amounts to saying that a formal tool is adequate if it consistently leads to “good” decisions. Wakker 1989. The literature on “decision” (see Raiﬀa 1970. Allais 1953. if not all. McCrimmon and Larsson 1979) presently results in a very complex situation in which it is not easy to discriminate between theories both from an empirical (see e. Can something be done then? In view of the many diﬃculties encountered with the models envisaged in this book and the many ﬁelds in which no formal decision and evaluation tools are used. McClennen 1990. This literature shows that it is very diﬃcult to deﬁne what would constitute a “good decision” a priori (good in which state of nature ? good for whom ? good according to what criteria ? at what moment in time ?. however. Sopher and Gigliotti 1993) or a normative point of view (see e. McCord and de Neufville 1982. . Analysts implementing formal decision and evaluation tools are in a position similar to that of an engineer. Loomes and Sugden 1982. Keeney. Although we ﬁnd theories most useful. This is true even though most.e. Schmeidler 1989. the proliferation of alternative theories since then (see e. Hey and Orme 1994. Harless and Camerer 1994. however. Abdellaoui and Munier 1994. Machina 1989. a set of conditions is known that completely characterises the proposed choice or evaluation models).g. Nau 1995. the relation between the formal axiomatic theory and the assessment technologies derived from it are far from being obvious (see e. Yaari 1987) fostered by the result of numerous empirical experiments (see e. until the beginning of the eighties. Quiggin 1982.g. Carbone and Hey 1995.g. the criteria for separating sound from unsound theories of “rational choice” do not appear obvious to us. Furthermore. Bouyssou 1984). has always insisted on the fact that “good decisions do not necessarily lead to good outcomes”. Hammond 1988. we do think that this area will be rich and fertile for future research. Kahneman and Tversky 1979. Fargier and Prade 1997. of these theories have been axiomatically characterised (i. Johnson and Schkade 1989. At this point it should be apparent that research on formal decision and evaluation methods should not be guided by the hope of discovering models that would be ideal under certain types of circumstances.

10.3. WHAT CAN BE EXPECTED?

245

Freed from the idea that we will discover the method, we can, more modestly and more realistically, expect to move towards: • structuring tools that will facilitate the implementation of formal decision and evaluation models in complex and conﬂictual decision processes; • ﬂexible preference models able to cope with data of poor or unknown quality, conﬂicting or lacking information; • assessment protocols and technologies able to cope with complex and unstable preferences, uncertain trade-oﬀs, hesitation and learning; • tools for comparing aggregation models in order to know what they have in common and whether one is likely to be more appropriate in view of the quality of the data? • tools for deﬁning and deriving “robust” conclusions. To summarise, the future as we see it: structuration methodologies allowing for an explicit involvement and participation of all stakeholders, ﬂexible preference models tolerating hesitations and contradictions, ﬂexible tools for modelling imprecision and uncertainty, evaluation models fully taking incommensurable dimensions into account in a meaningful way, assessments technologies incorporating framing eﬀects and learning processes, exploration techniques allowing to build robust recommendations (see Bouyssou et al. 1993). Thus, “thanks to rigourous concepts, well-formulated models, precise calculations and axiomatic considerations, we should be able to clarify decisions by separating what is objective from what is less objective, by separating strong conclusions from weaker ones, by dissipating certain forms of misunderstanding in communication, by avoiding the trap of illusory reasoning, by bringing out certain counter-intuitive results” (Roy and Bouyssou 1991). This “utopia” calls for a vast research programme requiring many diﬀerent types of research (axiomatic analyses of models, experimental studies of models, clinical analyses of decision/evaluation processes, conceptual reﬂections on the notions of “rationality” and “performance”, production of new pieces of software, etc.). The authors are preparing another book that will hopefully contribute to this research programme. It will cover the main topics that we believe to be useful in order to successfully implement formal decision/evaluation models in real-world processes : • structuration methods and concepts, • preference modelling tools, • uncertainty and imprecision modelling tools, • aggregation models, • tools for deriving robust recommendations.

246

CHAPTER 10. CONCLUSION

If we managed to convince you that formal decision and evaluation models are an important topic and that the hope of discovering “ideal” methods is somewhat chimerical, it is not unlikely that you will ﬁnd the next book valuable.

Bibliography

[1] Abbas, M., Pirlot, M. and Vincke, Ph. (1996). Preference structures and cocomparability graphs, Journal of Multicriteria Decision Analysis 5: 81–98. [2] Abdellaoui, M. and Munier, B. (1994). The ‘closing in’ method: An experimental tool to investigate individual choice patterns under risk, in B. Munier and M.J. Machina (eds), Models and experiments in risk and rationality, Kluwer, Dordrecht, pp. 141–155. [3] Adler, H.A. (1987). Economic appraisal of transport projects: A manual with case studies, Johns Hopkins University Press for the World Bank, Baltimore. [4] Airaisian, P.W. (1991). Classroom assessment, McGraw-Hill, New York. [5] Allais, M. and Hagen, O. (eds) (1979). Expected utility hypotheses and the Allais paradox, D. Reidel, Dordrecht. [6] Allais, M. (1953). Le comportement de l’homme rationnel devant le risque : Critique des postulats et axiomes de l’´cole am´ricaine, Econometrica e e 21: 503–46. [7] Armstrong, W.E. (1939). The determinateness of the utility function, The Economic Journal 49: 453–467. [8] Arrow, K.J. and Raynaud, H. (1986). Social choice and multicriterion decision-making, MIT Press, Cambridge. [9] Arrow, K.J. (1963). Social choice and individual values, 2nd edn, Wiley, New York. [10] Atkinson, A.B. (1970). On the measurement of inequality, Journal of Economic Theory 2: 244–263. [11] Baldwin, J.F. (1979). A new approach to approximate reasoning using a fuzzy logic, Fuzzy Sets and Systems 2: 309–325. [12] Balinski, M.L. and Young, H.P. (1982). Fair representation, Yale University Press, New Haven. [13] Bana e Costa, C.A., Ensslin, L., Corrˆa, E.C. and Vansnick, J.-C. (1999). e Decision support systems in action: Integrated application in a multicriteria decision aid process, European Journal of Operational Research 113: 315–335. [14] Barbera, S., Hammond, P. and Seidl, C. (eds) (1998). Handbook of utility theory, Vol. 1: Principles, Kluwer, Dordrecht. 247

248

BIBLIOGRAPHY

[15] Bartels, R. H.., Beatty, J. C.. and Barsky, B.H.. (1987). An introduction to Spline for use in computer graphics and geometric Modeling, Morgan Kaufmann, Los Altos. [16] Barzilai, J., Cook, W.D. and Golany, B. (1987). Consistent weights for judgments matrices of the relative importance of alternatives, Operations Research Letters 6: 131–134. [17] Bazerman, M.H. (1990). Judgment in managerial decision making, Wiley, New York. [18] Bell, D., Raiﬀa, H. and Tversky, A. (eds) (1988). Decision making: Descriptive, normative and prescriptive interactions, Cambridge University Press, Cambridge. [19] Belton, V., Ackermann, F. and Shepherd, I. (1997). Integrated support from problem structuring through alternative evaluation using COPE and V•I•S•A, Journal of Multi-Criteria Decision Analysis 6: 115–130. [20] Belton, V. and Gear, A.E. (1983). On a shortcoming of Saaty’s analytic hierarchies, Omega 11: 228–230. [21] Belton, V. (1986). A comparison of the analytic hierarchy process and a simple multi-attribute value function, European Journal of Operational Research 26: 7–21. [22] B´reau, M. and Dubuisson, B. (1991). A fuzzy extended k-nearest neighbor e rule, Fuzzy Sets and Systems 44: 17–32. [23] Bernoulli, D. (1954). Specimen theoriæ novæ de mensura sortis, Commentarii Academiæ Scientiarum Imperialis Petropolitanæ (5, 175–192, 1738), Econometrica 22: 23–36. Translated by L. Sommer. [24] Bezdek, J., Chuah, S.K. and Leep, D. (1986). Generalised k-nearest neighbor rules, Fuzzy Sets and Systems 18: 237–256. [25] Blin, M.-J. and Tsouki`s, A. (1998). Multicriteria methodology contribution a to the software quality evaluation, Technical report, Cahier du LAMSADE No 155, Universit´ Paris-Dauphine, Paris. e [26] Boardman, A. (1996). Cost beneﬁt analysis: Concepts and practices, PrenticeHall, New-York. [27] Boiteux, M. (1994). Transports : Pour un meilleur choix des investissements, La Documentation Fran¸aise, Paris. c [28] Bonboir, A. (1972). La docimologie, PUF, Paris. [29] Borda, J.-Ch. (1781). M´moire sur les ´lections au scrutin, Comptes Rendus e e de l’Acad´mie des Sciences. Translated by Alfred de Grazia as “Mathee matical derivation of an election system”, Isis, Vol. 44, pp. 42–51. [30] Bouchon, B. (1995). La logique ﬂoue et ses applications, Addison Wesley, New York.

BIBLIOGRAPHY

249

[31] Bouchon-Meunier, B. and Marsala, C. (1999). Learning fuzzy decision rules, in D. D. J. Bezdek and H. Prade (eds), Fuzzy sets in approximate reasoning and information systems, Vol. 3 of Handbook of Fuzzy Sets, Kluwer, Dordrecht, chapter 4, pp. 279–304. [32] Bouyssou, D., Perny, P., Pirlot, M., Tsouki`s, A. and Vincke, Ph. (1993). a A manifesto for the new MCDM era, Journal of Multi-Criteria Decision Analysis 2: 125–127. [33] Bouyssou, D. and Perny, P. (1992). Ranking methods for valued preference relations: A characterization of a method based on entering and leaving ﬂows, European Journal of Operational Research 61: 186–194. [34] Bouyssou, D. and Pirlot, M. (1997). Choosing and ranking on the basis of fuzzy preference relations with the ‘Min in Favor’, in G. Fandel and T. Gal (eds), Multiple criteria decision making – Proceedings of the twelfth international conference, Hagen, Germany, Springer Verlag, Berlin, pp. 115– 127. [35] Bouyssou, D. and Vansnick, J.-C. (1986). Noncompensatory and generalized noncompensatory preference structures, Theory and Decision 21: 251–266. [36] Bouyssou, D. (1984). Decision-aid and expected utility theory: A critical survey, in O. Hagen and F. Wenstøp (eds), Progress in utility and risk theory, Kluwer, Dordrecht, pp. 181–216. [37] Bouyssou, D. (1986). Some remarks on the notion of compensation in MCDM, European Journal of Operational Research 26: 150–160. [38] Bouyssou, D. (1990). Building criteria: A prerequisite for MCDA, in C.A. Bana e Costa (ed.), Readings in multiple criteria decision aid, Springer Verlag, Berlin, pp. 58–80. [39] Bouyssou, D. (1992). On some properties of outranking relations based on a concordance-discordance principle, in A. Goicoechea, L. Duckstein and S. Zionts (eds), Multiple criteria decision making, Springer-Verlag, Berlin, pp. 93–106. [40] Bouyssou, D. (1996). Outranking relations: Do they have special properties?, Journal of Multi-Criteria Decision Analysis 5: 99–111. [41] Brams, S.J. and Fishburn, P.C. (1982). Approval voting, Birkh¨user, Basel. a [42] Brans, J.-P. and Vincke, Ph. (1985). A preference ranking organization method, Management Science 31: 647–656. [43] Brekke, K.A. (1997). The num´raire matters in cost-beneﬁt analysis, Journal e of Public Economics 64: 117–123. [44] Brent, R.J. (1984). Use of distributional weights in cost-beneﬁt analysis: A survey of schools, Public Finance Quarterly 12: 213–230. [45] Brent, R.J. (1996). Applied cost-beneﬁt analysis, Elgar, Adelshot Hants. [46] Broome, J. (1985). The economic value of life, Economica 52: 281–294.

Cambridge. Systems and decision making. and Hey. A management science approach. J. Handbook of public economics. Wiley. E. La docimologie. P. (1986). [60] Desrosi`res. A. Journal of Economic Theory 40: 304–318. (1785). IT-13 1: 21–27. [51] Condorcet. N. e e Brussels. Th´orie du mesurage. (1992). (1986).G. New York. Master’s thesis. IEEE. agr´gation des crit`res et applie e e cation au d´cathlon. SMG. A. Feldstein (eds). Cambridge University Press. (1980). S. Systems thinking.S. Guidelines for project evaluation. [55] Dasgupta.250 BIBLIOGRAPHY [47] Carbone. (1994). ERIC/AE Digest. marquis de. An axiomatic characterization of preference under uncertainty: Weakening the independence axiom. Paris. Labor-Nathan. The theory of cost-beneﬁt analysis. J. Principles of cost-beneﬁt analysis for developing countries. Nearest neighbor pattern classiﬁcation. [56] Dasgupta. Why beneﬁt-cost analysis is widely disregarded and what to do about it?. (1993).. Les Dossiers d’Education et Formae e tions 47: 183–203. Paris. [63] Dinwiddy. Cabay. Interfaces 26: 1–6. Transactions on Information Theory. (1982). (1987).W. (1996). E. [64] Dorfman.K. [59] Dekel. Evaluation scolaire et mesure. [49] Chatel. [65] Dr`ze. Essai sur l’application de l’analyse ` la probabilit´ des d´cisions rendues ` la pluralit´ des voix.N.J.G. Paris. Tools for teaching. .. Universit´ Libre de Bruxelles. [57] Davis. C. H. J.J. Wiley. P. A comparison of the estimates of expected utility and non-expected utility preference functionals. ´ [62] de Landsheere. Amsterdam. L. and Teal. (1995). Marglin. Basingstoke. Grading students. P. 909–989. UNIDO. [61] de Ketele. E. Pr´cis de docie mologie. (1972). E. [52] Cover. New York. Cost-beneﬁt analysis: Theory and practice. P.A. M. Technical Report Series EDO-TM-95-5. Ima e e a e primerie Royale.C. B. Elsevier. D. INSEE. systems practice. Auebach and M. (1972). [54] Daellenbach.-M. (1981). Macmillan.S. M. J. Qu’est-ce qu’une note : recherche sur la pluralit´ des e ´ modes d’´ducation et d’´valuation. (1995). (1995). pp. [50] Checkland. San Francisco. Evaluation continue et examens. Louvain-La-Neuve. A. (1996). New York.H. (1967). F. Brussels. Geneva Papers on Risk and Insurance Theory 20: 111–133. and Pearce. R. and Stern. ´ [48] Cardinet. [58] de Jongh. in e A. (1994).D. De Boeck. [53] Cross. Reﬂ´ter ou instituer : L’invention des indicateurs e e statistiques. and Sen. Technical Report 129/J310. G. Jossey-Bass. T. and Hart.

C. R. (1961). (1994). Remarks on the analytic hierarchy process. Proceedings of the 14t h conference on uncertainty in artiﬁcial intelligence. and Frisbie. Fuzzy logic. (1997). Morgan Kaufmann. P. Wiley.B. P. in D. and Perny.L. Proceedings of the 15t h conference on uncertainty in artiﬁcial intelligence. [71] Dupuit. and Ughetto. Los Altos. P. [74] Ellsberg. Fuzzy algorithms for control. Verbruggen. pp. D. H.BIBLIOGRAPHY 251 [66] Dubois. Morgan Kaufmann. Equity considerations in public risks evaluation.. New-York. 17–58. Prentice-Hall. D. Quarterly Journal of Economics 75: 643–669. Bid management of software acquisition for cartography applications. Qualitative decision models under uncertainty without the commensurability hypothesis. Aosta. M. Decision-making under ordinal preferences and uncertainty. Los Altos. R. Qualitative decision theory with Sugeno integrals. and Sarin. in H.C. (1990). H. Prade. ambiguity and the Savage axioms. SIAM Journal on Applied Mathematics 33: 469–489. in K. Prentice-Hall.C. e [72] Dyer. and Prade. pp. pp. [73] Ebel. A. Fuzzy Sets and Systems 24: 279–300.J. [80] Fishburn. Dispersive equity and social risk. [75] Fargier. New-York. R.C. (1988). and Sabbadin. G. R. Maﬃoli. H.B. [77] Fiammengo. Zimmermann and R. De la mesure de l’utilit´ des travaux publics. [79] Fishburn. Fairness and social risk I: Unaggregated analyses. [70] Dubois. (1991). 157–164. D.K. Operations Research 37: 229–239. (1999).. Proceedings of the 13th conference on uncertainty in artiﬁcial intelligence. [83] Fishburn. and Prade. pp. (1987). H. Los Altos. (1999).S. and Prade.D. Plenum Press. J. (1989). Dordrecht. [69] Dubois.. Presented at AIRO ’97 Conference. (1977)..C. Utility theory for decision-making. (1997). New-York. D. H. Panarotto. Noncompensatory preferences. I. Morgan Kaufmann. Management Science 37: 751–769. and Straﬃn. Laskey and H. Possibility theory. and Turino. P. New York. Kluwer. Contemporary Political Studies. Condorcet social choice functions.P. D. (1844). Comparing electoral systems. P. D.A. P.C. . Essentials of educational measurement.. Iob. L.M. (1976). J. Geiger and P. [78] Fishburn. D. (1970). Babuska (eds). [68] Dubois. Management Science 40: 1174–1188.K. and Sarin. [76] Farrell. Management Science 36: 249–258. (1997). control engineering and artiﬁcial intelligence. P. 188–195. H. (1991). 121–128. [67] Dubois. P. Fargier. Risk. [81] Fishburn. Buosi. D. H. Prade (eds). Prade. [82] Fishburn. D.. (1998).. Annales des e Ponts et Chauss´es (8). Synthese 33: 393–403. H.. The mean value of a fuzzy number. P. Shenoy (eds).

D. Maxmin expected utility with a nonunique prior.C. [92] Folland. Goodman. Dordrecht. Les cahiers du Club CRIN . Technical report.C. (1973). (1991). Roubens (eds). [91] Fodor. (1993). . Paris. Zionts (ed. [93] French. (1981). e e [96] Gafni.C.V.C. P. S. (1993). M. M. and Stano. Nonconventional preference relations in decision making. Herm`s.). I. (1997). A. I. Nontransitive preferences in decision theory. Baltimore. Evaluation subjective. P.C. (1988a). W. (1951). Journal of Health Economics 10: 329–342. S. [87] Fishburn. (1984).. [89] Fishburn. Dordrecht. Kacprzyk and M. [99] Gilboa. D. and Perny. Journal of Risk and Uncertainty 4: 113–134. [94] French. New-York. (1982). USAF Scholl of aviation and medicine. [100] Gilboa. A. Kluwer. The economics of health and health care. Multicriteria problem solving. Measurement theory and examinations. J. S. F. Decision theory – An introduction to the mathematics of rationality. Reidel. El´ments de logique ﬂoue. London. (1978). P. (1994). Manipulation of voting schemes: A general result. 197. pp. in M. 4. Prentice-Hall. Theory and Decision 15: 161– [98] Gibbard. E. D. Berlin.L. British Journal of Mathematical and Statistical Psychology 34: 38–49.Association ECRIN. and Roubens. [85] Fishburn. M. and Hodges. Condorcet’s paradox. Johns Hopkins University Press. A. P. Operations Research 32: 901–908. (1983). Guely. (1997). Econometrica 41: 587–601. 469–489. [88] Fishburn.C. Springer Verlag. A survey of multiattribute/multicriteria evaluation theories. (1988b). Equity considerations in utility-based measures of health outcomes in economic appraisals: An adjustment algorithm.252 BIBLIOGRAPHY [84] Fishburn. (1989). Fuzzy preference modelling and multicriteria decision support. L. ´ [101] Grabisch. non-parametric discrimination: consistency properties.. Equity axioms for public risks. (1997). Ellis Horwood. Berlin. Normative theories of decision making under risk and under uncertainty. [97] Gehrlein.C. S. and Schmeidler. and Birch. Paris. P. 181–224. [86] Fishburn. (1997). Journal of Economic Theory 59: 33–49. Randolph Field. Nonlinear preference and utility theory. Springer Verlag. [90] Fix.C. Journal of Mathematical Economics 18: 141–153. P. J. [95] Gacogne. The foundations of expected utility. Discriminatory analysis. pp. P. Updating ambigous beliefs. in S. and Schmeidler.

(1994). P. and Schoemaker. Mathematics of Operations Research 20: 381–399. P. (1995). Theory and Decision 25: 25–78. The application of fuzzy integrals to multicriteria decision making.V.J.M. Amsterdam. C. Interfaces 22: 47–60. C. Corkindale (eds). 9–15.C. [117] Horn. [109] Harvey. and V´ri.M. Cambridge. (1993). [103] Hammond. C. M´thodes multicrit res non-compensatoires e pour la classiﬁcation ﬂoue d’objets. [116] Holland. (1982). and Orme. Adelshot Hants. and Vargas. Judgement and choice: The psychology of decision. Oxford. Management Science 28: 936–953. O. Relationships between decision making process and study process in OR interventions. J. [108] Harvey. Wiley. (1988). [105] Harker. (1993). A. (1993). [110] Henriet. P.C. H. (1994).T. Washington D. (1995). C. A slow-discounting model for energy conservation. Technical report. 21–38. J. [107] Harvey. Analysis and aiding a decision processes.G. Svenson. Standard for a software quality metrics methodology.C. P. [114] Hey. Proceedings of LFA’96. [120] International Atomic Energy Agency (1993). [119] IEEE 92 (1992). The Institute of Electrical and Electronics Engineers. Investigating generalizations of expected utility theory using experimental data.G. . L.. C. and Spash. D. Universit´ Paris Dauphine. pp. Cambridge University Press. Elgar. Sources of bias in assessment procedures for utility functions. Cost-beneﬁt analysis and the environment. [104] Hanley. L. The assumptions of cost-beneﬁt analysis: A philosopher’s view. [113] Heurgon.BIBLIOGRAPHY 253 [102] Grabisch. R. New York. e M´moire du dea 103. E. (1996). North-Holland. European Journal of Operational Research 89: 445–456. (1996). Econometrica 62: 1251–1289. CAB International. (1987). C. M.F. (1982). Econometrica 62: 1251–1289.C. in K. Willis and J. (1987). L.T.H. P. A. The reasonableness of non-constant discounting.M. Cost-beneﬁt aspects of food irradiation processing. Probl mes d’aﬀectation et m´thodes de classiﬁcation. European Journal of Operational Research 10: 230–236. e e [112] Hershey.D. (1994).L. and Perny. Journal of Public Economics 53: 31–51. [111] Henriet. Kunreuther. The utility of generalized expected utility theories. The theory of ratio scale estimation: Saaty’s analytic hierarchy process. pp.. Statistical indicators. [115] Hogarth. (1995). [106] Harless. Environmental valuation: New perspectives. Management Science 33: 1383–1403. Proportional discounting of future costs and beneﬁts. Consequentialist foundations for expected utility. (1992).J. R. and Camerer. Bernan Associates. N. [118] Humphreys.

J. (1990). [128] Johannesson. ISO. Berlin. Cambridge University Press. [137] Keeney. P. and Schkade. (1995a). Bias in utility assesments: Further evidence and explanations. quality characteristics and guidelines for their use. J. [127] Jaﬀray.A. (1988). Harvard University Press. Springer Verlag. D. [129] Johannesson. (1989a). Universit´ Paris-Dauphine. Smart choices: A guide to making better decisions. Health Policy 33: 59–66. (1999).A. Interactive assessment of preferences using holise tic judgments. A note on the depreciation of the societal perspective in economic evaluation in health care. R. [133] Kahneman. H. E. Choice under risk and the security factor: An axiomatic model. P. The relationship between cost-eﬀectiveness analysis and cost-beneﬁt analysis. [124] Jacquet-Lagr`ze. Slovic. Paris. The PREFCALC system.254 BIBLIOGRAPHY [121] ISO/IEC 9126 (1991). Decisions with multiple objectives: Preferences and value tradeoﬀs.-Y. and Raiﬀa. (1996).O. Some experimental ﬁndings on decision making under risk and their implications. (1982). e [122] Jacquet-Lagr`ze. Moscarola. Operations Research Letters 8: 107–112.B. Boston. e [123] Jacquet-Lagr`ze. E. Hammond. (1993). B. and Hirsch. Cambridge University Press. J.-Y. Econometrica 47: 263–291. Cambridge. E. G. New York. [125] Jaﬀray.. Management Science 35: 406–424. Judgement under uncertainty – Heuristics and biases. Utility theory for belief functions. European Journal of Operational Research 10: 151–164. Discounting of life-saving and other nonmonetary eﬀects. M. (1989b). Prospect theory: An analysis of decision under risk. (1989). Cahier du LAMSADE e No 13. [131] Johansson.. J. Technical report. [132] Johnson. European Journal of Operational Research 38: 301–306. [136] Keeney. S. M. (1995b). and Tversky. Bana e Costa (ed. D. Management Science 29: 300–306. J. J. Readings in multiple criteria decision aid. (1983). . Dordrecht.-Y. in C. [130] Johannesson. Kluwer. [126] Jaﬀray. Information technology – Software product evaluation.S. Cambridge. A. Wiley. Descripe tion d’un processus de d´cision. Theory and Decision 24: 169–200. (1976).J. H. A. M. [134] Kahneman. and Tversky. and Raiﬀa. D. Gen`ve.). (1978). pp.. E.L. [135] Keeler. R. Theory and methods of economic evaluation of health care. E. Technical report. Cost-beneﬁt analysis of environmental change. and Cretin. (1979). and Siskos.L.. 335–350. Roy. Social Science and Medicine 41: 483–489. (1981). Assessing a set of additive utility e functions for multicriteria decision making: The UTA method.

Regret theory: An alternative theory of rational choice under uncertainty. Gray.A.E. (1990).. J. Paris. Sage Publications. 3rd edn.D. [139] Kelly. J. [151] Loomes.D.H. Corkindale (eds). (1992). New York. D. [153] Loomis. Johns Hopkins University Press. Champ. [144] Krutilla. (1995). Basic books. [150] Little. K. Theory and Decision 25: 1–23. (1986). E. J.A. P. F. D. P. How to measure performance and use tests. J. Academic Press. Oxford. Luce. [142] Kohli. (1993). Willis and J. Environmental valuation: New perspectives. Social Choice and Welfare 8: 97–169. and Mirlees. L. T. IEEE Transactions on Systems Man and Cybernetics. and Givens. Diﬀerent experimental procedures for obtaining valuations of risky actions: Implications for utility theory. Holt. Multiple purpose river development.BIBLIOGRAPHY 255 [138] Keller. (1975). Charles C. and Tversky. Brown. (1988). and Mirlees. and Juarez. [149] Little. P. Thomas. [147] Lesourne.S. (1987).D. R. O. Economic Journal 92: 805–824. New York.M. in K. Foundations of measurement. [146] Laslett. Suppes. I. J.H.V. .N. B. [152] Loomes.N. [143] Krantz. O. Academic Press. Foundations of measurement. Grading and marking in American schools: Two centuries of debate. J.C. New York. Springﬁeld. G. G. 15: 580– 585. and Eckstein. M. and Sugden. [145] Laska.M. (1958). (1985). Cost-beneﬁt analysis and project appraisal in developing countries. Project appraisal and planning for developing countries. T.. Morris.. C. 1: Additive and polynomial representations. G. Oxford University Press for the Asian Development Bank..L. pp. and Weiss. (1974). Foundations of behavioral research. CAB International. J. 3: Representation. Manual of industrial project analysis in developing countries. A. The assumptions of cost-beneﬁt analysis. New York.G. 5–20. R. and Lucero. [141] Kirkpatrick. NorthHolland. Amsterdam. Oxford. Vol. A. Paired comparisons estimates of willingness to accept and contingent valuation estimates of willingness to pay. R. and Fitz-Gibbon. J.. Peterson. I. Economic analysis of investment projects: A practical approach... (1982). J. Baltimore.D. [148] Lindheim.T..D. Vol. Thousand Oaks. C. (1968). Suppes. Journal of Economic Behavior and Organisation 35: 501–515. A fuzzy k−nearest neighbor algorithm. R. Krantz. (1991). Elgar. Rinehart and Winston. Cost-beneﬁt analysis and economic theory. Adelshot Hants. J. axiomatisation and invariance. (1971).A.. (1996). [154] Luce.T. Social choice bibliography. and Tversky. [140] Kerlinger. (1998).

M. [158] Machina. Th. 27–145. PUF. Thomas and D. Journal of Economic Literature 27: 1622– 1688. A set of independent necessary and suﬃcient conditions for simple majority decisions. pp. [163] May. in S. L. M. Utility theory: Axioms versus paradoxes. Expected utility without the independence axiom. D. Multiobjective decision making. in B. London. (1982). [170] Mintzberg.. Games and Decisions. [165] McCord. (1979). Reidel. and Lockwood. Cambridge University Press.E. (1984). and Th´oret. [171] Mishan. A. Empirical demonstration that expected utility decision analysis is not operational. Stigum and F. P. Environment and Planning B 10: 47–62. Cambridge. Hartley.J. (1982). Dordrecht. [166] McCord.256 BIBLIOGRAPHY [155] Luce. London.O. H. E. E. [159] Machina.F. The representation of urban planning-processes: An exploratory review. K. [168] McLean. L’´valuation des ´l`ves. Paris. Academic Press. (1956). D. Administrative Science Quarterly 21: 246– 272. R. [160] Mamdani. Hagen (eds). Cost-beneﬁt analysis. Econometrica 20: 680–684. [157] Lysne. (1989). Academic Press. White (eds). (1983). Wiley. R. in M. Econometrica 24: 178–191. (1957). K. Reidel. M. (1990). New York. Grading of student’s attainement: Purposes and functions. Econometrica 50: 277–323. . (1996). D. J. R. (1996). and Larsson. 279– 305. H. New York. Semiorders and a theory of utility discrimination. The structure of une structured decision processes. A. 181–199. Valued relations aggregation with the Borda method. [162] Masser.C.E. [167] McCrimmon. Fundamental deﬁciency of expected utility analysis. I. (1981). Sage Publications. and Raiﬀa. and de Neufville.D. Enquˆte sur le jugement professoe ee e ral. Wenstøp (eds). (1982). Foundations of utility and risk theory. (1983).J. Expected utility hypotheses and the Allais paradox. (1996). S. pp. [169] Merle. Allais and O. R. French.R. Rationality and dynamic choice: Foundational explorations. pp. [156] Luce. Gaines fuzzy reasonning and its applications. Scandinavian Journal of Educational Research 28: 149–165. Thousand Oaks. Journal of Multi-Criteria Decision Analysis 5: 127–132. (1952). [161] Marchant. M. E. Allen and Unwin. Why and how should we assess students? The competing measures of student performance.. and de Neufville. H. (1976). Raisinghani. R. [164] McClennen. R.J. Dynamic consistency and non-expected utility models of choice under uncertainty.D.

e a e e e e PhD thesis. (1999). Action evaluation and action structuring – Diﬀerent decision aid situations reviewed through two actual cases. (1997). in R.C. in D. New models of decisions under uncertainty. Fuzzy sets in approximate reasoning and information systems. Poems in translation: Sappho to Val´ry. Vol. Dordrecht. Reidel. Modelling and control. Paris.M. M. Rethinking the process of operational research and systems analysis. Paris. T. K. European Journal of Operational Research 70: 67–82.-P.A. Dordrecht. Tomlinson and I. H. R. Kiss (eds). in C. How do you know they know what they know? A handbook of helps for grading and evaluating student progress. Arkansas. [175] Mousseau. (1998). Arbitrage. and Caverini. Pergamon Press. D. LAMSADE. A. [179] Nau. [184] Nurmi. (1993). and Tsouki`s. K. e PUF. pp. La psychologie de l’´valuation scolaire. Journal of Risk and Uncertainty 10: 71–91. Types of organizational decision processes. Administrative Science Quarterly 19: 414–450. e [176] Munier. J.BIBLIOGRAPHY 257 [172] Moom. Westminster. Some Norwegian politician’s use of cost-beneﬁt analysis. B. J. A. (1993). The University e of Arkansas Press. (1978). (1997). IUSWARE: A formal methodology for a software evaluation and selection. 305–333.F. (1984). R. [178] Nauck. Grove Publishing. [183] Noizet. Organizational decision processes and ORASA intervention. (1998). Theory and Decision 31: 199–240. rationality and equilibrium. (1991). P. chapter 5. R. Bezdek and H. [173] Morisio. Prade (eds). Kluwer. Sage Publications. (1984). (1987). Thousand Oaks. Public Choice 95: 381–401. [182] Nims. and Sugeno.F. pp.T. Bana . J. and Kruse. Dordrecht. (1996). European Journal of Operational Research 38: 307–317. [180] Nau. J. Neuro-fuzzy methods in fuzzy rule generation. (1990). D.F. (1995). [177] Nas. Universit´ Paris-Dauphine. Kluwer. 169– 186. Probl`mes li´s ` l’´valuation de l’importance en aide e e a e multicrit`re ` la d´cision : R´ﬂexions th´oriques et exp´rimentations. Cost-beneﬁt analysis: Theory and application. 3 of Handbook of Fuzzy Sets. V.F. (1989). and McCardle. [181] Nguyen. T. A. [188] Ostanello. Coherent decision analysis with inseparable probabilities and utilities. and Tsouki`s. [187] Ostanello. An explicative model of ‘public’ ina terorganizational interactions. G. [186] Nyborg. (1990). H. IEE Proceedings on Software Engineering 144: 162–174. [174] Moscarola. M. Oxford. D. [185] Nutt.F. Comparing voting systems. A.

Use of artiﬁcial intelligence multicriteria decision making. F.. Examens et docimologie. A common framework for describing some outranking procedures. e [196] Perrot. and Roubens. Ann Arbor. 61–74. (1998). Universit´ Paris-Dauphine.C. in J. Th. J. (1981). (1997). G. [195] Perny. W. (1997). Cambridge. PUF. e [192] Perny. Springer Verlag. Springer Verlag.43. in T. Validation aspects of a prototype solution implementation to solve a complex MC problem. [197] Perrot. representations. operations research n and statistics.. Prentice-Hall. Gal. [201] Popham. Journal of Multi-Criteria Decision Analysis 6: 86–93. pp. M. [198] Pi´ron.-Ch. e [199] Pirlot. and Zucker.J. 3–30. [203] Quiggin. (1999). PhD thesis. P. M. (1982). Cambridge University Press. E. [190] Ott. (1999). Sur le non-respect de l’axiome d’ind´pendance dans les e m´thodes de type ELECTRE. (1978). 279– 285. Trystram..1–15. New-York. J. Ecole Nationale Sup´rieure des Industries Agricoles e Alimentaires. Dordrecht. Ann Arbor Science. 15. Fuzzy preference modelling. Stewart and Th. Maˆ ıtrise des proc´d´s alimentaires et th´orie des enseme e e bles ﬂous. and Vincke. P. . (1963). algorithms. pp. Document du LAMSADE No 113. (1994). Journal of Economic Behaviour and Organization 3: 323–343. theory. Environmental indices: Theory and practice. Berlin.).J. A real world MCDA application: a Evaluating software. N. Collaborative ﬁltering methods based on fuzzy preference relations. Paris.D. and Tsouki`s. Cahiers du CERO 34: 211–232. Cl´ ımaco (ed. [202] Poulton. [194] Perny. comparison between bayesian and fuzzy approaches. Dordrecht. P. H. Paris. N.258 BIBLIOGRAPHY e Costa (ed. Behavioral decision theory: A new approach. pp. Journal of Food Engineering 29: 301–315. Hanne (eds). [200] Pirlot. Readings in multiple criteria decision aid. (1997). Dordrecht.). A. Le Guennec. (1992). Kluwer. Modern educational measurement. J. Ph. [193] Perny. Fuzzy sets in decision analysis. 36–57. in R. A theory of anticipated utility. P. [189] Ostanello. Multi-criteria analysis. Advances in MCDM models. Semiorders. pp. M. pp. (1996). D. A.R. (1997). W. and Pomerol.). and applications. Kluwer. [191] Paschetta. applications. and Guely. Sensor fusion for real time quality evaluation of biscuit during baking. Berlin. Slowi´ski (ed. Technical report. Kluwer. Properties. Proceedings of EUROFUSE-SIC’99. E. (1999).

Economica. [211] Roy. (1974). Springer Verlag. Barrett.J. (1993). 1985. (1991). C. [209] Roy.. Kluwer. (1984). Fuzzy Sets and Systems 49: 9–13. R. Conﬁdent decision making. Universit´ Paris-Dauphine. Decision analysis – Introductory lectures on choices under uncertainty. (1989). Investigaci´n Operativa 2: 95–110. [220] Salles. New York. J.. (1996). B. [217] Saaty. D. The analytic hierarchy process. [206] Riley. and Bouyssou.. B. [207] Rosenhead. Piatkus.S. M. and Pattanaik.BIBLIOGRAPHY 259 [204] Quiggin. ELECTRE IS : Aspects m´thodologiques e et guide d’utilisation. Science de la d´cision ou science de l’aide ` la d´cision ?. (1970). Grade inﬂation and course choice. Dordrecht. Technical report.C.-M.L. Economica. (1992). Decision-aid: an elementary introduction with emphasis on multiple criteria.C. P. . Kluwer.J.K. B. Document du LAMSADE No 30. Dordrecht.F. Generalized expected utility theory – The rank-dependent model. (1994). (1993). (1990).R. Aide multicrit`re ` la d´cision : M´thodes e a e e et cas. and Worthington. Eliminating grades in schools: An allegory for change. (1980). (1993). M. and Vincke. Preference modelling. Paris.J. European Journal of Operational Research 66: 184–204. Ph. London. Cahier du LAMSADE No 97. McGraw-Hill. (1989). Grades and grading practices: The results of the 1992 AACRAO survey. Paris. e [212] Roy. (1991). Washington D. e a e Technical report. C. Singer. and Wakeman. B. Checca. [218] Sabot. (1994). D. Berlin. J.. J. L. Rationality and aggregation of preferences in an ordinally fuzzy framework. (1985). [219] Sager. T. [215] Roy. [205] Raiﬀa. R. T. A S Q Quality Press. Journal of Economic Perspectives 5: 159–170. Paris. Rational analysis of a problematic world.H. Revue d’Economie Politique 1: 1–44. and Skalka. New York. H. Decision science or decision-aid science?. New York.E. Addison-Wesley. and Bouyssou. [208] Roubens. and Schoemaker. P. H. D. Crit`res multiples et mod´lisation des pr´f´rences : l’apport e e ee ´ des relations de surclassement. B. e [216] Russo. Milwaukee. [214] Roy. Multicriteria methodology for decision aiding. B. e Paris. J. o [210] Roy. Wiley. Universit´ Paris-Dauphine. B. [213] Roy. American Association of Collegiate Registrars and Admissions Oﬃcers. Original version in French “M´thodologie multicrit`re d’aide ` la e e a d´cision”.

North Holland. Oxford. Multiple criteria optimisation: Theory. (1977). [230] Sinn. Vol. Grading student writing: An annotated bibliography. H. R. Wiley. Social choice theory. B. B. [223] Schmeidler.. Faculty behavior. [237] Sugden. Fuzzy automata and decision processes. [222] Savage. (1986). London. A. pp. grades and student evaluations.) (1998). Journal of Economic Education 25: 5–15. [234] Stamelos. I. (1986). Schieber. Handbook of mathematical economics.K. 2nd revised edn. A.W. R. R. 1073–1181. Econometrica 57: 571–587. North-Holland. and Wiliams. Strategy proofness and Arrow’s conditions: Existence and correspondence theorems for voting procedures and social welfare functions. A. and Gollier.E. Journal of Economic Theory 10: 187–217. Fuzzy sets in decision analysis. Theory and Decision 43: 241–51. Dordrecht. (1997). Econometrica 65: 745– 779. Cahier du LAMSADE No 156. Intriligator (eds). in M.M. A. Oxford University Press. (1983). Gupta.N. (1954). Maximization and the act of choice. [233] Speck. R. and Tsouki`s. Journal of Economic Theory 37: 55–75. Myers. pp. S. M. [225] Schoﬁeld. (1993). New York.A. J. (ed. Economic decisions under uncertainty. D. (1983). New York. (1985). Kluwer. A behavioural model of rational choice in Models of man. The foundations of statistics. operations research n and statistics. (1997). Wiley. Eeckoudt. (1957). (1975). [238] Sugeno. Economics of radiation protection: Equity considerations.W. Paris. M. [227] Sen. S. Arrow and M. North-Holland. R. [236] Stratton. Fuzzy measures and fuzzy integrals: a survey.D. and Gigliotti. 89–102. (1994).260 BIBLIOGRAPHY [221] Satterthwaite. and application. 1972. Wiley. L.. The principles of practical cost-beneﬁt analysis. Unwin and Hyman. [231] Slowi´ski. G. [224] Schneider. Amsterdam. Universit´ Parise Dauphine. Th. Hedonic prices and cost-beneﬁt analysis. L. Amsterdam. and King. computation. 241–260.C. Technical report. Gains (eds).W. [232] Sopher. A test of generalized expected utility theory. [226] Scotchmer.J. Westport. . New York.R. C. 3.K. Software evaluation problem situaa tions. Amsterdam. Greenwood Publishing Group.H. (1998). [235] Steuer. H. (1989). pp. C. in K. (1989).. G. Theory and Decision 35: 75–106. Subjective probability and expected utility without additivity. (1998).A. Cost-beneﬁt analysis in urban and regional planning. [229] Simon. [228] Sen. Saridis and B.

A characterization of PQI interval a orders. Intransitivity of preferences. Springer Verlag. New York. Albert and Charles Boni. Ph. [253] Vassiloglou. A. (1986). (1991). A new axiomatic foundation of partial a comparability. Human Development Report 1997. R. Proceedings OSDA ’98. [241] Syndicat des Transports Parisiens (1998). (1969). and French. Processing Automation 4: 504–512. Syndicat des Transports Parisiens. Information Sciences 36: 59–83. G. a [246] Trystram. and Vincke. Application of fuzzy logic for the control of food processes.. Technical e report. M´thodes d’´valuation des projets e e d’infrastructures de transports collectifs en r´gion Ile-de-France. a e e Ricerca Operativa (40): 7–44. M. Consequences. [245] Toth. [255] Vincke. opportunities and procedures. P. Russell Sage Foundation. Arrow’s theorem and examination assessment. in J. [254] Vassiloglou.L. Non conventional preference relations in decision making.nl/locate/endm). Theory and Decision 39: 79–114. Ph. Brussels. (1999). Kacprzyk and M. An anthology of world poetry. De Borda et Condorcet ` l’agr´gation multicrit`re. N. pp. Urbana. [242] Tchudi. (1984). [249] Tversky. ´ [243] Teghem. (http://www. Programmation lin´aire. (1982). (1999). J. Roubens (eds). I preference structures. 72–81. [248] Tsouki`s. Some multi-attribute models in examination assessment. and Guely. Editions de l’Universit´ de e e ´ Bruxelles-Editions Ellipses. (1988). (1996). An introductory survey on fuzzy control. Oxford University Press. K. [250] United Nations Development Programme (1997). Ph. F. . (1928). M. [247] Tsouki`s.H. J. Cost-beneﬁt analysis of climate change: The broader perspectives. (1997). Quasi rational economics. (1997). [252] Vansnick. A. Electronic Notes on Discrete Mathematics. Berlin. and Vincke. [244] Thaler. Paris. British Journal of Mathematical and Statistical Psychology 37: 216– 233. [251] van Doren. Q. (1995). Basel. M. Alternatives to grading student writing. [240] Suzumura. F. Perrot. Birkh¨user. National Council of Teachers of English. Social Choice and Welfare 16: 17–40. Psychological Review 76: 31–48. British Journal of Mathematical and Statistical Psychology 35: 183–192. (1985).-C. S.elsevier. M. to appear also in Discrete Applied Mathematics. A. (1995). S. New York. Oxford.BIBLIOGRAPHY 261 [239] Sugeno. pp.

(1989). 1–2. [259] von Neumann. (1992b). F. Proceedings of EUROFUSE-SIC’99. Ph. P. Les nombres et leurs myst`res. and Stason. M. A review of cost-beneﬁt analysis as applied to the evaluation of new road proposals in the U. Elsevier.P..D. PhD thesis. and Harvey. Theory of games and economic behavior. [263] Watson. A theory of approximate reasoning.C. (1987). Fatal tradeoﬀs: Public and private responsibilities for risk. J. [260] von Winterfeldt. J. Cambridge University Press. from manipulation of measurement to manipulation of perceptions. [264] Weinstein. (1992). L. (1961). Hayes. W. (1999). e Paris.. (1977). Kluwer. Editions de e a e ´ l’Universit´ de Bruxelles-Editions Ellipses. Decision analysis and behavioral research. .K. Cambridge. [257] Vincke. A. (1994). Aide multicrit`re ` la d´cision dans le cadre de la e a e probl´matique du tri : M´thodes et applications. Mikulich (eds).B. Generalized Gini inequality indices. W.A. (1981). e [258] Viscusi. Princeton. New York. Oxford. Wiley. Garrod. [267] Willis. e [270] Zadeh. Origi´ nal version in French “L’Aide Multicrit`re ` la D´cision”. Oxford University Press. Multi-criteria decision aid. On the “environmental” discount rate. (1992). New England Journal of Medicine 296: 716–721. Econometrica 55: 95–115. The decathlon – A colorful history of track and ﬁeld’s most challenging event. (1998). [271] Zadeh. [269] Yu. in J.G. D. pp. M.K. Exploitation of a crisp binary relation in a ranking problem.A. From computing with numbers to computing with words. [266] Weymark. D.R. Amsterdam. and Edwards. (1979). Michie and L. [261] Wakker. pp. (1944). [272] Zarnowsky. Ph. Mathematical Social Sciences 1: 409–430. Paris. [265] Weitzman. (1989). L. D. [262] Warusfel. Leisure Press. (1986). W. Decision analysis as a replacement for cost-beneﬁt analysis. LAMSADE. 1989. Seuil.L. K. Theory and Decision 32: 221–241. M. Brussels. e e Universit´ Paris-Dauphine. Foundations of cost-effectiveness analysis for health and medical practices.I. Princeton University Press. and Morgenstern. Journal of Environmental Economics and Management 26: 200–209. [268] Yaari. Champaign. Machine intelligence. (1981). Transportation Research – D 3: 141–156. Dordrecht. Additive representations of preferences – A new foundation of decision analysis. 149–194.A. G. Points Sciences. O. W. (1992a).E. S. The dual theory of choice under risk.262 BIBLIOGRAPHY [256] Vincke.E.R. European Journal of Operational Research 7: 242–248.

Beneﬁt-cost analysis in theory and practice. . D. (1994).O. and Dively.BIBLIOGRAPHY 263 [273] Zerbe. Harper Collins. R. New York.D.

159. 14 Borda’s method. 41. 166 AHP. 21 incomparability. 105 semiorder. 214 communication. 130 disaggregation. 117 air quality. 214 attributes hierarchy. 96 consistency. 134 Condorcet. 51. 46. 220 outranking. 213 automatic decision. 46. 57. 141 paired comparison. 148. 130. 212 actor. 18–20 Borda. 19. 73 correlation.Index absolute scale. 216. 71. 212 conjunctive rule. 115 action. 47 client. 44 compensation. 245 weighted sum. 117 screening process. 105 non-compensation. 96 single-attribute value function. 41. 124 call for tenders. 242 compensation. 61. 244 bayesian decision theory. 13 paradox. 212 computer science. 85. 106 value function. 155. 63 Allais’ paradox. 117 dominance. 105 utility function. 84 constructive approach. 193 procedure. 237 attributes hierarchy. 30. 215 rank reversal. 59. 219 coherence test. 102 cost-beneﬁt analysis. 172 additive. 41. 30. 244 binary relation acyclic. 16 aspiration level. 106 weight. 206 coalition. 10. 130 corporate ﬁnance. 57. 42. 51 Condorcet’s method. 96 264 astrology. 212 analyst. 238 . 237 concordance. 34 Arrow. 125 conjunctive rule. 241 weighted sum. 46. 126 aggregation. 191 ambiguity. 206 acyclic. 96 constructive approach. 219 concordance threshold. 126 fuzzy. 80 monotonicity. 148 axiomatic analysis. 93 linearity. 61 multi-attribute value function. 239. 208 cardinal. 106 tournament. 20 transitivity. 125 utility. 239 automatic decision systems. 35 weighted average. 111 rank reversal. 206 anchoring eﬀect.

219 forecasting. 237 elections. 33 minimal passing. 34 graphology. 86. 212. 213 ﬁnal recommendation. 149. 151 decision support. 187 disaggregation. 187 externalities. 51. 206. 85. 21 control. 207. 78 social beneﬁts. 237 Allais’ paradox. 81 price of time. 212 problem situation. 206 decision model. 237 265 . 74. 213 problem statement. 93. 80 fuzzy. 84. 29. 161. 228 Ellsberg’s paradox. 135 decathlon. Petersburg game. 82 dominance. 30. 219 cycle reduction. 219 learning process. 1. 216. 40. 169 GPA. 86 credibility index. 215 software. 36. 214 interaction. 191 Ellsberg’s paradox. 139 criteria coalition. 237 decision process. 82 equity. 117 discordance. 214 hierarchy. 206 analyst. 80 public goods. 212. 152 decision theory. 237 anchoring eﬀect. 149. 206 decision rule. 237 education science. formal. 201 expected value. 75 social costs. 206. 71. 212 expected value. 103 point of view. 202. 214 coherent family. 161 rule. 76. 66 decision dynamic consistency. 212 relative importance. 239 legitimation. 206 problem statement. 77. 161 labels. 220 model. 239 economics. 42 standardised score. 165 implication. 1 decision aiding process. 71. 51. 75. 206 actor. 215. 187 St. 169 set. 237 ELECTRE-TRI. 213 evaluation model problem statement. 239 net present value. 210 evaluation model.INDEX externalities. 96 dynamic consistency. 237 environment. 48 grade. 206 client. 206. 192 expected utility. 86 evaluation absolute. 48 marking scale. 78 markets. 78 ﬁnal recommendation. 202. 219 coherence test. 166 interval. 74 social rate. 213. 212 model. 63. 76 net present social value. 76 price. 138 discounting. 189. 75 social welfare. 192 engineering. 85 price of human life. 79. 34 GPA. 211. 215 decision table. 119 problem formulation.

47. 117 interactive methods. 212 proﬁles. 117 implication. 61 nearest neighbours. 30. 104 majority rule. 85 price of human life. 76 marking scale. 98 reliability. 214 human development. 141 operational research. 50 price.266 health. 110 trade-oﬀ. 214. 119 legitimation. 134 credibility index. 111 . 58. 201 interaction. 79 standard sequences. 220 linear scale. 139 cycle reduction. 228 incomparability. 135 ELECTRE-TRI. 32. 138 concordance. 130 relation. 51. 245 nontransitive. 47. 155 interval scale. 102 of irrelevant alternatives. 10. 99 intuition. 67. 80 priority. 237 ordinal. 84. 39. 214 ordinal. 238 indices. 219 paired comparison. 125 manipulability. 238 ideal point. 47 interval scale. 115 cardinal. 130 indiﬀerence threshold. 33 mathematics. 74 nominal scale. 216. 193 veto. 15 separability. 214 outranking. 212 political science. 125 threshold. 103 interactive methods. 103 incomparability. 38. 129 discordance. 30 MCDM. 201 majority rule. 101 meaningfulness. 245 structuration. 38. 62. 173 indiﬀerence threshold. 33 scale. 81 price of time. 75. 51. 17 markets. 105 interpolation. 124. 105 outranking methods. 71 heuristics. 172 net present social value. 107 subjective. 214 non-compensation. 85 monotonicity. 54. 242 hierarchy. 130. 215. 39. 166 imprecision. 40. 245 mono-criterion analysis. 219 concordance threshold. 19. 227 measurement. 214 validity. 76 net present value. 61 ideal point. 125 PROMETHEE. 216. 220 independence. 76. 212 absolute scale. 105 sorting. 104 meaningfulness. 214 indicator. 131 learning process. 33 model. 57 swing-weight. 242 kernel. 220 substitution rate. 242. 193 point of view. 213. 237 preference model. 99 linear scale. 214 ratio scale. 227 INDEX nominal scale. 62.

79 unanimity. 193 public goods. 57 t-norm. 241 weighted sum. 218. 105. 212 St. 124 Concordet paradox. 207. 164 threshold. Petersburg game. 101 statistics. 211. 187 stability. 148 scale. 86 software. 82 social welfare. 106 multi-attribute. 219 risk. 103. 71. 51 Condorcet’s method. 182. 106 expected. 117 ranking. 32. 214 similarity indices. 237 structuration. 215 PROMETHEE. 75 social rate. 96 security. 216. 206. 101 separability. 79 screening process. 99. 239 endogenous. 179. 221 exogenous. 101 transitivity. 50. 98 relative importance. 212 ratio scale. 155. 105 single-attribute. 59. 173 social beneﬁts. 38. 218 utility.INDEX probability. 79. 213 sorting. 85. 159. 99. 216. 125 manipulability. 189. 245 subjective. 206 problem statement. 173 relation. 242. 35 weighted average. 242. 212 problem situation. 77. 20 sensitivity analysis. 214 substitution rate. 84. 212. 245 rule aggregation. 201. 106 veto. 13 267 uncertainty. 78 rank reversal. 166. 201 value function. 17 unanimity. 51. 125 trade-oﬀ. 81 semiorder. 83 stability. 239 robustness. 219 voting procedure Borda’s method. 239 problem formulation. 18–20 transportation. 75 social costs. 86. 172 . 173 tournament. 76. 13 weight. 42.

- Metode Pengukuran Beban Kerja Psikis Dan Fisik
- AUN-QA Accomplishing Programme Assessment Trainer ,Copy
- Pengenalan K3
- Perencanaan Strategis PPK
- 09835.pdf
- Draft-Key Performance Indicator
- Pedoman Pembuatan Flowchart
- Kompetensi Kuliah Nurse
- Aspek Hukum Kontrak Asuransi Di Indonesia
- Aspek Hukum Kontrak Asuransi Di Indonesia
- Aspek Hukum Kontrak Asuransi Di Indonesia
- PMK No 69 Ttg Tarif Pelayanan Kesehatan Program JKN
- PurpleMovement Edisi September 2012
- Purplemovement #2
- Purplemovement #1
- Konsep Penelitian Brand Expression Rumah Sakit
- Penyusunan Brand Expression Berdasarkan Strategi Pemasaran dan Brand Strategy
- Turn Over Karyawan (Kajian Literatur)
- Center of Gravity Method "Penentuan Lokasi Sarana Kesehatan"
- DSP_Kekalahan Kaum Ibu
- DSP_Jebakan Kebijaksanaan
- CV
- Distribusi Tenaga Kesehatan
- Indikator efisiensi
- EKUITAS MEREK

- Franses, van Dijk. Non-Linear Time Series Models in Empirical Finance (2000)by irinutza123
- 5505481-john-wiley-sons-2004-decision-analysis-for-management-judgment-3rd-edition-isbn-0470861088-493s-tlfebookby Nazareno Tuttuparupazzescu
- Risk Managementby dineshmerani123
- Real Analysis and Probability (2002 Dudley)by homayoon_moradi

- Company Health Check
- Measuring Risk
- Due Diligence
- European Comission - International is at Ion of SME 2010
- Risk Manual
- decisiontheory
- Due Diligence Tools & Techniques
- 1N4148_1N4448
- Notes on Heuristics in Problem Solving
- Venture Capital Financing in India
- Project Finance Manual
- 0750658525
- Bierens - Introduction to the Mathematical and Statistical Foundations of Eco No Metrics
- PwC-Internal Audit 2012
- Risk Analysis
- rgraphics
- Bureaucracy
- clustersjan09
- Financial_Econometrics_2010-2011
- Interest Rate Risk
- Franses, van Dijk. Non-Linear Time Series Models in Empirical Finance (2000)
- 5505481-john-wiley-sons-2004-decision-analysis-for-management-judgment-3rd-edition-isbn-0470861088-493s-tlfebook
- Risk Management
- Real Analysis and Probability (2002 Dudley)
- Financial Reporting Vol. 1
- Book of Proof
- Be a Success Full Consultant - Insider Guide to Setup and Running Consulting Service
- Bouyssou,Marchant,Pirlot,Perny,Tsoukias,Vincke Evaluation and Decision Models - A Critical Perspective (Kluwer)

Are you sure?

This action might not be possible to undo. Are you sure you want to continue?

We've moved you to where you read on your other device.

Get the full title to continue

Get the full title to continue reading from where you left off, or restart the preview.

scribd