You are on page 1of 5

Outliers in

Statistical Data
Second Edition
VIC BARNETT
University of Sheffield
and
TOBY LEWIS
The Open University
John Wiley & Sons
Chichester • New York • Brisbane • Toronto • Singapore
Contents
CHAPTER 1 INTRODUCTION 1
1.1 Human error and ignorance 7
1.2 Outliers in relation to probability models 8
1.3 Outliers in more structured situations 11
1.4 Bayesian and non-parametric methods 17
1.5 Survey of outlier problems 18
CHAPTER 2 WHY DO OUTLYING OBSERVATIONS ARISE AND
WHAT SHOULD ONE DO ABOUT THEM? 20
2.1 Early informal approaches 20
2.2 Origin of outliers, statistical method, and aim 24
2.2.1 The nature and origin of an outlier 25
2.2.2 Relevant statistical procedures for handling outliers 27
2.23 Different aims in examining outliers 31
2.3 Models for contamination 35
CHAPTER 3 THE ACCOMMODATION APPROACH: ROBUST
ESTIMATION AND TESTING 45
3.1 Performance criteria 49
3.1.1 Efficiency measures for estimators 49
3.1.2 The qualitative approach : influence curves 55
3.1.3 Robustness of confidence intervals 60
3.1.4 Robustness of significance tests 61
3.2 General methods of accommodation-blanket procedures 63
3.2.1 Estimation of location 63
3.2.2 Estimation of dispersion 68
3.2.3 Hypothesis tests and confidence intervals 69
3.3 Specific accommodation procedures 72
xii Contents
CHAPTER 4 ACCOMMODATION PROCEDURES FOR UNI-
VARIATE SAMPLES 74
4.1 Estimation of location 74
.1 Estimators based on trimming or Winsorization 74
.2 L-estimators (linear order statistics estimators) 11
.3 M-estimators (maximum likelihood type estimators) 79
.4 R-estimators (rank test estimators) 82
.5 Other estimators 82
4.2 Estimation of scale or dispersion 84
4.3 Hypothesis tests and confidence intervals 86
4.4 Accommodation of outliers in univariate normal samples 87
4.5 Accommodation of outliers in gamma (including exponential)
samples 100
CHAPTER 5 TESTING FOR DISCORDANCY: PRINCIPLES AND
CRITERIA 107
5.1 Construction of discordancy tests 111
5.1.1 Test statistics 111
5.1.2 Inclusive and exclusive measures 115
5.1.3 Statistical bases for construction of tests 116
5.1.4 Assessment of significance 127
5.2 Performance criteria of tests 132
5.3 The multiple outlier problem 136
5.3.1 Block procedures for multiple outliers in univariate samples 14C
5.3.2 Consecutive procedures for multiple outliers in univariate
samples 142
CHAPTER 6 SPECIFIC DISCORDANCY TESTS FOR OUTLIERS IK
UNIVARIATE SAMPLES 144
6.1 Guide to use of the tests 144
6.2 Discordancy tests for gamma (including exponential) samples 146
6.2.1 Gamma samples: contents list and details of tests 147
6.3 Discordancy tests for normal samples 161
6.3.1 Normal samples: contents list and details of tests 163
6.4 Discordancy tests for samples from other distributions 192
6.4.1 Log-normal samples 193
6.4.2 Truncated exponential samples 193
6.4.3 Uniform samples 194
6.4.4 Gumbel, Frechet, and Weibull samples 196
6.4.5 Pareto samples 197
6.4.6 Poisson samples 198
6.4.7 Binomial samples 200
Contents xiii
CHAPTER 7 OUTLIERS IN DIRECTIONAL DATA 203
7.1 Outliers on the circle 205
7.2 Outliers on the sphere 208
CHAPTER 8 OUTLYING SUB-SAMPLES: SLIPPAGE TESTS 211
8.1 Non-parametric slippage tests 213
8.1.1 Non-parametric tests for slippage of a single population 213
8.1.2 Non-parametric tests for slippage of several populations : multiple
outlying sub-samples 220
8.2 The slippage model 223
8.3 Parametric slippage tests 224
8.3.1 Normal samples 225
8.3.2 General slippage tests 234
8.3.3 Non-normal samples 236
8.3.4 Group parametric slippage tests 240
8.4 Other slippage work 241
CHAPTER 9 OUTLIERS IN MULTIVARIATE DATA 243
9.1 Principles for outlier detection in multivariate samples 243
9.2 Accommodation of multivariate outliers 246
9.2.1 Estimation of parameters in multivariate distributions 247
9.2.2 Outliers in multivariate analyses 253
9.3 Discordancy tests 254
9.3.1 Multivariate normal samples 255
9.3.2 Multivariate exponential samples 264
9.3.3 Multivariate Pareto samples 266
9.3.4 A transformation approach 267
9.4 Informal methods for multivariate outliers 268
9.4.1 Marginal outliers and linear constraints 269
9.4.2 Graphical and pictorial methods 270
9.4.3 Principal component analysis method 272
9.4.4 Use of reduction measures in the form of generalized distances 21A
9.4.5 Function plots 275
9.4.6 Correlation methods and influential observations 277
9.4.7 A 'gap test' for multivariate outliers 278
CHAPTER 10 THE OUTLIER PROBLEM FOR STRUCTURED
DATA: REGRESSION, THE LINEAR MODEL, AND
DESIGNED EXPERIMENTS 281
10.1 Outliers in linear regression 286
10.1.1 Multiple regression 289
10.1.2 Outliers in linear structural and functional models 290
xiv Contents
10.2 Outliers with general linear models 292
10.2.1 Residual-based methods 292
10.2.2 Non-residual-based methods 301
10.2.3 Accommodation of outliers for the linear model 306
10.3 Outliers in designed experiments 309
10.4 Graphical methods for linear model outliers 318
10.5 Non-parametric procedures 323
10.6 Outliers in contingency tables 324
10.7 Some further comments on influential observations 325
CHAPTER 11 OUTLIERS IN TIME SERIES: A LITTLE-EXPLORED
AREA 329
11.1 Detection and testing of outliers in time series 330
11.2 Accommodation of outliers in time series 333
11.2.1 Time-domain characteristics 334
11.2.2 Frequency-domain characteristics 337
11.3 Bayesian methods 340
11.4 Comment 341
CHAPTER 12 BAYESIAN APPROACHES TO OUTLIERS 343
12.1 Bayesian considerations 343
12.2 Bayesian accommodation of outliers 346
12.3 Bayesian assessment of contamination 353
CHAPTER 13 PERSPECTIVE 360
APPENDIX: STATISTICAL TABLES 364
Contents list 365
RFERENCES AND BIBLIOGRAPHY 424
INDEX 452