You are on page 1of 3

TABLE OF CONTENTS

ACKNOWLEDGEMENTS ......................................................................................................... II
LIST OF TABLES ............................................................................................................. VII
LIST OF FIGURES ............................................................................................................ IX
ACRONYMS ...................................................................................................................... XI
CHAPTER 1

INTRODUCTION .......................................................................................1

CHAPTER 2

REVIEW OF LITERATURE ......................................................................3

2.1
WHAT IS AN OUTLIER? ....................................................................................................... 3
2.2
HISTORY OF OUTLIERS ...................................................................................................... 4
2.3
IMPORTANCE OF DETECTING OUTLIERS .......................................................................... 5
2.4
CAUSES OF OUTLIERS ........................................................................................................ 6
2.4.1 OUTLIERS IN SURVEY DATA SETS........................................................................................... 7
2.4.2 NATURAL VARIATION .......................................................................................................... 9
2.4.3 CONTAMINATION ................................................................................................................. 9
2.5
EFFECTS OF OUTLIERS ..................................................................................................... 10
2.5.1 DAMAGING EFFECTS OF OUTLIERS .................................................................................... 10
2.5.2 BENEFITS OF OUTLIERS IN THE DATA SET ......................................................................... 11
2.6
MASKING AND SWAMPING EFFECTS OF THE OUTLIERS ................................................. 11
2.6.1 MASKING EFFECT............................................................................................................... 12
2.6.2 SWAMPING EFFECT ............................................................................................................ 12
2.7
APPLICATIONS OF OUTLIER DETECTING TECHNIQUES................................................. 13
2.7.1 FRAUD DETECTION ............................................................................................................ 13
2.7.2 MEDICAL DATA.................................................................................................................. 13
2.7.3 COMMUNITY BASED DISEASES .......................................................................................... 14
2.7.4 SPORTS DATA ANALYSIS ................................................................................................... 14
2.7.5 DETECTING MEASUREMENT ERRORS ................................................................................ 14
2.8
PREVIOUS TECHNIQUES ................................................................................................... 15
2.8.1 TUKEYS METHOD (BOXPLOT) .......................................................................................... 17
2.8.2 METHOD BASED ON MEDCOUPLE ...................................................................................... 18
CHAPTER 3
3.1

SPLIT SAMPLE SKEWNESS .................................................................. 20

INTRODUCTION ................................................................................................................. 20
iv

3.2
SKEWNESS ......................................................................................................................... 21
3.3
VARIOUS MEASURES OF SKEWNESS ................................................................................ 22
3.3.1 MOMENT BASED MEASURE OF SKEWNESS ....................................................................... 22
3.3.2 PEARSON SKEWNESS .......................................................................................................... 23
3.3.3 QUARTILE SKEWNESS ........................................................................................................ 23
3.3.4 OCTILE SKEWNESS ............................................................................................................. 24
3.3.5 MEDCOUPLE ....................................................................................................................... 24
3.4
SPLIT SAMPLE SKEWNESS (SSS)...................................................................................... 25
3.5 METHODOLOGY: BOOTSTRAP TESTS FOR SKEWNESS ......................................................... 28
3.6
POWER AND SIZE OF THE TEST ....................................................................................... 31
3.7
MERITS OF THE NEW TECHNIQUE .................................................................................. 40
CHAPTER 4

SPLIT SAMPLE SKEWNESS BASED BOX PLOTS ............................... 41

4.1
INTRODUCTION ................................................................................................................. 41
4.2
INTRODUCTION ................................................................................................................. 41
4.3
PROBLEM STATEMENT ..................................................................................................... 43
4.4
PROPOSED TECHNIQUE .................................................................................................... 44
4.4.1 CONSTRUCTION ................................................................................................................. 45
4.4.2 BENEFITS/ADVANTAGES OF SPLIT SAMPLE SKEWNESS ADJUSTED TECHNIQUE .............. 46
4.5
HYPOTHETICAL DATA EXAMPLE .................................................................................... 48
4.6
HYPOTHESIS AND METHODOLOGY ................................................................................. 48
4.7
THEORETICAL APPROACH ............................................................................................... 50
4.8
CONVENTIONAL APPROACH: BEST AND WORST CASE IN CONTEXT OF PERCENTAGE
OUTLIERS ...................................................................................................................................... 56
4.9
COMPARISON OF SSSBB TECHNIQUE WITH KIMBERS APPROACH ............................. 58
4.9.1 COMPARISON IN SYMMETRIC DISTRIBUTION ........................................................................ 59
4.9.2 COMPARISON IN SKEWED DISTRIBUTIONS............................................................................ 60
4.10 CONCLUSION ..................................................................................................................... 64
CHAPTER 5

MODIFIED HUBERT VANDERVIEREN BOXPLOT ............................. 65

5.1
INTRODUCTION ................................................................................................................. 65
5.2
PROBLEM STATEMENT ..................................................................................................... 66
5.3
MODIFIED HUBERT VANDERVIEREN BOXPLOT (MHVBP) ........................................... 69
5.4
CONSTRUCTION OF TECHNIQUE BY PROPOSED MODIFICATION .................................. 69
5.5
HYPOTHESIS AND METHODOLOGY ................................................................................. 70
5.6
THEORETICAL APPROACH AND SIMULATION STUDY .................................................... 71
5.7
SIZE OF TESTS ................................................................................................................... 72
5.8
POWER OF THE TEST ........................................................................................................ 73
4.11 CONVENTIONAL APPROACH: BEST AND WORST CASE IN CONTEXT OF PERCENTAGE
OUTLIERS ...................................................................................................................................... 79
v

5.10
5.11

ARTIFICIAL OUTLIER EXAMPLE ..................................................................................... 80


CONCLUSION ..................................................................................................................... 82

CHAPTER 6 MEDCOUPLE BASED SPLIT SAMPLE SKEWNESS ADJUSTED


TECHNIQUE ..................................................................................................................... 83
6.1
INTRODUCTION ................................................................................................................. 83
6.2
PROPOSED MODIFICATION .............................................................................................. 84
6.3
MONTE CARLO SIMULATION STUDY .............................................................................. 85
6.4
COMPARISON OF FENCES PRODUCED BY MCSSSBB AND HVBP TECHNIQUES WITH
TRUE 95 PERCENT CENTRAL BOUNDARY ................................................................................... 85
6.5
CONVENTIONAL APPROACH: BEST AND WORST CASE IN CONTEXT OF PERCENTAGE
OUTLIERS ...................................................................................................................................... 93
6.6
CONCLUSION ..................................................................................................................... 95
CHAPTER 7

APPLICATIONS....................................................................................... 96

SUMMARY...................................................................................................................................... 96
7.1
STOCK RETURN DATA SET............................................................................................... 96
7.2
BABY BIRTH WEIGHT DATA ............................................................................................ 99
7.3
COMPARISON OF TUKEYS TECHNIQUE AND SSSBB IN BABY BIRTH WEIGHT DATA
...101
7.4
COST BENEFIT ANALYSIS............................................................................................... 103
CHAPTER 8
8.1
8.2.1
8.3
8.4
8.5.1

CONCLUSIONS AND RECOMMENDATIONS .................................... 105

CONCLUSIONS ................................................................................................................. 105


ADVANTAGES OF SPLIT SAMPLE SKEWNESS BASED BOXPLOT ................................... 106
ADVANTAGES OF THE MODIFIED HUBERT VANDERVIEREN BOXPLOT ...................... 106
RECOMMENDATIONS ...................................................................................................... 106
FUTURE WORK ............................................................................................................... 107

BIBLIOGRAPHY................................................................................................................... 108
APPENDIX....................................................................................................................... 112
PREVIOUS TECHNIQUES ............................................................................................. 112

vi