3 views

Uploaded by niyati25

WHAT ARE OUTLIERS266.pptx

save

You are on page 1of 15

• Outliers can be caused by measurement or execution error.• A database may contain data objects that do not comply with the general behavior or model of the data. • the outliers may be of particular interest . These data objects are outliers.

Applications: • Fraud detection • Medicine • Public health • Sports statistics • Detecting measurement errors .

OUTLIER DETECTION METHODS • Statistical Distribution-Based Outlier Detection • Distance-Based Outlier Detection • Density-Based Local Outlier Detection • Deviation-Based Outlier Detection .

Statistical Distribution-Based Outlier Detection • assumes a distribution for the given data set • identifies outliers with respect to the model using a discordancy test • requires knowledge of the data set parameters • knowledge of distribution parameters • expected number of outliers. .

How does the discordancy testing work? • This test examines two hypotheses: • working hypothesis • alternative hypothesis .

n. is evaluated • If SP(vi) is small H is rejected .• A working hypothesis. that is. • Verifies whether oi is <> in relation to F • Assume T is some statistic used as discordancy test • Assume value of the statistic for object oi is vi • Then distribution T is constructed • SP(vi)=Prob(T > vi). 2. F. … . H. is a statement that the entire data set of n objects comes from an initial distribution model. where i = 1. • H : oi E F.

H. which states that oi comes from another distribution model. . • The result is very much dependent on which model F is chosen because oi may be an outlier under one model and a perfectly valid value under another.• An alternative hypothesis. G. is adopted.

2. H : oi E (1-mu)F +muG. 2.• kinds of alternative distributions. where i = 1. • Inherent alternative distribution H’ : oi E G. • Slippage alternative distribution . n. : : : . : : : . where i = 1. n • Mixture alternative distribution G.

is a distancebased (DB) outlier with parameters pct and dmin.Distance-Based Outlier Detection • An object. if at least a fraction.dmin)-outlier. .11 that is. o. D. in a data set. of the objects in D lie at a distance greater than dmin from o. pct. a DB(pct.

algorithms for mining distance-based outliers • Index-based algorithm • Nested-loop algorithm • Cell-based algorithm .

Density-Based Local Outlier Detection • Distance-based outlier detection is based on global distance distribution • It encounters difficulties to identify outliers if data is not uniformly distributed. .

Deviation-Based Outlier Detection • it identifies outliers by examining the main characteristics of objects in a group • two techniques for deviation-based outlier detection • Sequential Exception Technique • OLAP Data Cube Technique .

Sequential Exception Technique .

OLAP Data Cube Technique .

- What Are Outliers51Uploaded byniyati25
- WHAT ARE OUTLIERS70.pptxUploaded byniyati25
- What Are Outliers54Uploaded byniyati25
- What Are Outliers16Uploaded byniyati25
- WHAT ARE OUTLIERS80.pptxUploaded byniyati25
- WHAT ARE OUTLIERS67.pptxUploaded byniyati25
- What Are Outliers91Uploaded byniyati25
- WHAT ARE OUTLIERS75.pptxUploaded byniyati25
- What Are Outliers56Uploaded byniyati25
- What Are Outliers20Uploaded byniyati25
- What Are Outliers105Uploaded byniyati25
- What Are Outliers13Uploaded byniyati25
- What Are Outliers11Uploaded byniyati25
- What Are Outliers55Uploaded byniyati25
- WHAT ARE OUTLIERS77.pptxUploaded byniyati25
- What Are Outliers200Uploaded byniyati25
- What Are Outliers197Uploaded byniyati25
- What Are Outliers148Uploaded byniyati25
- WHAT ARE OUTLIERS110.pptxUploaded byniyati25
- What Are Outliers159Uploaded byniyati25
- What Are Outliers125Uploaded byniyati25
- What Are Outliers164Uploaded byniyati25
- What Are Outliers212Uploaded byniyati25
- What Are Outliers99Uploaded byniyati25
- What Are Outliers144Uploaded byniyati25
- What Are Outliers136Uploaded byniyati25
- What Are Outliers214Uploaded byniyati25
- What Are Outliers232Uploaded byniyati25
- What Are Outliers120Uploaded byniyati25
- What Are Outliers122Uploaded byniyati25

- ch27.pdfUploaded byniyati25
- What Are Outliers271Uploaded byniyati25
- BhavanaRam PaperUploaded byniyati25
- Computer Graphics1Uploaded byniyati25
- 9.eUploaded byniyati25
- 98qwertyUploaded byniyati25
- Cs453 d HTML Javascript 1Uploaded byniyati25
- 4.80 SY ITUploaded byniyati25
- Computer Graphics December 2010 - Old(Vtuplanet.com)Uploaded byniyati25
- chap 7Uploaded byniyati25
- 9.eUploaded byniyati25
- FTKManualUploaded byniyati25
- 2 D Transformations 2Uploaded byniyati25
- 184890998 CEH v8 Labs Module 13 Hacking Web Applications PDFUploaded byniyati25
- Ieee Test PlanUploaded byniyati25
- What Are Outliers258Uploaded byniyati25
- What Are Outliers262Uploaded byniyati25
- What Are Outliers266Uploaded byniyati25
- What Are Outliers272Uploaded byniyati25
- What Are Outliers270Uploaded byniyati25
- What Are Outliers255Uploaded byniyati25
- What Are Outliers257Uploaded byniyati25
- What Are Outliers268Uploaded byniyati25
- What Are Outliers269Uploaded byniyati25
- What Are Outliers260Uploaded byniyati25
- What Are Outliers263Uploaded byniyati25
- What Are Outliers267Uploaded byniyati25
- What Are Outliers259Uploaded byniyati25
- What Are Outliers261Uploaded byniyati25