Anomaly Detection

Uploaded by

Rashul Chutani

0% found this document useful (0 votes)

13 views17 pages

Copyright

Available Formats

PDF, TXT or read online from Scribd

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Report this Document

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Flag for inappropriate content

0% found this document useful (0 votes)

13 views17 pages

Anomaly Detection

Uploaded by

Rashul Chutani

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Flag for inappropriate content

Jump to Page

You are on page 1of 17

Search inside document

Anomaly Detection

What is an anomaly?
Types of anomalies
Sample problem
• Suppose you want to track traffic flow in road segments. If the traffic
is anomalous, then it could potentially be due to an accident, water
logging, etc.

• How would you do that?

• Model traffic flow using a normal distribution. Compute the

probability of the current flow.
Statistical methods

Object Null Model P-value(Object) <𝜃

Yes

How do you build this model? Anomalous

P-value
• P-value(v=o): Probability of
observing a value as extreme as o Expected
• p(x>o) Value

Observed
Value

Property
Univariate Normal distribution

• Compute Z-score: How many std. dev away from mean?

• Use the standard normal chart to compute p-values.
• ~66% within 1 std dev, 95% within 2 and 99% within 3.
Anomalies at a global level
• A road is anomalous with 0.05 probability
• What is the probability that at least one road is anomalous in a 1000-
road network?
• 1 − 0.951000 ~1
• Finding an anomalous road in a day is not statistically significant
• You find 100 roads as anomalous. Is that an anomaly?
• Let us make the assumption that all roads are independent.
It’s like a coin toss…
• You toss a coin at each road. With 0.05 probability it is anomalous.
• Finding 100 anomalous roads:
1000
• 𝑃 100 = 100
0.05100 0.95900
• Finding 100 or more anomalous roads
1000
• 𝑝 − 𝑣𝑎𝑙𝑢𝑒 100 = σ1000
𝑖=100 𝑖
0.05𝑖 0.951000−𝑖
Moving to multiple categories
• You are given 10 different road segments and their traffic category.
• Clogged:7, Slow:2, Moving:1, Smooth:0
• Each state happens with a certain probability
• Clogged: 0.25
• Slow: 0.4
• Moving: 0.25
• Smooth: 0.1
• On the whole, is it anomalous?
• P-value <0.05
• How is it different from the single category setting?
• Univariate to multivariate
Use Chi-square test
• You have multiple independent
random variables.
• Chi-squared distribution
• Distribution of a sum of the squares of k
independent standard normal random
variables
• Normalized differences from the
expected value in a multinomial
(roughly) follow chi-square
• Derivation:

 (O − E )
https://www.stat.berkeley.edu/~stark/SticiG 2 2
ui/Text/chiSquare.htm

=  E
First the easy example
• You toss a coin 50 times and find 28 heads and 22 tells. Is this a normal
occurrence or anomalous?

• E(heads)=E(tails)=25
2 9 9 18
•𝑋 = + = = 0.72
25 25 25
• k: Degrees of freedom
• The number of independent ways in which the data can vary
• 2 for this example?
• 1
• Check in chi-square table
Get p-value..
Going back to our problem…
• E(clogged)=2.5
• E(slow)=4
• E(normal)=2.5
• E(smooth)=1
2 7−2.5 2 4−2 2 1−2.5 2 1
•𝑥 = + + + = 11
2.5 4 2.5 1

• Anomalous
Moving to multiple roads
• You are given 10 different road segments, and their traffic speeds
• Is it anomalous?
• What’s different?
• Don’t have categories
Multivariate normal distribution
• Vector of r=[𝑟1 , ⋯ , 𝑟𝑚 ]
• Road 𝑟𝑖 ≈ 𝑁(𝜇𝑖 , 𝜎12 )
• Distance from expected speeds
(𝑟𝑖 −𝜇𝑖 )2
• d(r)=√(σ𝑖 𝜎2 )
𝑖
• If d(r)≥ 𝜃, then anomalous
• How would you select 𝜃?
• What happens if the roads are not independent?
• Use Mahalanobis distance

Subtraction
From Everand
Subtraction
Sally Fisk
No ratings yet
Advanced Electric Circuits: The Commonwealth and International Library: Applied Electricity and Electronics Division
From Everand
Advanced Electric Circuits: The Commonwealth and International Library: Applied Electricity and Electronics Division
A. M. P. Brookes
No ratings yet
Montecarlo Integration
Document33 pages
Montecarlo Integration
danut horincas
No ratings yet
8 Heteroscedasticity
Document24 pages
8 Heteroscedasticity
David Ayala
No ratings yet
The Normal Binomial and Poisson Distributions
Document25 pages
The Normal Binomial and Poisson Distributions
Deepak Rana
No ratings yet
Lecture2 Math ML Review
Document87 pages
Lecture2 Math ML Review
LishanZhu
No ratings yet
4 Regression Inference
Document36 pages
4 Regression Inference
fitra purna
No ratings yet
5 Random Var PDF
Document74 pages
5 Random Var PDF
krishnakishore
No ratings yet
CSC 344 - Algorithms and Complexity
Document37 pages
CSC 344 - Algorithms and Complexity
Arvin Hipolito
No ratings yet
Introduction To Data Science Exploratory Data Analysis
Document55 pages
Introduction To Data Science Exploratory Data Analysis
hunt4nothing
No ratings yet
Econ 307: Two-Sample Inference and Chi-Square Test: Pasita Chaijaroen Pchaijaroen@wm - Edu
Document26 pages
Econ 307: Two-Sample Inference and Chi-Square Test: Pasita Chaijaroen Pchaijaroen@wm - Edu
Paul Lee
No ratings yet
ch4 Combinational Logic Circuits
Document90 pages
ch4 Combinational Logic Circuits
Minh Mẫn Nguyễn
No ratings yet
Normal Distributions For ETEEAP2020-1
Document72 pages
Normal Distributions For ETEEAP2020-1
movieboxpro482
No ratings yet
4 Regression Issues
Document44 pages
4 Regression Issues
arpit
No ratings yet
Stats101A - Chapter 1
Document25 pages
Stats101A - Chapter 1
Zhen Wang
No ratings yet
Chapter 2 Econometric
Document28 pages
Chapter 2 Econometric
mukesh
No ratings yet
Regression Model DR N Nagesha
Document28 pages
Regression Model DR N Nagesha
Adarsh Agrawal
No ratings yet
Theoretical Distributions & Hypothesis Testing
Document66 pages
Theoretical Distributions & Hypothesis Testing
Priya Hubli
No ratings yet
Point-to-Point Shortest Path Algorithms With Preprocessing: Microsoft Research - Silicon Valley
Document44 pages
Point-to-Point Shortest Path Algorithms With Preprocessing: Microsoft Research - Silicon Valley
Rátkai Zoltán
No ratings yet
Lesson 3 Overview Problems and Outliers
Document31 pages
Lesson 3 Overview Problems and Outliers
maartenwilders
No ratings yet
Quantitative Methods in Management: Term II 4 Credits MGT 408 DAY - 5
Document123 pages
Quantitative Methods in Management: Term II 4 Credits MGT 408 DAY - 5
sudheer gotteti
No ratings yet
Lecture 5 - Heteroskedasticity
Document25 pages
Lecture 5 - Heteroskedasticity
THANH NGUYEN HOANG
No ratings yet
Chi Squared Analysis With Voiceover 1
Document29 pages
Chi Squared Analysis With Voiceover 1
kallenbachcarter
No ratings yet
Stats101A - Chapter 2
Document59 pages
Stats101A - Chapter 2
Zhen Wang
No ratings yet
WEEK 11 Normal Probability Distributions
Document51 pages
WEEK 11 Normal Probability Distributions
Evelyn Velasco
No ratings yet
STAT1
Document41 pages
STAT1
Diana Marie Medina
No ratings yet
Business Statistics 1
Document30 pages
Business Statistics 1
Sumaira Ashraf
No ratings yet
Bab 12. Data Analysis and Probability
Document49 pages
Bab 12. Data Analysis and Probability
Jelang Ramadhan Nur Wachid
No ratings yet
BA21 Describing Data N Distribution Analysis
Document66 pages
BA21 Describing Data N Distribution Analysis
Anoushkaa Ghorpade
No ratings yet
The Normal Binomial and Poisson Distributions
Document25 pages
The Normal Binomial and Poisson Distributions
s_mahes
No ratings yet
Lecture 4
Document38 pages
Lecture 4
addis zewd
No ratings yet
21 Continuous Distributions
Document34 pages
21 Continuous Distributions
Rahul Jaju
No ratings yet
Term 2 Statistics and Probability Reviewer
Document42 pages
Term 2 Statistics and Probability Reviewer
Bulbasaur Nyaa
No ratings yet
Review 2
Document10 pages
Review 2
Anastasia Monica Khunniegalshottest
No ratings yet
IE 220 Probability and Statistics: 2009-2010 Spring Chapter 2: Descriptive Statistics: Numerical Summary
Document41 pages
IE 220 Probability and Statistics: 2009-2010 Spring Chapter 2: Descriptive Statistics: Numerical Summary
Tuna Can
No ratings yet
4 Normal and Binomial Distribution 2021
Document45 pages
4 Normal and Binomial Distribution 2021
Haikal Dinie
No ratings yet
Solving Problems by Searching: Artificial Intelligence
Document43 pages
Solving Problems by Searching: Artificial Intelligence
Dai Trong
No ratings yet
Review 2 PDF
Document10 pages
Review 2 PDF
Parminder Singh
No ratings yet
Elhabian ICP09
Document130 pages
Elhabian ICP09
angelius
No ratings yet
W4 Lecture4
Document31 pages
W4 Lecture4
Thi Nam Phạm
No ratings yet
00000chen - Linear Regression Analysis3
Document252 pages
00000chen - Linear Regression Analysis3
Tommy Ngo
No ratings yet
Arena 8 RandomVariateGeneration
Document12 pages
Arena 8 RandomVariateGeneration
Victoria Moore
No ratings yet
State-Space Representation
Document21 pages
State-Space Representation
Dilip Moyal
No ratings yet
Business Statistics 1
Document30 pages
Business Statistics 1
ikki123123
No ratings yet
Simulation Modelling
Document54 pages
Simulation Modelling
Canice Chiu
No ratings yet
Spatial Interpolation, Kriging
Document30 pages
Spatial Interpolation, Kriging
Catherine Munro
100% (1)
Probabilistic Methods in Engineering: Lecture I: Introduction/Motivation
Document31 pages
Probabilistic Methods in Engineering: Lecture I: Introduction/Motivation
Jack Dawson
No ratings yet
2021 Lecture03 P1 ProblemSolvingBySearching
Document43 pages
2021 Lecture03 P1 ProblemSolvingBySearching
Nguyen Thong
No ratings yet
Lecture 10 (With Ans)
Document23 pages
Lecture 10 (With Ans)
劉泳
No ratings yet
Chapter 3
Document80 pages
Chapter 3
getaw bayu
No ratings yet
DISC: 203 - Probability & Statistics: Lecture 7 - 9 Probability Distributions - I
Document34 pages
DISC: 203 - Probability & Statistics: Lecture 7 - 9 Probability Distributions - I
Rahim Mazhar
No ratings yet
AP Statistics Midterm Outline (Up To Chapter 8)
Document4 pages
AP Statistics Midterm Outline (Up To Chapter 8)
elialt
No ratings yet
R For Data Exploration
Document52 pages
R For Data Exploration
Jad Abou Assaly
No ratings yet
Evolutionary Algorithms: - Overview - Traveling Salesman Problem (TSP)
Document26 pages
Evolutionary Algorithms: - Overview - Traveling Salesman Problem (TSP)
Zain Ul Haq
No ratings yet
06a - The Normal Distribution
Document36 pages
06a - The Normal Distribution
hammad
No ratings yet
Descriptive Statistics and Graphs: Statistics For Psychology
Document14 pages
Descriptive Statistics and Graphs: Statistics For Psychology
iamquasi
No ratings yet
Chi-Square As A Statistical Test
Document27 pages
Chi-Square As A Statistical Test
jimmy
No ratings yet
Bioinformatics-Lesson 07 - Hidden Markov Model
Document28 pages
Bioinformatics-Lesson 07 - Hidden Markov Model
mahedi hasan
No ratings yet
Phan I - C5a-Breakdown of OLS Assumptions
Document35 pages
Phan I - C5a-Breakdown of OLS Assumptions
pham nguyet
No ratings yet
H1.1 Definitions, Measures, Plots, CLT
Document83 pages
H1.1 Definitions, Measures, Plots, CLT
Ece Ebru Kaya
No ratings yet
A Density-Based Algorithm For Discovering Clusters in Large Spatial Databases With Noise
Document6 pages
A Density-Based Algorithm For Discovering Clusters in Large Spatial Databases With Noise
Rashul Chutani
No ratings yet
Clustering
Document61 pages
Clustering
Rashul Chutani
No ratings yet
Clustering On Graphs: The Markov Cluster Algorithm (MCL) : CS 595D Presentation by Kathy Macropol
Document46 pages
Clustering On Graphs: The Markov Cluster Algorithm (MCL) : CS 595D Presentation by Kathy Macropol
Rashul Chutani
No ratings yet
Programming Abstractions: Cynthia Lee
Document19 pages
Programming Abstractions: Cynthia Lee
Rashul Chutani
No ratings yet
Programming Abstractions: Cynthia Lee
Document21 pages
Programming Abstractions: Cynthia Lee
Rashul Chutani
No ratings yet
Programming Abstractions: Cynthia Lee
Document24 pages
Programming Abstractions: Cynthia Lee
Rashul Chutani
No ratings yet
Programming Abstractions in C++: Cynthia Lee
Document17 pages
Programming Abstractions in C++: Cynthia Lee
Rashul Chutani
No ratings yet
Dna - The Genetic Material: 2021-22 Semester I
Document25 pages
Dna - The Genetic Material: 2021-22 Semester I
Rashul Chutani
No ratings yet
Cell 2
Document26 pages
Cell 2
Rashul Chutani
No ratings yet
DNA The Genetic Material Lecture 2 1630394192851
Document21 pages
DNA The Genetic Material Lecture 2 1630394192851
The Revolution
No ratings yet
Cell 3
Document25 pages
Cell 3
Rashul Chutani
No ratings yet
Minor 1 Solutions
Document10 pages
Minor 1 Solutions
Rashul Chutani
No ratings yet
Seating Arrangement HUL 243
Document22 pages
Seating Arrangement HUL 243
Rashul Chutani
No ratings yet
Chi-Square Test and Its Application in Hypothesis Testing: Article
Document4 pages
Chi-Square Test and Its Application in Hypothesis Testing: Article
C.j. Prabhu
No ratings yet
GARCH Volatility Forecast in Excel
Document10 pages
GARCH Volatility Forecast in Excel
NumXL Pro
No ratings yet
Unit - IV Design of Experiments
Document16 pages
Unit - IV Design of Experiments
IT Girls
No ratings yet
Learning Outcomes
Document48 pages
Learning Outcomes
Pom Jung
0% (1)
Uji Kebebasan
Document57 pages
Uji Kebebasan
Irwan Nilham
No ratings yet
Detecting Multicollinearity in Regression Analysis
Document4 pages
Detecting Multicollinearity in Regression Analysis
Noora Shrestha
No ratings yet
Topic 5 Discrete Distributions
Document30 pages
Topic 5 Discrete Distributions
MULINDWA IBRA
No ratings yet
Functions of Random Variables (Optional)
Document14 pages
Functions of Random Variables (Optional)
Oyster Mac
No ratings yet
Chapter Iii: Sampling and Sampling Distribution
Document4 pages
Chapter Iii: Sampling and Sampling Distribution
Glecil Joy Dalupo
No ratings yet
Introduction To Linear Regression and Correlation Analysis
Document68 pages
Introduction To Linear Regression and Correlation Analysis
Reader
88% (8)
Monkey Tests For Random Number Generators 1
Document9 pages
Monkey Tests For Random Number Generators 1
raja5293
No ratings yet
MBA Salary Case Study
Document18 pages
MBA Salary Case Study
kashish khetrapal
No ratings yet
STATISTICAL METHODS Units-345notes BTech S.srinivasa Rao
Document78 pages
STATISTICAL METHODS Units-345notes BTech S.srinivasa Rao
none
No ratings yet
Solution To Exercise 4: Multivariable Analysis in R Part 2: Cox Proportional Hazard Model
Document3 pages
Solution To Exercise 4: Multivariable Analysis in R Part 2: Cox Proportional Hazard Model
Julian Andres Guzman Fierro
No ratings yet
Statistics and Probability LAS Q4
Document131 pages
Statistics and Probability LAS Q4
Jean rose Francisco
60% (5)
Cantor Distribution
Document3 pages
Cantor Distribution
shiena8181
No ratings yet
Types of Variables in Statistics and Research
Document5 pages
Types of Variables in Statistics and Research
Rizaldie Morales
0% (1)
Submitted To: Mrs. Geetika Vashisht College of Vocational Studies University of Delhi
Document36 pages
Submitted To: Mrs. Geetika Vashisht College of Vocational Studies University of Delhi
sanchit nagpal
No ratings yet
Standard Normal Distribution Table
Document2 pages
Standard Normal Distribution Table
Mandeep Jassy
No ratings yet
Linear Regression Analysis in SPSS Statistics
Document7 pages
Linear Regression Analysis in SPSS Statistics
mesay83
No ratings yet
Notas - Bayes Rule
Document9 pages
Notas - Bayes Rule
CristianDavidFrancoHernandez
No ratings yet
Mortality: The Essence of A Healthy Forest
Document33 pages
Mortality: The Essence of A Healthy Forest
hartati apriani
No ratings yet
Solutions Chapter 10
Document7 pages
Solutions Chapter 10
Nama Sahaja
No ratings yet
Probability Calculation
Document12 pages
Probability Calculation
Alishba Khan
No ratings yet
Chem Engr
Document3 pages
Chem Engr
Nasir Danial
No ratings yet
ETM 421Hw3
Document3 pages
ETM 421Hw3
Waqas Sarwar
No ratings yet
3rd Quarter Exam (Statistics)
Document4 pages
3rd Quarter Exam (Statistics)
JERALD MONJUAN
No ratings yet
Cox Regression vs. Logistic Regression
Document15 pages
Cox Regression vs. Logistic Regression
dr musafir
No ratings yet
CBCS Sem V - Probability Theory and Statistics Question Bank For Sem 5 Uploaded by Navdeep Raghav (KMV Academic Corner)
Document2 pages
CBCS Sem V - Probability Theory and Statistics Question Bank For Sem 5 Uploaded by Navdeep Raghav (KMV Academic Corner)
simplydrishtiagg
No ratings yet
Allama Iqbal Open University, Islamabad (Department of Statistics) Warning
Document4 pages
Allama Iqbal Open University, Islamabad (Department of Statistics) Warning
samranaseem367
No ratings yet