Professional Documents
Culture Documents
What is an anomaly?
Types of anomalies
Sample problem
• Suppose you want to track traffic flow in road segments. If the traffic
is anomalous, then it could potentially be due to an accident, water
logging, etc.
Yes
Observed
Value
Property
Univariate Normal distribution
(O − E )
https://www.stat.berkeley.edu/~stark/SticiG 2 2
ui/Text/chiSquare.htm
= E
First the easy example
• You toss a coin 50 times and find 28 heads and 22 tells. Is this a normal
occurrence or anomalous?
• E(heads)=E(tails)=25
2 9 9 18
•𝑋 = + = = 0.72
25 25 25
• k: Degrees of freedom
• The number of independent ways in which the data can vary
• 2 for this example?
• 1
• Check in chi-square table
Get p-value..
Going back to our problem…
• E(clogged)=2.5
• E(slow)=4
• E(normal)=2.5
• E(smooth)=1
2 7−2.5 2 4−2 2 1−2.5 2 1
•𝑥 = + + + = 11
2.5 4 2.5 1
• Anomalous
Moving to multiple roads
• You are given 10 different road segments, and their traffic speeds
• Is it anomalous?
• What’s different?
• Don’t have categories
Multivariate normal distribution
• Vector of r=[𝑟1 , ⋯ , 𝑟𝑚 ]
• Road 𝑟𝑖 ≈ 𝑁(𝜇𝑖 , 𝜎12 )
• Distance from expected speeds
(𝑟𝑖 −𝜇𝑖 )2
• d(r)=√(σ𝑖 𝜎2 )
𝑖
• If d(r)≥ 𝜃, then anomalous
• How would you select 𝜃?
• What happens if the roads are not independent?
• Use Mahalanobis distance