You are on page 1of 71

DEMYSTIFYING MACHINE LEARNING

AND MEASURING SOC SUCCESS


ANDREW HOLLISTER, TECHNICAL DIRECTOR EMEA

2 OCTOBER 2018
WELCOME To receive your
CPE Credit:
1. Complete 3 Attendance
Audio is streamed over your computer. Checkpoints
or
2. Watching the On-Demand
Dial-in numbers and codes are on the left. recording? Watch from the
beginning to the very end.
Have a question for the speaker? Access the Q&A tab. 3. Don’t forget to take the
survey!

Technical issues? Access the Help tab.


Use the Credits tab to
track your Checkpoints
Questions or suggestions? Visit https://support.isaca.org
Use the Papers tab to
find the following:
1. PDF Copy of today’s
presentation.
2. CPE Submission Guide.
TODAY’S SPEAKER

Andrew Hollister
Technical Director EMEA
LogRhythm
Some Definitions

The discipline of extracting


information from data.

Data Science

©LogRhythm 2018. All rights reserved. Company Confidential 4


Some Definitions

The discipline of extracting


information from data.
Data Science

The science of enabling computers to


learn without being explicitly
programmed to do so.

Machine Learning
(ML)

©LogRhythm 2018. All rights reserved. Company Confidential 5


Some Definitions

Data The discipline of extracting


Science information from data.

Machine The science of enabling computers to


Learning learn without being explicitly
(ML) programmed to do so.

The science of enabling a computer to


Artificial automate something a human would
Intelligence normally do that requires intelligence,
(AI) analysis, and decision making.

©LogRhythm 2018. All rights reserved. Company Confidential 6


From HAL to Now

Dawn of AI

1950s- 1980s- Today &


1940s 1970s 2000s
1960s 1990s Beyond
From HAL to Now

Dawn of AI

1950s- 1980s- Today &


1940s 1970s 2000s
1960s 1990s Beyond

Early Innovation
From HAL to Now
AI Winter
Dawn of AI

1950s- 1980s- Today &


1940s 1970s 2000s
1960s 1990s Beyond

Early Innovation
From HAL to Now
AI Winter
Dawn of AI

1950s- 1980s- Today &


1940s 1970s 2000s
1960s 1990s Beyond

Early Innovation Narrow AI


From HAL to Now
AI Winter Compute Power
Dawn of AI and Big Data

1950s- 1980s- Today &


1940s 1970s 2000s
1960s 1990s Beyond

Early Innovation Narrow AI


From HAL to Now
AI Winter Compute Power
Dawn of AI and Big Data

1950s- 1980s- Today &


1940s 1970s 2000s
1960s 1990s Beyond

Early Innovation Narrow AI Towards Strong AI


The Hype and
the Reality
ML/AI = The Security Silver Bullet!
ML/AI:
Learns all patterns
Recognises real security concerns
Problem solved, right?
ML/AI = The Security Silver Bullet!
ML/AI:
Learns all patterns
Recognises real security concerns
Problem solved, right?

Growing skepticism or debate around vendor claims of the “magic bullet”


of machine learning for cyber security
• https://eugene.kaspersky.com/2016/05/25/darwinism-in-it-security-pt-2-inoculation-from-bs/
• https://seclab.cs.ucsb.edu/media/uploads/papers/2010_sommer_paxson_ssp_oakland10-ml.pdf
• https://techcrunch.com/2016/07/01/exploiting-machine-learning-in-cybersecurity/
Modern AI ≈ Machine Learning
Three Key AI/ML Success Factors

Data Domain Data Science


Data
Domain
Data Science
What About the Data?

Awesome
AI/ML
Data Preparation

The accessing, organising, and


structuring of unprocessed data
assets to be used for data analysis.
Data Preparation

The accessing, organising, and


structuring of unprocessed data
assets to be used for data analysis.

Data Scientists spend ~80%


of their time preparing data.
Data Preparation – Who, What, When?

09 28 2016 03:19:33 172.16.0.21 <LOC4:DBUG> Sep 28 03:18:09 probe LogRhythmDpi: EVT:001 4b03743f-8d06-4bdb-a9fe-63b6bb833376:00
172.16.0.106,172.16.0.35,1205,25,00:50:56:a7:00:df,00:50:56:a7:35:ad,6,956,339816/339816,11920/11920,501/501,1475029082,1475029089,7/7,dname=lrxm.uk.emea.logrhyth
m.com,command=EHLO|MAIL|RCPT|DATA,sender=invoicetracking@acme.com,recipient=bsmith@uk.emea.logrhythm.com,subject=Status Update For Tracking#
123412341234,object=250,objectname=Invoice.pdf
Data Preparation – Who, What, When?

09 28 2016 03:19:33 172.16.0.21 <LOC4:DBUG> Sep 28 03:18:09 probe LogRhythmDpi: EVT:001 4b03743f-8d06-4bdb-a9fe-63b6bb833376:00
172.16.0.106,172.16.0.35,1205,25,00:50:56:a7:00:df,00:50:56:a7:35:ad,6,956,339816/339816,11920/11920,501/501,1475029082,1475029089,7/7,dname=lrxm.uk.emea.logrhyth
m.com,command=EHLO|MAIL|RCPT|DATA,sender=invoicetracking@acme.com,recipient=bsmith@uk.emea.logrhythm.com,subject=Status Update For Tracking#
123412341234,object=250,objectname=Invoice.pdf
LogRhythm Machine Data Intelligence: An Example
LogRhythm Machine Data Intelligence: An Example
LogRhythm Machine Data Intelligence: An Example
LogRhythm Machine Data Intelligence: An Example
LogRhythm Machine Data Intelligence: An Example

100+
Metadata Fields
Data
Domain
Data Science
Why AI/ML for Security?
Poll #1

What concerns you most?

a) Account compromise
b) Insider threats
c) Privilege account abuse
d) Data exfiltration
e) Other – none of the above
Poll #1

What concerns you most?

a) Account compromise
b) Insider threats
c) Privilege account abuse
d) Data exfiltration
e) Other – none of the above
The Evolving Need

• Exponentially increasing
threat surface “Unfortunately, more security
• Spectrum of attacks – doesn’t necessarily mean better
security….The status quo is not
“unknown unknowns” sustainable…Even as companies
• Improving detection spend more on security, losses
requires improving related to cybercrime have
accuracy and efficiency nearly doubled in the last five
years.”
• Moving beyond rules-
based approaches

-Keith Weiss, head of U.S. software


coverage for Morgan Stanley
Why AI/ML for Security?

69%
of orgs report a recent
insider data exfil attempt1

https://www.verizonenterprise.com/verizon-insights-lab/dbir/
Why AI/ML for Security?

81%
breaches involved stolen
or weak credentials2

https://www.verizonenterprise.com/verizon-insights-lab/dbir/
Skills shortage

80%
Organisations affected by
Cyber Security Skills gap

CyberEdge’s 2018 Cyberthreat Defense Report


Poll Results
Security Analytics Machine Learning Landscape

Anomaly Behavioural Peer


Detection Profiling Analysis

Threat
Classification

Temporal
Modeling Network
Analytics
Security Analytics Machine Learning Landscape
One-Class Nearest
SVMs Neighbor
Search

Anomaly Behavioural Peer Deep


Learning
Detection
Multivariate
Statistical Profiling Analysis
Modeling

Density
Outlier
Methods
Bayesian Subspace Clustering
Inference Models
Spectral
Graph Threat
Theory
Classification

Markov
Temporal Random

Modeling Time
Processes
Network Adversarial
Series
Analysis
Analytics Learning
Indications or Conclusions?

“Give me insights to threats I


“I don’t want false positives”
wouldn’t otherwise know about”

CERTAINTY
INSIGHT

https://blogs.gartner.com/anton-chuvakin/2016/12/08/what-should-your-ueba-show-indications-or-conclusions/
Machine Learning: Non-Deterministic Security

Pick any two:


• Inherent to methods and
algorithms used in ML
Correct + Fast = Not Explainable
• Non-deterministic
Fast + Explainable = Not Correct
approaches augment
Correct + Explainable = Not Fast CORRECT deterministic approaches

FAST EXPLAINABLE

https://blogs.gartner.com/anton-chuvakin/2015/03/03/killed-by-ai-much-a-rise-of-non-deterministic-security/
Analytics in Depth is Required

Brute-force Spear-phishing Zero-day

Rootkit Session hijacking Insider threat

Commodity malware Custom malware

Spectrum of Attacks
Analytics in Depth is Required

Brute-force Spear-phishing Zero-day

Rootkit Session hijacking Insider threat

Commodity malware Custom malware

Spectrum of Attacks

Vulns: Known Vulns: Known Vulns: Unknown


Methods: Known Methods: Unknown Methods: Unknown
Analytics in Depth is Required

Brute-force Spear-phishing Zero-day

Rootkit Session hijacking Insider threat

Commodity malware Custom malware

Spectrum of Attacks

Vulns: Known
Real-time Vulns: Known
threat detection Vulns: Unknown
Methods: Known analyticsMethods: Unknown
via scenario Methods: Unknown
Analytics in Depth is Required

Brute-force Spear-phishing Zero-day

Rootkit Session hijacking Insider threat

Commodity malware Custom malware

Spectrum of Attacks

Vulns: Known
Real-time Vulns: Known Anomaly Vulns:
threat detection Unknown
detection via deep
Methods: Known analyticsMethods: Unknown behavioural
via scenario Methods: Unknown
profiling
Attack Coverage with Analytics In Depth

Percentage of Attacks

Known Unknown
Attack Coverage with Analytics In Depth

Percentage of Attacks

Attack Impact

Known Unknown
Measuring Results
Poll #2

How do you measure your SOC effectiveness today?

a) # alerts
b) # cases closed
c) Wider SLAs that are met
d) Time to triage
e) No effective way to measure
f) I don't know
How Do You Measure SOC Effectiveness?

# of Alerts
Open/Close Rates
% Met SLAs

Mean time to triage


How Do You Measure SOC Effectiveness?

Helpful in Does not indicate


understanding issues in processes
# of Alerts
frequencies and if and a poor reflection
Open/Close Rates the team is of performance
% Met SLAs overloaded

Mean time to triage


How Do You Measure SOC Effectiveness?

Helpful in Does not indicate


understanding issues in processes
# of Alerts
frequencies and if and a poor reflection
Open/Close Rates the team is of performance
% Met SLAs overloaded

How quickly the Does not recognise


Mean time to triage team is able to issues with false
respond to new positives or your
concerns ability to respond to
different threat types
Measuring the Security Operation Process
First Qualified
Earliest Alarm
Alarm Threat or Mitigated Resolved
evidence Creation
Touch Not
Measuring the Security Operation Process
First Qualified
Earliest Alarm
Alarm Threat or Mitigated Resolved
evidence Creation
Touch Not

Time to Triage
Measuring the Security Operation Process
First Qualified
Earliest Alarm
Alarm Threat or Mitigated Resolved
evidence Creation
Touch Not

Time to Triage

Time to Detect
Measuring the Security Operation Process
First Qualified
Earliest Alarm
Alarm Threat or Mitigated Resolved
evidence Creation
Touch Not

Time to Triage

Time to Detect

Time to Qualify
Measuring the Security Operation Process
First Qualified
Earliest Alarm
Alarm Threat or Mitigated Resolved
evidence Creation
Touch Not

Time to Triage

Time to Detect

Time to Qualify

Time to Investigate
Measuring the Security Operation Process
First Qualified
Earliest Alarm
Alarm Threat or Mitigated Resolved
evidence Creation
Touch Not

Time to Triage

Time to Detect

Time to Qualify

Time to Investigate

Time to Respond
Measuring the Security Operation Process
First Qualified
Earliest Alarm
Alarm Threat or Mitigated Resolved
evidence Creation
Touch Not

Time to Triage

Time to Detect

Time to Qualify

Time to Investigate

Time to Respond
Measuring the Security Operation Process
First Qualified
Earliest Alarm
Alarm Threat or Mitigated Resolved
evidence Creation
Touch Not

Time to Triage

Time to Detect

Time to Qualify

Time to Investigate

Time to Respond
Poll Results
Technology? Process? People?
Where/what is the bottleneck?

• Technology issue:
High # of Alarms • How many are qualified? Tune analytics to draw down false
positives
• Process issue:
• Can external/internal contextual data be automatically
Slow rate of collected?
Qualification • Technology issue:
• Do you have access to contextual data? (e.g. vulnerability
state, threat intel, CMDB, etc.)

Slow rate of • Technology issue:


Investigation • Access to endpoint or network forensic data?
Takeaways

1. True AI isn’t here yet (for network security)


Takeaways

1. True AI isn’t here yet (for network security)


2. ML/AI is not a silver bullet
Takeaways

1. True AI isn’t here yet (for network security)


2. ML/AI is not a silver bullet
3. Both Scenario + behaviour-based approaches are
required for analytics in depth
Takeaways

1. True AI isn’t here yet (for network security)


2. ML/AI is not a silver bullet
3. Both Scenario + behaviour-based approaches are
required for analytics in depth
4. Security operations should be measured for
bottlenecks, not frequencies
QUESTIONS?
This training content (“content”) is provided to you without warranty, “as is” and “with all
faults”. ISACA makes no representations or warranties express or implied, including
those of merchantability, fitness for a particular purpose or performance, and non-
infringement, all of which are hereby expressly disclaimed.

You assume the entire risk for the use of the content and acknowledge that: ISACA has
designed the content primarily as an educational resource for IT professionals and
therefore the content should not be deemed either to set forth all appropriate
procedures, tests, or controls or to suggest that other procedures, tests, or controls that
are not included may not be appropriate; ISACA does not claim that use of the content
will assure a successful outcome and you are responsible for applying professional
judgement to the specific circumstances presented to determining the appropriate
procedures, tests, or controls.

Copyright © 2018 by the Information Systems Audit and Control Association, Inc. (ISACA). All rights reserved. This webinar may not be used, copied, reproduced,
modified, distributed, displayed, stored in a retrieval system, or transmitted in any form by any means (electronic, mechanical, photocopying, recording or otherwise).
THANK YOU FOR
ATTENDING THIS
WEBINAR

You might also like