You are on page 1of 10

14/04/2020 Lessons learned from the Microsoft SOC—Part 2a: Organizing people

|
Security
Solutions All Microsoft

Products
Search 
Operations & Intelligence
Sign in

Partners
April 23, 2019

Resources
Lessons learned from the Microsoft SOC—Part 2a:
Organizing people
Trust Center 

Mark Simos Lead Cybersecurity Architect, Cybersecurity Solutions Group


Kristina Senior Director, SOC and IR, Digital Security & Risk Engineering
John Dellinger Chief Security Advisor, Cybersecurity Solutions Group

In the second post in our series, we focus on the most valuable resource in the
security operations center (SOC)—our people. This series is designed to share our
approach and experience with operations, so you can use what we learned to
improve your SOC. In Part 1: Organization, we covered the SOC’s organizational
role and mission, culture, and metrics.

The lessons in the series come primarily from Microsoft’s corporate IT security
operation team, one of several specialized teams in the Microsoft Cyber Defense
Operations Center (CDOC). We also include lessons our Detection and Response
Team (DART) have learned helping our customers respond to major incidents.

People are the most valuable asset in the SOC—their experience, skill, insight,
creativity, and resourcefulness are what makes our SOC effective. Our SOC
management team spends a lot of time thinking about how to ensure our people
are set up with what they need to succeed and stay engaged. As we’ve improved

https://www.microsoft.com/security/blog/2019/04/23/lessons-learned-microsoft-soc-part-2-organizing-people/ 1/10
14/04/2020 Lessons learned from the Microsoft SOC—Part 2a: Organizing people

our processes, we’ve been able to decrease the time it takes to ramp people up and
increase employee enjoyment of their jobs.

Today, we cover the first two aspects of how to set up people in the SOC for
success:

Empower humans with automation.

Microsoft SOC teams and tiers model.

Empower humans with automation

Rapidly sorting out the signal (real detections) from the noise (false positives) in the
SOC requires investing in both humans and automation. We strongly believe in the
power of automation and technology to reduce human toil, but ultimately, we’re
dealing with human attack operators and human judgement is critical to the
process.

In our SOC, automation is not about using efficiency to remove humans from the
process—it is about empowering humans. We continuously think about how we
can automate repetitive tasks from the analyst’s job, so they can focus on the
complex problems that people are uniquely able to solve.

Automation empowers humans to do more in the SOC by increasing response


speed and capturing human expertise. The toil our staff experiences comes mostly
from repetitive tasks and repetitive tasks come from either attackers or defenders
doing the same things over and over. Repetitive tasks are ideal candidates for
automation.

We also found that we need to constantly refine the automation because attackers
are creative and persistent, constantly innovating to avoid detections and
preventive controls. When an effective attack method is identified (like phishing),
they exploit it until it stops working. But they also continually innovate new tactics
to evade defenses introduced by the cybersecurity community. Given the profit

https://www.microsoft.com/security/blog/2019/04/23/lessons-learned-microsoft-soc-part-2-organizing-people/ 2/10
14/04/2020 Lessons learned from the Microsoft SOC—Part 2a: Organizing people

potential of attacks, we expect the challenges of evolving attacks to continue for


the foreseeable future.

When repetitive and boring work is automated, analysts can apply more of their
creative minds and energy to solving the new problems that attackers present to
them and proactively hunting for attackers that got past the first lines of defense.
We’ll discuss areas where we use automation and machine learning in “Part 3:
Technology.”

Microsoft SOC teams and tiers model

At Microsoft, we organized our SOC into specialized teams, allowing them to better
develop and apply deep expertise, which supports the overall goals of reducing
time to acknowledge and remediate.

This diagram represents the key SOC functions: threat intelligence, incident
management, and SOC analyst tiers:

Threat intelligence—We have several threat intelligence teams at Microsoft that


support the SOC and other business functions. Their role is to both inform business
stakeholders of risk and provide technical support for incident investigations,

https://www.microsoft.com/security/blog/2019/04/23/lessons-learned-microsoft-soc-part-2-organizing-people/ 3/10
14/04/2020 Lessons learned from the Microsoft SOC—Part 2a: Organizing people

hunting operations, and defensive measures for known threats. These strategic
(business) and tactical (technical) intelligence goals are related but distinctly
different from each other. We task different teams for each goal and ensure
processes are in place (such as daily standup meetings) to keep them in close
contact.

Incident management—Enterprise-wide coordination of incidents, impact


assessment, and related tasks are handled by dedicated personnel separate from
technical analyst teams. At Microsoft, these incident response teams work with the
SOC and business stakeholders to coordinate actions that may impact services or
business units. Additionally, this team brings in legal, compliance, and privacy
experts as needed to consult and advise on actions regarding regulatory aspects of
incidents. This is particularly important at Microsoft because we’re compliant with a
large number of international standards and regulations.

SOC analyst tiers—This three-tier model for SOC analysts will probably look
familiar to seasoned SOC professionals, though there are some subtleties in our
model we don’t see widely in the industry.

Our organization uses the term hot path and cold path to describe how we
discover adversaries and optimize processes to handle them.

https://www.microsoft.com/security/blog/2019/04/23/lessons-learned-microsoft-soc-part-2-organizing-people/ 4/10
14/04/2020 Lessons learned from the Microsoft SOC—Part 2a: Organizing people

Hot path—Reflects detection of confirmed active attack activity that must


be investigated and remediated as soon as possible. Managing and
remediating these incidents are primarily handled by Tier 1 and Tier 2,
though a small percentage (about 4 percent) are escalated to Tier 3.
Automation of investigations and remediations are also beginning to help
reduce hot path workloads.

Cold path—Refers to all other activities including proactively hunting for


adversary campaigns that haven’t triggered a hot path alert.

Roles and functions of the SOC analyst tiers

Tier 1—This team is the primary front line for and focuses on high-speed
remediation over a large volume of incidents. Tier 1 analysts respond to a very
specific set of alert sources and follow prescriptive instructions to investigate,
remediate, and document the incidents. The rule of thumb for alerts that Tier 1
handles is that it can be typically remediated within seconds to minutes. The
incidents will be escalated to Tier 2 if the incident isn’t covered by a documented
Tier 1 procedure or it requires involved/advanced remediation (for example, device
isolation and cleanup).

In addition:

The Tier 1 function is currently performed by full-time employees in our


corporate IT SOC. In the past and in other teams at Microsoft, we staffed
contracted employees or managed service agreements for Tier 1 functions.

A current initiative for the full-time employee Tier 1 team is to increase the
use of automated investigation and remediation for these incidents. One
goal of this initiative is to grow the skills of our current Tier 1 employees, so
they can shift to proactive work in other security assignments in SOC or
across the company.

Tier 1 (and Tier 2) SOC analysts may stay involved with an escalated incident
until it is remediated. This helps preserve context during and after
transferring ownership of an incident and also accelerates their learning and
skills growth.

https://www.microsoft.com/security/blog/2019/04/23/lessons-learned-microsoft-soc-part-2-organizing-people/ 5/10
14/04/2020 Lessons learned from the Microsoft SOC—Part 2a: Organizing people

The typical ratio of alert volumes is noted in the Tiers and Tools diagram
above. (We’ll share more details in “Part 3: Technology.”)

Tier 2—This team is focused on incidents that require deeper analysis and
remediation. Many Tier 2 incidents have been escalated from Tier 1 analysts, but
Tier 2 also directly monitors alerts for sensitive assets and known attacker
campaigns. These incidents are usually more complex and require an approach that
is still structured, but much more flexible than Tier 1 procedures. Additionally, some
Tier 2 analysts also proactively hunt for adversaries (typically using lower priority
alerts from the same Microsoft Threat Protection tools they use to manage reactive
incidents).

Tier 3—This team is focused primarily on advanced hunting and sophisticated


analysis to identify anomalies that may indicate advanced adversaries. Most
incidents are remediated at Tiers 1 and 2 (96 percent) and only unprecedented
findings or deviations from norms are escalated to Tier 3 teams. Tier 3 team
members have a high degree of freedom to bring their different skills,
backgrounds, and approaches to the goal of ferreting out red team/hidden
adversaries. Tier 3 team members have backgrounds as security professionals, data
scientists, intelligence analysts, and more. These teams use different tools
(Microsoft, custom, and third-party) to sift through a number of different datasets
to uncover hidden adversary activity. A favorite of many analysts is the use of Kusto
Query Language (KQL) queries across Microsoft Threat Protection tool datasets.

The structure of Tier 3 has changed over time, but has recently gravitated to four
different functions:

Major incident engineering—Handles escalation of incidents from Tier 2.


These virtual teams are created as needed to support the duration of the
incident and often include both reactive investigations, as well as proactive
hunting for additional adversary presence.

External adversary research and threat analysis—Focuses on hunting for


adversaries using existing tools and data sources, as well as signals from
various external intelligence sources. The team is focused on both hunting

https://www.microsoft.com/security/blog/2019/04/23/lessons-learned-microsoft-soc-part-2-organizing-people/ 6/10
14/04/2020 Lessons learned from the Microsoft SOC—Part 2a: Organizing people

for undiscovered adversaries as well as creating and refining alerts and


automation.

Purple team operations—A temporary duty assignment where Tier 3


analysts (blue team) are paired with our in-house attack team members (red
team) as they perform authorized attacks. We found this purple (red+blue)
activity results in high-value learning by both teams, strengthening our
overall security posture and resilience. This team is also responsible for the
critical task of coordinating with red team to deconflict whether a detection
is an authorized red team or a live attacker. At customer organizations,
we’ve seen failure to deconflict red team activity result in our DART teams
flying onsite to respond to a false alarm (an avoidable, expensive, and
embarrassing mistake).

Future operations team—Focuses on future-proofing our technology and


processes by building and testing new capabilities.

Learn more

For more insights into Microsoft’s approach to using technology to empower


people, watch Ann Johnson’s keynote at RSA 2019 and download our poster. For
information on organizational culture and goals, read Lessons learned from the
Microsoft SOC—Part 1: Organization. In addition, see our CISO series to learn more.

Stayed tuned for the second segment in “Lessons learned from the Microsoft SOC
—Part 2,” where we’ll cover career paths and readiness programs for people in our
SOC. And finally, we’ll wrap up this series with “Part 3: Technology,” where we’ll
discuss the technology that enables our people to accomplish their mission.

For more discussion on some of these topics, see John and Kristina’s session
(starting at 1:05:48) at Microsoft’s recent Virtual Security Summit.

Read more from this series

Lessons learned from the Microsoft SOC—Part 1: Organization

https://www.microsoft.com/security/blog/2019/04/23/lessons-learned-microsoft-soc-part-2-organizing-people/ 7/10
14/04/2020 Lessons learned from the Microsoft SOC—Part 2a: Organizing people

Lessons learned from the Microsoft SOC Part 2b: Career paths and
readiness

Filed under:

Automation, CISO series, Threat protection

You may also like these articles

June 6, 2019 January 31, 2019 February 21, 2019

Lessons learned CISO series: Lessons learned


from the Talking from the
Microsoft SOC cybersecurity Microsoft SOC—
Part 2b: Career with the board Part 1:
paths and of directors Organization
readiness
To maintain a In the first of our
In our second post board’s confidence, three part series, we
about people—our you need to engage provide tips on how
most valuable them in your to manage a
resource in the SOC strategy early and security operations
—we talk about our often. center (SOC) to be
investments into more responsive,
readiness programs, Read more 
effective, and
career paths, and collaborative.
recruiting for
success. Read more 

https://www.microsoft.com/security/blog/2019/04/23/lessons-learned-microsoft-soc-part-2-organizing-people/ 8/10
14/04/2020 Lessons learned from the Microsoft SOC—Part 2a: Organizing people

Read more 

Get started with


Microsoft Security

Microsoft is a leader in
cybersecurity, and we embrace our
responsibility to make the world a
safer place.

LEARN MORE 

Get all the news, updates, and more at @MSFTSecurity

What's new Microsoft Education Enterprise Developer Company


Store
Microsoft 365 Microsoft in Azure Microsoft Visual Careers
Account profile education Studio
Surface Pro X AppSource About Microsoft
Download Office for Windows Dev
Surface Laptop 3 Center students Automotive Center Company news

Surface Pro 7 Microsoft Store Office 365 for Government Developer Privacy at
support schools Network Microsoft
Windows 10 apps Healthcare
https://www.microsoft.com/security/blog/2019/04/23/lessons-learned-microsoft-soc-part-2-organizing-people/ 9/10
14/04/2020 Lessons learned from the Microsoft SOC—Part 2a: Organizing people

Returns Deals for TechNet Investors


Manufacturing
students &
Order tracking parents Microsoft Diversity and
Financial services
developer inclusion
Store locations Microsoft Azure program
Retail
in education Accessibility
Buy online, pick Channel 9
up in store Security
Office Dev
In-store events Center

Microsoft
Garage

 English (United States)

Sitemap Contact Microsoft Privacy & cookies Terms of use Trademarks Safety & eco About our ads
© Microsoft 2020

https://www.microsoft.com/security/blog/2019/04/23/lessons-learned-microsoft-soc-part-2-organizing-people/ 10/10

You might also like