You are on page 1of 9

14/04/2020 Lessons learned from the Microsoft SOC—Part 1: Organization - Microsoft Security

|
Security
Solutions All Microsoft

Products
Search 
Operations & Intelligence
Sign in

Partners
February 21, 2019

Resources
Lessons learned from the Microsoft SOC—Part 1:
Organization Trust Center 

Mark Simos Lead Cybersecurity Architect, Cybersecurity Solutions Group


John Dellinger Chief Security Advisor, Cybersecurity Solutions Group
Kristina Senior Director, SOC and IR, Digital Security & Risk Engineering

We’re frequently asked how we operate our Security Operations Center (SOC) at
Microsoft (particularly as organizations are integrating cloud into their enterprise
estate). This is the first in a three part blog series designed to share our approach
and experience, so you can use what we learned to improve your SOC.

In Part 1: Organization, we start with the critical organizational aspects


(organizational purpose, culture, and metrics). In Part 2: People (Part 2a and 2b), we
cover how we manage our most valuable resource—human talent. And finally Part
3: Technology, covers the technology that enables these people to accomplish their
mission.

Overall SOC model

Microsoft has multiple security operations teams that each have specialized
knowledge to protect the different technical environments at Microsoft. We use a
“fusion center” model with a shared operating floor, which we call our Cyber
https://www.microsoft.com/security/blog/2019/02/21/lessons-learned-from-the-microsoft-soc-part-1-organization/ 1/9
14/04/2020 Lessons learned from the Microsoft SOC—Part 1: Organization - Microsoft Security

Defense Operations Center (CDOC), to increase collaboration and facilitate rapid


communication among these teams. Each team manages to the specific needs of
their environment.

In this three part series, we focus on the operation of our corporate IT SOC team as
they most closely reflect the challenges and approaches of our customers—having
many users and endpoints, email attack vectors, and a hybrid of on-premises and
cloud assets. In addition, we include a few lessons learned from the other SOCs and
our Detection and Response Team (DART) that helps our customers respond to
major incidents.

This SOC operates with three tiers of analysts plus automation as seen in Figure 1
below. (We’ll provide more details in Part 2: People.)

Figure 1. SOC analyst tiers plus automation.

The tooling in the SOC (Figure 2) is a mixture of centralized breadth capabilities and
specialized tools to enable high quality alerts and an end-to-end investigation and
remediation experience. (Part 3: Technology will provide more details.)

Figure 2. SOC tooling.

Like all things in security, our SOC has evolved considerably over the years to its
current state and will continue to evolve. We recently noticed that our SOC had
sustained a 100+ percent growth in incidents handled over the past three years
with a nearly flat staffing level. While we don’t know if we can expect this
astounding trend to continue in the future, it validates that we are on the right

https://www.microsoft.com/security/blog/2019/02/21/lessons-learned-from-the-microsoft-soc-part-1-organization/ 2/9
14/04/2020 Lessons learned from the Microsoft SOC—Part 1: Organization - Microsoft Security

track and should share our learnings.

SOC organizational purpose

The first element we cover is the value of the SOC in the context of the overall
mission and risk of the organization. Like the traditional incarnations of crime and
espionage, we don’t expect there will be a straightforward “solution” to
cyberattacks. A SOC is often a crucial risk mitigation investment for an enterprise as
it is core to limiting how much time and access attackers have in the organization.
This ultimately increases the attacker’s cost and decreases the benefit, which
damages their return on investment (ROI) and motivation for attacking your
organization. Everything in the SOC should be oriented toward limiting the time
and access attackers can gain to the organization’s assets in an attack to mitigate
business risk.

At Microsoft, our SOCs bear not just the responsibility of reducing risk to our
employees and investors, but also the weight of the trust that millions of customers
accessing our cloud services and products put in us.

We’ve learned that the SOC has four primary functional integration points with the
business:

https://www.microsoft.com/security/blog/2019/02/21/lessons-learned-from-the-microsoft-soc-part-1-organization/ 3/9
14/04/2020 Lessons learned from the Microsoft SOC—Part 1: Organization - Microsoft Security

Business context (to the SOC)—The SOC needs to understand what is


most important to the organization so the team can apply that context to
fluid real-time security situations. What would have the most negative
impact on the business? Downtime of critical systems? A loss of reputation
and customer trust? Disclosure of sensitive data? Tampering with critical
data or systems? We’ve learned it’s critical that key leaders and staff in the
SOC understand this context as they wade through the continuous flood of
information and triage incidents and prioritize their time, attention, and
effort.

Joint practice exercises (with the SOC)—Business leaders should regularly


join the SOC in practicing response to major incidents. This builds the
muscle memory and relationships that are critical to fast and effective
decision making in the high pressure of real incidents, reducing
organizational risk. This practice also reduces risk by exposing gaps and
assumptions in the process that can be fixed prior to a real incident.

Major incidents updates (from the SOC)—The SOC should provide


updates to business stakeholders for major incidents as they happen. This
allows business leaders to understand their risk and take both proactive and
reactive steps to manage that risk. For more learnings on major incidents by
our DART team, see the incident response reference guide.

Business intelligence (from the SOC)—Sometimes the SOC finds that


adversaries are targeting a system or data set that isn’t expected. As the
SOC discovers the targets of attacks, they should share these with business
leaders as these signals may trigger insight for business leaders (outside
awareness of a secret business initiative, relative value of an overlooked
data set, etc.).

SOC culture

If you take one thing away from this post, it’s that the SOC culture is just as
important as the individuals you hire and the tools you use. Culture guides
countless decisions each day by establishing what the right answer looks and feels
like in ambiguous situations, which are plentiful in a SOC.

https://www.microsoft.com/security/blog/2019/02/21/lessons-learned-from-the-microsoft-soc-part-1-organization/ 4/9
14/04/2020 Lessons learned from the Microsoft SOC—Part 1: Organization - Microsoft Security

Our cultural elements are very much focused on people, teamwork, and continuous
learning and include these learnings:

Use your human talent wisely—Our people are the most valuable asset
we have in the SOC and we can’t afford to waste their time on repetitive
thoughtless tasks that can be automated. To combat the human threats we
face, we need knowledgeable and well-equipped humans that can apply
expertise, judgement, and creative thinking. This human factor affects
almost every aspect of SOC operations including the role of tools and
automation to empower humans to do more (versus replacing them) and in
reducing toil on our analysts. (More on this topic in Part 2: People.)

Teamwork—We’ve learned that we can’t tolerate the “lone hero” mindset


in the SOC, nobody is as smart as all of us together. Teamwork makes a
high-pressure working environment like the SOC much more fun, enjoyable,
and productive when everyone knows they’re on the same team and
everyone has each other’s back. We design our processes and tools to
divide up tasks into specialties and to encourage people to share insights,
coordinate and check each other’s work, and constantly learn from each
other.

Shift left mindset—To get and stay ahead of cybercriminals and hackers
who constantly evolve their techniques, we must continuously improve and
shift our activities “left” in the attack timeline. We focus on speed and
efficiency to try and get “faster than the speed of attack” by looking at ways
we could have detected attacks earlier and responded more quickly. This
principle is effectively an application of a continuous learning “growth
mindset” that keeps the team laser focused on reducing risk for our
organization and our customers.

SOC metrics

The final organizational element is how we measure success, a critical element to


get right. Metrics translate culture into clear measurable objectives and have a
powerful influence on shaping people’s behavior. We’ve learned that it’s critical to
consider both what you measure, as well as the way that you focus on and enforce
https://www.microsoft.com/security/blog/2019/02/21/lessons-learned-from-the-microsoft-soc-part-1-organization/ 5/9
14/04/2020 Lessons learned from the Microsoft SOC—Part 1: Organization - Microsoft Security

those metrics. We measure several indicators of success in the SOC, but we always
recognize that the SOC’s job is to manage significant variables that are out of our
direct control (attacks, attackers, etc.). We view deviations primarily as a learning
opportunity for process or tool improvement rather than a failing on the part of the
SOC to meet a goal.

These are the metrics we track, trend, and report on:

Time to acknowledge (TTA)—Responsiveness is one of the few elements


the SOC has direct control over. We measure the time between an alert
being raised (“light starts to blink”) and when an analyst acknowledges that
alert and begins the investigation. Improving this responsiveness requires
that analysts don’t waste time investigating false positives while another
true positive alert sits waiting. We achieve this with ruthless prioritization.
Any alert that requires an analyst response must have a track record of 90
percent true positive. We’ll talk more about the technology we use in Part 3:
Technology and will describe our use of “cold path” activities like proactive
hunting to supplement the “hot path” of alerts in Part 2: People.

Time to remediate (TTR)—Much like many SOCs, we track the time to


remediate an incident to ensure we’re limiting the time attackers have
access to our environment, which drive effectiveness and efficiencies in our
SOC processes and tools.

Incidents remediated (manually/with automation)—We measure how


many incidents are remediated manually and how many are resolved with
automation. This ensures our staffing levels are appropriate and measures
the effectiveness of our automation technology.

Escalations between each tier—We track how many incidents escalated


between tiers to ensure we accurately capture the workload for each tier.
For example, we need to ensure that Tier 1 work on an escalated incident
isn’t fully attributed to Tier 2.

Get started

https://www.microsoft.com/security/blog/2019/02/21/lessons-learned-from-the-microsoft-soc-part-1-organization/ 6/9
14/04/2020 Lessons learned from the Microsoft SOC—Part 1: Organization - Microsoft Security

Our biggest recommendation for the SOC organization is to define the culture you
want to inculcate. This will shape your team and attract the talent you want. In the
coming weeks, we’ll share our philosophy on managing people, career paths, skills,
and readiness, and what tools we use to enable our people to accomplish their
mission.  In the meantime, head over to CISO series to learn more.

Read more from this series

Lessons learned from the Microsoft SOC—Part 2a: Organizing people

Lessons learned from the Microsoft SOC Part 2b: Career paths and
readiness

Filed under:
Automation, CISO series, Cybersecurity, Endpoint security, Microsoft 365,
Microsoft Cloud App Security

You may also like these articles

June 6, 2019 April 23, 2019 January 31, 2019

Lessons learned Lessons learned CISO series:


from the from the Talking
Microsoft SOC Microsoft SOC— cybersecurity
Part 2b: Career Part 2a: with the board
of directors
https://www.microsoft.com/security/blog/2019/02/21/lessons-learned-from-the-microsoft-soc-part-1-organization/ 7/9
14/04/2020 Lessons learned from the Microsoft SOC—Part 1: Organization - Microsoft Security

paths and Organizing To maintain a


readiness people board’s confidence,
you need to engage
In our second post In the second of our them in your
about people—our three-part series, we strategy early and
most valuable focus on the most often.
resource in the SOC valuable resource in
—we talk about our the SOC—our Read more 

investments into people.


readiness programs,
career paths, and Read more 

recruiting for
success.

Read more 

Get started with


Microsoft Security

Microsoft is a leader in
cybersecurity, and we embrace our
responsibility to make the world a
safer place.

LEARN MORE 

https://www.microsoft.com/security/blog/2019/02/21/lessons-learned-from-the-microsoft-soc-part-1-organization/ 8/9
14/04/2020 Lessons learned from the Microsoft SOC—Part 1: Organization - Microsoft Security

Get all the news, updates, and more at @MSFTSecurity

What's new Microsoft Education Enterprise Developer Company


Store
Microsoft 365 Microsoft in Azure Microsoft Visual Careers
Account profile education Studio
Surface Pro X AppSource About Microsoft
Download Office for Windows Dev
Surface Laptop 3 Center students Automotive Center Company news

Surface Pro 7 Microsoft Store Office 365 for Government Developer Privacy at
support schools Network Microsoft
Windows 10 apps Healthcare
Returns Deals for TechNet Investors
students & Manufacturing
Order tracking parents Microsoft Diversity and
Financial services developer inclusion
Store locations Microsoft Azure program
in education Retail Accessibility
Buy online, pick Channel 9
up in store Security
Office Dev
In-store events Center

Microsoft
Garage

 English (United States)

Sitemap Contact Microsoft Privacy & cookies Terms of use Trademarks Safety & eco About our ads
© Microsoft 2020

https://www.microsoft.com/security/blog/2019/02/21/lessons-learned-from-the-microsoft-soc-part-1-organization/ 9/9

You might also like