You are on page 1of 45

United States

Department of Energy
National Nuclear Security Administration
International Nuclear Security

Advanced Testing of Physical


Security Systems through AI/ML
Polling Questions
1. How applicable to physical security is AI/ML?
Learning Objectives
• Name four methods for physical protection analysis
• Name the two limitations where AI/ML can provide a benefit over
current analyses
• Name one ASD tool currently used
• Name an AI method that can learn to develop strategies
• Describe two limitations to AI/ML-based approaches using
reinforcement learning
Brief Refresh
AI/ML includes methods to analyze data for a specific purpose:
• Classification

• Pattern analysis

• Optimization (efficacy of strategy)


Machine Learning
• Three paradigms
◦ Supervised Learning
 Predictions have clear definitions, e.g., “Is Hostile” vs. “Is Not Hostile”
 Requires labeled training data for algorithm to learn
 Example: intelligent thermostat makes empirical fit of temperatures in facility
◦ Unsupervised Learning
 Identifies structures and patterns in the data
 Useful for identifying groups and outliers
 Example: product marketing
◦ Reinforcement Learning
 Maps environments to actions to maximize a reward signal
 Requires no labeled training data
 Applicable to complex, real-world tasks like physical security
Brief Refresh
A security system is defined here to include:

• Audible • Infrared
• Cyber security • LIDAR
Sensors • Door access • Motion
• Guards • Video

• Doors • Materials
Layout • Fences • Walls

• Human • Operations
Procedures Resources • Training
Security System Design and Evaluation

Identification of
critical assets

Probabilistic
Identification of
Ranking of
Deficiencies
Threats

Creation of
Assessment of Operating
Security Posture Procedures
Design Basis Threat
• Design Basis Threat (DBT) describes the capabilities of potential adversaries
who might attempt removal of material or sabotage
• Includes external and internal adversaries
◦ Terrorists, criminals
◦ Authorized individuals who commit or aid malicious acts
• Developed by regulatory bodies, competent authorities and operators
• DBT provides the basis for security evaluation
◦ Effectiveness of security is measured against ability to protect against DBT

https://www.iaea.org/topics/security-of-nuclear-and-other-radioactive-material/design-basis-threat
Design Basis Threat
Discussion:
How does DBT get defined?
◦ What goes into process?
 External Physical Threat, Insider Threat
 Theft, sabotage
 New technologies, e.g., drones, cyber
How does DBT evolve?
◦ What type data is used?
◦ Use of Open Source Intelligence (OSI)?
Design Basis Threat Development
• Collect and analyze threat information
◦ Intelligence
◦ Past security events at facility
◦ Past security events as similar facilities
• Evaluated credibility of threat information
• Identify potential adversaries
◦ Characteristics, capabilities and attributes
◦ Relevance of capabilities to specific targets

Red Team Exercise (Y0900658)

https://www.iaea.org/topics/security-of-nuclear-and-other-radioactive-material/design-basis-threat
Evaluation of Security
• Administrative ($) Low Value, Wide Coverage

• Tabletop Exercises $
• Live-Action Drills

• Red Team Exercises ($$$$) $$$


Red Team Exercise
High Value, Specific Coverage (Y0900658)
Polling Questions
2. At what level would you expect AI/ML
could make the largest impact?
Administrative
• Security can be evaluated by adherence to several security documents that
define the intended function and expected outcomes of the security
infrastructure
◦ Policies & Procedures
◦ Standards
◦ Response plans, performance metrics
◦ Security Concept of Operations
• Administrative security documents can be a strong indicator of security culture
Tabletop Exercises (TTX)
• Evaluate potential security incident scenario
◦ Simulation of a facility layout
◦ Detailed information of spatial or temporal scales not
required
• Participants gain situational awareness
◦ Explore effectiveness of response and infrastructure
• Security domain experts address vulnerabilities
◦ Taking adversary perspective identifies possible exploits
◦ Identify feasible mitigation solutions
Sandia LabNews. January 02, 2020
Live-Action Drills
• Coordinated, supervised, real-time
exercise simulations of security
scenarios:
◦ Exercise a single, specific operation
or function
◦ Exercise the coordination, command,
and control among various functions
◦ Full-scale exercise across all
functions and groups
• Often intended to probe response
to security incident

KnoxNews Sentinel. October 24, 2014


Red Teaming
• Physical penetration testing
◦ Provides real-world explorations of physical security effectiveness
• Red teams are highly trained to infiltrate various secure environments
◦ Provide a sophisticated intruder’s perspective
• Will employ attacks across a variety of vectors
◦ No prior knowledge by response of method or timing
◦ Broader scope than live-action drill, attack pathways are unknown to responders
• Objective: Uncover unanticipated vulnerabilities over the entire threat
surface
◦ Comprehensive picture of the level of security at a facility
Challenges, Costs and Logistics
• Rely on human experts to explore the space of possible attacks vectors
• Labor/time constraints; result in high costs
• Logistical constraints
◦ Security experts scheduled away from regular functions
◦ Sections of facility closed for live-action training
• New technologies rapidly expanding feasible attack vectors
◦ E.g., drones, access to open source intelligence, cyber-physical attacks
◦ Require specialized expertise
• Not feasible for human experts to exhaust all possible attacks or novel vectors
◦ Complex attack surface leads to overlooked threats in traditional security assessments
AI/ML Augmented Analysis
1. Meeting challenges, costs and logistics
a. Analysis personnel, funding and time is a limited resource
b. Optimization of analysis process using data is method for improvement
2. AI/ML methods are force multipliers
a. Analyses generate significant amounts of data that humans cannot process well
b. Applying AI/ML tools to existing and future analysis provide additional insights
c. New technological advancements can be rapidly simulated using tools
Applications of AI/ML to Security Analysis
1. Optimization methods for low level tasks
a. Sensor placement for maximum area coverage
b. Sensor selection to minimize false alarms
c. Randomization of guard routes
d. Mechanical engineering analysis for materials selection of walls and fences
2. More advanced
a. Facility layout with Adversary Sequence Diagrams
b. Simulation of force on force interactions
Adversary Path

Pathway Analysis
• Quantitative analysis of adversary
actions, also known as Adversary Physical Area: Off Site
Sequence Diagrams (ASD) Physical Area: Site Security Area
◦ Compliments expert evaluations in TTX
◦ Used to determine the most vulnerable
path for a given threat Physical Area:
◦ Variations of ML can perform these tasks Controlled Building

• Adversary accomplishes goal by


moving along a path, both physical
and temporal, while defeating the
various aspects of physical security
encountered. Adversary Target

• Many metrics may be recorded for


AI/ML analysis later
Adversary Path

Adversary Sequence Diagram


• Pathways define transitions
between each adjacent Path Element: Physical Area: Off Site
physical area Fence Physical Area: Site Security Area
◦ Path elements assigned a PD = 0.4
probability of detection (PD) T = 45s
and time delay (T) associated Physical Area:
with each layer of security Controlled Building
• Overall result is quantitative
value of the total PD and Path Element:
Door
estimated T across all
PD = 0.7
segments T = 30s
• Optimization via AI/ML would Adversary Target
revise the facility design
using analysis results
Security Assessments Tools & Methodology
• Pathway Analysis
◦ Quantitative analysis of adversary actions, also known as Adversary Sequence Diagrams
(ASD)
◦ ASD perspective: Adversary accomplishes goal by moving along a path, both physical and
temporal, while defeating the various aspects of physical security encountered.
◦ Quantitative assessment to compliment expert evaluations in TTX
• Tools allow for automation and standard approaches
ASD Tools
• Sandia Physical Security Handbook *
◦ Evaluated times for various attack scenarios (e.g. breaching a wall)
• ASSESS
◦ Defines 2D area with features from handbook
◦ Automates attack sequence with respect to time
◦ Determines most probable methods of attack
◦ Provides table of neutralization probabilities
and time durations
◦ Typically used in lead up to exercise

*Physical Protection Technology for Nuclear Power Plants https://www.osti.gov/servlets/purl/10177423


https://www.osti.gov/servlets/purl/12261761
ASD Tools
• Scribe3D (DOE license – Sandia)
◦ Build and run TTXs in a video game-like
environment
 Based on gaming engine with advanced AI
capabilities
◦ Allows for more complex and immersive
scenarios than traditional TTX
 Inject realistic time delays, terrains, and
weapons systems
◦ Calculates Probability to Kill (PK) for an
engagement based
◦ Recording/replay for detailed after-action
review
◦ Current AI limited to agents moving in way
points – moving to more AI Scribe3D
ASD Tools
• Simajin (Commercial - Rhinocorps)
◦ Rhinocorps
◦ Build physical facility layouts and
process simulations
◦ Allows for detailed modeling of
physical environment
◦ Can simulate processes and
events over time
 Automate changes to physical
landscape and event,
e.g. radiological incident

RhinoCorps Simajin
ASD Tools
• Vanguard (Commercial - Rhinocorps)
◦ Build and run TTXs in a video
game-like environment
◦ Automated simulation of attack,
response and outcomes
◦ Built on top of Simajin
◦ Includes support for decision tree
analysis
◦ Significant amounts of data provide
for an AI/ML analysis to
extend capabilities in near term

RhinoCorps Vanguard
AI-Augmented Security Assessments
• AI: Method that approximate human behavior:
◦ ML: Algorithms that use data to learn patterns that are not explicitly programmed and
improve through experience (improves Pd)
• Intelligent automation reduces time on routine tasks
◦ Access control: Facial recognition speeds up identification
 Authorized access can be more efficiently managed
◦ Alarm adjudication: Reduce nuisance alarms at Central Alarm Station (CAS)
 CAS operator can better focus on alarms from actual threats
• Intelligent automation allows for larger threat spaces to be explored
◦ Attack simulation: All possible vectors can be considered
AI Physical Security: Offensive Testing
• Complex attack surfaces
• AI red teaming
◦ Model of physical system,
• Exhaustive with fewer logistics
◦ Through AI millions of attack simulations can
be automated to virtually test all possible
attack vectors
◦ Results on attack vectors or novel methods
easily aggregated for analysts
AI Physical Security: Offensive Testing
• Integrated cyber/physical attacks
◦ Multi-agent simulations can include cyber attack vectors
◦ Logistically difficult to simulate in traditional Red Team exercise
• Insider threat assessment
◦ Agent positioning and prior knowledge can represent insider capabilities
◦ Collusion of one or more insiders in real time aiding external threat
• Evaluation of new sensors
◦ Sensor capabilities directly added to physical model prior to deployment
Reinforcement Learning (RL)
• Goal: Agent to learn to interact with environment in
ways that result in a higher reward signal
• Agent is knowledgeable of and fully able to control
its behavior
◦ May not be fully knowledgeable of, or able to control, the
environment
• Example: Agent may know that it is able to breach a
metal fence and execute that behavior.
◦ It does not know optimal place to make breach or result of
breach (e.g., presence of intrusion alarms)

RL Navigation Paths
P. Mirowski, et al.arXiv:1611.03673
Reinforcement Learning
• The agent’s interaction with the environment is defined by:
◦ Actions: The choices made by the agent; ◦ Rewards: The basis for evaluating the
◦ States: Environmental and internal feedback agent’s choices, where the agent learns to
basis for the agent making the choices. make choices that increase the overall
 For example, a state defined by the portion of reward received
a fence line is currently visible to the agent ◦ Policy: A rule, or set of rules, by which the
(environmental), and that the agent is agent selects actions as a function of states.
currently walking (internal)

R. Sutton and A. Barto. “Introduction to Reinforcement Learning.” 2018.


Reinforcement Learning

Exercise: Navigating a Maze


Identify the key components for an RL
algorithm
◦ Recognize internal and external states
◦ Available actions
◦ Reward
Reinforcement Learning
• Algorithm discovers chooses actions that maximize
reward via trial and error
• Actions affect both immediate and future rewards
◦ Requires the agent to learn situations and sequences of
behavior
◦ Develops strategies similar to those observed in physical
protection
• Rewards
◦ Derived from agent goals, e.g., theft of material
◦ At each step agent receives a numerical reward from the
environment
◦ The reward provides the “reinforcement” of good or bad
behaviors
Reinforcement Learning Design
• Trial and Error
◦ Agent learns over millions of trials in simulated environment
◦ Early trials primarily explore the space of possible actions to
find reward
◦ Later trials exploit previously discovered actions that resulted in
reward
• Exploration vs. Exploitation
◦ Exploration: Algorithm chooses an action at random every so
often
◦ Exploitation: Algorithm learns to preference prior success in
choosing action
• Balance of is key aspect of algorithm design
◦ Critical to ensure reward is not maximized only in a local space
Reinforcement Learning Design
• Optimization: agent’s objective to maximize the
cumulative reward
◦ Delayed reward is a critical characteristic of RL
◦ An action that results in a low immediate reward may still be
chosen if it results in higher cumulative rewards across all
actions
• Reward is a critical and often difficult design element
Adversary
Reinforcement Learning

4. Polling Question
What type of reward might be useful in this Off Site
case? Site Security Area

Controlled Building
- Reward only at end
- Continuous reward with increased time
- Continuous reward with unexplored areas
- Number of layers accessed Adversary Target
RL Example: Hide and Seek
• Environment
◦ Multiple competitive agents trained via RL: Hiders (blue) and Seekers (red)
◦ Open area with walls, boxes and ramps
◦ Hiders given time to move objects and hide before Seekers begin
• Actions:
◦ Move around area
◦ See each other and objects
◦ Grab and move objects: boxes and ramps
 Lock objects in place so that they cannot be moved
• Rewards:
◦ Hiders receive +1 if all are hidden, -1 if seen
◦ Seekers receive +1 if any hider is found, -1
if none are found
B. Baker, et al. https://arxiv.org/abs/1909.07528
RL Example: Hide and Seek
• Various strategies emerged
◦ Chasing
◦ Shelter construction
◦ Seekers learn to use ramps to jump walls and defenses
◦ Hiders learn to steal ramps as defense
◦ Seekers using boxes to jump off of (exploiting simulation mechanics) Novel Vector

B. Baker, et al. https://arxiv.org/abs/1909.07528


Search

Transfer Learning
• A technique where a model trained on one
task is re-purposed on a second related task Allowed Solutions
◦ Teach an old AI new tricks
• Progressive complexity training All Solutions
◦ Simple: Learn to walk through opening
◦ Complex: Learn to open door
◦ More complex: Learn to open door with a key Search
◦ Very complex: Learn to breach fortified door
• Reduces search space using prior knowledge
Allowed Solutions

All Solutions
Transfer Learning, Lisa Torrey and Jude Shavlik
Transfer Learning
• Reusing agent trained on simple, generic tasks leads to faster training on
facility-specific, complex tasks

Agent first learns to find a key and Prior learning jumpstarts ability to find a key and
unlock a door in a single room. unlock a door in a larger office layout.
Reinforcement Learning Challenges
• Strategy of reward is key aspect of successful RL application
• Environment is unknown
◦ Agent does not necessarily have knowledge of layout or limitations of security
◦ Agent can only exploit knowledge of environment after enough experience is
collected via exploration
◦ Computation time for adequate timing may be prohibitive
• Achieving generalization of model is key
• Adaptation of model to real work may not be perfect
◦ Human in the loop analysis still critical
Machine Learning

Discussion:
What intelligent security systems do you currently use?
◦ What security applications?
◦ What type of data is used?
AI Physical Security: Offensive Testing

Discussion: Trends in Attack Vectors


What additional attack vectors might be
trending and over the horizon?
◦ Are current techniques sufficient?
◦ Can current techniques be analyzed in a
reasonable amount of time?
What We’ve Learned
• Methods for physical protection analyses
• Approaches for improving systems within cost and logistical limitations
• Several tools and methods exist for semi-automated threat analysis
◦ Adversary Sequence Diagrams, Software-based Tabletop Exercises
• AI/ML methods can approximate human behavior
◦ Simulate threats to a nuclear facility
◦ Provide efficient, more thorough search of threat space
• Research and development are currently advancing AI/ML-based security
systems testing through Reinforcement Learning
• Challenges:
◦ Accurate simulation of the physical environment: Only as good as the input
◦ Design of action and reward space for RL algorithm

You might also like