Professional Documents
Culture Documents
Cybersafety A System-Theoretic Approach To Identify Cyber-Vulnerabilities Amp Mitigation Requirements in Industrial Control Systems
Cybersafety A System-Theoretic Approach To Identify Cyber-Vulnerabilities Amp Mitigation Requirements in Industrial Control Systems
5, SEPTEMBER/OCTOBER 2022
Abstract—Recent cyber-physical attacks, such as Stuxnet, Triton etc., have invoked an ominous realization about the vulnerability of
critical infrastructure, including water, power and gas distribution systems. Traditional IT security-biased protection methods that focus
on improving cyber hygiene are largely impotent in the face of targeted attacks by advanced cyber-adversaries. Thus, there is an urgent
need to analyze the safety and security of critical infrastructure in a holistic fashion, leveraging the physics of the cyber-physical
system. System-Theoretic Accident Model & Processes (STAMP) offers a powerful framework to analyze complex systems; hitherto,
STAMP has been used extensively to perform safety analyses but an integrated safety and cybersecurity analysis of industrial control
systems (ICS) has not been published. This paper uses the electrical generation and distribution system of an archetypal industrial
facility to demonstrate the application of a STAMP-based method – called Cybersafety – to identify and mitigate cyber-vulnerabilities in
ICS. The key contribution of this work is to differentiate the additional steps required to perform a holistic cybersecurity analysis for an
ICS of significant size and complexity and to present the analysis in a structured format that can be emulated for larger systems with
many interdependent subsystems.
Index Terms—CPS security design, industrial control system, STAMP, system security, cyber-physical damage
turbo-generator ($11M [5]) and subsequent outage costing by modeling attacker’s intentions in the analysis; the method
several million dollars in repairs and lost revenue. It also is known as the Extended Tree Analysis (ELT) [12].
provides several realistic scenarios that illustrate how the Despite these advances, there are inherent limitations of
interdependencies of the controlled process could be these methods. For instance, while FMEA is well-suited for
exploited to enable such an attack. Section 2 provides a liter- evaluating individual component failures and providing
ature review about the application of systems theory to reliability information, it is limited in its use as a safety tool
cybersecurity. Section 3 provides a brief overview about the because it considers single item failures without considering
Cybersafety method. Section 4 describes the key features of failures due to component interactions [13], [14].
an archetypal industrial plant that the method was applied Likewise, Xu et al. [15] argue that FTA is limited in its
to while Section 5 describes the bulk of the analysis. A dis- analysis of human factors, organizational and extra-organi-
cussion of the results, along with some proposed mitigation zational factors. It also fairs poorly as the complexity of the
requirements is provided in Section 6 followed by a short system increases [15]. Leveson [16] argues against the use of
conclusion in Section 7. probabilistic risk analyses (i.e. the underlying framework
for FTA) over system design analyses to improve system
2 LITERATURE REVIEW safety due to the inherent difficulty and uncertainty in
assigning probabilities to design and manufacturing flaws.
Traditional approaches to protect cyber-physical systems According to Dunj o et al. [17], the systems-based Hazard
are often strongly biased by practices and design principles and Operability (HAZOP) Analysis [13], [14] lies in between
prevalent in the information security world. These FTA and FMEA. Friedberg [18] argues that over the years,
protection mechanisms and principles broadly include researchers have tried to formalize HAZOP to achieve
authentication, access control, firewalls, intrusion detection, objective and quantifiable results, “but all approaches to quan-
antimalware, application whitelisting, flow whitelisting, tify results have led back to the use of FTA”.
cryptography, integrity verification, survivability etc. Loukas
[1] and Cardenas et al. [6] support the view that traditional
protection mechanisms in cyberspace are largely applicable 2.2 System-Theoretic Accident Model and
to cyber-physical systems. However, they note that impor- Processes (STAMP)
tant differences exist in implementation and effectiveness; An alternative to performing joint analysis of safety and
some of these are described next. security using extended versions of traditional hazard anal-
First, for cyber-physical systems, availability and integrity ysis methods (such as FTA/FMEA etc.), is to use the per-
of information is more crucial than confidentiality of infor- spective of modeling using systems theory. Leveson [19], [20]
mation. Second, for intrusion detection in cyber-physical developed a framework to understand causes of accidents
systems, sensor data from the physical space is an important using systems theory. This framework is called STAMP
input, unavailable to IT systems which rely purely on cyber- (System-Theoretic Accident Model and Processes).
space metrics. Third, an understanding of the consequences STAMP treats accidents as a ‘dynamic control problem’
of an attack in the physical world is required to design a emerging from violation of safety constraints rather than a
protection scheme for defense-in-depth of the cyber-physi- ‘reliability problem’ aimed at preventing component failures.
cal asset. And fourth, conventional security policies (such as Several analytical methods have been developed based on
patching) may in fact increase potential vulnerabilities, the STAMP framework such as STPA, CAST etc.
rather than decreasing them [7] – to elaborate this point, STPA is an acronym for System-Theoretic Process Analy-
note that for air-gapped Supervisory Control and Data Acquisi- sis; it is a forward-looking approach for identifying hazards
tion (SCADA) systems, the ‘air-gap’ improves cybersecurity in complex systems [19], [20]. Similar to STPA, but looking
as it limits opportunities for remote attacks. Paradoxically, backwards, CAST (Causal Analysis using Systems Theory),
however, this also means that the devices cannot be auto- is used to identify causal factors for past events or accidents
matically updated with malware signatures (blacklists) and using the STAMP framework [19], [20].
operators must manually install any updates on each iso- In his thesis, Thomas [21] provides a mathematical model
lated device, thereby increasing the risk of cross-contamina- underlying STPA and a method to perform the analysis sys-
tion due to frequent manual updates [7]. tematically which enables a more rigorous analysis with
more objective results. Since its creation, the STPA method
2.1 Safety Focused Approaches has been applied to a wide variety of industries and safety
Since one of the primary concerns with security of cyber- use-cases. It has been used in the automotive industry [22],
physical systems is its impact on system safety, a number of automation and workplace safety [23], nuclear power plants
hazard analysis frameworks and methods traditionally [24], ship navigation [25], medical applications [26] etc.
used for safety analyses have been adapted for security Laracy [27], [28] recognized the similarities between safety
analyses. and security and proposed an extension of STAMP to secu-
For instance, Schmittner et al. [8] extended Failure Modes rity problems of critical infrastructure, such as the Air Trans-
and Effects Analysis (FMEA) [9] to include security consider- portation System. This approach was called STAMP-Sec [27].
ation, by including vulnerabilities, threat agents and threat Salim [29] performed the first documented cybersecurity
modes as inputs for determining failure causes. The analysis using the STAMP-based CAST method by analyz-
extended method is called Failure Mode, Vulnerabilities and ing the TJX Cyberattack; this was the largest cyberattack in
Effects Analysis (FMVEA). Likewise, Steiner and Liggesmeyer history (by number of credit cards) when announced in
[10] proposed an extension of Fault-Tree Analysis (FTA) [11] 2007 and cost TJX $170 million. Nourian and Madnick [30]
3314 IEEE TRANSACTIONS ON DEPENDABLE AND SECURE COMPUTING, VOL. 19, NO. 5, SEPTEMBER/OCTOBER 2022
5 ANALYSIS
5.1. Define Basis of the Analysis – Step 1
Cybersafety is a top-down, consequence-driven approach
that begins by establishing the boundaries of the system by
defining the goal of the system and identifying the critical
functions required to achieve that goal along with unaccept-
able system-level losses. The system problem statement pro-
vides a convenient framework for establishing the goal and
KHAN AND MADNICK: CYBERSAFETY: A SYSTEM-THEORETIC APPROACH TO IDENTIFY CYBER-VULNERABILITIES & MITIGATION... 3317
TABLE 3
Partial List of Controllers, Safety Responsibilities & Control Actions
condition is different from the overfluxing condition windings [40]. This condition is again distinct from over-
described earlier since one is caused by a high V/Hz ratio fluxing since a high voltage with a proportionally high fre-
while the other is caused by an overcurrent condition quency would not cause an overfluxing event, but it would
(UCA-AVR-10). result in an overvoltage condition (UCA-AVR-6). Excessive
voltages can damage and break down stator insulation in
the machine leading to a fault [43]. It can also stress insula-
5.3.3 Overvoltage tion in other connected components such as transformers,
Overvoltage occurs when the levels of electric field stress bushings and surge arrestors.
exceed the insulation capability of the generator stator
TABLE 5
List of Unsafe Control Actions for the Digital AVR, Operator and Management
factor is also penalized by the utility since it increases cur- leading to overheating and pre-mature failure of motors [45]
rent flow through the distribution network (UCA-AVR-3). (UCA-AVR-12).
If not corrected, the rotor field can weaken to the point The systematic approach described above to identify
that the gas turbine can cause the generator to ‘slip a pole’ UCAs for the AVR can be repeated for each of the control-
i.e. generator rotor would suddenly turn as much as one lers modeled in the functional control structure as docu-
complete revolution faster than it should be spinning and mented by Khan [34].
then violently come to a stop as it tries to magnetically link
up again with the stator magnetic field. Such an event 5.4 Generate Loss Scenarios – Step 4
would cause catastrophic failure of the coupling between In the previous subsection, various system states were iden-
the turbine and the generator [41], [42] (UCA-AVR-9). tified under which a given control action would be hazard-
If there is a complete loss of excitation and the generator ous. In this section, we determine causal factors that enable
breaker is not tripped, it can cause the synchronous genera- the issuance of the earlier identified unsafe control actions.
tor to operate as an induction generator, causing the rotor to According to Leveson [19], two types of causal scenarios
quickly overheat, leading to insulation damage, high vibra- must be considered (graphically shown in Fig. 7):
tion, and rotor striking the stator, causing catastrophic dam-
age [41], [42] (UCA-AVR-1). Apart from the generator, loss a. Scenarios that lead to the issuance of unsafe control
of excitation also impacts the power system; not only is a actions; these could be a result of (1) unsafe controller
source of reactive power lost, but the plant acts as a reactive behavior or (2) inadequate/malformed feedback.
current sink to meet its reactive power demand which has b. Scenarios in which safe control actions are improp-
the potential of triggering a system-wide voltage collapse erly executed or not executed altogether; these could
[44] (UCA-AVR-4). be a result of issues along the (1) control path or the
(1) controlled process itself.
For illustration purposes, we zoom into the functional con-
5.3.5 Under-voltage trol structure for the Digital AVR from Fig. 6 and superim-
When operating in islanded mode (i.e. independent of the pose it with guidewords from Schmittner et al. [32] signifying
grid), if the voltage drops too low, it has the potential to sample attack scenarios; the annotated control structure for
cause overheating of the motor loads at the plant due to an the AVR is presented in Fig. 8. Using the sample attack guide-
increase in current (to make up for the reduction in voltage), words as a starting point, scenarios are hypothesized which
3322 IEEE TRANSACTIONS ON DEPENDABLE AND SECURE COMPUTING, VOL. 19, NO. 5, SEPTEMBER/OCTOBER 2022
Fig. 7. Factors that can result in a) unsafe control actions b) safe control actions not or improperly executed.
would cause the controller to issue an unsafe command. For Coordination & Communication Flaws
instance, the controller could issue an unsafe command Dynamics & Migration to Higher Risk
because it is fed tampered data about the process state, or it Interdependencies Impact
could have the wrong process model to begin with (i.e. the Finally, constraints are then defined to prevent the issu-
process has changed over time, but the controller’s process ance of the UCA as well as to mitigate the effects of the
model has not been updated to reflect that change) etc. attack. Table 6 presents several scenarios, associated causal fac-
Using this logic, each of the sample attack guidewords tors as well as refined safety/security constraints derived for the
around the control loop are carefully contemplated as AVR control loop. A detailed discussion about the new
potential scenarios. Next, for each scenario, potential causal insights gained by generating loss scenarios (in Table 6) is
factors are determined. To generate insightful causal factors, provided in the next section.
we focus on six different categories of flaws for each sce- To further demonstrate the methodology, a few scenarios
nario. These categories are: are generated for a higher-level controller as well i.e. the
plant operator and presented in Table 7. As before, two type
Process/Mental Model Flaws (assumptions) of scenarios are considered for the operator; either the oper-
Contextual Factors (system or environmental) ator provides an unsafe control input (i.e. pushing the
Structural Flaws (missing controls) wrong button) or does not adequately take the required
KHAN AND MADNICK: CYBERSAFETY: A SYSTEM-THEORETIC APPROACH TO IDENTIFY CYBER-VULNERABILITIES & MITIGATION... 3323
TABLE 6
List of Scenarios for the Digital AVR
corrective action (i.e. the corrective action is rendered control loop being analyzed (component failure flaws), vul-
ineffective). nerabilities emerging as a result of interactions between
By following a similar approach, we can move around components (based on interdependencies external to the
the control structure presented in Fig. 6 and evaluate each control loop) and unsafe control inputs from hierarchical
of the other controllers in order to generate a complete set controllers (or the violation of constraints due to ineffective
of causal factors that lead to system-level losses. implementation of controls by higher-level controllers).
Note that any of these vulnerabilities may be exploited by a
malicious actor.
6 DISCUSSION
The causal factors identified in Step 4 above (presented in 6.1 Types of Vulnerabilities Discovered
Tables 6 and 7) are indicative of different types of vulner- For instance, the analysis revealed that in the event that the
abilities. These include vulnerabilities that are local to the AVR malfunctions and causes the generator to overflux, the
3324 IEEE TRANSACTIONS ON DEPENDABLE AND SECURE COMPUTING, VOL. 19, NO. 5, SEPTEMBER/OCTOBER 2022
TABLE 7
List of Scenarios for the Operator
protection scheme at the plant is not equipped with an over- Scenario SC-AVR-08-02 is similar but involves feedback
flux relay (ANSI Device Code 24) to control such a situation about a component state that is not part of the AVR control
– an example of a component-level flaw that was discovered loop – i.e. informational interdependency external to the
through the analysis. The fact that the protection scheme is control loop. Here, the assumption is that the generator
missing an overflux relay is not completely unexpected; breaker is closed when in reality it could simply have been
Scharlach [43] notes that traditionally the overflux protection spoofed by an attacker. This raises questions about the
is implemented in generators larger than 100 MW but cau- design of the process (in terms of information exchange
tions that “due to the serious effects that can result from an unde- integrity between components) and additional controls that
tected overexcitation event”, this protective element should be could prevent such an outcome. This scenario is an example
applied even on smaller machines, especially given the low of a loss scenario resulting from interaction between compo-
cost of such relays. nents (the generator breaker and the AVR).
Note that the overflux relay is required in the event that Scenario SC-AVR-08-03 is different from the other ones
the AVR fails to enforce the required constraints on the gen- presented thus far because it involves a change in the con-
erator terminal voltage. We will now explore the scenarios trol algorithm of the AVR controller. It shows how access to
that would cause it to violate its safety and security con- the AVR by an external contractor for legitimate business
straints in the first place. Scenario SC-AVR-08-01 (Scenario 1 purposes could be exploited by an adversary resulting in a
in Table 6) is rather intuitive – if the AVR is provided incor- loss. The associated causal factors highlight flaws in
rect information about the state of the terminal voltage, it assumptions and controls including beliefs about the vet-
would produce incorrect voltage outputs resulting in a loss ting process of contractors and 3rd parties, their level of
– it is doing what it is designed to do. In addition, the associ- access to critical assets (such as engineering workstations)
ated causal factors provide interesting insights about the and observeability of their actions in making changes to sen-
flaws in the system that would enable such a loss scenario sitive equipment. This scenario also highlights the tradeoff
to succeed. For instance, there is implicit trust without verifi- between functionality and security; remote access and con-
cation in the interaction between the sensor and the control- trol enables convenience, but also increases the attack sur-
ler. Coupled with the use of insecure industrial protocols, face. It also highlights the weakeness in process of storing
missing overfluxing protection and poor physical access more information that what is necessary on the engineering
controls, we can see how this is really a system problem workstation and providing access to an unvetted 3rd party
that needs a system level solution. via remote access.
KHAN AND MADNICK: CYBERSAFETY: A SYSTEM-THEORETIC APPROACH TO IDENTIFY CYBER-VULNERABILITIES & MITIGATION... 3325
Scenario SC-AVR-08-04 highlights an important assump- requirements such as interlocking generator voltage with
tion that can be exploited by an adversary; the design of the generator breaker and generator frequency to preclude an
process does not account for malicious intent. A spurious overfluxing event altogether i.e. to eliminate the vulnerabil-
loss of feedback condition from the AVR, by design, trans- ity through design, if possible.
fers the operation mode from automatic to manual which Furthermore, process changes are suggested in the form
can then be exploited by an attacker to manually increase of changes to the operator procedure when synchronizing
voltage of the generator beyond safe limits. the generator to the grid. New requirements also include
Finally, the last row in Table 6 hypothesizes the system changes to engineering responsibilities in terms of access
impact of the UCA i.e. the ability of the AVR to produce and storage of operating procedures in the control room.
effects that go beyond the AVR control loop. For instance, Finally, additional constraints are recommended on man-
hazardous function of the AVR would cause damage to the agement as well as outside vendors and contractors in the
generator which would impact the aggregate steam output form of policy changes for access to plant equipment. The
from the heating system, which in turn would require the important thing to note is that these functional requirements
steam-driven chillers to be shut-down in preference of the span all levels of the control structure – technical, process,
electric-driven chillers, which in turn would additionally human and organizational.
stress the electric distribution system because of additional
import from the grid. This again highlights the need for tak-
ing a systems perspective of the security problem. 6.3 Cybersafety Performance Evaluation
Table 7 lists three scenarios for the plant operator where As noted earlier, Cybersafety is a strictly qualitative method.
the operator either takes the wrong action (because of bad In this subsection we provide a subjective evaluation of
information or otherwise) or provides inadequate corrective Cybersafety’s performance in identifying and mitigating
action. Similar to the AVR’s process model, the operator cyber-vulnerabilities.
also has a mental model of the various processes in the Step 1 (define basis of the analysis) was the easiest to
plant, albeit at a higher level of abstraction. Scenario SC-OP- implement but required significant input from the key
02-01 shows how in the absence of out-of-band feedback, stakeholders (operators, engineers as well as management)
the operator may be convinced of AVR malfunction through to ensure that the target system was correctly identified and
malicious feedback injection, forcing the operator to manu- that critical losses and hazards were correctly defined. From
ally override the AVR and increase excitation resulting in a duration perspective, this step required an engagement of
the loss. One important point to note here is that this sce- a few hours with the key stakeholders.
nario is possible because of poor cybersecurity culture and Step 2 (model the functional control structure) required
training etc., that may emerge as a result of management’s several iterations to get right. It also required input from all
focus ‘on keeping the plant running’ – putting pressure on the the key stakeholders. This step was quite insightful for all
operator to try to resolve the issue without escalating it to parties involved as it helped to clarify the control structure
engineering/management. within the organization. It was interesting to observe that
Scenario SC-OP-02-02 describes how the ability to there was divergence in understanding about how the vari-
remotely alter field device mode (from auto to manual) can ous components of the system interacted with one another.
be exploited by an adversary to take over control of the Agreeing upon a functional control structure, was in of
physical process to launch an attack. itself, very valuable to the organization, since it enhanced
Scenario SC-OP-02-03 describes how the operator’s con- clarity of function for the various controllers in the organi-
trol algorithm may be compromised by altering voltage set- zation. From a duration perspective, this step took a few
points in the operating procedure – an indirect effect of not hours to complete (the actual duration was longer, as it was
having access to physical copies of operating procedures in iterated several times).
the control room. The causal factors for this scenario also Step 3 (identify unsafe control actions) was straight for-
highlight cultural aspects; lack of direct engagement ward but required individual engagements with subject-
between engineers and operators has made the process of matter experts (SME). It was observed that for the most part,
reporting errors convulted and difficult. This can result in a the SMEs were able to list UCAs fairly easily without actually
scenario where the operator follows the procedure at face- developing detailed process models. However, identifying
value even if the prescibred settings are incorrect. UCAs for higher-level controllers was found to be rather
abstract and subjective. From a duration perspective, this
step took several days to complete. It should be noted that
6.2 Mitigation Requirements this step is dependent upon the level of abstraction of the sys-
Note that despite the limited nature of the analysis, different tem. At the level of abstraction shown in this study we eli-
types of vulnerabilities have been uncovered. Fig. 9 illus- cited approximately 90 UCAs in the course of 4-5 days.
trates some high-level functional requirements and changes Step 4 (generate loss scenarios) was by far the most chal-
to the design of the control structure that could prevent or lenging to complete. There was very little guidance in the
mitigate the effects of an attack on the AVR. For instance, publicly available STAMP/STPA literature [19] for this
the functional requirements presented in Fig. 9 recommend phase of execution. However, by using the sample attack
addition of an out-of-band feedback loop for generator ter- guidewords and thinking about the causal factors in terms
minal voltage to the operator as well as the implementation of flaw categories and missing controls, we were able to
of an overflux relay into the protection scheme. In addition, generate rich causal factors that were able to capture the
some design changes are recommended as functional dynamics of the system and indirect interactions. We
3326 IEEE TRANSACTIONS ON DEPENDABLE AND SECURE COMPUTING, VOL. 19, NO. 5, SEPTEMBER/OCTOBER 2022
generated scenarios and causal factors for only 10 percent of system. This was a first-of-a-kind analysis on the cyber vul-
the UCAs which took 2-3 weeks to complete. This step nerabilities of the electric generation and distribution sys-
required significant subject-matter knowledge but was a tem using a systems perspective. It was discovered that the
creative exercise that led to insightful discussions with the addition of a few additional steps, makes the method more
stakeholders about controls, threats and vulnerabilities. robust and repeatable and makes the analysis more compre-
Overall, each phase of execution highlighted different hensive. The effect of system interdependencies was
aspects of the system’s design and its weakenesses which included in the analysis which influenced each step of the
deepened understanding about the system, its vulnerabil- analysis; from the problem statement in Step 1, to the
ities and attack surface. modeling of the extended functional control structure in
Step 2, to the identification of process model variables in
Step 3 and finally generation of loss scenarios and impact
7 CONCLUSIONS on the system in Step 4.
The objective of this study was to demonstrate the ability of Using the analogy of the human body, just as it is impos-
the Cybersafety method to systematically and robustly sible to avoid all contact with infections and never catch a
uncover cyber vulnerabilities and mitigation strategies in an disease, it is impossible for an industrial control system to
industrial control system using a real-world example; spe- be under constant attack and never have its network
cifically, those vulnerabilities that emerge as a result of defenses breached. Therefore, the system has to be designed
interactions between components and interdependent so that it is resilient against the effects of the attack and
subsystems. Cybersafety provides a well-guided and structured analytical
We demonstrated the application of cybersafety to identify method to identify vulnerabilities and derive functional
cyber-vulnerabilities in an archetypal industrial control requirements to improve resilience against cyberattacks.
KHAN AND MADNICK: CYBERSAFETY: A SYSTEM-THEORETIC APPROACH TO IDENTIFY CYBER-VULNERABILITIES & MITIGATION... 3327
[42] G. Klempner and I. Kerszenbaum, Operation and Maintenance of Stuart Madnick (Member, IEEE) received the SB
Large Turbo-Generators. Hoboken, NJ, USA: Wiley, 2004. degree in electrical engineering, the SM degreee
[43] R. C. Scharlach and J. Young, “Lessons learned from generator event in management, and the PhD degree in computer
reports,” Accessed: Apr. 24, 2019. [Online]. Available: https://cms- science from MIT. He was the John Norris
cdn.selinc.com/assets/Literature/Publications/Technical% Maguire professor of information technology and
20Papers/6387_LessonsLearnedGeneratorRS-JY_20100303_Web2. a professor of engineering systems in 1960, and
pdf?v=20191007-202032#:~:text=Additionally%2C%20the% since 1972, he has been an MIT faculty member.
20effects%20of%20negative,each%20function%20to%20protect% He was the head with Information Technologies
20generators Group, Sloan School of Management, MIT, for
[44] W. Hartmann, “Generator protection overview,” Accessed: more than 20 years. He is the founding director of
Apr. 25, 2019. [Online]. Available: https://www.eiseverywhere. Cybersecurity at MIT Sloan (CAMS), the Interdis-
com/file_uploads/8b5452d7f9376912edcba156cd1a5112_WSU_ ciplinary Consortium for Improving Critical Infrastructure Cybersecurity.
GENPROTOVERVIEW_180305.pdf He is the author or coauthor of more than 350 books, articles, or reports,
[45] UST Power, “AVR guide: Voltage too high, too low j UST,” including the classic textbook on operating systems, and has three pat-
Accessed: Apr. 24, 2019. [Online]. Available: https://ustpower. ents. His current research interests include cybersecurity, information
com/comparing-automatic-voltage-regulation-technologies/ integration technologies, semantic web, database technology, software
need/avr-guide-voltage-high-low/ project management, and the strategic use of information technology.
He has been active in industry, as a key designer and developer of proj-
Shaharyar Khan received the BASc (Hons.) ects, such as IBM’s VM/ 370 operating system and Lockheed’s DIALOG
degree in mechanical engineering from the Uni- information retrieval system. He was a consultant to major corporations
versity of Waterloo in 2010 and the SM degree in and has been the founder or cofounder of five high-tech firms, and cur-
engineering and management from the Massa- rently operates a hotel in the 14th century Langley Castle in U.K.
chusetts Institute of Technology in 2019. He is
currently a research scientist with the MIT Sloan
School of Management. He was a seismic/struc- " For more information on this or any other computing topic,
tural design engineer with BWX Technologies, please visit our Digital Library at www.computer.org/csdl.
designing and analyzing critical components for
nuclear power plants. He was also the lead proj-
ect engineer with Nuclear Generating Station,
deploying tools for reactor inspections and maintenance. He is a regis-
tered professional engineer, Ontario, Canada.