You are on page 1of 16

World Environmental and Water Resources Congress 2017 676

Detection of Cyber Physical Attacks on Water Distribution Systems via Principal


Component Analysis and Artificial Neural Networks

Ahmed A. Abokifa, S.M.ASCE1; Kelsey Haddad2; Cynthia S. Lo3; and Pratim Biswas4

1
Dept. of Energy, Environmental, and Chemical Engineering, Washington Univ., St. Louis. E-
mail: ahmed.abokifa@wustl.edu
Downloaded from ascelibrary.org by U OF ALA LIB/SERIALS on 09/03/20. Copyright ASCE. For personal use only; all rights reserved.

2
Dept. of Energy, Environmental, and Chemical Engineering, Washington Univ., St. Louis. E-
mail: khaddad@wustl.edu
3
Dept. of Energy, Environmental, and Chemical Engineering, Washington Univ., St. Louis. E-
mail: clo@wustl.edu
4
Dept. of Energy, Environmental, and Chemical Engineering, Washington Univ., St. Louis. E-
mail: pbiswas@wustl.edu

Abstract
Automated monitoring and operation of modern Water Distribution Systems (WDSs) are largely
dependent on an interconnected network of computers, sensors, and actuators that are jointly
coordinated by a Supervisory Control and Data Acquisition (SCADA) system. Although the
implementation of such embedded systems enhances the reliability of the WDS, it also exposes it
to cyber-physical attacks that can disrupt the system’s operation or compromise critical
information. Hence, the development of attack detection algorithms that can efficiently diagnose
and identify such assaults is crucial for the successful application of these automated systems. In
this study, we developed an algorithm to identify anomalous behaviors of the different
components of a WDS in the context of the Battle of the Attack Detection Algorithms
(BATADAL). The algorithm relies on using multiple layers of anomaly detection techniques to
identify both local anomalies that affect each sensor individually, as well as global anomalies
that simultaneously affect more than one sensor at the same time. The first layer targets finding
statistical outliers in the data using simple outlier detection techniques. The second layer
employs a trained artificial neural networks (ANNs) model to detect contextual anomalies that
does not conform to the normal operational behavior of the system. The third layer uses principal
component analysis (PCA) to decompose the high-dimensional space occupied by the given set
of sensor measurements into two sub-spaces representing normal and anomalous network
operating conditions. By continuously tracking the projections of the data instances on the
anomalous conditions subspace, the algorithm identifies the outliers based on their influence on
the directions of the principal components. The proposed approach successfully predicted all of
the pre-labeled attacks in the validation data set with high sensitivity and specificity. However,
for all the detected attacks, the algorithm maintained a false “under attack” status for a few hours
after the threat no longer existed.
INTRODUCTION
Smart water networks are interconnected water distribution systems that integrate on-line
monitoring, data collection and transmission, real-time computation, and automated operation of
the physical components of the system. They rely on an integrated network of distributed sensors
and remote actuators, which are connected to Programmable Logic Controllers (PLCs) that

© ASCE

World Environmental and Water Resources Congress 2017


World Environmental and Water Resources Congress 2017 677

handle the data and control the functional processes of the distribution system. The transmitted
monitoring and process control data are typically collected by a Supervisory Control And Data
Acquisition (SCADA) system, which is a centralized computer that analyzes the data, performs
simulations and/or optimization computations, and coordinates the operation of the entire system
in real-time. Numerous water utilities have recently started to adopt smart network technologies
to improve the overall performance, efficiency, and reliability of their distribution systems. The
increased interest in embracing smart network technologies was accompanied by a
corresponding growth in the development of the related industrial tools and solutions, including
Downloaded from ascelibrary.org by U OF ALA LIB/SERIALS on 09/03/20. Copyright ASCE. For personal use only; all rights reserved.

advanced metering and sensing technologies, data analytics tools, and automation systems.
Nevertheless, the adoption of these advanced technologies opened the door to a novel class of
security vulnerabilities.
Despite the numerous merits of implementing embedded networking technologies in the
sector of critical infrastructure systems, linking the physical assets of the system with the cyber-
space has transferred the matter of infrastructure security to the realm of cyber-based risks
(Rasekh et al. 2016). This linkage has expanded the domain of potential security threats from the
traditional risks associated with direct physical attacks that target sabotaging the elements of the
water infrastructure or compromising the water quality, to cyber-attacks that can remotely
perturb the performance of, or cause physical damage to, the system’s assets. From a national
security standpoint, water systems possess a sensitive disposition given the critical role they play
in the sustainable development of modern communities. On the other hand, this role made water
infrastructure systems a highly attractive target for cyber-attacks by terrorists, subversives, and
adversary states. In 2015, the US Department of Homeland Security reported that the Industrial
Control Systems-Cyber Emergency Response Team (ICS-CERT) received and responded to 25
cyber-related incidents that targeted water and wastewater systems, making it the third highest
targeted sector by cyber-attacks after critical manufacturing and energy (DHS ICS-CERT 2015).
Generally these cyber-attacks target the SCADA system and the PLCs that locally operate pumps
and valves (Amin et al. 2013; Taormina et al. 2016a).
Although additional cyber-security measures can be imposed on the different components
of the cyber-physical system -including the remote sensors, the communications network, and
the SCADA layers- to enhance the system’s resilience in the face of cyber-attacks, the relatively
extended periods of system operation implies that the probability that the one of these
components is attacked at least once during its life-time is non-negligible (Taormina et al.
2016b). Recently, Taormina et al. (2016c) developed a modelling framework to assess the
hydraulic response of water networks to cyber-physical attacks. They concluded that the
response of the network to an attack depends on the specifications of the attack, the initial
conditions of the system, and the water demands. Yet, little is known about the design of attack
detection algorithms that recognize suspicious behaviors in pumps, valves, sensors, and other
components of the water network.
This study aims to develop an algorithm for the detection of cyber physical attacks on
drinking water distribution systems, as a part of the BATtle of the Attack Detection ALgorithms
(BATADAL). The design goals of the detection algorithm are: i) to determine the existence of an
ongoing attack with maximum speed and reliability; ii) to avoid issuing false alarms, and to
recognize when the system is no longer under attack, and iii) to identify which components of
the physical network have been compromised during the cyber-attack.

© ASCE

World Environmental and Water Resources Congress 2017


World Environmental and Water Resources Congress 2017 678

METHODOLOGY
Problem description
C-Town Public Utility (CPU) started experiencing cyber-physical attacks that disrupted the
operation of their water distribution system (WDS) shortly after the introduction of a smart-grid
technology comprised of the deployment of a set of remote sensors and actuators connected to
nine PLCs that control the tanks, pumps, and valves in the system (Figure 1). The collected data
is transmitted to a SCADA system that supervises and coordinates the operation of the WDS in
Downloaded from ascelibrary.org by U OF ALA LIB/SERIALS on 09/03/20. Copyright ASCE. For personal use only; all rights reserved.

real-time. Two sets of data were provided as a part of the design challenge, where the first set
comprised historical SCADA data collected before the deployment of the smart control
technology, and the second data set was collected after deployment. The first set is guaranteed to
have no cyber-attacks, while the second set contains a group of attacks labeled by the utility
engineers. The second set may also contain additional attacks that the engineers were not able to
identify and label. The provided data sets included the water level in each tank, status of each
pump and valve, flow through each pump and valve, and suction and discharge pressure for each
valve and pumping station. The ultimate objective of the detection algorithm is to identify the
attacks in a test set of unlabeled data with high speed and reliability.

Figure 1. C-Town water distribution system

© ASCE

World Environmental and Water Resources Congress 2017


World Environmental and Water Resources Congress 2017 679

Concept of attack detection


In an ideal case, the water utility will have an accurate hydraulic model of the WDS that is
routinely calibrated using the collected monitoring data from the system. The presence of such a
model would be of a great benefit for detecting abnormal system behaviors, and can provide
significant insights if used in parallel with an algorithm that analyzes the data from the SCADA
system. However, since the actual water demand patterns that were used to simulate the
generation of the attacks are not provided as a part of the competition, the attack detection
Downloaded from ascelibrary.org by U OF ALA LIB/SERIALS on 09/03/20. Copyright ASCE. For personal use only; all rights reserved.

problem is implicitly considered solely dependent on the given SCADA data. Hence, the concept
of an attack detection algorithm can be readily reduced to an anomaly detection algorithm that
targets discovering the irregularities in the data induced by cyber-physical attacks. In other
words, the main aim of the detection algorithm is to search for extended inconsistencies in the
observations, which are interpreted as “finger prints” or indicators of an ongoing attack. An
anomaly can be generally defined as a data point (or a series of points) that does not conform to a
well-defined notion of normal behavior (Chandola et al. 2009), and that “deviates so much from
other observations as to arouse suspicions that it was generated by a different mechanism”
(Hawkins 1980). Hence, a first step to detect anomalies in a given set of observations is to define
a region or a domain of observations that is believed to represent the normal behavior of the
system. In our case, this domain was defined by the given historical SCADA data of clean
observations with no attacks.
In the context of detecting cyber-physical attacks on WDSs, the definition of an anomaly
should not be limited to the identification of the “outliers”, i.e. data points that lie beyond certain
statistical fences defined based on a reference data set that represents normal (expected)
behavior. In addition to that, a cyber-physical attack can interfere with the performance of one of
the network components in a way that alters its operational patterns -compared to the normal
conditions-, while keeping its performance characteristic (for example: tank level, pumping flow
rate, or pressure) within the normal historic min/max bounds. In this case, the anomalous pattern
is considered a contextual anomaly, which means that the suspicious data instances are only
anomalous within a specific temporal context determined by the regular time-series patterns.
To illustrate the difference between these two types of anomalies, Figure 2a shows a
series of anomalous data points observed in the data from of the water level in Tank 1. These
anomalies correspond to the attack event that took place from Oct. 9th to Oct. 11th, and disrupted
the performance of PU2, which lead to suspiciously high water levels in Tank 1. As can be seen,
the suspicious data points are noticeably larger in magnitude compared to the maximum historic
bound recorded in the clean data set. On the other hand, Figure 2b shows an anomalous pattern
in the data of the discharge pressure of pump station 3, which corresponds to the attack that took
place on Nov. 27th, and interfered with the performance of PU7. In this case, the magnitudes of
the anomalous points are well within the previously defined historic min/max bounds. However,
the pump performance has been clearly interrupted as can be seen from the evident change in the
pattern of the discharge pressure in the period that immediately followed the attack.
Therefore, the anomaly detection algorithm should be designed to, not only find
anomalous data points with suspiciously high or low magnitudes, but also to identify data points
that do not conform with the regular operational patterns of the system. The former task can be
simply done by comparing the magnitudes of the data points to certain statistical fences (for
example a given number of standard deviations below and above the mean, or a number of

© ASCE

World Environmental and Water Resources Congress 2017


World Environmental and Water Resources Congress 2017 680

interquartile ranges below and above the first and third quartiles, respectively). However, the
latter is a relatively more challenging task since it requires the algorithm to first “learn” the
patterns representing the normal operation of the system. This can be accomplished using a
supervised machine-learning algorithm that is trained to predict a future data point (or a group of
points) from a series of previous observations. Due to their known capability of modelling non-
linear relationships, Artificial Neural Networks (ANNs) were previously used to forecast
hydraulic time series patterns (Maier and Dandy 2000), and more recently, ANNs were applied
to detect intrusion events caused by cyber-attacks on SCADA systems (Gao et al. 2010; Linda et
Downloaded from ascelibrary.org by U OF ALA LIB/SERIALS on 09/03/20. Copyright ASCE. For personal use only; all rights reserved.

al. 2009). In this study, we used ANNs to model the data patterns of each individual sensor, and
anomalies were detected by comparing the observed data points against the values predicted by
the ANNs model as described in the next section.

Figure 2. (a) Tank 1 level (point anomaly); and (b) Pump Station 3 discharge pressure
(contextual anomalies).
A cyber physical attack on the WDS can interfere with the performance of multiple
components of the network at the same time but without significantly altering the individual
characteristics of each of the affected components. For example, if a pumping station is being
attacked, we expect that the pumping flow rate of each individual pump, the suction pressure,

© ASCE

World Environmental and Water Resources Congress 2017


World Environmental and Water Resources Congress 2017 681

and the discharge pressure, will be affected


simultaneously. A detection algorithm that only
analyzes the data from individual sensors
reporting each of these values might miss such
attacks if the induced anomalies in each of the
data arrays were below the detection limits
described above. Therefore, due to the potential
high dimensionality of the sensors data, and the
Downloaded from ascelibrary.org by U OF ALA LIB/SERIALS on 09/03/20. Copyright ASCE. For personal use only; all rights reserved.

fact that a single attack can affect multiple


sensors at the same time, the detection algorithm
should not only target finding local anomalies
existing in the data obtained from each sensor
separately, but should also aim to discover global
anomalies by analyzing the combined data
obtained from all sensors in the multi-
dimensional space. In this study, we used
Principal component analysis (PCA) to project
the observed data points from all the sensors onto
a set of principal components that capture the
highest variance in the data, which helped
differentiate between normal and anomalous data
instances as described in the next section.
Algorithm development
The detection algorithm was designed in a multi-layered manner, with each layer targeting the
detection of a specific class of outliers. The first layer focused on finding simple statistical
outliers with excessively high or low magnitudes, while the second layer aimed at revealing
contextual anomalies using trained ANNs, and the third layer targeted discovering global
anomalies that simultaneously affect multiple sensors using PCA. The data arrays analyzed by
the algorithm are the water levels in each tank (7 arrays), the flow rate through each pump and
valve (12 arrays), and the suction and discharge pressures for each pumping station and valve (12
arrays). A final check is done to ensure that, at all times, the operational statuses of the pumps
and valves follow the appropriate control logic based on the observed water levels in the tanks.
i- Layer 1: Detection of statistical outliers
Let be the ( × ) data matrix consisting of observations from sensors, and containing
an unknown number of outliers. Hence, = [ ;  ; … ;  ], where  is the vector of
observations from sensor i:  = [ , , … , ]. A simple statistical approach to detect outliers
in a given set of observations can be defined by specifying certain boundaries above and below
the mean value by multiples of the standard deviation, which can be written as:
Upper fence: = +
Lower fence: = −
where and are the designated upper and lower boundaries of data array  ; and are the
standard deviation multipliers representing the upper and lower fences, and and are the

© ASCE

World Environmental and Water Resources Congress 2017


World Environmental and Water Resources Congress 2017 682

mean and standard deviation of data vector  . An outlier point can also be defined based on the
interquartile range (IQR) as:
Upper fence: = + ( 3 − 1)
Lower fence: = − ( 3 − 1)
where and are the first and third quartiles of the data array  , respectively, and and
are the interquartile range multipliers for the upper and lower fences, respectively. A good way
Downloaded from ascelibrary.org by U OF ALA LIB/SERIALS on 09/03/20. Copyright ASCE. For personal use only; all rights reserved.

to define the upper and lower multipliers of the standard deviation and the IQR (i.e. , and )
can be based on the maximum and minimum values recorded in the clean data set. This implies
that these multipliers are believed to enclose the natural variation in the data, and therefore, any
point lying beyond these fences, in addition to the historic max/min from the clean data set, can
be considered an anomaly. A tolerance margin of 3% above and below each of these fences was
allowed in order to minimize the number of false outliers detected when the model was applied
to the second data set with pre-labeled attacks.
ii- Layer 2: Detection of contextual anomalies via ANNs
The ANN model used in this study is a multi-layer perceptron (MLP), which is a computational
model consisting of multiple layers of interconnected artificial neurons (computation nodes).
Each neuron performs a nonlinear computation, and the weighted sum of the outputs from all the
neurons in one layer is fed to the neurons of the next layer (i.e. feed forward NN). A MLP
typically consists of an input layer, one or more hidden layers, and an output layer. By adjusting
the weights assigned to each neuron, one can establish the desired relationship between the
outputs and the inputs of the ANN. This process is known as training or learning, and in this
study, it was done using a Backpropagation algorithm that adjusts the weights of the ANN
through minimizing the error between the predicted and desired outputs with the Levenberg-
Marquardt optimization algorithm.
In this study, the ANN was designed to predict a single future observation from a series
of previous observations. Hence, the output layer consisted of one neuron; while the input layer
consisted of a set of input neurons, which comprises the array of previous observations used to
predict the next observation. A single layer of hidden neurons was employed, with a number of
hidden neurons that is equal to the number of neurons in the input layer (Figure 3). Therefore,
each data array  containing observations was divided into a group of ( − ) sets, with each
set consisting of inputs [ , , , , … , , ] and one output: [ , ]. Different sizes for the input
layer were experimented, and a satisfactory training error was achieved with =40, which is
approximately 1% of the number of observations in the second data set.
The output of each neuron can be written as:

= , + .

where is the sigmoid activation function defined as:

© ASCE

World Environmental and Water Resources Congress 2017


World Environmental and Water Resources Congress 2017 683

1
( )=
1+
Downloaded from ascelibrary.org by U OF ALA LIB/SERIALS on 09/03/20. Copyright ASCE. For personal use only; all rights reserved.

Figure 3. Schematic of the ANN


For each sensor, the ( − ) sets generated from the clean data array were randomly
divided into three groups: 70% for Training, 15% for Validation, and 15% for Testing. The
trained ANN model was then applied to predict the time-series data for the second data set that
contained the attacks, and the error between the predicted and the observed data points was
obtained. Anomalies were defined as data points that had a prediction error that is more than 20
times the maximum training error from the first clean data set.
The hydraulic data for tank levels, flow rates, and pressures, is relatively noisy, and is
characterized by excessive fluctuations taking place on short time intervals. This makes training
an ANN to learn their operational patterns a fairly complex and time-consuming task, and
reduces the accuracy of the generated ANN in predicting future observations. Therefore, instead
of using the raw time series data to train and test the ANN, we used a smoothed form of the data
that preserves the same structure but with less frequent fluctuations. To do that, the Fast Fourier
Transform (FFT) of the data signal was first obtained to decompose the time series data into its
underlying frequencies. Then, the raw data was filtered (smoothed) using a third degree low pass
Butterworth filter, with a cutoff frequency that is equal to the frequency corresponding to 50% of
the cumulative amplitude of the signal. As can be seen from Figure 4, the filtered data captured
the same anomalies that existed in the raw data, while the fluctuations were significantly
smoothed.
iii- Layer 3: Detection of global anomalies via PCA
PCA is a coordinate transformation method that has been previously used to detect traffic
anomalies in network systems (Huang et al. 2007; Lakhina et al. 2004; Lee et al. 2013). It can be
used to re-map a given set of multi-dimensional data points onto new axes known as the
principal components (PCs). Each PC points in the direction representing the maximum variance
remaining in the data after accounting for the variance in the preceding components. Hence, the
first PC captures the maximum variance in the data that can be projected on a single axis, while
the following PCs capture the remaining variance, with each component capturing the largest
variance among the next orthogonal direction. Therefore, the set of principal components
can be defined as:

© ASCE

World Environmental and Water Resources Congress 2017


World Environmental and Water Resources Congress 2017 684

= arg max −
‖ ‖

which can be solved by evaluating the eigenvectors of the covariance data matrix: =
, where is the standardized (z-score) version of the observations matrix .

PCA decomposition can be used to examine the intrinsic dimensionality of the data by
Downloaded from ascelibrary.org by U OF ALA LIB/SERIALS on 09/03/20. Copyright ASCE. For personal use only; all rights reserved.

evaluating the first few PCs that capture the majority of the cumulative variance. The PCs can
hence be separated into two sets corresponding to the normal and anomalous subspaces. The first
set consists of the PCs that contain the most significant variation in the data, and therefore
capture the typical periodic patterns of the time-series. Furthermore, anomalous data points that
do not follow the normal pattern will appear more distinctive when projected on the PCs that
capture less variability, and hence constitute the anomalous subspace (Lakhina et al. 2004). In
our case, the first 14 principal components captured 99% of the variance in data matrix;
therefore, they were considered representative of the normal subspace, while the rest of the
principal components (PC 15-31) represented the anomalous subspace. To clarify the difference
between the two sets of PCs, figure 5 shows the projection of the second data set on PC1 and
PC16, where the anomaly points induced by the first three attacks clearly appear on the
projection of PC16 that belongs to the anomalous subspace, while PC1 showed the normal
periodic data pattern.

Figure 4. Discharge pressure for pumping station 3: raw data vs. smoothed data with a
Butterworth filter
In this study, in addition to directly using the projections of the data points on the
principal components to find the anomalies, we also implemented the leave one out (LOO)
methodology developed by Lee et al. 2013, which examines the effect of adding the data
instances of interest on changing the direction of the principal components. We first start by
evaluating the initial PCs of the clean data set that represent normal data points with no
anomalies. Then we proceed by iteratively adding the data points from the second set one by one

© ASCE

World Environmental and Water Resources Congress 2017


World Environmental and Water Resources Congress 2017 685

to the first data set, and then re-evaluate the principal components each time. The procedure is
repeated times for all the observations in the second data set. With adding a new data point,
we expect that the directions of the resulting PCs will deviate from the original directions. The
magnitude of the deviation is dependent on the outlier-ness of the added data point; therefore, an
anomalous data instance will yield a larger deviation than the one generated by adding a normal
data instance.
The cosine similarity between the directions of the PCs before and after adding the data
Downloaded from ascelibrary.org by U OF ALA LIB/SERIALS on 09/03/20. Copyright ASCE. For personal use only; all rights reserved.

point can be calculated as:


〈 , ̂〉
=1−
‖ ‖‖ ̂ ‖
where and ̂ are the principal components before and after adding the data instance.
Anomalous data points can then be simply identified by evaluating their effect on the direction of
the PCs representing the anomalous subspace relative to other data points. In our study, we
considered the points having a cosine similarity more than two standard deviations above the
mean to be anomalous, while other points below this threshold were considered normal.

Figure 5. Projections of the second data set on (a) PC1 (normal subspace); and (b) PC16
(anomalous subspace).

© ASCE

World Environmental and Water Resources Congress 2017


World Environmental and Water Resources Congress 2017 686

SUMMARY OF THE RESULTS


The algorithm was applied to the two data sets provided as a part of the BATADAL competition.
The first clean data set comprised historic hourly sensor readings collected by the CPU from
their WDS for a 1-year period starting from January 2014 to January 2015 (8761 data points per
sensor), and was used to train the algorithm and to define the normal behavior of the system as
described in the previous sections. The second data set comprised sensor readings for a 6-months
period from July 2016 to December 2016 (4177 data points per sensor), and contained 5 labeled
Downloaded from ascelibrary.org by U OF ALA LIB/SERIALS on 09/03/20. Copyright ASCE. For personal use only; all rights reserved.

attacks that took place on: 1) September 14th and 15th: Pump control inconsistencies in PU10 and
PU11; 2) October 9th to October 11th and 3) October 30th to November 1st: Pump PU2 control
inconsistencies; 4) November 27th: Pump PU7 inconsistencies; and 5) December 6th to December
9th: Pump control inconsistencies in PU6 and PU7. The trained algorithm was applied to detect
the pre-labeled attacks in the second data set.
Figure 6 shows plots of the attacks detected by the algorithm as compared to the pre-
labeled attacks in the second data set. As can be seen from the figures, the algorithm was able to
detect all the labeled attacks, in addition to three more potential attacks that were not originally
labeled in the given data set. For all the detected attacks, the algorithm predicted the existence of
a threat before the official start time of the attack, which meant that the time to detection (the
time needed by the algorithm to recognize an attack) was zero for all the labeled attacks. This
can be attributed to the fact that the algorithm mainly relies on detecting anomalies in the data,
hence the minor perturbations that precede the formal start of the attack are interpreted as a part
of the attack. In addition, the algorithm was able to maintain the attack status for the entire
period where labeled attacks existed, with the exception of attack 4 where two short lapse
periods during the attack were not labeled by the algorithm.
Nevertheless, for all the detected attacks, the algorithm tended to keep the system under
an attack status for a few hours after the formal end of the labeled attack. In addition, the
algorithm identified a few sporadic data points as anomalies, which given their very short
duration and their dispersed nature were highly unlikely to be part of an attack. Therefore,
isolated singly labeled data points that had more than four unlabeled data points before them and
after them were manually filtered out, which slightly reduced the number of false positives.
Another important parameter, that required manual setting, was the shortest expected attack
duration, where attacks spanning less than seven data points (the minimum duration from the
given labeled attacks) were ignored (3 attacks with ≤ 7 data points in series).
The confusion matrix of the algorithm is shown in Table 1. Generally, the higher the
values of the diagonal elements (true positives and negatives), and the smaller the values of the
off diagonal elements (False positives and negatives), the better the performance of the
algorithm. It should be noted that the additional attacks labeled by the algorithm were all
considered false positives when calculating the confusion matrix, which explains the yielded
high value for the FP as compared to the FN. The sensitivity of the algorithm in detecting the
attacks is represented by the True Positive Rate (TPR), which is the ratio of the true positives to
the sum of the true positives and false negatives, while the specificity is represented by the True
Negative Rate (TNR), which is the ratio of the true negatives, to the sum of true negatives and
false positives. The presented algorithm scored relatively high in both measures: TPR = 0.936
and TNR = 0.957, which indicates a satisfactory performance.

© ASCE

World Environmental and Water Resources Congress 2017


World Environmental and Water Resources Congress 2017 687

Table 1. Confusion Matrix Labeled (Actual)


Under Attack No Attack
Under Attack TP = 205 FP = 171
Predicted
No Attack FN = 14 TN = 3787

The first and second detection layers were able to localize the sources of the detected
Downloaded from ascelibrary.org by U OF ALA LIB/SERIALS on 09/03/20. Copyright ASCE. For personal use only; all rights reserved.

anomalies, while the third layer only recognized global anomalies without specifying which
component was the actual source for the anomaly. For all the detected attacks, the algorithm
picked anomalies in the data obtained from more than one sensor at the same time, however, the
anomalies where typically concentrated in the data from one or two sensors. For attack 1, the
identified anomalies were mainly found in the data from the flow sensor of PU11, which was the
actual network component targeted during the attack (in addition to PU10). In this scenario, the
attacker might have either directly targeted the pump actuator (PLC5), or the connection between
PLC9 (Tank 7 level), and PLC5 (PU10 and PU11 actuators). For attack 2, the detected anomalies
were mainly discovered in the data of Tank 1 level, which controls the operation of PU2, the
actual targeted component during the attack. For attack 3, the algorithm could not attribute the
detected anomalies to a certain network component since the irregularities were simultaneously
observed in the data of the discharge pressure of valve 2, and pumping stations 4, and 5. For
attack 4, the anomalies were discovered in the data of Tank 4 level, which controlled the
operation of PU7, the actual targeted component during the attack. For attack 5, the anomalies
were mainly picked in the flow data of PU6 and PU7, which were the actual targeted
components during the attack. Therefore, for four out of the five detected attacks, the algorithm
was able to identify either the actual targeted component (attacks 1, and 5), or the sensor that
controls its operation (attacks 2, and 4). While for the third attack, the corresponding anomalies
were picked from multiple elements distributed across more than one PLC, which identified a
number of potentially affected components, but was not able to distinguish which one of these
components was compromised.

© ASCE

World Environmental and Water Resources Congress 2017


World Environmental and Water Resources Congress 2017 688
Downloaded from ascelibrary.org by U OF ALA LIB/SERIALS on 09/03/20. Copyright ASCE. For personal use only; all rights reserved.

Figure 6. Predicted vs Labeled attacks


While each of the three layers was able to partially detect each one of the five pre-labeled
attacks independently (Figure 7), compiling the outliers detected by the three layers was
necessary to detect all of the attacks to a satisfactory level. While the second layer (the trained
ANNs model) was the most efficient in terms of accurately detecting all the labeled attacks with
the least number of false positives, the employment of the third layer (PCA decomposition) was
crucial for specifying the right start times for all the attacks. Nevertheless, the second layer was
the main source of the sporadic singly labeled outliers mentioned previously. This observation
shed light on another important aspect of the proposed algorithm, which is the sensitivity of the
detected outliers to the cut-off limits used in each of the three layers. While using a more
stringent cut-off for the criteria defining an outlier typically improved the detection of the attacks
(minimize false negatives), it also led to a higher number of false positives. Therefore, a
sensitivity analysis might be required, before applying the algorithm to new datasets, to draw the
appropriate line between normal and anomalous data instances based on the characteristics of the
system.

© ASCE

World Environmental and Water Resources Congress 2017


World Environmental and Water Resources Congress 2017 689
Downloaded from ascelibrary.org by U OF ALA LIB/SERIALS on 09/03/20. Copyright ASCE. For personal use only; all rights reserved.

Figure 7. Anomalies detected by each layer


CONCLUSIONS
In this study, an algorithm was developed to detect cyber physical attacks on smart water
distribution networks. The algorithm employed multiple techniques to detect different types of
anomalies in the SCADA data. An ANN model was trained to predict the regular patterns of the
system’s operation, which was used to detect anomalous data points that are inconsistent with the
normal behavior of the system. PCA was applied to decompose the multi-dimensional sensor
data matrix into two subspaces representing the normal and the anomalous projections, which
efficiently discovered global anomalies affecting multiple sensors at the same time. The
presented algorithm provides an integrated method that takes into consideration the naturally
existing correlation between the collected readings from different sensors in order to identify
anomalous behaviors that can go undetected when data from each sensor is analyzed
independently.

© ASCE

World Environmental and Water Resources Congress 2017


World Environmental and Water Resources Congress 2017 690

The algorithm performed well in detecting all the pre-labeled attacks in the validation
data set with no delays. However, it tended to put the system under an attack status for a few
hours after the threat no longer existed. While the algorithm was generally able to localize the
network components that were compromised during the detected attacks, for one of the attacks,
the algorithm could not identify the exact elements of the system that were being targeted
because the discovered anomalies where found in the data originated from more than one
component at the same time.
Downloaded from ascelibrary.org by U OF ALA LIB/SERIALS on 09/03/20. Copyright ASCE. For personal use only; all rights reserved.

The main focus of the current algorithm was on minimizing false negatives (i.e. detecting
all the labeled attacks) since the consequences of undetected attacks could in some cases be quite
severe if the attack was sophisticated enough to cause physical damage to the system or
compromise the water quality. Future work will focus on enhancing the algorithms ability to
distinguish the targeted components during an attack, and minimizing the generated periods of a
false positive status.

REFERENCES

Amin, S., Litrico, X., Sastry, S., and Bayen, A. M. (2013). “Cyber security of water scada
systems-part I: Analysis and experimentation of stealthy deception attacks.” IEEE
Transactions on Control Systems Technology, 21(5), 1963–1970.
Chandola, V., Banerjee, A., and Kumar, V. (2009). “Anomaly Detection: A Survey.” ACM
computing surveys (CSUR), 41(3), 15–58.
DHS ICS-CERT. (2015). “U.S. Department of Homeland Security – Industrial Control Systems-
Cyber Emergency Response Team, Year in Review.” 24.
Gao, W., Morris, T., Reaves, B., and Richey, D. (2010). “On SCADA control system command
and response injection and intrusion detection.” eCrime Researchers Summit, IEEE 2010.
Hawkins, D. M. (1980). Identification of outliers. Springer.
Huang, L., Nguyen, X., Garofalakis, M., Jordan, M. I., Joseph, A., and Taft, N. (2007). “In-
Network PCA and Anomaly Detection.” Advances in Neural Information Processing
Systems, 19, 617–624.
Lakhina, A., Crovella, M., and Diot, C. (2004). “Diagnosing network-wide traffic anomalies.”
ACM SIGCOMM Computer Communication Review, 34(4), 219.
Lee, Y. J., Yeh, Y. R., and Wang, Y. C. F. (2013). “Anomaly detection via online oversampling
principal component analysis.” IEEE Transactions on Knowledge and Data Engineering,
25(7), 1460–1470.
Linda, O., Vollmer, T., and Manic, M. (2009). “Neural Network based Intrusion Detection
System for critical infrastructures.” 2009 International Joint Conference on Neural
Networks, 1827–1834.
Maier, H. R., and Dandy, G. C. (2000). “Neural networks for the prediction and forecasting of
water resources variables: A review of modelling issues and applications.” Environmental
Modelling and Software, 15(1), 101–124.

© ASCE

World Environmental and Water Resources Congress 2017


World Environmental and Water Resources Congress 2017 691

Rasekh, A., Hassanzadeh, A., Mulchandani, S., Modi, S., and Banks, M. K. (2016). “Smart
Water Networks and Cyber Security.” Journal of Water Resources Planning and
Management, 77843(7), 1816004.
Taormina, R., Galelli, S., Tippenhauer, N. O., Ostfeld, A., and Salomons, E. (2016a). “Assessing
the Effect of Cyber-Physical Attacks on Water Distribution Systems.” World Environmental
And Water Resources Congress 2016, (May), 436–442.
Taormina, R., Galelli, S., Tippenhauer, N. O., Salomons, E., and Ostfeld, A. (2016b).
Downloaded from ascelibrary.org by U OF ALA LIB/SERIALS on 09/03/20. Copyright ASCE. For personal use only; all rights reserved.

“Simulation of cyber-physical attacks on water distribution systems with EPANET.”


Cryptology and Information Security Series, 14(January), 123–130.
Taormina, R., Galelli, S., Tippenhauer, N. O., Salomons, E., and Ostfeld, A. (2016c).
“Characterizing cyber-physical attacks on water distribution systems.” Journal of Water
Resources Planning and Management.

© ASCE

World Environmental and Water Resources Congress 2017

You might also like