Cybersecurity in SMEs The Smart-Home - Office Use Case

Cybersecurity in SMEs: The Smart-Home/Office
Use Case
1st Nikolaos Vakakis 2nd Odysseas Nikolis 3rd Dimosthenis Ioannidis 4th Konstantinos Votis 5th Dimitrios Tzovaras
CERTH-ITI CERTH-ITI CERTH-ITI CERTH-ITI CERTH-ITI
Thessaloniki, Greece Thessaloniki, Greece Thessaloniki, Greece Thessaloniki, Greece Thessaloniki, Greece
nikovaka@iti.gr odynik@iti.gr djoannid@iti.gr kvotis@iti.gr tzovaras@iti.gr
Abstract—Today, small and medium-sized enterprises (SME) Recent studies show not only a high percentage of small
can be considered as the new big target for cyber attacks, or medium sized businesses suffering from cyber attacks [12],
while the cybercrime prevention is often neglected within their but also that SMEs appear to have a significant perception
environment. This paper aims to investigate the characteristics
of cybersecurity threats in the Digital Innovation Hub (DIH) gap when it comes to cyber awareness and preparedness [7].
ecosystem of a Smart-Home/Office environment being constituted Similarly, according to one FireEye claim [13], 77% of all
by SMEs that contains various smart-devices and IoT equipment, cyber-crimes target SMEs. Simple endpoint protection through
smart-grid components, employees'workstations and medium antivirus has become by far inadequate due to the complexity
sized networking equipment. As the Cyber-security in such an and variety of cyber threats as well as the integration of a
ecosystem is greatly demanding and challenging because of the
various communication layers and the different supported IoT wealth of digital technologies in business processes even of
devices, we introduce a more robust, resilient and effective the smallest enterprises.
cybersecurity solution that can be effortlessly tailored to each The fact that SMEs represent the vast majority of total
individual enterprise's evolving needs and can also speedily businesses and that hackers consider them as an easier tar-
adapt/respond to the changing cyber threat landscape. Thus, this get than large enterprises reinforce the need for new robust
Cyber-security framework will be evaluated through three major
types of Smart-Home/Office datasets and will be supported from security mechanisms that achieve early and accurate detection
SME/ICT clusters under the framework of the Secure and Private of attack patterns. The SPEAR H2020 project [33] aims
Smart Grid (SPEAR) H2020 project. The first promising results to develop tools that enhance the security by design and
of our work indicate the potential of implementing strong defence introduce cyber security prevention mechanisms in the area
mechanisms for SMEs’ environments within DIHs. of smart grids, including also smart homes (like the smart
Index Terms—cybersecurity, anomaly detection, SME, dataset,
Smart-Home/Office environment, SPEAR, IoT, smart-devices, home of CERTH in Thessaloniki which operates both as a
security, LSTM, Neural Networks living and working environment). This dual nature renders
the smart home an appropriate testbed for applying advanced
I. I NTRODUCTION anomaly detection methodologies and tools with data deriving
Cybersecurity in SMEs is an ever growing concern due from a real environment that can be a powerful weapon for
to the increasing adoption of digital technologies like Cloud SMEs'security administrators support.
Computing and Internet of Things (IoT) but also the consti- The discussion on this paper will proceed as follows: Sec-
tution of a wide range of devices (e.g. PCs, servers, mobile tion II presents relevant publications, section III describes the
devices, etc.) and business practices (e.g. Bring Your Own derived from the Smart-Home/Office environment, different
Device, remote access, use of cloud-based apps and services, kinds of real datasets that have been collected efficiently, while
etc.). In this context, the increased connectivity that comes in section IV we analyse how anomaly detection techniques
with IoT technologies also entails bigger attack surface for could be applied on these data sets. Thus, we present some
adversaries, by leveraging security gaps of interconnected IoT advanced methods based on deep learning techniques, while in
devices within a Smart-Home/Office environment. On the one section V the first results obtained from the previously defined
hand, providing secure protocols can protect organization data models are presented. Finally, in section VI we conclude with
from being eavesdropped, as well as from unauthorized access the future discussion and work.
to their systems or even denial of service attacks and save
them from security breaches that could cost much in terms of II. R ELATED W ORK
capital or customer dissatisfaction. On the other hand, security The work, described in this paper, focuses on cyber-security
imposes slower communications, thus it is important to find the in the Smart-Home/Office use case. The latest advancements
right balance between security and communication efficiency. in this domain are related with two main research topics, the
Therefore, it can be understood that it is crucial to have collection of valuable datasets from a commercial/industrial
security mechanisms that can detect attacks not only in the IoT network and the exploitation of the datasets with state-
network layer but also on the application layer and compensate of-the-art machine learning methods for intrusion/anomaly
for any protocol vulnerabilities. detection.
978-1-7281-1016-5/19/$31.00 ©2019 IEEE

For cyber-security, most of the publicly available datasets
[37] are cyber-attack specific or the network architecture of the
collected datasets is not clear, especially for IoT networks. The
lack of available datasets brought the collection and creation
of the Smart-Home/Office dataset as a natural solution. These
data come from three different layers, the network capture
from common IP-network infrastructure (pcap files), commer-
cial & industrial network protocols and the operational data
(electricity & sensor measurements).
Similar datasets and anomaly detection algorithms are used
in cyber-security for Industrial Control Systems(ICS). Papers
[3], [30], [29] describe the different network layers and a
collection of operational data that constitute the basis for
training ICS intrusion detection systems to provide security
and safety for industrial equipment [31], [4].
The profile of these type of datasets and the machine learn-
ing techniques [9] used for the Smart-Home/Office match the Fig. 1: Smart Home/Office 4-layer IoT Communication Stack
cyber-security requirements of modern SMEs.The increasing
need to provide cyber-security in SMEs [28], [18], [17] gives
more value to the collected Smart-Home/Office dataset and its Source/Dest IP/Port, Flow duration, Total Forward Packets,
application. Packets/sec, etc.) [22], [8], [1]. The produced dataset provides
For example, a novel secure-by-design system for SMEs a label feature that can be used for manual annotation of a
is described in [27]. The results of the current work is a network traffic flow.
proof-of-concept for state-of-the-art anomaly detection meth- These data cover the network perception of the anomaly
ods on network traffic datasets that could be implemented in detection systems up to the transport layer in any IP-network
programmable logic and enhance the FORTIKA accelerator similar to those in SMEs and industries. The raw packet
gateways installed in the SMEs. capture provides the flexibility for customization of feature
extraction, and combined with the already provided features,
III. DATASETS different quality analysis methods can be applied to evaluate
After the study of the Smart-Home/Office's services, proto- and extract the most representative information of the network
cols and network connectivity which are summarized in a 4- traffic flows for the application of state-of-the-art machine
layer IoT communication stack [Fig. 1] as defined by [32], we learning techniques that can accurately predict the infection
conclude in the collection of three different types of datasets status of an individual network traffic flow.
by utilizing Smart-Home/Office’s analysis framework.
B. Application layer network protocols dataset
• Raw and Network flows traffic: Collection of all the
raw and network flows traffic data (port mirroring). In cyber-attack/anomaly detection, only the use of the
• Application layer network protocols: Collection of “Raw and network flows traffic dataset” restricts the net-
attributes and features from application layer network work analysis up to the transport layer. As a result, cyber-
protocols derived from raw network traffic. attacks/anomalies targeting the upper network layers (up to
• Smart-devices & Sensors Measurements: Collection L7) may remain undetected. The “Application layer network
of multiple time-series measurements from smart-devices protocols dataset” works as a complementary dataset to cover
and sensors (e.g. smart-meters, smart-inverters, etc.). the upper network layers.
The dataset is restricted with a Non-Disclosure Agreement The Smart-Home/Office use case includes smart devices that
(NDA) for distribution and exploitation to third-parties for utilize more than 12 IoT protocols. The current version of the
innovation and research actions. dataset focuses on three application layer network protocols
commonly used in industries and SMEs:
A. Raw and network flows traffic dataset • BACnet for the Smart-Home/Office’s HVAC system.
The production of this dataset contains all the network traffic • MQTT as a message broker for sensor measurements.
going in/out from the Smart-Home/Office network. Through • Modbus for the smart meters, smart-inverters, etc.
the port mirroring functionality on the main switch the raw The features of this dataset are a wide variety of network
network traffic is captured in the form of .pcap files. The raw protocol attributes. The extraction of the protocol attributes
traffic packets (.pcap files) are processed with parsers 1 for is conducted with a network protocol analyzer 2 analyzing
the extraction of the network flows. The network flows are the raw network traffic captured from the Smart-Home/Office
extracted in .csv files with up to 84 network flow features (e.g. network.
1 Cicflowmeter, Argus, t-shark 2 Network analysis (t-shark -T fields)
The network protocol analyzer dissects the protocols’ pack- IV. A NOMALY D ETECTION M ETHODS
ets and extracts the features needed to train machine learning This section presents the initial exploitation of the datasets
models that learn the common network traffic patterns of a described in Section III with two anomaly detection machine
specific environment (Smart-Home/Office, SMEs, other indus- learning methods based on Long-Short Term Memory (LSTM)
tries) in terms of type or sequences of packets and exchanged neural networks, a variation of Recurrent Neural Networks
messages. An indicative sample of the protocol attributes used (RNNs) based on LSTM memory cells [Fig. 2] [14], [15],
in Smart-Home/Office use case are shown in Table I, II and [16] [36]. The first anomaly detection method uses the dataset
III. in sub-section III-B and focuses on the attributes of the MQTT
TABLE I: MQTT protocol attributes and description protocol. This method can be adjusted easily to the other
application layer network protocols (BACnet, Modbus). The
Attribute Description
second anomaly detection method operates on the dataset
mqtt.msgtype Specifies the type of an MQTT message described in sub-section III-C and focuses on the time-series
Specifies the length of an MQTT publish
mqtt.len measurement of the Total Apparent Power of the Smart-
message.
Home/Office. This anomaly detection method can be applied
to any other time-series measurement if the values do not lead
TABLE II: BACnet protocol attributes and description to a truly ascending/descending waveform.
Attribute Description The selection of these methods is derived from the sequen-
Indicates the type of application service tial nature of the data. Both methods utilize LSTM layers, for
bacapp.confirmed service
selected in a service request message the reason that this type of RNNs exhibit remarkable results in
Defines the Application Protocol Data Unit learning sequences of data containing long/mid-term patterns
bacapp.type (APDU) message type and the fields that
appear by efficiently modeling complex multivariate sequences. The
unsupervised learning approach of the former method and
the semi-supervised learning approach of the second was yet
TABLE III: Modbus protocol attributes and description another reason for our choice of them since we are dealing
Attribute Description with unlabeled datasets.
mbtcp.len Identifies the remaining length of the packet A. Stacked LSTM neural network
mbtcp.modbus.func code Code that identifies the function to execute
mbtcp.modbus.data The actual transmitted data This section describes an unsupervised learning anomaly
detection method that implements a stacked LSTM neural
The Smart-Home/Office “Application layer network pro- network topology, that has the ability to learn the normal
tocols dataset” contains all the available protocol attributes. patterns of the application layer network protocol packet traffic
Table I, II, III present only the attributes derived from and detect any unusual change in the packet traffic flow.
the quality analysis for the feature selection process for the Stacked LSTM neural networks are capable of accurately
anomaly detection methods described in Section IV. detecting deviations from normal behaviour without requiring
prior knowledge of abnormalities [24] [5] [35]. More specif-
C. Smart-Devices & Sensor Measurements dataset ically, the machine learning model is continuously trained
The last dataset consists of time-series measurements with normal streaming data consisting of two MQTT protocol
collected from smart devices and sensors all around the attributes, the message type and length.
Smart-Home/Office. More specifically, the dataset consists The goal of this method is to learn the patterns of MQTT
of electricity measurements related to smart devices such message sequences that appear in the Smart-Home/Office net-
as smart-meters, smart-inverters, smart-battery controllers and work considering their type and length. The anomaly detection
PV-installation monitors. Each smart-device returns a set of is based on the detection of local peaks in the training loss. If
more than 20 time-series measurements such as Total Ac- at some point in time the network receives as training input an
tive/Reactive/Apparent power, Voltage phase and Amperage unusual sequence of message types or message lengths which
phase. could indicate an infiltration in the network or a Denial of
The idea of collecting and analyzing this dataset for anoma- Service (DoS) attack, then the training loss is expected to
lies detection lies in the foundation of the IoT network ar- exhibit a local peak.
chitecture. Anomaly detection over time-series measurements After experimenting with different model architectures, the
may impose a physical attack, a service/communication chan- model which achieved the best results is a neural network with
nel corruption, malfunctioning devices or fraudulent activity 6 LSTM layers with 10 LSTM units (Fig 2) [2] each and a time
against the Smart-Home/Office equipment. distributed output layer (Fig 3). Furthermore, Adam [20] was
chosen as the optimization method for the learning rate of the
The three different datasets complement each other in order model. As a first step the length and the message type values
to provide different perceptions and features for anomaly are normalized in the range [0,1] and then the neural network
detection in different layers of a Smart-Home/Office environ- receives 10 training samples in its input layer and each layer
ment. passes 1 hidden state output for each input time step to the
ture presented in Figure 4. In details, the seq2seq LSTM model
consists of an encoder and decoder, both with 2 LSTM layers
and a dense output layer. Each input and output sequence is
the time-series with 96 samples per day. The proposed model
is trained on manually annotated “normal” days and tested
on mixed “normal” and “abnormal” days. Data are fed to
the seq2seq LSTM model in a 3D format, e.g. (100, 96, 1),
which corresponds to 100 days, where each day consists of
Fig. 2: LSTM cell [23] 96 samples and each sample represents a single Total Reactive
Power measurement.
The anomaly detection is based on a user-specified threshold
next layer. Finally, the time distributed layer applies a dense and the Mean Squared Error (MSE) between the input and the
layer for each input and thus outputs one time step from the reconstructed output of the decoder [21] [10] [19]. Every day
sequence for each input time step. with a threshold value bigger than the specified MSE threshold
limit is marked as an anomaly.
Fig. 4: Seq2Seq LSTM encoder-decoder model architecture.
V. I NITIAL R ESULTS - A NOMALY D ETECTION

This section presents the initial anomaly detection results of
the ML methods described in section IV. The obtained results
Fig. 3: Stacked LSTM neural network model architecture. are promising and pave the way for further improvement and
research on ML models applied on the datasets of section III.
B. Seq2seq LSTM Encoder-Decoder A. Application layer network protocols - Anomaly detection

results
The second model is a self-supervised sequence-to-sequence
(seq2seq) LSTM network with an encoder-decoder architec- The stacked LSTM neural network was tested for its efficacy
ture [25] [26] [11] [6] [34]. The seq2seq LSTM model maps by using real normal and synthetic abnormal MQTT protocol
a fixed length input sequence with a fixed length output network traffic data. The real abnormal data come from the
sequence where the length of both sequences is the same. The dataset in sub-section III-B. The synthetic abnormal data were
encoder part of the seq2seq LSTM neural network is trained produced from the normal data in two different ways:
with the input time-series and produces the internal LSTM 1) randomly shuffling a portion of the MQTT message
cells states that are used as an input for the initialization sequence.
of the decoder part. Additionally, the decoder part has one 2) modifying the length of a portion of publish messages
more input, which is the same time-series as the encoder and which contain the payload of MQTT packets.
predicts a time-series similar to the training data and the input Specifically, the initial dataset consists of approximately
data. 34000 messages and is split to chunks of 500 messages that
Experimenting with the dataset in Section III-C, the best were fed consequently to the neural network in the 3D format
results were achieved with the seq2seq LSTM model architec- (50, 10 ,2). This means 50 batches of 10 timesteps each, where
each timestep corresponds to a message with two features,
message type and length. In such a dataset, two artificial
anomalies were produced:
1) the first is a shuffled chunk of 500 messages to test
whether the neural network would react to a different
sequence of messages
2) the second is generated by replacing the publish mes-
sages with a specific length (223) with another random
length (130) which indicates different payload, to test
whether the neural network would react to an unusual
Fig. 6: Loss diagram indicating abnormal traffic due to unusual
length of payload that could be caused by a compro-
message length sequence
mised IoT device.
Figures 5 and 6 visualize the training loss as an anomaly
detection metric. As can be observed in both figures, in re-sampled with the frequency of 15 minutes (96 sam-
the beginning the neural network fits to the input message ples/day) and the existing measurement values were
sequences that it considers as normal and the training loss aligned respectively to the closest time-stamp based on
converges to low values close to zero. In figure 5, the first their time-index3 .
anomaly is visible on the first spike of the training loss 2) Interpolation: Time-axis re-sampling may introduce
where the shuffled sequence of messages is fed to the neural NaN values in time-series samples. A linear interpola-
network. This spike indicates that “abnormal” sequence of tion is used to fill the gaps.
messages was detected. After the detection of the spike, the 3) Min-max normalization: Normalization of the mea-
weights of the neural network are reset back to the “normal” surements values between [0,1].
input sequence of messages and the training loss falls again 4) Manual annotation: There is no calendar with manual
close to zero. Furthermore, the sensitivity of the anomaly annotation of the normal and abnormal events in the
detection model was tested by forming a second synthetic measurements. Visual inspection, measurement interpre-
batch of abnormal data, but this time only 10% (50 from tation and manual annotation were used to create the
500) of messages were shuffled. This anomaly is detected ground truth of the “normal” behavior.
in the training loss as the second spike at the end of the The training set for the Seq2Seq LSTM model is a manually
training loss waveform. This is a proof-of-concept that the annotated dataset of more than 100 “normal” days. The test set
model can efficiently detect even small changes in the network. consists of the rest of the days which are a mix of “normal”
In the same way, in figure 6 the anomalous sequence of and “abnormal” days. The seq2seq LSTM model is able to
messages with the corrupted payload causes a sharp spike in detect the time-series days that do not follow the pattern of a
the waveform. “normal” day as an anomaly.
Figure 7a presents with red the real data of a normal day
and with blue the prediction outcome of the trained model. The
prediction seems to follow the trend of a normal day. On the
other hand, figure 7b presents a day marked as anomalous. The
anomaly is detected at the right side of figure 7b where the last
samples do not follow the normal day pattern. This difference
increases the MSE and results to a MSE score bigger than the
threshold value. The threshold value is a hyper-parameter that
is tuned based on the current ML application4 .
From a user’s perspective, the interpretation of the anomaly
Fig. 5: Loss diagram indicating abnormal traffic due to unusual detection results indicates an excessive consumption of “To-
message type sequence tal Apparent Power” in the Smart-Home/Office during the
night. This event detection could be exploited to increase
energy efficiency and reduce the electricity costs in a Smart-
B. Smart-devices & Sensors Measurements - Anomaly detec- Home/Office environment or trigger further investigation in
tion results case of a malfunction due to a cyber-attack.
The pre-processing of the dataset described in section III-C VI. C ONCLUSIONS AND FUTURE WORK
has four stages,
Securing SMEs’ ecosystems, in the era of IoT evolution,
1) Time-axis harmonization: Some time-series measure-
is a very challenging and extremely important task for the
ments have a different number of samples per day
because of different sampling frequencies or missing 3 Pandas.DataFrame.Resample
samples. As a result, the time-axis for each day was 4 In this experiment, the threshold value is 0.1
test our methodologies with actual cyber-attack data. These are
the next steps towards implementing a state-of-the art security
mechanism that operates on several different levels of a Smart-
Home/Office environment and can efficiently detect different
cyber-attack patterns.
VII. ACKNOWLEDGEMENT
The aforementioned work effort in this paper is conducted
under the framework of the SPEAR project, a Horizon 2020
(a) A “normal” day from the time-series of the “Total Apparent program, funded by the European Union under the grant
Power” consumed in the Smart-Home/Office. agreement No. 787011.
R EFERENCES
[1] URL : http : / / www. netflowmeter. ca / netflowmeter. html
(visited on 06/27/2019).
[2] Justin Bayer et al. “Evolving memory cell structures
for sequence learning”. In: International Conference on
Artificial Neural Networks. Springer. 2009, pp. 755–
764.
[3] Justin Beaver, Raymond Borges, and Mark Buckner.
“An Evaluation of Machine Learning Methods to Detect
Malicious SCADA Communications”. In: vol. 2. Dec.
2013, pp. 54–59. DOI: 10.1109/ICMLA.2013.105.
[4] R. C. Borges Hink et al. “Machine learning for power
system disturbance and cyber-attack discrimination”. In:
(b) An “abnormal” day from the time-series 2014 7th International Symposium on Resilient Control
of the “Total Apparent Power” consumed in Systems (ISRCS). Aug. 2014, pp. 1–8.
the Smart-Home/Office. The anomaly here in-
dicates a high-consumption of “Total Apparent
[5] Sucheta Chauhan and Lovekesh Vig. “Anomaly de-
Power” during the night visualized as a tail at tection in ECG time signals via deep long short-
the last samples of the day term memory networks”. In: 2015 IEEE International
Fig. 7: Seq2Seq LSTM Encoder-Decoder - Anomaly Detec- Conference on Data Science and Advanced Analytics
tion: Red - Original measurement values of the time-series. (DSAA). IEEE. 2015, pp. 1–7.
Blue - The predicted time-series measurement values from the [6] Kyunghyun Cho et al. “Learning phrase representations
seq2seq LSTM model. using RNN encoder-decoder for statistical machine
translation”. In: arXiv preprint arXiv:1406.1078 (2014).
[7] Chubb. 2018. URL: https : / / www . chubb . com / sg -
cyber-security researchers. Nowadays, legacy signature-based en / articles / too - small - to - fail . aspx ? utm source =
anomaly detection techniques are no longer sufficient as rapid mediaroom & utm medium = referral & utm campaign =
technology improvements lead to new attack schemes. In the sme-cyber-report (visited on 06/27/2019).
context of the SPEAR project, the Smart-Home/Office use case [8] Gerard Draper-Gil. et al. “Characterization of En-
scenario offers the opportunity for testing and implementing crypted and VPN Traffic using Time-related Fea-
novel anomaly detection methodologies that apply in smart- tures”. In: Proceedings of the 2nd International Con-
environments similar to the SMEs. ference on Information Systems Security and Privacy
The initial experimental results present the potential value - Volume 1: ICISSP, INSTICC. SciTePress, 2016,
of the original datasets extracted from the Smart-Home/Office, pp. 407–414. ISBN: 978-989-758-167-0. DOI: 10.5220/
which constitute a solid basis upon which effective and robust 0005740704070414.
anomaly detection models can be build. Those datasets offer [9] Benedikt Eiteneuer and Oliver Niggemann. “LSTM
a wealth of useful information and features in different layers for Model-based Anomaly Detection in Cyber-Physical
that can be leveraged not only from the models presented Systems”. In: DX@Safeprocess. 2018.
in this paper but also from other type of predictive models. [10] Cheng Fan et al. “Analytical investigation of
One more step in the future work is the further improvement autoencoder-based methods for unsupervised anomaly
and experimentation with the two LSTM-based models with detection in building energy data”. In: Applied Energy
their hyper-parameters and the features in different layers 211 (2018), pp. 1123–1135. ISSN: 0306-2619. DOI:
(raw packet traffic, application layer network protocols, multi- https://doi.org/10.1016/j.apenergy.2017.12.005. URL:
variate measurements). Additionally, we aim to generate real http : / / www . sciencedirect . com / science / article / pii /
attack scenarios against the Smart Home/Office equipment and S0306261917317166.
[11] Tharindu Fernando et al. “Soft+ hardwired attention: [25] Pankaj Malhotra et al. “LSTM-based encoder-decoder
An lstm framework for human trajectory prediction and for multi-sensor anomaly detection”. In: arXiv preprint
abnormal event detection”. In: Neural networks 108 arXiv:1607.00148 (2016).
(2018), pp. 466–478. [26] Pankaj Malhotra et al. “Multi-sensor prognostics using
[12] Kelly Finnerty et al. “Cyber Security Breaches Survey an unsupervised health index based on lstm encoder-
2018: Statistical Release”. In: (2018). decoder”. In: arXiv preprint arXiv:1608.06154 (2016).
[13] FireEye. 5 reasons cyber attackers target SMEs. 2016. [27] Evangelos Markakis et al. “Acceleration at the edge for
URL: https : / / www. fireeye . com / content / dam / fireeye - supporting SMEs Security: The FORTIKA paradigm”.
www/global/en/offers/pdfs/SME-Infographic web.pdf English. In: IEEE Communications Magazine 57.2 (Feb.
(visited on 06/27/2019). 2019), pp. 41–47. ISSN: 0163-6804. DOI: 10 . 1109 /
[14] Felix Gers and Jurgen Schmidhuber. “Recurrent nets MCOM.2019.1800506.
that time and count”. In: vol. 3. Feb. 2000, 189–194 [28] Alaa Mohasseb et al. “Predicting CyberSecurity In-
vol.3. ISBN: 0-7695-0619-4. DOI: 10.1109/IJCNN.2000. cidents Using Machine Learning Algorithms: A Case
861302. Study of Korean SMEs”. In: Feb. 2019. DOI: 10.5220/
[15] Klaus Greff et al. LSTM: A Search Space Odyssey. 0007309302300237.
arXiv. org. 2015. [29] Thomas H. Morris, Zach Thornton, and Ian P.
[16] Rafal Jozefowicz, Wojciech Zaremba, and Turnipseed. “Industrial Control System Simulation and
Ilya Sutskever. “An empirical exploration of recurrent Data Logging for Intrusion Detection System Re-
network architectures”. In: Journal of Machine search”. In: 2015.
Learning Research (2015). [30] S. Pan, T. Morris, and U. Adhikari. “Classification
[17] Salah Kabanda, Maureen Tanner, and Cameron Kent. of Disturbances and Cyber-Attacks in Power Systems
“Exploring SME cybersecurity practices in developing Using Heterogeneous Time-Synchronized Data”. In:
countries”. In: Journal of Organizational Computing IEEE Transactions on Industrial Informatics 11.3 (June
and Electronic Commerce 28.3 (2018), pp. 269–282. 2015), pp. 650–662. ISSN: 1551-3203. DOI: 10.1109/
DOI : 10.1080/10919392.2018.1484598. TII.2015.2420951.
[18] Cameron Kent, Maureen Tanner, and Salah Kabanda. [31] S. Pan, T. Morris, and U. Adhikari. “Developing a
“How South African SMEs address cyber security: Hybrid Intrusion Detection System Using Data Mining
The case of web server logs and intrusion detection”. for Power Systems”. In: IEEE Transactions on Smart
In: 2016 IEEE International Conference on Emerging Grid 6.6 (Nov. 2015), pp. 3104–3113. ISSN: 1949-3053.
Technologies and Innovative Business Practices for [32] B. Russell and D. Van Duren. Practical Internet
the Transformation of Societies (EmergiTech) (2016), of Things Security. Packt Publishing, 2016. ISBN:
pp. 100–105. 9781785880292. URL: https://books.google.gr/books?
[19] T. Kieu, B. Yang, and C. S. Jensen. “Outlier Detection id=Tv5vDQAAQBAJ.
for Multidimensional Time Series Using Deep Neural [33] SPEAR project H2020 official website. URL: https : / /
Networks”. In: 2018 19th IEEE International Confer- www.spear2020.eu/ (visited on 06/27/2019).
ence on Mobile Data Management (MDM). June 2018, [34] Ilya Sutskever, Oriol Vinyals, and Quoc V Le. “Se-
pp. 125–134. DOI: 10.1109/MDM.2018.00029. quence to sequence learning with neural networks”.
[20] Diederik P. Kingma and Jimmy Ba. Adam: A Method In: Advances in neural information processing systems.
for Stochastic Optimization. 2014. arXiv: 1412 . 6980 2014, pp. 3104–3112.
[cs.LG]. [35] Adrian Taylor, Sylvain Leblanc, and Nathalie Japkow-
[21] Diederik P Kingma and Max Welling. “Auto-encoding icz. “Anomaly detection in automobile control network
variational bayes”. In: arXiv preprint arXiv:1312.6114 data with long short-term memory networks”. In: 2016
(2013). IEEE International Conference on Data Science and
[22] Arash Habibi Lashkari. et al. “Characterization of Tor Advanced Analytics (DSAA). IEEE. 2016, pp. 130–139.
Traffic using Time based Features”. In: Proceedings [36] Understanding LSTM networks. URL: http : / / colah .
of the 3rd International Conference on Information github . io / posts / 2015 - 08 - Understanding - LSTMs/
Systems Security and Privacy - Volume 1: ICISSP, (visited on 06/27/2019).
INSTICC. SciTePress, 2017, pp. 253–262. ISBN: 978- [37] Ozlem Yavanoglu and Murat Aydos. “A Review on
989-758-209-7. DOI: 10.5220/0006105602530262. Cyber Security Datasets for Machine Learning Algo-
[23] LSTM cell figure. URL: http://wp.firrm.de/index.php/ rithms”. In: Dec. 2017. DOI: 10.1109/BigData.2017.
2018/04/13/building-a-lstm-network-completely-from- 8258167.
scratch-no-libraries/ (visited on 06/20/2019).
[24] Pankaj Malhotra et al. “Long short term memory net-
works for anomaly detection in time series”. In: Pro-
ceedings. Presses universitaires de Louvain. 2015, p. 89.

Cybersecurity in SMEs The Smart-Home - Office Use Case

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Cybersecurity in SMEs The Smart-Home - Office Use Case

Uploaded by

Copyright:

Available Formats

Cybersecurity in SMEs: The Smart-Home/Office

978-1-7281-1016-5/19/$31.00 ©2019 IEEE

Fig. 4: Seq2Seq LSTM encoder-decoder model architecture.

V. I NITIAL R ESULTS - A NOMALY D ETECTION

B. Seq2seq LSTM Encoder-Decoder A. Application layer network protocols - Anomaly detection

You might also like