You are on page 1of 6

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/332464143

Application of Big Data Analytics pertaining to Power system Security

Conference Paper · April 2019

CITATIONS READS

0 224

2 authors:

R. Thamizhselvan Ramesh Kumar Selvaraju


Annamalai University Annamalai University
8 PUBLICATIONS   0 CITATIONS    5 PUBLICATIONS   27 CITATIONS   

SEE PROFILE SEE PROFILE

Some of the authors of this publication are also working on these related projects:

Application of Machine learning Algorithms to Power system Security Assessment Problems View project

All content following this page was uploaded by R. Thamizhselvan on 17 April 2019.

The user has requested enhancement of the downloaded file.


International Conference on Communication and Electronics Systems ( ICCES 2018)

Application of Big Data Analytics pertaining


to Power system Security
Thamizhselvan Ramalingam Rameshkumar Selvaraju Karthikeyan Shanmugam
Assistant Professor (on Deputation) Assistant Professor (on Deputation) Assistant Professor (on Deputation)
Department of Electrical Engineering, Department of Electrical Engineering, Department of Electrical Engineering,
Annamalai University Annamalai University Annamalai University
Tamilnadu,India-608002 Tamilnadu,India-608002 Tamilnadu,India-608002
tamil2012au@gmail.com rameshkumar.au@gmail.com karthikaueee79@gmail.com

Thirunavukkarasu jayaraman Saravanan Manikam


Assistant Professor (on Deputation) Assistant Professor
Department of Electrical Engineering, Department of Electronics and communication Engineering,
Annamalai University Annamalai University
Tamilnadu,India-608002 Tamilnadu,India-608002
arasujkm@yahoo.co.in saravananm180982@gmail.com

Abstract— Today, the power industry is experiencing Now a days, big data has been attracting increasing
tremendous changes both in the scale of power grid and in the attention from many academicians, industries and
system complexity along with the fast installation of computers government [5], [6]. ‗Big Data‘ is a term that defines the
and communication of smart devices. Erection of a smart huge volume of data that is growing exponentially [7]. The
power system, Big data analytics has become essential in the process of data analytics includes extracting the useful
energy industry which gratify an interconnected, distributed information from the data by building all possible relations
resources, consumption calculation and future prediction. It is among various data. This makes big data to appear even
clear that, a huge amount of measurement data including bigger. The volume of data i.e. pronounced as 'Big Data' is
production, operation, control, trading and consumption are provides too much information to a human data analyst
continuously collected, communicated, and processed in an relevant to any field [8]. Considering of this issue with power
amazing speed. Big data is a rising technology which applies to
system, the collection of huge data through PMU's in
data sets where data size is too large and common data related
technology tools are hard to generate, manage, and operate
SCADA and EMS which is really uncomfortable and time-
under multiple-limits. Existing Data analysis schemes based on consuming process. This plight deeply creates an impact to
Conventional methods will not be correlated in space and time, get insights by correlating the data, crunching information
and it is tough to design a unified and generalized power and understanding patterns of the data, etc [9], [10].
system model. In this paper reviews the applications of big data So, Machine Learning (ML) has proposed to training the
analytics using machine learning algorithms to indicate models by feeding their datasets and making algorithms can
uncertainties in power system within its feature characteristics. problem-solving and decision making within BDA
Keywords— Big Data Analytics, Feature selection, Machine
automated process [11]. Along with newer technological
Learning algorithms, Power system Security updates like Internet of Things (IoT), big data analytics with
ML based Prediction analysis will be smarter with electric
I. INTRODUCTION utilities [12]. Developing a smart big data applications using
IOT, the list of mandatory tools are Hadoop ecosystem,
Based on Big Data Analytics (BDA) and cloud technology, including Hadoop Distributed File System, Hadoop Map
an integrated architecture can be used in the smart power Reduce, Apache Sqoop, Spark, Hive, Zeppelin, Play
system which optimizes power transmission, controlling Framework, HTML and web based libraries for visualization
power consumption, indicates changes in power demand and such as Amcharts.js. Sqoop is a tool designed for efficiently
quality in supply, etc [1]. The highlight of smart power transferring bulk data between relational databases and
network over standard power system is hold up of two way Apache Hadoop [13]. Spark provides four main libraries than
communication between the functions that validates the data can be combined in the same application. SQL and Data
exchange between utilities with dynamism. So, the huge Frames for data manipulation. MLlib for machine learning
volume of captured data stores automatically under offline algorithms. GraphX for graph computations. Spark
or cloud management [2] via internet is necessary to create Streaming for streaming workloads. Spark offers developer
big impact on data knowledge, i.e. referred as 'data storage'. API‘s in Java, Python, Scala and R [14]. Apache Hive is a
Within this automated process of BDA, a fast predictive data warehouse infrastructure that provides data
methodology is necessary to develop a model that can summarization, querying, and analysis [15]. this paper work
analyze the system status based on the data knowledge. does not have any target to justifies all of the feature
These are all main assets in power system analysis while advantages in BDA. it proposes only the effectiveness of
supporting decision making in control, monitor, protection, Machine learning based classification and how it helps to
reliability and efficiency of power system operation. it is improving the data exchange between the usage of electrical
clear that, these data's are hardly utilized and predicts utilities.
massive insecurity associated with it [3], [4].

IEEE Xplore ISBN:978-1-5386-4765-3


International Conference on Communication and Electronics Systems ( ICCES 2018)

II. CONCEPT OF BIG DATA resilience, technology adoption, and energy demand
management.
A. Characteristics of Big Data
 Predict equipment failures and power outages,
The main characteristics of big data are volume, velocity, allowing utilities to optimize the level of protection.
veracity, variety and value [7] as shown in Fig. [1]. The  Improve the operating efficiency of electrical
details are generation, transmission, and distribution.
 Integrate intermittent power sources (i.e. renewable)
more efficiently and effectively.
Volume
 Control Engineers and consumers to make better
decisions, based on data and empirical investigation,
Veracity rather than on intuition or past-practice.
Velocity

Big data

Value Variety

Figure 1: Characteristics of Big Data

Volume: The term ‗Big data‘ itself has volume hidden on it.
The required Data is collected from various sources through
online and offline. These huge volume of good quality data
makes the analysis healthier. Sometimes storing and
managing this amount of data becomes difficult.
Velocity: the rate at which the data is generated in the data
base. It deals with determining how fast the data is generated
from various sources through the real world, online and
offline. Figure 2: Features of Big data analytics in Power
Variety: Data is available in many formats like system
numbers,text, images,videos, etc,. These accounts to the C. Sources of Data generation in Big data analytics
variety of data.
Value: Big data analytics enable utilities to derive great In this paper, the objective of PSBDA is to assess the
value from data to support real time operations, strategic very large volumes of data from various components in the
decisions and system planning. smart networks and grids using sensors such as PMU‘s
Veracity: this enables the reliability and quality of data integrated intelligent Power electronic devices as shown in
Fig. 3 and transform these data into meaningful inputs for
B. Concept of Big data Analyticsin Electrical utilities power system security analysis [17]. The outcome of these
Electrical power systems are extremely complex with an learning applications may identify some operating trends
immediate need of matching millions of demand leading to anticipate future failures, and consequently,
requirements with supply [3], [7]. Big Data Analytics and
advanced information technologies hold the promise of
improved system reliability, greater energy efficiency, and
lower costs for consumers.

Big Data Analytics allow the massive amounts of data


generated by electronic sensors, smart grid technologies,
electricity supply, grid operations, and customer demands
have to be coordinated, analyzed, tacit, and effectively
utilized in control centres as shown in Fig. 2. It concludes as
follows [16],

 Develop models and simulations of the electrical grid


and infrastructure to improve their reliability,

IEEE Xplore ISBN:978-1-5386-4765-3


International Conference on Communication and Electronics Systems ( ICCES 2018)

provide timely and accurate insights for predictive analysis.

Figure 3: Sources of Power system data generation in BDA


III. APPLICATION OF MACHINE LEARNING ALGORITHMS FOR
POWER SYSTEM SECURITY
The main aim of use BDA in Power system is to extract Figure 5: The linear separating hyper plane with optimal
useful information (value) from the data. The field of
margin
Machine Learning provides a ground for scientists to explore 1 T 𝑁
learning models and learning algorithms that can help Min w, b, ξ w w + C 𝑖=1 𝜉𝑖 (1)
2
machines (e.g., computers) learn the system from data [11]. Subject to Constraints
yi(wT ϕ(xi ) +b) ≥ 1- ξi; ξi ≥ 0, i = 1, 2… N (2)
Feature Training ML
Data In Eq. 1 and Eq. 2, Where w is the weight vector of the hyper
selection process Classifier
Samples
plane, C is the penalty parameter proportional to the amount
of the constraint violation, ξi is the slack variable, ϕ (.) is a
mapping function called ‗kernel‘ function and b is the
Testing threshold. The kernel function maps data in input space to
Classification
with feature space where they are linearly separable [19], [20].
Classifier
Security
Index B. Feature Selection Process
All the variables portray the power system network
Figure 4: Pattern classification using ML Algorithms may not contain useful information in classification. Thus,
reducing is the unwanted variables from the dataset which
The first step is to generate data set by performing load improves classifier performance [21]. The F-value feature
flow analysis in real time or repository in a cloud selection is used as the criterion function for selection of a
management. Once the data are collected the next step be variable as feature as given in Eq. 3.
data pre-processing. In this step, the data from various
𝑚 𝑠 −𝑚 𝑖
sources as well as variety of formats and possibly containing F= (3)
𝜎𝑠2 +𝜎𝑖2
missing or erroneous values are extracted using feature
selection algorithms. The data are then transformed to the
target repository‘s format, after which data are loaded into a Where, ms - Mean of the variable in the secure class; mi-
repository in cloud management. Now, some informed Mean of the variable in the insecure class; 𝜎𝑠2 - Variance of
actions or decisions can be made through training and the variable in the secure class; 𝜎𝑖2 - Variance of the variable
testing process in pattern classification thus pertaining in the insecure class.
system security as shown in Fig. 4. The selection of features begins with the computation
of F-values for all components (variables) of pattern vector
A. Support Vector Machine in the training set. The variable with the largest F value is
In this paper, the design of pattern classification was selected as the first feature. Let this variable be z1. When
based on Support Vector Machine (SVM). SVM performs selecting other features, redundant information is omitted by
the task of classification by first mapping the input data to a discarding these variables which are correlated to z1, i.e.
multidimensional feature space and then constructing an those variables having a correlation coefficient greater than
optimal hyper plane classifier separating the two classes 0.8 [22].
with maximum margin. SVM performs minimization of
error function by an iterative training algorithm to construct IV. RESULTS AND DISCUSSION
an optimal hyper plane [18]. The proposed SVM based classification has been
Consider a training set T = {xi, yi}, where xi is a real implemented on IEEE 118 bus power system [23] which is
valued n-dimensional input vector and yi ∈ {+1,−1} is a label considered as smart network for BDA.
that determines the class of data instance, xi. The SVMs are
A. Data set Generation in real time
employed for such two class problems. The optimal hyper
plane (line between two classes) is determined by an The Process of data set generation has been initiated at
orthogonal vector (w) and a bias (b). The points closest to the any appropriate time period chosen in Energy Management
optimal separating hyper plane with the largest margin ρ are Systems (EMS) or control centres, it can be seen that the
called as Support Vectors (SVs) as shown in Fig. 5. To fluctuations in generation and load have been accounted up
construct this optimal separating hyper plane, the SVM to ± 30%. The threshold setting of security limit for bus
classifier solves the following primal problem described as voltages in entire network was taken as 0.90 to 1.10 p.u. The
an optimization problem. preset MVA limit of transmission lines in this network is
assumed up to 130% of loading [22]. Based on the real time
analysis, The patterns or variables are generated from the
load flow results with presence of normal and line faults
happens one at a time or multiple lines at a time. The details
of data set generation are shown in Table 1.

IEEE Xplore ISBN:978-1-5386-4765-3


International Conference on Communication and Electronics Systems ( ICCES 2018)

Table 1 Generation of Data set We wish to thank the authorities of Annamalai


University, Annamalainagar and authorities of Deputed
Generated Variables Feasible no. of Variables institutions, Tamilnadu, India who has provided the time to
Vi and δi 118 each work and prepare this paper.
PGi 19 REFERENCES
QGi 54
[1] V. S. Thiyagarajan, A. Ayyasamy, ―Privacy Preserving Over Big
PDi and QDi 99 each Data Through VSSFA and MapReduce Framework in Cloud
Pi-j , Pj-i , Qi−j , Qj-i , Si-j , Environment‖, Wireless Personal Communications, pp. 1-25, 2017.
186 each
Ploss and Qloss [2] K. Venkatachalapathy, V. S. Thiyagarajan, A. Ayyasamy and K.
It can be seen that, initially 1001 operating states were Ranjani, ―Big data with cloud virtualization for effective resource
handling‖, International journal of control theory and applications,
recorded as feasible input variables, later 410 input variables vol. 9, no. 2, pp. 435-444, May 2016.
were selected from 1001 variables due to implementation of [3] B. A. Schuelke-Leech, B.Barry, M.Muratori, and B. J., Yurkovich,
F-value feature selection. Here, both the set of input ―Big Data issues and opportunities for electric utilities‖, Renewable
variables has been considered for SVM based classification and Sustainable Energy Reviews ,Vol. 52, pp. 937-947, 2015.
separately. [4] U.S. Department of Energy, ―Smart Grid system report ‖,2009.
[Online]Available:https://www.smartgrid.gov/files/systems_report.p
For a possible of 1063 cases as operating scenarios df, 2009.
(reputations), 744 operating scenarios are found to be secure [5] J. Shibily and E.A. Jasmin, ― Big Data Analytics for Distribution
and the remaining 319 cases are found to be insecure. System Monitoring in Smart Grid ‖, International Journal of Smart
Further, this data set is randomly split into training and Home , Vol. 11, No. 5, pp. 21-32, 2017.
testing phases. [6] L.Hua, Z. Junguo and L.Fantao, ―Internet of things technology and
By adjusting the SVM parameters and different its applications in smart grid‖, Indonesian Journal of Electrical
Engineering and Computer Science, Vol.12, No.2, pp. 940-946,
kernels, The Performance evaluation of SVM classifier in 2014.
BDA has been obtained. The results of SVM classifier with [7] P.Zikopoulos and C. Eaton,―Understanding big data: Analytics for
and without Feature selection has been presented in Table 2. enterprise class hadoop and streaming data‖, McGraw-Hill Osborne
Media, 2011.
Table 2 performance evaluation of ML classifier [8] E. Giglioli, C. Panzacchi and L. Senni, ―How Europe is approaching
the smart grid‖, McKinsey & Company, 2010.
Testing [9] A. Faruqui, S. Sergici and A.Sharif, ―The impact of informational
Performance feedback on energy consumption—A survey of the experimental
Training All With feature evidence‖, Energy, Vol. 35, No. 4, pp. 1598-1608, 2010.
Evaluation
Variables selection [10] M.Hassanalieragh, A. Page, T. Soyata, G. Sharma, G. Mateos, and
S. Andreescu, ―Health monitoring and management using Internet-
100 92.48 of-Things (IoT) sensing with cloud-based processing: Opportunities
Accuracy (%) 100 (213/213)
(850/850) (197/213) and challenges‖, IEEE international conference on services
7.51 computing (SCC) , pp. 285-292, 2015.
Error (%) 0 (0/850) 0 (0/213)
(16/213) [11] T. G. Dietterich, ―Machine-learning research: Four current
directions,‖ AI Magazine, vol. 18, no. 4, pp. 97–136, 1997.
During the testing phase, the classification accuracy of [12] F. Xia, L.T. Yang, L.Wang, and A.Vinel, ―Internet of
things‖, International Journal of Communication Systems, Vol.25,
classifier with FS is 100 %. It is evident that the No.9, pp. 1101-1102, 2012.
performance of SVM classifier is satisfactory. So, [13] The Apache Software Foundation, ―Apache Spark: Lightning-fast
deployment of machine learning models to analyze this data cluster computing‖ 2016, [Online]. Available:
can help to generate useful information about Power system http://spark.apache.org/.
security. Following the confrontation, under BDA especially [14] The Apache Software Foundation, ―APACHE HIVE TM‖ The
its Operational and Enterprise analytics will indicate some Apache Software Foundation‖, 2014. [Online]. Available:
https://hive.apache.org.
features such as asset maintenance, outage management,
[15] The Apache Software Foundation, ―Apache Sqoop,‖
distribution optimization, real-time monitoring and The Apache Software Foundation, Online]. Available:
visualization of data in smart power network. http://sqoop.apache.org/, 2016.
[16] H.Jiang, K.Wang, Y. Wang, M. Gao ,and Y. Zhang, ―Energy Big
V. CONCLUSION Data: A Survey, , Special Section On Theoretical Foundations For
Big Data Applications: Challenges And Opportunities‖, IEEE
A unique feature of big data analytics using machine Access, Vol. 4, 2016.
learning algorithms for power system security. The smart [17] C.C. Chang and C. J. Lin, ―LIBSVM: a Library for Support Vector
network includes generation units, transmission lines and Machines‖, 2001. (Software Available at
distribution systems have involved sufficiently to create http://www.csie.ntu.edu.tw/~cjlin/libsvm.)
powerful data analytics manages system security. By the [18] N. Christianini, J. Shawe-Taylor, ―An Introduction to Support
combination of these techniques, it is possible to know and Vector Machines and Other Kernel-based Learning Methods‖, MIT
expect load demand, generation volume and system Press, Cambridge, 2000.
disturbance, and then regulate the control automatically and [19] R. Herbrich, ―Learning Kernel Classifiers: Theory and Algorithms‖,
quickly so the grid and other electrical utilities can be MIT Press, 2001.
improved and avoid instabilities. In this work, IEEE 118 bus [20] J.Min, and Y. Lee, ―Bankruptcy Prediction Using Support Vector
Machine with Optimal Choice of Kernel Function Parameter‖,
system has been used to demonstrate for obtaining some of Expert Systems with Applications, Vol. 28(4), pp. 603–614, 2005.
the key features of big data analytics. [21] M.Dash, and H.Liu, ―Feature Selection for Classification, Intelligent
data analysis, Vol.1 (3), pp.131-156,1997.
ACKNOWLEDGMENT

IEEE Xplore ISBN:978-1-5386-4765-3


International Conference on Communication and Electronics Systems ( ICCES 2018)

[22] C.K. Pang, A.J. Koivo, and A.H. El-Abiad. ―Application of Pattern
Recognition to Steady-State Security Evaluation in a Power
System‖, IEEE Transactions on Systems, Man and Cybernetics, Vol.
3, No. 6, pp. 622– 631,1973.
[23] http://www.ee.washington.edu/research/pstca (Power System Test
Case Archive), 1996.

IEEE Xplore ISBN:978-1-5386-4765-3

View publication stats

You might also like