You are on page 1of 68

Intelligent Data Mining and Analysis in

Power and Energy Systems : Models


and Applications for Smarter Efficient
Power Systems 1st Edition Zita A. Vale
Visit to download the full and correct content document:
https://ebookmass.com/product/intelligent-data-mining-and-analysis-in-power-and-en
ergy-systems-models-and-applications-for-smarter-efficient-power-systems-1st-editio
n-zita-a-vale/
Intelligent Data Mining and Analysis in Power and Energy Systems
IEEE Press
445 Hoes Lane
Piscataway, NJ 08854

IEEE Press Editorial Board


Sarah Spurgeon, Editor in Chief

Jón Atli Benediktsson Andreas Molisch Diomidis Spinellis


Anjan Bose Saeid Nahavandi Ahmet Murat Tekalp
Adam Drobot Jeffrey Reed
Peter (Yong) Lian Thomas Robertazzi
Intelligent Data Mining and Analysis in Power and
Energy Systems

Models and Applications for Smarter Efficient Power Systems

Edited by

Zita Vale
GECAD Research Group on Intelligent Engineering and Computing for Advanced Innovation
and Development (GECAD) Polytechnic Institute of Porto (ISEP/IPP) Porto, Portugal

Tiago Pinto
GECAD Research Group on Intelligent Engineering and Computing for Advanced Innovation
and Development (GECAD) Polytechnic Institute of Porto (ISEP/IPP) Porto, Portugal and
University of Trás-os-Montes e Alto Douro Vila Real, Portugal

Michael Negnevitsky
School of Engineering, University of Tasmania Hobart, Tasmania, Australia

Ganesh Kumar Venayagamoorthy


Holcombe Department of Electrical and Computer Engineering, Real-Time Power and
Intelligent Systems Laboratory Clemson University Clemson, SC, USA and School of
Engineering University of KwaZulu-Natal Durban, South Africa
Copyright © 2023 by The Institute of Electrical and Electronics Engineers, Inc.

Published by John Wiley & Sons, Inc., Hoboken, New Jersey. All rights reserved.
Published simultaneously in Canada.

No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any
means, electronic, mechanical, photocopying, recording, scanning, or otherwise, except as permitted under Section
107 or 108 of the 1976 United States Copyright Act, without either the prior written permission of the Publisher, or
authorization through payment of the appropriate per-copy fee to the Copyright Clearance Center, Inc., 222
Rosewood Drive, Danvers, MA 01923, (978) 750-8400, fax (978) 750-4470, or on the web at www.copyright.com.
Requests to the Publisher for permission should be addressed to the Permissions Department, John Wiley & Sons,
Inc., 111 River Street, Hoboken, NJ 07030, (201) 748-6011, fax (201) 748-6008, or online at http://www.wiley.com/
go/permission.

Trademarks: Wiley and the Wiley logo are trademarks or registered trademarks of John Wiley & Sons, Inc. and/or
its affiliates in the United States and other countries and may not be used without written permission. All other
trademarks are the property of their respective owners. John Wiley & Sons, Inc. is not associated with any product
or vendor mentioned in this book.

Limit of Liability/Disclaimer of Warranty: While the publisher and author have used their best efforts in preparing
this book, they make no representations or warranties with respect to the accuracy or completeness of the contents
of this book and specifically disclaim any implied warranties of merchantability or fitness for a particular purpose.
No warranty may be created or extended by sales representatives or written sales materials. The advice and
strategies contained herein may not be suitable for your situation. You should consult with a professional where
appropriate. Neither the publisher nor author shall be liable for any loss of profit or any other commercial damages,
including but not limited to special, incidental, consequential, or other damages. Further, readers should be aware
that websites listed in this work may have changed or disappeared between when this work was written and when it
is read. Neither the publisher nor authors shall be liable for any loss of profit or any other commercial damages,
including but not limited to special, incidental, consequential, or other damages.

For general information on our other products and services or for technical support, please contact our Customer
Care Department within the United States at (800) 762-2974, outside the United States at (317) 572-3993 or fax (317)
572-4002.

Wiley also publishes its books in a variety of electronic formats. Some content that appears in print may not be
available in electronic formats. For more information about Wiley products, visit our web site at www.wiley.com.

Library of Congress Cataloging-in-Publication Data

Names: Vale, Zita, editor. | Pinto, Tiago, PhD, editor. | Negnevitsky,


Michael, editor. | Venayagamoorthy, Ganesh Kumar, editor.
Title: Intelligent data mining and analysis in power and energy systems :
models and applications for smarter efficient power systems / edited by
Zita Vale, Tiago Pinto, Michael Negnevitsky, Ganesh Kumar
Venayagamoorthy.
Description: Hoboken, New Jersey : Wiley-IEEE Press, [2023] | Series: IEEE
press series on power and energy systems
Identifiers: LCCN 2022043311 (print) | LCCN 2022043312 (ebook) | ISBN
9781119834021 (cloth) | ISBN 9781119834038 (adobe pdf) | ISBN
9781119834045 (epub)
Subjects: LCSH: Electric power systems. | Data mining.
Classification: LCC TK1001 .I577 2023 (print) | LCC TK1001 (ebook) | DDC
621.31–dc23/eng/20220909
LC record available at https://lccn.loc.gov/2022043311
LC ebook record available at https://lccn.loc.gov/2022043312

Cover Design: Wiley


Cover Image: © metamorworks/Shutterstock

Set in 9.5/12.5pt STIXTwoText by Straive, Chennai, India


“To our dear Parents and Children”
vii

Contents

About the Editors xix


List of Contributors xxi
Foreword xxvii

Introduction 1
References 3

Part I Data Mining and Analysis Fundamentals 5

1 Foundations 7
Ansel Y. Rodríguez-González, Angel Díaz-Pacheco, Ramón Aranda, and Miguel Á.
Álvarez-Carmona
Acronyms 7
1.1 Data Mining: Why and What? 7
1.2 Data Mining into KDD 8
1.3 The Data Mining Process 9
1.3.1 Data Cleaning 10
1.3.2 Data Integration 10
1.3.3 Data Reduction 11
1.3.4 Data Transformation 12
1.4 Data Mining Task and Techniques 12
1.4.1 Techniques 14
1.4.1.1 Techniques in the “Description” Branch 14
1.4.1.2 Regression Techniques 14
1.4.1.3 Classification Techniques 15
1.4.2 Applications 17
1.5 Data Mining Issues and Considerations 18
1.5.1 Scalability of Algorithms 18
1.5.2 High Dimensionality 18
1.5.3 Improving Interpretability 18
1.5.4 Handling Uncertainty 19
1.5.5 Privacy and Security Concerns 19
1.6 Summary 19
References 20
viii Contents

2 Data Mining and Analysis in Power and Energy Systems: An Introduction to


Algorithms and Applications 25
Fernando Lezama
Acronyms 25
2.1 Introduction 25
2.2 Data Mining Technologies 26
2.2.1 Supervised Methods 26
2.2.1.1 Regression-Based Methods 27
2.2.1.2 Classification-Based Methods 27
2.2.2 Unsupervised Methods 27
2.2.2.1 Association Rule Mining 28
2.2.2.2 Clustering-Based Methods 28
2.3 Data Mining Applications in Power Systems 28
2.3.1 Profiling 29
2.3.2 Forecasting 31
2.3.3 Fault Detection and Diagnosis 33
2.3.4 Other Applications 34
2.4 Discussion and Final Remarks 35
References 37

3 Deep Learning in Intelligent Power and Energy Systems 45


Bruno Mota, Tiago Pinto, Zita Vale, and Carlos Ramos
Acronyms 45
3.1 Introduction 46
3.2 Deep Learning 49
3.2.1 Regression Problems 49
3.2.1.1 Photovoltaic Energy Forecast 49
3.2.1.2 Wind Power Forecast 50
3.2.1.3 Building Energy Consumption Prediction 50
3.2.1.4 Electricity Price Forecast 51
3.2.1.5 Other Regression Works 52
3.2.2 Classification Problems 52
3.2.2.1 Power Quality Disturbances Detection/Classification 53
3.2.2.2 Fault Detection/Classification 54
3.2.2.3 Feature Engineering 55
3.2.2.4 Other Classification Works 55
3.2.3 Decision-Making Problems 56
3.2.3.1 Energy Management 56
3.2.3.2 Demand Response 57
3.2.3.3 Electricity Market 57
3.2.3.4 Other Decision-Making Works 58
3.3 Accomplishments, Limitations, and Challenges 58
3.4 Conclusions 60
References 60
Contents ix

Part II Clustering 69

4 Data Mining Techniques Applied to Power Systems 71


Sérgio Ramos, João Soares, Zahra Forouzandeh, and Zita Vale
Acronyms 71
4.1 Introduction 71
4.1.1 Data Selection 72
4.1.2 Data Pre-processing 73
4.1.3 Data Mining 73
4.1.4 Analysis and Interpretation 74
4.2 Data Mining Techniques 75
4.2.1 Clustering Algorithms 76
4.2.2 Clustering Validity Indices 79
4.2.3 Classification Algorithms 80
4.3 Data Mining Techniques Applied to Power Systems 82
4.3.1 Electrical Consumers Characterization 83
4.3.1.1 Typical Load Profile 83
4.3.2 Electrical Consumers Characterization – Classification 86
4.3.3 Conclusions 89
4.4 Electrical Tariffs Design Based on Data Mining Techniques 90
4.4.1 Electrical Tariffs Design 90
4.4.2 Conclusions 93
4.5 Data Mining Contributions to Characterize Zonal Prices 93
4.5.1 Zonal Prices Characterization 93
4.5.2 Conclusions 97
4.6 Data Mining-Based Methodology for Wind Forecasting 98
4.6.1 Wind Forecasting 98
4.6.2 Conclusions 100
4.7 Final Remarks 101
References 101

5 Synchrophasor Data Analytics for Anomaly and Event Detection,


Classification, and Localization 105
Sajan K. Sadanandan, Arman Ahmed, Shikhar Pandey, and Anurag K. Srivastava
5.1 Introduction 105
5.2 Synchrophasor Data Quality Issues and Challenges 106
5.2.1 PMU Data Flow: Data Quality Issues 107
5.2.2 PMU Data Anomalies 108
5.3 ML-Based Anomaly Detection, Classification, and Localization (ADCL) Over Data
Drifting Multivariate Synchrophasor Data Streams 108
5.3.1 Data Drift in Synchrophasor Measurements 109
5.3.2 PMUNET Framework 110
5.3.2.1 Data Pre-Processing (DPP) Module 110
5.3.2.2 Data-Drift (DD) Module 110
x Contents

5.3.2.3 Save-Load (SL) Module 112


5.3.3 Anomaly Detector (AD) Module 112
5.3.3.1 Anomaly Classification 113
5.3.3.2 Anomaly Localization 113
5.3.4 Distributed Deep Autoencoder Learning 113
5.4 Synchrophasor Data Anomaly and Event Detection, Localization, and
Classification (SyncAED) 114
5.4.1 Synchrophasor Data Anomaly Detection (SyncAD) 114
5.4.1.1 Base Detectors 115
5.4.1.2 Ensemble Method 115
5.4.1.3 Prony-Based Transient Window Estimation 115
5.4.2 Event Detection, Classification, and Localization 116
5.4.2.1 Event Detection 117
5.4.2.2 Event Classification 117
5.4.2.3 Event Localization 117
5.5 Test-Bed and Test Cases 119
5.5.1 Cyber-Power Test-Bed Architecture 119
5.5.1.1 Test Case 119
5.6 Results and Discussion 120
5.6.1 Simulation Results for PMUNET 120
5.6.1.1 Performance Evaluation Metrics 120
5.6.1.2 Experimental Analysis 121
5.6.2 Simulation Results for SyncAED 122
5.6.2.1 Anomaly Detection 122
5.6.2.2 Event Detection and Classification Using Clustering and Decision Tree 122
5.7 Summary 125
Acknowledgments 125
References 125

6 Clustering Methods for the Profiling of Electricity Consumers Owning Energy


Storage System 129
Cátia Silva, Pedro Faria, Juan M. Corchado, and Zita Vale
Acronyms 129
6.1 Introduction 129
6.2 Methodology Definition 131
6.3 Clustering of Consumers with ESS 135
6.3.1 Optimal Number of Clusters 135
6.3.1.1 Average Silhouette Method 136
6.3.1.2 Elbow Method 136
6.3.1.3 Gap Statistic Method 137
6.3.2 Clustering Methods 137
6.3.2.1 Partitional Clustering 138
6.3.2.2 Fuzzy Clustering 141
6.3.2.3 Hierarchical Clustering 142
6.4 Conclusion 145
Acknowledgments 146
References 146
Contents xi

Part III Classification 149

7 A Novel Framework for NTL Detection in Electric Distribution Systems 151


Chia-Chi Chu, Nelson Fabian Avila, Gerardo Figueroa, and Wen-Kai Lu
Acronyms 151
7.1 Introduction 151
7.1.1 State-of-the-Art 152
7.1.2 Proposed Framework 153
7.2 Data Acquisition and Pre-Processing 154
7.2.1 Data Acquisition 154
7.2.2 Pre-Processing 155
7.3 Feature Extraction 156
7.3.1 Overview 156
7.3.2 MODWPT 156
7.3.3 Feature Extraction Mechanism 156
7.4 Classification Strategies 158
7.4.1 Random Under-Sampling (RUS) and Random Over-Sampling (ROS) Techniques 158
7.4.2 Adaptive Boosting Algorithm 158
7.4.3 Random Under-Sampling Boosting Algorithm 159
7.5 Evaluation 160
7.6 Experiments 161
7.6.1 Outlier Detection Using Smoothing Splines 161
7.6.2 MODWPT-Based Signal Decomposition 163
7.6.3 RusBoost NTL Detection Technique 163
7.6.4 Comparison with Existing Approaches 164
7.7 Conclusion 166
References 167

8 Electricity Market Participation Profiles Classification for Decision Support in


Market Negotiation 171
Tiago Pinto and Zita Vale
Acronyms 171
8.1 Introduction 171
8.2 Bilateral Negotiation 172
8.3 Decision Support for Bilateral Negotiations 174
8.3.1 Clustering of Players Profiles 176
8.3.2 Classification of New Players 177
8.3.2.1 Artificial Neural Networks 177
8.3.2.2 Support Vector Machines 177
8.4 Illustrative Results 178
8.5 Conclusion 183
References 184

9 Socio-demographic, Economic, and Behavioral Analysis of Electric


Vehicles 187
Rúben Barreto, Tiago Pinto, and Zita Vale
Acronyms 187
9.1 Introduction 187
xii Contents

9.2 Electric Vehicle Outlook 188


9.2.1 Electric Mobility Market 188
9.2.2 Economic Aspects 189
9.2.3 Socio-demographic Aspects 190
9.2.4 Recommendations for Policymakers 191
9.3 Data Mining Models for EVs 191
9.3.1 Charging Behavior 191
9.3.2 EV User Behavior 192
9.3.3 Driving Range 193
9.3.4 Speed 194
9.3.5 Electric Vehicle Battery 195
9.3.6 Charging Station Planning 195
9.3.7 Summary 196
9.4 Conclusions 197
References 197

Part IV Forecasting 201

10 A Multivariate Stochastic Spatiotemporal Wind Power Scenario


Forecasting Model 203
Wenlei Bai, Duehee Lee, and Kwang Y. Lee
Acronyms 203
Nomenclature 203
10.1 Introduction 204
10.2 Generalized Dynamic Factor Model 206
10.2.1 Derivation of the GDFM 206
10.2.2 Estimation of the GDFM 208
10.2.3 Forecast of the GDFM 210
10.2.4 Verification of the GDFM 212
10.2.5 Application of the GDFM 216
10.3 Conclusion 219
References 221

11 Spatiotemporal Solar Irradiance and Temperature Data Predictive


Estimation 223
Chirath Pathiravasam and Ganesh Kumar Venayagamoorthy
Acronyms 223
11.1 Introduction 223
11.2 Virtual Weather Stations 225
11.3 Distributed Weather Forecasting 227
11.3.1 Spatiotemporal Prediction Network 227
11.3.2 Computational Units 228
11.4 Results and Discussion 228
11.4.1 Weather Data Estimation 229
11.4.2 Weather Data Prediction 230
Contents xiii

11.5 Summary 232


Acknowledgment 234
References 234

12 Application of Decomposition-Based Hybrid Wind Power Forecasting in


Isolated Power Systems with High Renewable Energy Penetration 237
Evgenii Semshikov, Michael Negnevitsky, James Hamilton, and Xiaolin Wang
12.1 Introduction 237
12.2 Decomposition Techniques 238
12.2.1 Variational Mode Decomposition 239
12.2.2 Decomposition of Wind Power Time Series 239
12.3 Decomposition-Based Neural Network Forecasting 241
12.3.1 Theory Behind LSTM 242
12.3.2 VMD-LSTM for Wind Power Forecasting 242
12.4 Forecast-Based Dispatch in Isolated Power Systems 243
12.4.1 Control Strategy 244
12.4.2 Regulation and Load Following Reserves 246
12.5 Case Studies 249
12.5.1 King Island Isolated Power System 249
12.5.2 Case Study I (Control Strategy with No RE Forecast) 250
12.5.3 Case Study II (Control Strategy Involving Persistence Model RE Forecast) 251
12.5.4 Case Study III (Control Strategy Involving VMD-LSTM-Based RE
Forecast) 251
12.5.5 Economic Assessment Over a Year of Operation 252
12.6 Conclusions and Discussions 253
References 253

Part V Data Analysis 257

13 Harmonic Dynamic Response Study of Overhead Transmission Lines 259


Dharmbir Prasad, Rudra P. Singh, Irfan Khan, and Sushri Mukherjee
Acronyms 259
Nomenclature 259
13.1 Introduction to Methodology 260
13.1.1 General 261
13.1.2 Selection Aspects of Dampers 261
13.1.3 Literature Review 262
13.2 Problem Formulation 264
13.2.1 Design Aspect 265
13.2.2 Mathematical Modeling 265
13.3 Numerical Analysis 266
13.3.1 Simulation Inputs 267
13.3.1.1 Model Description 267
13.3.1.2 Load Excitation 268
13.3.1.3 Span Wise Phase Lag 268
xiv Contents

13.3.2 Analysis Findings 271


13.4 Conclusion 273
13.A Appendix 274
References 277

14 Evaluation of Shortest Path to Optimize Distribution Network Cost and


Power Losses in Hilly Areas: A Case Study 281
Subho Upadhyay, Rajeev K. Chauhan, and Mahendra P. Sharma
Acronyms 281
14.1 Introduction 282
14.2 Design of Power Distribution Network 282
14.3 Digital Elevation Map 283
14.4 Placement of Generators and Load Centers 283
14.5 Single Line Diagram of 9-Bus System 285
14.6 Finding Shortest Path Between Load/Generating Centers 286
14.6.1 Objective Function 287
14.6.2 Distribution Network Distance 289
14.7 Selection of Conductor Using Newton Raphson Method 290
14.7.1 Estimation of Conductor Cost 292
14.8 Calculation of CO2 Emission Cost Saving 293
14.9 Overall Cost Estimation of Distribution System 294
14.10 Sensitivity Analysis 294
14.10.1 Change in Diesel Fuel Price 295
14.10.2 Change in Solar Radiation 295
14.10.3 Change in Demand 295
14.10.4 Change in Energy Index Ratio 295
14.11 Conclusion 295
References 296

15 Intelligent Approaches to Support Demand Response in Microgrid


Planning 299
Rahmat Khezri, Amin Mahmoudi, and Hirohisa Aki
Acronyms 299
15.1 Introduction 299
15.2 Microgrid Planning 300
15.2.1 Problem Overview 301
15.2.2 Objective Functions 302
15.2.3 Data Analysis 303
15.2.3.1 Weather Data 304
15.2.3.2 Load Data 304
15.2.3.3 Electricity Price 304
15.2.4 Microgrid Components 305
15.2.4.1 Distributed Energy Resources 305
15.2.4.2 Energy Storage Systems 305
15.2.5 Microgrid Operation 306
15.3 Demand Response in Microgrids 306
15.3.1 Overview on Demand Response Application for Microgrids 306
Contents xv

15.3.2 Demand Response: Types and Characteristics 307


15.3.3 Incentive DR 308
15.3.4 Time-Based DR 309
15.4 Intelligent Approaches to Support Demand Response 309
15.4.1 Data Mining Methods in DR 310
15.4.1.1 Supervised Data Mining 310
15.4.1.2 Unsupervised Data Mining 312
15.4.2 Fuzzy Logic-Based DR 313
15.4.3 Applications in Microgrid Planning 313
15.4.3.1 Cost Reduction 313
15.4.3.2 Resiliency Enhancement 314
15.4.3.3 Flexibility Improvement 314
15.4.3.4 Battery Capacity Reduction 314
15.5 Conclusion 315
References 315

16 Socioeconomic Analysis of Renewable Energy Interventions: Developing


Affordable Small-scale Household Sustainable Technologies in
Northern Uganda 319
Jens B. Holm-Nielsen, Achora P.O. Mamur, and Samson Masebinu
Acronyms 319
16.1 Introduction 319
16.2 Renewable Energy Technologies 321
16.2.1 Bio-oil 321
16.2.2 Bio-pellets 322
16.2.3 Biogas 322
16.2.4 Solar Cookers 322
16.2.5 Solar PV 323
16.3 Methodology 323
16.3.1 Driver Pressure Impact State Response Framework 323
16.3.2 Cost–Benefit Analysis 324
16.4 Application of the Method 324
16.5 Case Study Results for Product Development 327
16.5.1 Field Study 327
16.5.2 Source of Energy for Cooking 328
16.5.3 Source of Energy for Lighting 328
16.5.4 Household Income Level 328
16.5.5 Challenges for Firewood and Charcoal Use 329
16.5.6 The Rank of Adoption Toward Sustainable Renewable Energy Technologies 329
16.5.7 Household Opinions for Modern Energy Technologies 330
16.5.8 Level of Awareness of the Population 331
16.5.9 Medium of Information 331
16.5.10 Promotion to Purchase Alternative Renewable Energy Technologies 331
16.5.11 Sources of Fund for Investment in Northern Uganda Toward Renewable Energy
Technologies for Households 332
16.6 Cost–Benefit Analysis (CBA) 333
16.6.1 Benefits to Better Health 333
xvi Contents

16.6.2 Benefits on Greenhouse Gas Emissions Reduction 334


16.6.3 Benefits of the District Forest Resources Preservation 334
16.6.4 Outcomes of Cost–Benefit Analysis 334
16.7 Conclusion 337
References 338

Part VI Other Machine Learning Applications 343

17 Non-Intrusive Load Monitoring Using A Parallel Bidirectional Long


Short-Term Memory Model 345
Victor Andrean and Kuo-Lung Lian
Nomenclature 345
17.1 Introduction 346
17.1.1 Optimization-Based Approach 346
17.1.2 Learning-Based Approach 347
17.2 NILM System and Data Preprocessing 349
17.2.1 Data Scaling 349
17.2.2 Window Length Selection 350
17.2.3 Input-to-Output Relation (IOR) 350
17.2.3.1 IOR1 351
17.2.3.2 IOR2 351
17.2.3.3 IOR3 351
17.2.3.4 IOR4 351
17.3 Proposed Method 352
17.3.1 Feature Extractor 353
17.3.2 Elements of the PBLSTM 355
17.3.2.1 Convolution Neural Network (CNN) 355
17.3.2.2 Bidirectional Long Short-Term Memory (BLSTM) 356
17.3.2.3 Dense Layer 357
17.3.2.4 Deep Neural Network Training 358
17.4 Validation 358
17.5 Conclusion 368
References 368

18 Reinforcement Learning for Intelligent Building Energy Management


System Control 371
Olivera Kotevska and Philipp Andelfinger
Chapter Objectives 371
18.1 Introduction 371
18.2 Reinforcement Learning 372
18.2.1 Deep Reinforcement Learning 374
18.2.2 Advanced Reinforcement Learning 375
18.3 Applications of Deep Reinforcement Learning in Building Energy Management
Systems Control 376
18.3.1 Heating, Ventilation, and Air Conditioning 377
18.3.2 Water Heater 379
Contents xvii

18.3.3 Other Devices 380


18.4 Challenges and Research Directions 380
18.5 Conclusions 383
References 383

19 Federated Deep Learning Technique for Power and Energy Systems


Data Analysis 387
Hamed Moayyed, Arash Moradzadeh, Behnam Mohammadi-Ivatloo, and
Reza Ghorbani
Nomenclature 387
Acronyms 387
Symbols 387
19.1 Introduction 388
19.2 Federated Learning (FL) 388
19.2.1 Federated Learning Motivation 389
19.2.2 Performance Evaluation Metrics 390
19.2.3 Federated Learning vs. Distributed Machine Learning Approaches 391
19.2.4 The Federated Averaging Algorithm 391
19.2.5 Applications of Federated Learning 393
19.2.6 Challenges of Federated Learning 395
19.3 Power Systems Challenges and the Performance of Artificial Intelligence
Techniques in It 396
19.3.1 AI-Based Forecasting in Power Systems 396
19.3.2 AI-Based Condition Monitoring in Power Systems 397
19.4 Application of Federated Deep Learning in Power and Energy Systems 399
19.4.1 Electric Vehicle Networks 399
19.4.2 False Data Injection Attacks in Solar Farms 399
19.4.3 Solar Irradiation Forecasting 400
19.4.4 Heating Load Demand Forecasting 400
19.5 Conclusion 400
References 401

20 Data Mining and Machine Learning for Power System Monitoring,


Understanding, and Impact Evaluation 405
Xinda Ke, Huiying Ren, Qiuhua Huang, Pavel Etingov, and Zhangshuan Hou
Acronyms 405
20.1 Introduction 406
20.2 Power System Monitoring with Phasor Measurement Unit Data 407
20.2.1 PMU Anomaly Detection Framework 407
20.2.2 Anomaly Detection and Classification 408
20.3 Power System Mechanistic and Predictive Understanding 411
20.3.1 Spatiotemporal Pattern Recognition in PMU Signals 412
20.3.1.1 Time Series Pattern Recognition 412
20.3.1.2 Similarities and Variations Across Units 415
20.3.1.3 Similarities/Discrepancies Between Days/Months 417
20.3.2 Events Classification and Localization Through Convolutional Neural Network 417
20.3.2.1 Polish System Testbed and Data Preparation 417
xviii Contents

20.3.2.2 Fault Types and Implementation 419


20.3.2.3 CNN Model Development 419
20.3.2.4 CNN Model Evaluation 420
20.3.2.5 Fault Localization 421
20.3.2.6 Fault Classification 422
20.4 Characterization and Modelling of Weather and Power Extremes 423
20.4.1 Data Sources 424
20.4.2 Spatiotemporal Analysis 425
20.4.3 Probabilistic Modelling of Lines Outage 428
20.5 Conclusion 430
References 430

Conclusions 433
Zita Vale, Tiago Pinto, Michael Negnevitsky, and Ganesh Kumar Venayagamoorthy

Index 435
xix

About the Editors

Zita Vale, IEEE Senior Member, graduated in electrical engineering in 1986, received the PhD
degree in electrical and computer engineering in 1993, and the Agregação title (Habilitation) in
2003 from the University of Porto, Portugal. She is a full professor in the School of Engineering,
Polytechnic of Porto. She leads the research activities on Intelligent Power and Energy Systems
at GECAD – Research Group on Intelligent Engineering and Computing for Advanced Inno-
vation and Development. She has been involved in more than 60 R&D projects and published
more than 200 papers in international scientific journals. Her scientific research activities
mainly focus on Power and Energy Systems Operation, Electricity Markets, Demand Response,
Renewables, Electric Vehicles, and Distributed Generation and Storage. She has been developing
artificial-intelligence-based models, methods, and applications for power and energy, using agents
and multiagent systems, knowledge-based systems, semantics, machine learning, data mining,
and evolutionary computation.
She actively participates in several technical working groups and committees. She is the chair
of the IEEE PES Intelligent Data Analysis and Mining (IDMA) Working Group and of the Open
Data Sets (ODS) Task Force. She is the chair of the board of directors of ISAP – Intelligent Systems
Application to Power Systems. She is involved in editing activities for different journals and books
and is a regular reviewer and evaluator for papers and for project proposals and monitoring from
different funding agencies around the world.
Tiago Pinto is an assistant professor at the Universidade de Trás os Montes e Alto Douro
(UTAD), Portugal, and researcher at INESC-TEC. He has concluded the BSc and MSc, both at
the School of Engineering of the Polytechnic of Porto, where he has also developed his research
work for more than 10 years, namely at GECAD – Research Group on Intelligent Engineering
and Computing for Advanced Innovation and Development. He got the PhD from UTAD in 2016,
after which he engaged in a postdoc at the University of Salamanca, in collaboration with the
University of Oxford and ENGIE. His main research interests focus on artificial intelligence and
its application in power and energy systems, particularly machine learning, multi-agent systems,
decision support systems, and metaheuristic optimization. Through the involvement in more than
30 research projects in these fields, he has authored over 200 publications in international journals
and conferences, and has co-edited several books and special issues in journals related to power
and energy systems and artificial intelligence.
Michael Negnevitsky received the BE (Hons) in Electrical Engineering degree and PhD degree
from Byelorussian University of Technology, Minsk, Belarus, in 1978 and 1983, respectively. From
1978 to 1980, he worked at the Electrical Maintenance, Construction and Commissioning Com-
pany, and from 1984 to 1991, he was Senior Research Fellow at the Department of Electrical Engi-
neering, Byelorussian University of Technology, Minsk. After his arrival to Australia, he worked
xx About the Editors

at Monash University, Melbourne, and then the University of Tasmania. Currently he is Professor
and Chair in Power Engineering and Computational Intelligence, and Director of the Centre for
Renewable Energy and Power Systems.
His research interests include power system analysis and control, micro-grids with distributed
and renewable energy resources, smart grids, power quality and applications of artificial intelli-
gence in power systems. He has published more than 450 papers in high-quality journals and
refereed conference proceedings, authored 14 chapters in several books, and received 4 patents
for inventions. His book Artificial Intelligence (Addison Wesley 2002, 2005, 2011) has been trans-
lated into Mandarin, Cantonese, Korean, Greek, and Vietnamese and adopted in many universities
around the world.
He is Fellow of Engineers Australia, Fellow of the Japan Society for the Promotion of Science,
Member of CIGRE AP C4 (System Technical Performance) and AP C6 (Distribution Systems and
Dispersed Generation), Australian Technical Committee. Dr. Negnevitsky has served as Secretary
and Deputy Chair of IEEE PES Energy Development and Power Generation Committee, Chair
of IEEE PES International Practices Subcommittee, and Chair of the IEEE PES Working Group
on High Renewable Energy Penetration in Remote and Isolated Power Systems. In 2018, he
received the Joint Australasian Committee for Power Engineering and CIGRE Australia Award
for “outstanding career-long contributions to research and teaching in electric power engineering
as well as contribution to industry and CIGRE activities.”
Ganesh Kumar Venayagamoorthy is the Duke Energy Distinguished Professor of Power Engi-
neering and Professor of Electrical and Computer Engineering at Clemson University since January
2012. Prior to that, he was a Professor of Electrical and Computer Engineering at the Missouri
University of Science and Technology (Missouri S&T), Rolla, USA, where he was from 2002 to 2011.
Dr. Venayagamoorthy was a Senior Lecturer in Department of Electronic Engineering, Durban Uni-
versity of Technology, Durban, South Africa, where he was from 1996 to 2002. Dr. Venayagamoorthy
is the Founder and Director of the Real-Time Power and Intelligent Systems Laboratory at Missouri
S&T and Clemson University.
Dr. Venayagamoorthy received his PhD and MScEng degrees in Electrical Engineering from
the University of Natal, Durban, South Africa, in April 2002 and April 1999, respectively. He
received his BEng degree with a First-Class Honors in Electrical and Electronics Engineering from
Abubakar Tafawa Balewa University, Bauchi, Nigeria, in March 1994. He holds an MBA degree in
Entrepreneurship and Innovation from Clemson University, SC (August 2016).
Dr. Venayagamoorthy’s interests are in research, development and innovation of power systems,
smart grid, and artificial intelligence technologies. Dr. Venayagamoorthy is a Fellow of the IEEE,
IET (UK), the South African Institute of Electrical Engineers (SAIEE) and Asia-Pacific Artificial
Intelligence Association (AAIA), and a Senior Member of the International Neural Network
Society.
xxi

List of Contributors

A. Ahmed National Council of Science and Technology


Smart Grid Demonstration and Research Mexico City, Mexico City
Investigation Lab Mexico
Washington State University
Pullman, WA Nelson F. Avila
USA Independent Electricity System Operator
Toronto, ON
Hirohisa Aki Canada
Faculty of Engineering, Information and
Systems Wenlei Bai
University of Tsukuba Oracle Energy and Water
Tsukuba, Ibaraki Austin, TX
Japan USA

Philipp Andelfinger Rúben Barreto


Institute for Visual and Analytic Computing Research Group on Intelligent Engineering
University of Rostock and Computing for Advanced Innovation and
Rostock Development (GECAD) Institute of
Germany Engineering Polytechnic of Porto
Porto
Victor Andrean Portugal
Department of Electrical Engineering
National Taiwan University of Science and Miguel A. Carmona
Technology Tepic Technology Transfer Unit, Center for
Taipei Scientific Research and Higher Education of
Taiwan Ensenada
Tepic
Ramón Aranda Nayarit
Tepic Technology Transfer Unit, Center for Mexico
Scientific Research and Higher Education of
and
Ensenada
Tepic, Nayarit, Mexico National Council of Science and Technology
Mexico City
and
Mexico City
Mexico
xxii List of Contributors

Rajeev K. Chauhan Reza Ghorbani


Department of Electrical Engineering, Faculty Renewable Energy Design Laboratory
of Engineering (REDLab), Department of Mechanical
Dayalbagh Educational Institute Engineering
Agra, Uttar Pradesh University of Hawaii at Manoa
India Honolulu, HI
USA
Chia-Chi Chu
Department of Electrical Engineering James Hamilton
National Tsing Hua University School of Engineering, University of Tasmania
HsinChu Hobart
Taiwan, R.O.C. Tasmania
Australia
Juan M. Corchado
BISITE research group Jens B. Holm-Nielsen
University of Salamanca Department of Energy Technology
Salamanca Center for Bioenergy and Green Engineering
Spain Aalborg University
Esbjerg
Pavel Etingov Denmark
Pacific Northwest National Laboratory
Richland, WA Zhangshuan Hou
USA Pacific Northwest National Laboratory
Richland, WA
Pedro Faria USA
GECAD – Research Group on Intelligent
Engineering and Computing for Advanced Qiuhua Huang
Innovation and Development Pacific Northwest National Laboratory
Porto Richland, WA
Portugal USA

Gerardo Figueroa Xinda Ke


Sentiance NV Pacific Northwest National Laboratory
Antwerpen Richland, WA
Belgium USA

Zahra Forouzandeh Irfan Khan


GECAD – Research Group on Intelligent Supreme & Co. Pvt. Ltd.
Engineering and Computing for Advanced Kolkata, West Bengal
Innovation and Development, Polytechnic of India
Porto, School of Engineering (ISEP)
Porto Rahmat Khezri
Portugal College of Science and Engineering
Flinders University
Adelaide, SA
Australia
List of Contributors xxiii

Olivera Kotevska Achora P.O. Mamur


Computer Science and Mathematics Faculty of Sociology, Environmental and
Oak Ridge National Laboratory Business Economics
Tennessee University of Southern Denmark
Oak Ridge Esbjerg
USA Denmark

Duehee Lee Samson Masebinu


Electrical Engineering Department Department of Energy Technology, Center for
Konkuk University Bioenergy and Green Engineering
Seoul Aalborg University
Korea Esbjerg
Denmark
Kwang Y. Lee
Electrical and Computer Engineering Hamed Moayyed
Department GECAD – Research Group on Intelligent
Baylor University Engineering and Computing for Advanced
Waco, TX Innovation and Development, Polytechnic of
USA Porto (P.PORTO)
Porto
Fernando Lezama Portugal
Polytechnic of Porto (ISEP/IPP), Research
Group on Intelligent Engineering and Behnam Mohammadi-Ivatloo
Computing for Advanced Innovation and Faculty of Electrical and Computer
Development (GECAD) Engineering
Porto University of Tabriz
Portugal Tabriz
Iran
Kuo-Lung Lian
Department of Electrical Engineering Arash Moradzadeh
National Taiwan University of Science and Faculty of Electrical and Computer
Technology Engineering
Taipei University of Tabriz
Taiwan Tabriz
Iran
Wen-Kai Lu
Department of Information Management Bruno Mota
National Taiwan University GECAD Research Group on Intelligent
Taipei Engineering and Computing for Advanced
Taiwan, R.O.C. Innovation and Development (GECAD)
Polytechnic Institute of Porto (ISEP/IPP)
Amin Mahmoudi Porto
College of Science and Engineering, Flinders Portugal
University
Adelaide, SA
Australia
xxiv List of Contributors

Sushri Mukherjee and


Indian Institute of Technology Delhi
University of Trás-os-Montes e Alto Douro
Hauz Khas
Vila Real
New Delhi
Portugal
India
Dharmbir Prasad
Michael Negnevitsky
Asansol Engineering College
School of Engineering, University of Tasmania
Asansol, West Bengal
Hobart
India
Tasmania
Australia
Carlos Ramos
GECAD Research Group on Intelligent
Angel D. Pacheco
Engineering and Computing for Advanced
Tepic Technology Transfer Unit, Center for
Innovation and Development (GECAD)
Scientific Research and Higher Education of
Polytechnic Institute of Porto (ISEP/IPP)
Ensenada
Porto
Tepic
Portugal
Nayarit
Mexico
Sérgio Ramos
GECAD – Research Group on Intelligent
S. Pandey
Engineering and Computing for Advanced
Smart Grid and Technology
Innovation and Development, Polytechnic of
ComEd
Porto, School of Engineering (ISEP)
Oakbrook Terrace, IL
Porto
USA
Portugal
Chirath Pathiravasam
Huiying Ren
Holcombe Department of Electrical and
Pacific Northwest National Laboratory
Computer Engineering, Real-Time Power and
Richland, WA
Intelligent Systems Laboratory
USA
Clemson University
Clemson, SC
Ansel Y. Rodríguez González
USA
Tepic Technology Transfer Unit, Center for
and Scientific Research and Higher Education of
Ensenada
Department of Electrical Engineering
Tepic
University of Moratuwa
Nayarit
Katubedda
Mexico
Sri Lanka
and
Tiago Pinto
National Council of Science and Technology
GECAD Research Group on Intelligent
Mexico City
Engineering and Computing for Advanced
Mexico City
Innovation and Development (GECAD)
Mexico
Polytechnic Institute of Porto (ISEP/IPP)
Porto
Portugal
List of Contributors xxv

Sajan K. Sadanandan Subho Upadhyay


Smart Grid Integration Department of Electrical Engineering, Faculty
R&D Center, Dubai Electricity & Water of Engineering
Authority (DEWA) Dayalbagh Educational Institute
Dubai Agra, Uttar Pradesh
UAE India

Evgenii Semshikov Zita Vale


School of Engineering, University of Tasmania GECAD Research Group on Intelligent
Hobart, Tasmania Engineering and Computing for Advanced
Australia Innovation and Development (GECAD)
Polytechnic Institute of Porto (ISEP/IPP)
Mahendra P. Sharma Porto
Department of Hydro and Renewable Energy Portugal
Indian Institute of Technology
Roorkee, Uttarakhand Ganesh Kumar Venayagamoorthy
India Holcombe Department of Electrical and
Computer Engineering, Real-Time Power and
Cátia Silva Intelligent Systems Laboratory
GECAD – Research Group on Intelligent Clemson University
Engineering and Computing for Advanced Clemson, SC
Innovation and Development USA
Porto
and
Portugal
School of Engineering
Rudra P. Singh University of KwaZulu-Natal
Asansol Engineering College Durban
Asansol, West Bengal South Africa
India
Xiaolin Wang
João Soares School of Engineering, University of Tasmania
GECAD – Research Group on Intelligent Hobart, Tasmania
Engineering and Computing for Advanced Australia
Innovation and Development, Polytechnic of
Porto, School of Engineering (ISEP)
Porto
Portugal

Anurag K. Srivastava
Smart Grid Resiliency and Analytics Lab
West Virginia University
Morgantown, WV
USA
xxvii

Foreword

Recent machine learning and data analytics methods have proliferated into most areas of science,
engineering, and commerce. There are excellent reasons for their increasing popularity and appli-
cations. Many real-world problems are too complex to come up with closed-form analytical solu-
tions. However, such challenges did not make practitioners idle; instead, they have created working
models, prototypes and even built systems with a careful understanding of critical components of
the systems as a first step. The data generated from such systems are then analyzed by machine
learning and data analytics methods to have a more comprehensive understanding of the systems.
This book titled Intelligent Data Mining and Analysis in Power and Energy Systems makes a huge
leap in this direction in providing a better understanding of power and energy systems. Compiled
by Zita Vale, Tiago Pinto, Michael Negnevitsky, and Ganesh Kumar Venayagamoorthy, the book
begins with an introduction to machine learning and data analytics methods and then lays out
state-of-the-art methods in addressing various topics in power and energy system design, clus-
tering, classification, forecasting, and analysis with latest machine learning and data analytics
methods.
The book is self-contained and written for both novice and experts on the topic. The topics are
discussed in simple manner with adequate references and details, so that readers can understand
the current state-of-the-art and also find relevant past studies in a single volume.
If you are working in power and energy systems either as a researcher or a practitioner, this is a
must-have book to stay ahead in the game. Authors are experts in their own fields. The book will
save your efforts in searching for materials on the topic, provide you with the latest methodologies,
and direct you to other similar past studies.
Kudos to the editors for this compilation and authors for their contributions.
Kalyanmoy Deb
University Distinguished Professor
Withrow Senior Distinguished Research Scholar
Koenig Endowed Chair Professor
IEEE CIS Evolutionary Computation Pioneer
IEEE Fellow
Department of Electrical and Computer Engineering
Michigan State University, East Lansing, MI, USA
1

Introduction

In an era of ever-increasing data, there is also an increasing need for the development of suitable
intelligent data mining and analysis solutions that enable taking the value out of these data.
Fostered by this increasing need and also boosted by the recent worldwide boom in artificial
intelligence interest and development, we have been witnessing a significant development of a
wide array of new advanced data mining models and methods. These models and methodologies
have been instrumental in dealing with real problems, especially in highly complex domains such
as power and energy systems [1].
The challenges in power and energy systems have changed completely during the past years,
especially because of the increase in the distributed renewable energy sources and the consequently
required transformations in power systems’ operation, management, and planning, and also in
electricity markets [2]. New players are emerging, such as prosumers, electric vehicles, new types of
aggregators, energy communities, new local market operators, energy managers of different kinds,
among many others [3, 4]. Consequently, new business models are also being proposed, experi-
mented, and implemented as the way to involve such new players in the sector in an active way
while creating a new value for these players and for the system, e.g. through the enhancement of
the use of local generation and the fostering of consumption and generation flexibility trading [5].
Such significant changes in a traditionally conservative sector require an unprecedented adap-
tation and foresight capacity from the entire energy value chain, including from policy makers,
regulators, operators, planners, and even from the smaller players. This is where the role of new
and intelligent data mining and analysis models and methodologies become crucial, contributing
to overcoming multiple problems with distinct characteristics. Some relevant examples are power
system planning, state estimation, energy resource profiling, aggregation and forecasting, market
negotiation, and energy management at multiple levels, including building, microgrid, smart grid,
energy community, and distribution grid levels.
This book provides a comprehensive review of intelligent data mining and analysis applications
in power and energy systems. This book is organized in six complementary parts, each focusing
on a specific topic within the data mining and analysis domain, namely, data mining and analy-
sis fundamentals, clustering, classification, forecasting, data analysis, and other machine learning
applications. Each of the six parts is briefly described as follows:
Part I: Data mining and analysis fundamentals provide an overview on data mining and anal-
ysis foundations as a means to introduce the reader to the main concepts of the domain and
facilitate the deeper understanding of the works described in the rest of the book. Besides the
main concepts behind data mining and analysis, the first chapter is dedicated to highlighting
the importance of data pre-processing and feature engineering as a means to enable a suitable
Intelligent Data Mining and Analysis in Power and Energy Systems: Models and Applications for Smarter Efficient
Power Systems, First Edition. Edited by Zita Vale, Tiago Pinto, Michael Negnevitsky, and Ganesh Kumar Venayagamoorthy.
© 2023 The Institute of Electrical and Electronics Engineers, Inc. Published 2023 by John Wiley & Sons, Inc.
2 Introduction

application of the state-of-the-art models dedicated to the diverse traditional problems related
to data mining. This introductory overview provides the means for a deeper understanding of
data mining and analysis, bridging the reader into the power and energy systems application
domain through the presentation of two systematic reviews, namely, on data mining and analy-
sis applications in power and energy systems and on the contributions of deep learning in power
system problems. These two chapters address different problems within power and energy sys-
tems, which benefit from the advances of data mining models related to clustering, classification,
forecasting, and other common approaches.
Part II: Clustering presents a description of works that apply clustering models and methods to
address power and energy system problems. These include standard clustering approaches, as
well as the combination of clustering with other models, e.g. classification-based, to solve dif-
ferent types of problems. Specifically, the power and energy system problems addressed by this
part include consumer-directed problems related to consumer clustering and demand profil-
ing. Aggregation problems considering not only the aggregation of consumers but also of their
consumption flexibility are explored as a means to solve demand response challenges in wider
scales. Synchrophasor data analytics taking advantage on clustering models and their combina-
tion with other approaches are also addressed, aiming at anomaly detection, localization, and
classification.
Part III: Classification includes the description of works that apply classification models such as
artificial neural networks, support vector machines, K-nearest neighbors, among others, to solve
problems using labeled data. The application cases of these works are related to non-technical
loss detection in electric distribution systems, electrical vehicle integration in the power sys-
tem under multiple worldwide perspectives and considering different types of technologies, and
electricity market participation and decision support, namely, in the scope of bilateral contract
negotiations using historic and overserved data from multiple negotiators’ negotiation process.
Part IV: Forecasting is devoted to the description of works related to the forecasting of energy
resources with distinct characteristics. These works are mainly focused on the forecasting of
highly variable renewable energy sources; besides, the application of traditional regression-based
algorithms describes the advantages of specific approaches such as multivariate stochastic mod-
els, spatiotemporal models, and decomposition-based models. The forecasting models are
applied to solar irradiance and temperature estimation and to wind and solar power forecasting
under distinct scenarios regarding historical data, power system characteristics, and overall
renewable energy penetration.
Part V: Data analysis presents the application of data analysis models of distinct natures to address
different types of power system-related problems. These problems concern issues such as the
vibration of transmission line conductors through the analysis of harmonic dynamic response
and the design of power distribution network in hilly areas with the purpose of enabling off-grid
electrification. The application of intelligent demand response models as part of microgrid plan-
ning is another of the addressed problems. This part is finalized with a chapter focusing on
socioeconomic analysis of renewable energy interventions toward affordable and sustainable
household technologies.
Part VI: Other machine learning applications describe applications that use distinct types of
machine learning approaches such as reinforcement learning, federated learning, and proba-
bilistic modeling, addressing a varied set of challenges of natures. Such challenges include the
state estimation of power electronic converters, using both white box and black box approaches.
The problem of intelligent building energy management and control is addressed using
reinforcement learning. Federated deep learning is applied to generate global supermodels for
References 3

power system data analysis, and risk assessment of power system outages is performed through
probabilistic modeling, considering weather and climate extremes.
Overall, this book comprises the description of a wide set of intelligent data mining and analysis
models, methodologies, and applications, addressing problems of distinct natures within the field
of power and energy systems, while highlighting the advantages of the already achieved break-
throughs in the domain and pointing out the main gaps that have not yet been solved, as pointers
for future paths of continuous research and development.

Zita Vale
Polytechnic of Porto, Portugal
Tiago Pinto
Polytechnic of Porto, Portugal
University of Trás-os-Montes e Alto Douro
Portugal
Michael Negnevitsky
University of Tasmania, Australia
Ganesh Kumar Venayagamoorthy
Clemson University, USA

References

1 Ibrahim, M.S., Dong, W., and Yang, Q. (2020). Machine learning driven smart electric power sys-
tems: current trends and new perspectives. Applied Energy 272: 115237.
2 Pinto, T., Vale, Z., Widergren, S., and editors. (2021). Local Electricity Markets, 1e. Academic
Press 384 pp. https://www.elsevier.com/books/local-electricity-markets/pinto/978-0-12-820074-2.
3 Koirala, B.P., Koliou, E., Friege, J. et al. (2016). Energetic communities for community energy: a
review of key issues and trends shaping integrated community energy systems. Renewable and
Sustainable Energy Reviews 56: 722–744.
4 de São, J.D., Faria, P., and Vale, Z. (2021). Smart energy community: a systematic review with
metanalysis. Energy Strategy Reviews 36: 100678.
5 Hall, S. and Roelich, K. (2016). Business model innovation in electricity supply markets: the role
of complex value in the United Kingdom. Energy Policy 92: 286–298.
5

Part I

Data Mining and Analysis Fundamentals


7

Foundations
Ansel Y. Rodríguez-González 1 , Angel Díaz-Pacheco 2 , Ramón Aranda 3 , and
Miguel Á. Álvarez-Carmona 4
1
Unidad de Transferencia Tecnológica Tepic, Centro de Investigación Científica y de Educación Superior de Ensenada, Nayarit,
México
2
Departamento de Ingeniería en Electrónica, Campus Irapuato-Salamanca, Universidad de Guanajuato, Guanajuato, México
3
Unidad Mérida, Centro de Investigación en Matemáticas, Yucatán, México
4
Unidad Monterrey, Centro de Investigación en Matemáticas, Nuevo León, México

Acronyms
ANN artificial neural networks
IoT Internet of Things
KDD knowledge discovery in databases
KNN k-nearest neighbors
PCA principal component analysis
SVM support vector machine
SVR support vector regression
tSNE T-distributed stochastic neighbor embedding
TWD three-way decision

1.1 Data Mining: Why and What?


Storing data and analyzing it to make better decisions are a process that humanity has performed
at least since the creation of mathematics for commerce, particularly double-entry book-keeping,
initially known in the Renaissance as book-keeping “alla veneziana” [1]. This accounting system
consisted of recording transactions using a general memorandum and a second, more detailed, and
organized record. Additionally, transactions were recorded twice, on the one hand, the ledgers, on
the other debtors. Since each income must be balanced with a counterpart, the system allows to
find errors, understand where the costs and profits come from, and thus guide decision-making.
With the development of larger and more complex business relationships stemming from the
first and second industrial revolutions, the need for more powerful analysis tools arose. Thus,
econometrics emerged, a branch of economics that uses mathematical and statistical models to
analyze, interpret and make predictions about economic systems, predict variables, and find rela-
tionships between them and trends [2]. In general, econometrics is based on the construction of
formal models with which it is possible to verify hypotheses, measure statistical variables, and carry
out simulation tests.

Intelligent Data Mining and Analysis in Power and Energy Systems: Models and Applications for Smarter Efficient
Power Systems, First Edition. Edited by Zita Vale, Tiago Pinto, Michael Negnevitsky, and Ganesh Kumar Venayagamoorthy.
© 2023 The Institute of Electrical and Electronics Engineers, Inc. Published 2023 by John Wiley & Sons, Inc.
8 1 Foundations

The computing revolution made it inexpensive to carry out multiple (many) hypothesis tests,
and, as a consequence, the search for the model that best fits the data was encouraged. To describe
this process, terms such as data mining, data dredging, data snooping, and data fishing emerged
[3, 4]. Additionally, the term data miner was coined to name the researchers that given a set of
data, fit alternative equations since there are alternative subsets of possible explanatory variables
and chose the best equation. Also, the term data miner was used to differentiate them from classical
statistics researchers [5].
However, it is possible to find a model that fits a data set well, even if it is false (i.e. a model
obtained from completely random data). This situation complicates the interpretation of the test
results of hypotheses (significance levels) [6]. Due to this problem, the 1980s of the last century
were a dark decade for the term data mining as it had a negative connotation in econometrics.
Even so, a new dawn for the term data mining turned up in the early 1990s [7–9]. Computer
scientists begin to use the term to describe algorithms that search for previously unexpected struc-
tures and patterns in data. Usually, the data sets involved are large. Moreover, data sets are getting
bigger and bigger and continue to grow with the growth of the internet, storage and processing
capabilities, the widespread use of personal computers and mobile devices, and more recently,
the Internet of Things (IoT). In this context, data mining is characterized by focusing on algorith-
mic and computational problems in data analysis, such as memory requirements and algorithm
response time (computational cost), and not so much on more statistical problems, such as infer-
ence and estimation.

1.2 Data Mining into KDD

A very related research field with data mining is knowledge discovery in databases (KDD). On
the one hand, data mining consists in “applying data analysis and discovery algorithms that, under
acceptable computational efficiency limitations, produce a particular enumeration of patterns (or
models) over the data”. A pattern is an expression that describes a subset of the data using some
language. On the other hand, KDD is defined as “the non-trivial process of identifying valid, novel,
potentially useful and ultimately understandable patterns in data”. In other words, KDD consists
of “mapping low-level data into other forms that might be more compact, more abstract, or more
useful” [10].
Thus, the KDD is a more general interactive and iterative process that includes data mining at its
core. In addition, KDD includes other steps like data selection, data preprocessing, data transfor-
mation, and pattern interpretation/evaluation. Figure 1.1 presents the outline of the KDD process
from de data viewpoint.

Interpretation
Selection Preprocessing Transformation Data mining evaluation

Data Target Preprocessed Transformed Patterns Knowledge


data data data

Figure 1.1 The outline of the KDD process. Source: Adapted from Fayyad et al. [10].
1.3 The Data Mining Process 9

However, from a more general viewpoint than the data viewpoint, the KDD involves the following
nine steps:

1. Understanding the application domain: Understanding the relevant prior knowledge is


developed. The goal of the KDD process is defined from the customer’s viewpoint.
2. Creating a target data set: A data set to perform the discovery is selected, focusing on a subset
of variables or data samples.
3. Data cleaning and preprocessing: The data set is preprocessed to identify and clean
inconsistent and noisy data. Also, strategies for handling noisy data and missing data are
developed.
4. Data reduction and projection: Dataset is transformed in order to facilitate data mining.
The effective number of variables under consideration or invariant representations for the data,
depending on the goal of the KDD, process is found by means of dimensionality reduction or
transformation methods.
5. Choosing the suitable data mining task: The data mining task (e.g. summarization, clas-
sification, regression, and clustering) is chosen according to the defined goals for the KDD
process.
6. Choosing the suitable data mining algorithm: The data mining algorithm is chosen. Also,
the appropriate parameters for the said algorithm are decided.
7. Employing data mining algorithm: The patterns of interest (e.g. associations, clusters, and
classification or regression rules) are discovered by employing the chosen algorithm.
8. Interpreting mined patterns: The patterns mined are interpreted. In addition, they can be
visualized, and irrelevant or redundant patterns can be removed. Patterns can also be translated
in terms understandable to users. The process can continue to the next step or return to any
previous step.
9. Using discovered knowledge: The knowledge discovered is used in at least one of the follow-
ing three ways: it is documented and reported to interested parties, it is incorporated into another
system that uses it to make decisions, or it is compared with previously believed knowledge to
check and resolve potential conflicts.

1.3 The Data Mining Process

Although the data mining core consists of choosing the task and algorithm and extracting inter-
esting patterns, other relevant previous and subsequent steps conform to the data mining process.
Thus, the data mining process comprises three blocks of steps: data preprocessing, data mining
core, and post-data mining.
Data preprocessing aims to obtain a yield quality data from the raw data. This block includes data
cleaning, data integration, data reduction, and data transformation.
On the other hand, postdata mining focused on pattern evaluation and representing knowledge
from the patterns mined. Pattern evaluation is focused on identifying interesting patterns that rep-
resent the knowledge using interestingness measures evaluated over the mined patterns. While
knowledge representation is focused on summarizing and visualizing the data mining result in
reports, tables, set of rules, and graphs.
The following subsections (from 1.3.1 to 1.3.4) address the steps into the data pre-processing,
while Section 1.4 addresses data mining tasks and techniques involved in the data mining core
block.
10 1 Foundations

1.3.1 Data Cleaning


Data cleaning aims to detect and correct corrupt or inaccurate data set records and deals with miss-
ing values and noisy data.
Missing values can be handled in the following ways:
● Ignore de tuple: It is commonly used when the data mining task is the classification, and the

value of the class label does not exist in the tuple or when the tuple has several missing values.
● Fill the missing value manually: It is time-consuming and requires human effort. Conse-

quently, it is not used on large data sets.


● Use a constant global value: All missing values are replaced by the same constant value.

Commonly the value refers to the concept “Unknown.” However, it can be a problem because
the data mining algorithms used later must know the meaning of the term and incorporate a
mechanism to deal with it.
● Use the mean attribute value: Each missing value is replaced by the average value of its

attribute. Another choice, when the data mining task is the classification, is replacing each
missing value of a given tuple by the average value of its attribute only over the subset of tuples
with the same class label of the given tuple.
● Use estimated values: Each missing value is replaced by an estimated value. The estimated

value can be the most probable value of its attribute or the result of regression techniques.
Noise data is understood as an error in the data or the values that deviate from normal. It can be
handled in the following ways:
● Use noise filters: The noise tuples are identified and removed from the data set. Removing

noisy tuples is advantageous [11], but their attributes contain valuable information that can
help build a model (e.g. a classifier) [12]. On the other hand, distinguishing between noisy
examples and true exceptions is difficult.
● Use data polishing method: Considering that a subset of the attributes is a good predictor of

the value of the remaining attributes, then inappropriate values in a particular tuple (values far
from the predicted) are identified and replaced by non-noisy values (the predicted value) [13].
The main weakness of these methods is the time complexity involved.
● Use robust algorithms: The dealing of the noise data is transferred to the data mining core

block, and a data mining algorithm is used, little influenced by noise data. For example, if the
data mining task is the classification, the Credal-C4.5 algorithm [14] is a choice.

1.3.2 Data Integration


Data integration consists of combining data from multiple heterogeneous data sources and provid-
ing a unified view of the data [15]. Having a unified view of the data helps data mining algorithms
discover useful information. However, the unified view must be free of inconsistencies, discrepan-
cies, redundancies, and disparities. Otherwise, the knowledge extracted could be wrong, negatively
impacting decision-making.
The main issues to deals with the data integration are the following:
● Entity identification problem: Two or more tuples from the data sources can refer to the same
real-world entity but using different id attributes (e.g. customer_id, customer_number, or pass-
port_number). Then the question is: How to match real-world entities from data? The metadata
of each attribute can be used to avoid errors in schema integration.
● Redundancy and correlation analysis: Redundant data means data that can be derived or
calculated from one another. For example, if a data source provides the attribute age and another
1.3 The Data Mining Process 11

provides the attribute date of birth, the attribute age will be redundant in the integrated data set.
Redundancy can be discovered by analyzing the correlation between attributes.
● Tuple duplication: Duplicate tuples can occur when some tables in the data sources are denor-
malized.
● Data conflict detection and resolution: Conflicts between values that do not match, but repre-
sent the same, can be detected. It can be because the same information may be defined differently
in the data sources. For example, inconsistency between two attributes could be detected because,
although they represent the same magnitude, they use different metric systems (e.g. km and
miles, degrees Fahrenheit to degrees Celsius, dollars, and euros).
The following techniques are used to address the problems mentioned above:
● Manual integration: No automation is used for data integration, and all the integration is done
manually by the data analyst. This technique is time-consuming and requires much human
effort. It is used to integrate small data sources.
● Middleware integration: Data from data sources are collected, normalized, and merged by
middleware software. This technique is commonly used to integrate data from legacy systems
into a unified dataset for a modern system.
● Application-based integration: A specific software application is designed and developed to
integrate the data sources into a unified dataset. Although this technique saves more time and
effort than manual integration, the development time and the technical knowledge required
should be considered.
● Uniform access integration: A unified view is designed and created. This view represents the
integrated data, but the location of the data does not change. It is only a view; data stay in the
original data source.
● Common data storage: A system that keeps a copy of the data from the data source to store
and manage it independently is used. Although this technique can integrate very different data
sources like relational databases and flat files, the main problem is how to handle the vast vol-
umes of data.

1.3.3 Data Reduction


When the instances of the data have a large number of variables (features or characteristics) are
said to have high dimensionality. Due to the “curse of dimensionality,” there are many difficulties
to work with high-dimensionality data, such as visualization, assumptions validation, and compu-
tational processing time, among others [16].
To deal with the high-dimensionality issues, there are different dimensionality reduction
techniques. The idea behind all techniques is to generated a new data set with the same
number of instances but with a reducing dimension (lower number of variables). The new
low-dimensionality data must preserve some aspects or properties of the original data, for
example, two distant instances in the original data must remain distant in the new data, and two
nearby instances from the original space must remain nearby in the new data [17]. Some popular
techniques in the literature are:
● Principal Component Analysis (PCA): By far the most popular method used for dimension-
ality reduction. It was developed by Hotelling [18] and Pearson [19]. This method is a form of
unsupervised learning and is based on the covariance analysis between all the variables from
the data. The new data set is generated by a linear transformation that projects the data on the
coordinates with the greatest variances.
12 1 Foundations

● T-distributed Stochastic Neighbor Embedding (tSNE): This technique was developed by


van der Maaten and Hinton [20]. This method is a nonlinear dimensionality reduction that trans-
forms the original instances of the data into a two or three-dimensional graph, where are coded
the probabilities of observing other instances in a certain distance.
● Autoencoders: The autoencoders are based on artificial neural networks (ANNs) [21, 22]. These
methods use the idea of encoding the data by training a neural network. The network inputs
are given by each data variable (each input neuron is associated with each variable). The next
layers of the network have less neurons than the previous. The output of the last layer returns
the encoding data, and the dimension is given by the number of neurons in it.

1.3.4 Data Transformation


Data transformation aims to convert raw data into a suitable format for data mining algorithms
to work efficiently. There are several data transformation techniques, which are presented below.
Some of them have already been mentioned in previous data mining steps.

● Data smoothing: It is used to remove noise from the dataset. methods such as binning, regres-
sion, and clustering are used. Binning splits the sorted data into bins and smoothens the data
values considering their neighborhood values. Regression identifies relations between attributes,
and it uses them to predict an attribute from the other attribute. Clustering obtains groups of
similar values, allowing finding outliers, values that do not belong to any group.
● Attribute construction: New attributes that can help the data mining are created (computed)
from the existing attributes. For example, a new area attribute can be created from the height and
width attributes.
● Data aggregation: Summary or aggregation operations are applied to the data. For example, the
number of transactions per hour can be aggregated to get a daily summary.
● Data normalization: Data values are scaled to a smaller range such as [−1, 1] or [0.0, 1.0].
● Data discretization: Continuous data are converted into a set of data intervals. When discretiz-
ing, the cardinality of the resulting set (i.e. the number of intervals) is much smaller than the
cardinality of the original set. Then data mining algorithms work faster. However, care must be
taken because discretizing can change the nature of the data. Continuous data are converted into
a set of data intervals.
● Data generalization: Low-level data attributes are converted to high-level data attributes using
concept hierarchy. For example, the attribute city can be generalized as a country.

1.4 Data Mining Task and Techniques

Data mining comprehends a set of techniques and methods that can be applied to a wide range of
purposes and goals. To understand this complex paradigm, we need to establish a few categories.
Broadly speaking, data mining is used to achieve two main goals, discover interesting patterns, and
test hypotheses. For ease of understanding, Figure 1.2 provides a diagram of the data mining tasks
in a hierarchical order.
A verification task consists of classical statistical methods which try to test a predefined hypoth-
esis. Methods under this category include analysis of variance, T-Test, and goodness of fit, among
others. The discovery category expands through more levels than verification. Within this task, it
does not need a hypothesis to begin with, instead, analytical techniques are tasked with finding the
1.4 Data Mining Task and Techniques 13

Data mining

Discovery Verification

– T-test
Description Prediction – ANOVA
– Goodness of fit
– Clustering
– Summarization
– Association Classification Regression
– Support-vector machines – Linear regression
– Artificial neural networks – Support vector regression
– K-nearest neighbors – Artificial neural networks
–… –…

Figure 1.2 Hierarchy of data mining tasks.

right ones from the data. Discovery methods automatically find interesting patterns in the data, so
the data provide us with facts that, on occasions, are not obvious.
In this category, there are two main families, description and prediction. Descriptive data min-
ing attempts to understand relations among data instances, uncovering hidden patterns that pro-
vide the analyst with useful information for the task at hand. For some analytical tasks, when
information is scarce or there is no information at all, these methods are preferred as the first
approach. Two important tasks in this category are frequent pattern mining and association rules.
Frequent patterns [23] are substructures appearing in a data set with a certain frequency, such as
milk and bread in a grocery transaction. Finding those patterns is very useful for many data min-
ing applications [24]. Regarding the association rules [25], it aims to capture hidden patterns by
capturing the co-occurrence of events in the data and then trying to establish a correlation between
them. These rules are given as “If X, then Y is likely to occur 30% of the time” [26].
As for the predictive branch, the methods in this category analyze the data to construct a model
that imitates the unknown model assumed was used to generate that information. With this model,
we can automatically predict new instances from one or more characteristics related to samples.
The family of prediction is divided into classification and regression tasks. Both are forms of fore-
casting, but the target variable to be predicted is different in form and, therefore, its purposes also
diverge.
A regression is a method for investigating functional relationships among variables. This rela-
tionship is expressed as a model linking the response variable and one or more explanatory vari-
ables. Formally let y be the response variable and a set of explanatory variables x1 , x2 , …, xp (with
p = number of explanatory variables). The real relationship between y and x1 , x2 , …, xp , can be
approximated by the regression model y = f (x1 , x2 , …, xp ) + 𝜀 where 𝜀 is a random error represent-
ing the discrepancy in approximation [27].
On the other hand, classification is a subtype of regression because the response variable is
discrete rather than continuous. In a classification task, the problem is defined by a distribution D
over x × y where y = {0, 1, …, L} (with L = number of labels). The goal is to find a model h : x → y
minimizing the error rate on D [28].
From this point, the reader can appreciate the wide range of tasks that data mining can address;
however, we still need to give some background about some of the most representative techniques
under its umbrella.
14 1 Foundations

1.4.1 Techniques
Having defined the main data mining branches, we can provide some details about the main tech-
niques in each of them. Much has been written on classical statistical techniques, so the approaches
under the “Verification” branch will not be explained in this book.

1.4.1.1 Techniques in the “Description” Branch


Clustering is one of the most representative methods in this category. This technique aims to
group objects in a manner that objects within a group are more similar among them than those in
the other groups (see Figure 1.3).
Formally, given a dataset D = {t1 , t2 , …, tn } of tuples and an integer number of clusters to create k
partitions, we want to define a mapping f : D → (1, 2, …, k) where each ti is assigned to one cluster
K j , 1 ≤ j ≤ k. A cluster K j contains just those tuples mapped to it; that means that K j = {ti ∣ f (ti ) = K j ,
1 ≤ j ≤ n} and ti ∈ D [29].
Since clustering can reduce the dimensionality of data points, we can use this technique to pro-
duce compact representations of such data. On the other hand, using this approach on textual data
as a collection of documents, we can extract the most representative terms from each group to
summarize this collection [30].

1.4.1.2 Regression Techniques


Predicting how much a real estate property could cost or future stock prices are the kind of problems
regression algorithms aim to solve.
Linear regression is a powerful technique used to uncover relationships between the response
variable (they can be more than one) and the considered set of variables (see Figure 1.4). The basic
form of this technique is used to model the relationship between the response variable y and the
descriptor variable x that can represent concepts like caloric intake (x) and weight (y) or time for
study (x) and grades (y).
This relationship is written y = 𝛽 0 + 𝛽 1 x + 𝜀. The response variable is y, 𝛽 0 is the y-intercept, 𝛽 1 is
the slope, x is the descriptor, and 𝜀 is the random error [31]. An extension of this approach considers
more than one independent variable, and it is known as multiple linear regression.
Two well-known techniques for regression tasks are support vector regression (SVR) and ANN.
These approaches are also present in the classification branch and will be discussed in Section
1.4.1.3.

Figure 1.3 Graphical representation of a clustering task.


1.4 Data Mining Task and Techniques 15

Figure 1.4 Graphical representation of a regression task.

Figure 1.5 Graphical representation of the KNN algorithm for classification.

1.4.1.3 Classification Techniques


One of the simplest and most powerful techniques for classification tasks is the one known as
“k” nearest neighbors (KNN). It operates over the intuitive principle that samples from the same
class should be closer in a dimensional space and further away otherwise. Given the above, for
an unknown sample, it simply calculates its distance to the rest of the data points and its class is
determined as that of their KNN (see Figure 1.5). Formally, consider a dataset of n points with
their classes {(x1 , y1 ), (x2 , y2 ), …, (xn , yn )}, where (xi , yi ) represent the feature vector and their
corresponding target class (yi ). Then for a new data point, considering k = 1, the most likely class
should be computed by knn(x, 1) = yp where p = {1, 2, …, m} and m = number of classes. P is
determined as p = argmini ||x − xi ||2 [32].
Support Vector Machine (SVM) is another widely studied algorithm to address classification
tasks, proposed by Vapnik [33]. The basic notion behind this approach is to try to maximize the
distance between two classes, which is usually defined by their nearest samples. Given a problem
of two classes z = {xi, yi }m i = 1 , where yi ∈ {−1, 1}. Then z consists of two set of indexes, one for each
class, I = {i ∣ yi = 1} and II = {i ∣ yi = − 1}. Considering a hyperplane H given by wT x + b = 0, we say
that I and II are separable by H if for i = 1,… m, wT xi + b > 0, ∀i ∈ I or wT xi + b < 0, ∀i ∈ II. Then
the distance of each class to hyperplane H is defined as,
16 1 Foundations

tI (w, b) = min i∈I {yi (wT xi + b)}

tII (w, b) = min i∈II {yi (wT xi + b)}

The corresponding classification hyperplane is obtained by max||w|| = 1, b {tI (w, b) + tII (w, b)} [34]
(see Figure 1.6).
An interesting category within classification methods is for those that attempt to imitate nature,
which is the case of ANN. ANN are inspired by the functioning of the human brain, whose inter-
connected neurons are at the origin of different processes such as learning. For this reason, this
brain abstraction is well suited for classification and regression tasks, among others. There are sev-
eral types of ANNs according to their complexity. However, one of the most basic forms is the one
known as perceptron. It was created by Rosenblatt [35], and it represents the abstraction of a single
neuron (see Figure 1.7).

Decision boundary
(hyperplane)

Figure 1.6 Graphical representation of the SVM.

X1
Dendrite
W1 Nucleus
Dendrite branches

W2 Xf
X2 xiwj
i
W3

X3
Axon terminal

(a) Perceptron (b) Organic neuron

Figure 1.7 In (a) a Perceptron compared to (b) an organic neuron.


1.4 Data Mining Task and Techniques 17

The perceptron is made up of two layers of input and output nodes. The first layer x = (x1 , x2 , …,
xn ) is for input values, while in the output nodes (y1 , y2 , …, ym ) the result of calculations is returned.
The function used to calculate the results is called the activation function, and the sign function
(∑n )
is one of the most used. The sign function is given by ̂ y = sign i=1 wi xi where w = (w1 , w2 , …,
wn ) is the vector of the weights assigned to the connections between the input to output nodes
[36]. The weight vector stores the strength of the connection between the nodes. These values are
automatically determined during the learning process on labeled data.

1.4.2 Applications
So far, we have discussed some important aspects of data mining, but to offer a better understand-
ing to the reader, we would like to discuss a central aspect that highlights the importance of this
discipline and its applications. Although the multiple purposes of these tools are limited only by
our imagination, data mining has been used in a wide range of applications. Some of the most
representatives are described below.

Marketing: The possibility to automate these techniques to analyze data-seeking patterns enables
to harvest information in massive collections of unstructured data. In the marketing sector, data
mining has been used successfully for different objectives such as customer segmentation and
the range of approaches known as business intelligence, among others. Customer segmenta-
tions aim to separate customers into different and uniform groups based on their characteristics.
This analysis helps organizations to have a better understanding of their customers to design
better strategies based on the common features of their different segments. Business intelli-
gence helps organizations sort and structure the information extracted from raw data into tables,
graphs, and various interactive dashboards. These tools enable managers to overcome the com-
plexity of the different business aspects and make strategic decisions with real-time, updated
information [37].
Fraud detection: In recent years, the detection of financial fraud has received a great deal of
attention. With the availability of vast amounts of data from electronic transactions, customer
preferences, and patterns, the use of data mining techniques had proved to be adequate for the
detection, and therefore prevention of fraud. The most successful fraud detection systems are
based on extensive customer historical data where patterns of both fraudulent and nonfraudulent
customers are characterized, allowing the data mining algorithms to learn how to differentiate
between them [38].
Applications to medicine: Data mining has been used for different applications in the medical
field. Using the extensive historical data from patients, noninvasive analyses are conducted to
predict the existence of different conditions such as cancer [39, 40], diabetes [41–43], and dif-
ferent disorders in new-borns [44–46]. Furthermore, concerning the patient’s historical records,
text mining techniques are applied to them in order to automatize the search process for specific
data such as prescribed procedures, conditions, and drugs [47]. On the other hand, approaches
known as text summarization have been employed to produce compact representations of med-
ical records to facilitate its time-consuming analysis by doctors [48].
Tourism data analysis: In recent years, data mining has become essential in tourism [49]. Mainly
the data used are the movements of travelers and the probabilities of movement of the flow of
visitors in specific destinations [50]. Also, the management of different types of information has
given important advantages to the tourism sector; for example, the use of text has been used to a
large extent to determine the collective polarity of visitors to a site in order to detect if tourists had
18 1 Foundations

a good experience or bad and based on this make decisions for improvement [51]. On the other
hand, image processing has become an essential tool to detect important events that could affect
the destination’s image [52]. Finally, various tourist recommenders obtain advantages from data
mining in order to generate recommendations aimed at a better target [53].

1.5 Data Mining Issues and Considerations


From everything discussed in this chapter, it can agree that data mining is a remarkable tool for
many fields. However, many aspects still need to be improved or maintained in a permanent
upgrade cycle. These challenges vary from one study to another, but what it considers that the
main issues to be addressed are presented below.

1.5.1 Scalability of Algorithms


Over the past decade, the popularization of social networks, the development of affordable stor-
age technologies, the IoT, and many other factors have led to unprecedented data growth. It was
estimated that in 2015, 8 zettabytes were created mainly from unstructured data such as Facebook
posts, emails, and Tweets, among others [54]. This abundance of data implies benefits for orga-
nizations that can obtain business intelligence through its analysis, but conventional data mining
algorithms are designed for data that can be loaded completely in main memory. In response to this
problem, Dean and Ghemawat [55] introduced the MapReduce programming model. MapReduce
is based on the paradigm of divide and conquer to parallel data analysis operations on processing
nodes. Although MapReduce is currently the industry standard for the management and analysis
of massive datasets, this problem is not solved and at some point, in the future, data overflow will
exceed the capabilities of our latest solution, so this challenge is part of the ones that need to be
periodically revisited.

1.5.2 High Dimensionality


As mentioned above, the curse of dimensionality is a recurring problem in data mining. In such
types of scenarios, many irrelevant dimensions can be present, hindering the task to find patterns.
A common step in overcoming this setback is the use of dimensional reduction techniques, such as
those mentioned in Section 1.3.3. They transfer the original data from a higher dimensional space
by mathematical transformation and preserve as much of the original essential features in this
low-dimensional space [56]. Although there are many approaches to overcoming this limitation,
there is no universal solution and typically requires testing many techniques in the dataset that is
analyzed without guaranteed results.

1.5.3 Improving Interpretability


Data mining techniques have traditionally been compared to black boxes because we feed them
with, data and as an output, we get the result we want but without knowledge of the process that
occurs inside. However, for many domains, this obscure functioning is not ideal because of their
lack of intuitiveness and explanation of model predictions. This lack of explanation represents
a major shortcoming in critical decision-making processes because models must have two char-
acteristics: understandability and interpretability. Understandability is related to the question of
1.6 Summary 19

how an explanation is comprehended by the observer [57]. Interpretability, on the other hand, is
related to the intuition behind the results generated by the model. Many efforts have been made
to alleviate this defect. According to Linardatos et al. [58], interpretable approaches can be cat-
egorized based on different features. If their application is restricted to a specific domain, they
are called model-specific. The agnostic methods are those approaches that can be applied to every
domain. The scale of interpretation is also taken into account, if the model explains only for spe-
cific instances, it is a local one, and if it explains the entire model, it is said to be global. Another
category is for the data types in which the model is applied, and the last one is for the purposes of
interpretability, as to explain existing models (explain complex black-box models) [58].

1.5.4 Handling Uncertainty


Uncertainty is a common problem in all stages of the data mining process. There are two categories
of uncertainty: aleatoric and epistemic. The first one refers to the noise in the data captured by the
model, while the epistemic represents the ambiguity that the model presents when dealing with
operational inputs [59]. Different techniques have been proposed to deal with the uncertainty. One
of the most widespread is the Fuzzy Logic. This theory was introduced by Zadeh in 1975 and uses
fuzzy sets to try to reproduce human behavior under certain conditions [60]. Over the past decade,
many studies have used type-1 fuzzy sets to address the uncertainty, but significant improvements
have been achieved through type-2 fuzzy sets in many recent last-generation works. Another impor-
tant effort to control the uncertainty is given by the three-way decision (TWD) model. TWD is used
to identify and consider uncertain instances (in the input data) and manage uncertainty in the
output of the data mining models [61]. Despite the many studies to address uncertainty, this chal-
lenge is far from over and will require much more effort to accomplish an optimal solution to this
problem.

1.5.5 Privacy and Security Concerns


One of the main concerns of analyzing of data sets consisting of personal information on online
transactions, social interactions, and preferences, among others, is the possibility of recovering
fragments of sensitive personal information. Although the data can be encrypted, data mining
algorithms require decrypted data to work, but organizations are not willingly sharing sensitive
information (like medical records) decrypted and ready for analyzing/snooping. To overcome this
shortcoming, many studies have been based on an approach known as homomorphic encryption.
This technique is a form of encryption that allows certain types of analysis on encrypted data and
to generate a ciphered result which, once decrypted, gives the same results that performing the
analysis on the simple text [62], this system has been successfully applied in the medical field in
combination with the naive Bayes classifier [63]. Also, to classify encrypted communications (SSH
and Skype) [64] and classify encrypted images [65]. This approach consumes a lot of computing
power, so more work needs to be done to overcome this problem.

1.6 Summary
With the ease of modern computers, countless models and predictions have been generated from
various data sets. This has given rise to what we now know as data mining. This term became
popular in the 1990s. From this point, the main efforts to describe search algorithms for discovering
20 1 Foundations

unexpected structures and patterns in data were carried out. One of the keys to this impact is that
data sets are getting bigger and bigger and continue to grow with the growth of the internet, storage
and processing capabilities, the widespread use of personal computers and mobile devices, and
more recently, the IoT. In this context, data mining is characterized by focusing on algorithmic and
computational problems in data analysis, such as memory requirements, and not so much on more
statistical problems, such as inference and estimation.
A very related research field to data mining is knowledge discovery in databases. It consists of
“low-level mapping data into other forms that might be more compact, more abstract, or more
useful.” In general, the KDD process involves the following nine steps: (i) understanding of the
application domain, (ii) creating a target data set, (iii) data cleaning and preprocessing, (iv) data
reduction and projection, (v) choosing the suitable data mining task, (vi) choosing the suitable data
mining algorithm, (vii) employing data mining algorithm, (viii) interpreting mined patterns, and
(ix) using discovered knowledge.
On the other hand, data mining comprehends a set of techniques and methods that can be applied
to a wide range of purposes and goals. Among the most important, it is possible to find two impor-
tant categories that allow applying data mining to various problems. The first category is descriptive
power, which attempts to understand relations among data instances, uncovering hidden patterns
that provide the analyst with helpful information for the task at hand. Here, it is possible to find,
for example, the clustering algorithms. The second category involves predictive power, which is
the ability to analyze the data to construct a model that imitates the unknown model assumed to
generate that information. The family of predictions is divided into classification, with algorithms
like KNN, SVM, or ANN, and regression tasks as linear regression.
Finally, it is crucial to remark on critical issues. Several areas are constantly improving in different
areas such as research or implementation, and whenever data mining processes are going to be
carried out, the implementation should be considered. Among these considerations, the main ones
are scalability of algorithms, high dimensionality, improving interpretability, handling uncertainty,
and privacy and security concerns. Each of these issues could significantly affect the success of the
data mining process, so much emphasis is always placed on paying attention to each of these details.

References

1 Pacioli, L. (1494). Summa de Arithmetica geometria proportioni: et proportionalita... N.p.:


Paganino de paganini.
2 Goldberger, A.S. (1964). Econometric theory. Journal of the American Statistical Association
60 (312): 818–887.
3 Lovell, M.C. (1983). Data mining. Review of Economics and Statistics 65 (1): 12.
4 Selvin, H. and Stuart, A. (1966). Data dredging procedures in survey analysis. American Statisti-
cian 20 (3): 20–23.
5 Denton, F.T. (1985). Data mining as an industry. The Review of Economics and Statistics 67 (1):
124–127.
6 Caudill, S.B. (1988). The necessity of mining data. Atlantic Economic Journal 16 (3): 11.
7 Agrawal, R., Imielinski, T., and Swami, A. (1993a). Database mining: a performance perspective.
IEEE Transactions on Knowledge and Data Engineering 5 (6): 914–925.
8 Frawley, W.J., Piatetsky-Shapiro, G., and Matheus, C.J. (1992). Knowledge discovery in
databases: an overview. AI Magazine 13 (3): 57–70.
References 21

9 Michalski, R.S., Kerschberg, L., and Kaufman, K.A. (1992). Mining for knowledge in databases:
the INLEN architecture, initial implementation and first results. Journal of Intelligent Informa-
tion Systems 1: 85–113.
10 Fayyad, U., Piatetsky-Shapiro, G., and Smyth, P. (1996). From data mining to knowledge discov-
ery in databases. AI Magazine 17 (3): 37–54.
11 Gamberger, D., Boskovic, R., Lavrac, N., and Groselj, C. (1999). Experiments with noise filtering
in a medical domain. Proceedings of the 16th International Conference on Machine Learning.
Morgan Kaufmann Publishers, pp. 143–151.
12 Zhu, X. and Wu, X. (2004). Class noise vs. attribute noise: a quantitative study. Artificial Intelli-
gence Review 22: 177–210.
13 Teng, C.M. (1999). Correcting noisy data. ICML, pp. 239–248.
14 Mantas, C.J. and Abellán, J. (2014). Credal-C4.5: decision tree based on imprecise probabilities
to classify noisy data. Expert Systems with Applications 41 (10): 4625–4637.
15 Lenzerini, M. (2002). Data integration: a theoretical perspective. Proceedings of the 21st ACM
SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, pp. 233–246.
16 Altman, N. and Krzywinski, M. (2018). The curse(s) of dimensionality. Nature Methods 15:
399–400. https://doi.org/10.1038/s41592-018-0019-x.
17 Lee, J.A. and Verleysen, M. (2014). Two key properties of dimensionality reduction methods.
2014 IEEE Symposium on Computational Intelligence and Data Mining (CIDM), pp. 163–170.
https://doi.org/10.1109/CIDM.2014.7008663.
18 Hotelling, H. (1933). Analysis of a complex of statistical variables into principal components.
Journal of Educational Psychology 24: 417–441.
19 Pearson, K. (1901). On lines and planes of closest fit to systems of points in space. Philosophical
Magazine 2 (11): 559–572.
20 van der Maaten, L.J. and Hinton, G.E. (2008). Visualizing high-dimensional data using t-SNE.
Journal of Machine Learning Research 9: 2579–2605.
21 Kramer, M. (1991). Nonlinear principal component analysis using autoassociative neural net-
works. AIChE Journal 37: 233–243. https://doi.org/10.1002/aic.690370209.
22 Yang, H. and Schell, K.R. (2022). QCAE: a quadruple branch CNN autoencoder for real-time
electricity price forecasting. International Journal of Electrical Power & Energy Systems 141:
108092.
23 Agrawal, R., Imieliński, T., and Swami, A. (1993b). Mining association rules between sets of
items in large databases. 1993 ACM SIGMOD International Conference on Management of Data,
pp. 207–216.
24 Han, J., Cheng, H., Xin, D., and Yan, X. (2007). Frequent pattern mining: current status and
future directions. Data Mining and Knowledge Discovery 15 (1): 55–86.
25 Rodríguez-González, A.Y., Martínez-Trinidad, J.F., Carrasco-Ochoa, J.A., and Ruiz-Shulcloper,
J. (2013). Mining frequent patterns and association rules using similarities. Expert Systems with
Applications 40 (17): 6823–6836.
26 Tamayo, P., Berger, C., Campos, M. et al. (2005). Oracle data mining. In: Data Mining and
Knowledge Discovery Handbook, (ed. Maimon, O. and Rokach, L.), 1315–1329. Boston, MA:
Springer. https://doi.org/10.1007/0-387-25465-X_63.
27 Chatterjee, S. and Hadi, A.S. (2006). Regression Analysis by Example, Wiley Series in Prob-
ability and Statistics. Wiley. ISBN: 9780470055458. https://books.google.com.mx/books?
id=uiu5XsAA9kYC.
28 Liu, Z. and Xia, C.H. (2008). Performance Modeling and Engineering. New York, NY: Springer.
ISBN: 9780387793610. https://books.google.com.mx/books?id=Zda0LxbnCgoC.
Another random document with
no related content on Scribd:
CHAPTER V.
CAMPAIGN OF KHALID AGAINST THE FALSE PROPHET
TOLEIHA.

A.H. XI. Nov. A.D. 632.

The materials for our story at this point


are few, obscure, and disconnected. The Materials for the first epoch
scene of confusion that reigned throughout imperfect.
Arabia is presented to our view in but dim and hazy outline. With the
Prophet’s life, Tradition proper ends. The prodigious stores of oral
testimony, which light up in minutest detail the career of Mahomet,
suddenly stop. The grand object of tradition was, from the oral
teaching and example of the Prophet, to supplement by authoritative
rulings what was wanting in the Corân. That motive ceased with the
death of Mahomet, and with it tradition, as such, ceases also.[24]
What history we have for the period immediately succeeding is in the
form of loose fragments—the statements, it may be, of
eyewitnesses, or gathered as hearsay from the memory of Arab
tribes, or from legends in the neighbouring conquered lands. Hence
it is that, after the death of Mahomet, we are left for a time to grope
our way by evidence always scanty and often discrepant. The further
back we go, the obscurity is the greater; and it is most so while, in
the first year of Abu Bekr’s Caliphate, Islam was struggling for
existence. There was little room then for thought beyond the safety
of the moment; and when at length the struggle was over, nothing
was left but the sense of relief from a terrible danger, and the
roughest outline of the way in which it had been achieved. No date is
given for any one of the many battles fought throughout the year.
Here and there we may be guided by the apparent sequence of
events; but as the various expeditions were for the most part
independent of one another, and proceeding simultaneously all over
the peninsula, even this indication too often fails.[25]
Such being the case, the thread of our
narrative here must run an arbitrary Arrangement of narrative of
course. Taking Tabari as our guide, we campaigns
tribes.
against apostate

begin with the campaign of Khâlid against


Toleiha in the north-east, and follow him thence southward to
Yemâma. We shall then take up the provinces assigned to other
leaders, as they lie geographically around the coasts—Bahrein,
Omân, Hadhramaut and Yemen.
After Abu Bekr and Omar, the most
prominent figure in the story of the early Khâlid ibn Welîd.
Caliphate is without doubt that of Khâlid,
son of Welîd. More to him than to any other is it due that Islam
spread with such marvellous rapidity. A dashing soldier, and brave
even to rashness, his courage was tempered by a cool and ever-
ready judgment. His conduct on the battle-fields which decided the
fate of the Persian empire and of the Byzantine rule in Syria, must
rank him as one of the greatest generals of the world. Over and
again he cast the die in crises where loss would have been
destruction to Islam, but always with consummate skill and heroism
which won the victory. The carnage following his arms gained for him
the title of The Sword of God; and so little regard had he for loss of
life even amongst his own followers, that he could wed the freshly-
made widow of his enemy on the field yet moistened by his people’s
blood. He had already distinguished himself in the annals of Islam.
Fighting, at the first, on the side of the Coreish, the defeat of the
Prophet at Ohod was due mainly to his prowess. At the capture of
Mecca, now in the ranks of the faithful, his was the only column
which shed blood; and shortly after, the cruel massacre of an
unoffending tribe brought down upon him the stern reproof of
Mahomet.[26] At the battle of Mûta, three years before, he had given
a signal proof of his generalship, when, the Moslem army having
been routed by Roman legions, and its leaders one after another
slain, he saved the shattered remnants by skilful and intrepid tactics
from destruction.[27] It was this Khâlid whom Abu Bekr now sent forth
against the rebel prophets Toleiha and Moseilama.
His column, by far the strongest of the
eleven, was composed of the flower of the Khâlid marches towards the
Refugees from Mecca, as well as of the Beni Tay.
men of Medîna, which latter marched under their own officer, Thâbit
son of Cays.[28] To divert the enemy’s attention, Abu Bekr gave out
that the destination was Kheibar, and (to strike the greater terror into
the insurgents) that he intended himself to join it there with a fresh
contingent. Khâlid, however, was not long in quitting the northern
route. Striking off to the right, he made direct for the mountain range
of Ajâ and Salmâ, the seat of the Beni Tay, and not distant from the
scene of Toleiha’s revolt among the Beni Asad.
Of the doctrines of Toleiha, as of the
other pretenders to the prophetic office, we Toleiha, the false prophet.
know little, nor indeed anything at all to
show wherein the secret of influence lay. A few doggrel verses and
dark or childish sayings are all that the contemptuous voice of
tradition has transmitted of their teaching, if such it can be called. So
far as appears, it was a mere travesty of Islam. Toleiha forbad
prostration during worship. ‘The Lord,’ he said, ‘hath not commanded
that ye should soil your foreheads in the dust, neither that ye should
double up your backs in prayer.’ Similarly Moseilama and Sajâh
remitted two of the five daily times of prayer. That four pretenders
(for Sajâh the prophetess was also such) should have arisen in
different parts of Arabia, and, even before the death of Mahomet,
drawn multitudes after them, would seem to imply something in their
doctrine deeper than senseless rhymes and more specious than
petty variations of the Moslem rite.[29] So much is clear, that the
spiritual sense of Arabia had been quickened by the preaching of
Mahomet, and that his example had not only suggested the claims of
others, but also contributed to their success. Jealousy of Mecca and
Medîna, moreover, and impatience of the trammels of Islam, were
powerful incentives for Bedouins to cast in their lot with these
pretenders. Thus the Beni Ghatafân, who before their submission to
Mahomet were in league with the Tay and Asad tribes, had recently
fallen out with them and lost some of their pasture-lands. Oyeina,[30]
chief of the Ghatafân, now counselled a return to their old relations
with the Beni Asad. ‘Let us go back,’ he said, ‘to our ancient alliance
which we had before Islam with them, for never since we gave it up
have I known the boundaries of our pasture-lands. A prophet of our
own is better than a prophet of the Coreish. Besides, Mahomet is
dead, but Toleiha is alive.’ So saying, Oyeina, followed by 700
warriors of his tribe, joined the false prophet at Bozâkha.
When first he heard of the heresy,
Mahomet had deputed Dhirâr to the Beni Khâlid reclaims the Beni Tay.
Asad, with instructions to rally the faithful
amongst them, and with their aid to crush Toleiha. The two
encountered one another, and the sword of Dhirâr, we are told,
glanced off from the person of his adversary. On this, a rumour
spread abroad that Toleiha led a charmed life, and thenceforward his
cause prospered. After their defeat at Abrac, the insurgents, as we
have seen, flocked to Toleiha at Bozâkha, and he was further
strengthened by the adhesion of two influential branches of the Beni
Tay.[31] Dhirâr found his position at last so insecure that he fled to
Medîna. The great family of the Beni Tay, however, was not wholly
disloyal, for Adî (as above mentioned) had already presented the
legal dues to Abu Bekr on behalf of some part of it. Adî therefore
was now sent forward by Khâlid to his people, in the hope of
detaching them from Toleiha’s cause. He found them in no friendly
humour. ‘The Father of the Foal!’ they cried (for such was the
sobriquet contemptuously used for Abu Bekr[32]); ‘thou shalt not
persuade us to do homage to him.’ ‘Think better of it,’ replied Adi; ‘an
army approacheth which ye cannot withstand. Ye shall know full
soon that he is no foal, but a lusty stallion. Wherefore see ye to it.’
Alarmed at his words, they begged for time that they might recall the
two branches which had joined Toleiha, ‘For,’ said they, ‘he will
surely hold them hostages, or else put them to death.’ So Khâlid
halted three days, and in the end they not only tendered submission,
but joined him with 1,000 horse, ‘the flower of the land of Tay, and
the bravest of them.’
Thus reinforced, Khâlid advanced
against Toleiha. On the march his army Battle of Bozâkha.
was exasperated by finding the bodies of
two of their scouts—one a warrior of note named Okkâsha—who
had been slain, and left by Toleiha to be trampled on the road.[33]
The armies met at Bozâkha, and the combat is said to have been hot
and long. At last (so we are told) the tide of battle was turned by
certain utterances of Toleiha, who was on the field in his prophetic
garb of hair. Oyeina fought bravely with his 700 of the Beni Fezâra.
[34] The situation becoming critical, he turned to Toleiha, saying,
‘Hath any message come to thee from Gabriel?’ ‘Not yet,’ answered
the prophet. A second time he asked, and received the same reply.
‘Yes,’ said Toleiha, a little after, ‘a message now hath come.’ ‘And
what is it?’ inquired Oyeina eagerly. ‘Thus saith Gabriel to me, Thou
shalt have a millstone like unto his, and an affair shalt happen that
thou wilt not forget.’ ‘Away with thee!’ cried Oyeina scornfully; ‘no
doubt the Lord knoweth that an affair will happen that thou shall not
forget! Ho, ye Beni Fezâra, every man to his tent!’ So they turned to
go; and thereupon the army fled. Toleiha escaped with his wife to
Syria. His subsequent history proved him a brave warrior; but he had
a poor cause, and the combat could hardly have been very severe,
as no mention is made of loss on either side.
His sequel is curious. At the first,
Toleiha took refuge with the Beni Kelb on Toleiha’s sequel.
the Syrian frontier; then when the Beni
Asad were pardoned, he returned to them and again embraced
Islam. Passing Medîna soon after on pilgrimage, he was seized and
carried to Abu Bekr, who set him at liberty, saying, ‘Let him alone.
What have I to do with him? The Lord hath now verily guided him
into the right path.’ When Omar succeeded to the Caliphate, he
presented himself to take the oath of allegiance. At first Omar spoke
roughly to him: ‘Thou art he that killed Okkâsha and his comrade. I
love thee not.’ ‘Was it not better,’ answered Toleiha, ‘that they by my
hand should obtain the crown of martyrdom, rather than that I by
theirs should have perished in hell-fire?’ When he had sworn
allegiance, the Caliph asked him concerning his oracular gift,[35] and
whether anything yet remained of it. ‘Ah,’ he replied, ‘it was but a puff
or two, as from a pair of bellows.’ So he returned to his tribe, and
went forth with them to the wars in Irâc, where, in the great struggle
with Persia, he became a hero of renown.
After the battle of Bozâkha, the Beni
Asad, fearing lest their families should fall Beni Asad and other tribes
received back into Islam.
into the conqueror’s hand, tendered their
submission. The Beni Aámir, Suleim, and Hawâzin, tribes which had
stood aloof watching the event, now came in, and received from
Khâlid the same terms as the Beni Asad. They resumed the
profession of Islam with all its obligations, and in proof thereof
brought in the tithe. A full amnesty was then accorded, on condition
only that those who during the apostasy had taken the life of any
Moslem should be delivered up. These were now (to carry out the
Caliph’s vow) put to the like death as that which they had inflicted. If
they had speared their victims, cast them over precipices, drowned
them in wells, or burned them in the fire, the persecutors were now
subjected to the same barbarous and cruel fate.
Khâlid stayed at Bozâkha for a month,
receiving the submission of the people in A body of malcontents under
the vicinity and their tithes. Troops of horse Omm Siml discomfited.
scoured the country, and struck terror into the vacillating tribes
around. In only one direction was serious opposition met. Certain
malcontents from amongst the penitent and returning people, unable
to brook submission, gathered themselves together in a defiant
attitude. They had yet to learn that the grip of Islam was stern and
crushing. Their restless and marauding spirit preferred, perhaps,
even as a forlorn hope, to hold their enemy at bay; or they had
sinned beyond the hope of grace. Thus they assembled in a great
multitude around Omm Siml, daughter of a famous chieftain of the
Ghatafân. This lady’s mother, Omm Kirfa, had been captured and
put to a cruel death by Mahomet. She herself had waited upon
Ayesha as a captive maid in the Prophet’s household; but the
haughty spirit of her race survived the servitude. Mounted on her
mother’s war-camel, she led the force herself, and incited the
insurgents to a bold resistance. Khâlid proclaimed the reward of one
hundred camels to him who should maim her camel. It was soon
disabled; and, Omm Siml slain, the rout was easy.[36]
In this campaign the only persons taken
captive were those who had deeply Oyeina, Corra, and Alcama
compromised themselves as leaders in released by Abu Bekr.
rebellion. They were sent by Khâlid to Abu Bekr. The chief were
Oyeina, Corra, and Alcama. The story of this last, a chief of the Beni
Aámir, is curious. After the surrender of Tâyif he had fled to Syria.
On the death of Mahomet he returned, and incited his people to
rebellion. An expedition sent in pursuit of him had seized his family,
and carried them off captive to Medîna. He fled; but as all the
country-side had now submitted, there was no longer any way of
escape, and he was seized and delivered up to Khâlid. Corra, of the
same tribe, was one of those whom Amru, on his journey from
Oman, had found vacillating, and of whom he brought an evil report
to Abu Bekr. Oyeina, the marauding chieftain of the Fezâra, had
often been the terror of Medîna. When the city was besieged by the
Coreish, he offered his assistance on certain humiliating terms,
which the Prophet was near accepting; and he was one of the many
influential leaders ‘whose hearts,’ after the battle of Honein and
siege of Tâyif, ‘had been reconciled’ by the Prophet’s largesses. He
was now led into Medîna with the rest in chains, his hands tied up
behind his back. The citizens crowded round to gaze at the fallen
chief, and the very children smote him with their hands, crying out,
‘Enemy of the Lord, and apostate!’ ‘Not so,’ said Oyeina bravely; ‘I
am no apostate; I never was a believer until now.’[37] The Caliph
listened patiently to the appeal of the captives. He forgave them, and
commanded their immediate release.
Abu Bekr, as a rule, was mild in his
judgments, and even generous to the Fujâa, a freebooter, burned
alive.
fallen foe. But on one occasion the
treachery of a rebel chief irritated him to an act of barbarous cruelty.
Fujâa, a leader of some note amongst the Beni Suleim, under
pretence of fighting against the insurgents in his neighbourhood,
obtained from the Caliph arms and accoutrements for his band. Thus
equipped, he abused the trust, and, becoming a freebooter, attacked
and plundered Moslem and Apostate indiscriminately. Abu Bekr
thereupon wrote letters to a loyal chief in that quarter to raise a force
and go against the brigand. Hard pressed, Fujâa challenged his
adversary to a parley, and asserted that he held a commission from
the Caliph not inferior to his. ‘If thou speakest true,’ answered the
other, ‘then lay aside thy weapons and accompany me to Abu Bekr.’
He did so, and followed, without further resistance, to Medîna. No
sooner did he appear than the Caliph, enraged at his treachery, cried
aloud: ‘Go forth with this traitor to the burial-ground, and there burn
him with fire.’ So, hard by in Backî, the graveyard of the city, they
gathered wood, and heaping it together at the Mosalla, or place of
prayer, kindled the pile, and cast Fujâa on it.
If the charges were well founded, which
we have no ground for doubting, Fujâa Abu Bekr regrets the act.
deserved the fate of a bandit; but to cast
him alive into the flames was a savage act, for which Abu Bekr was
sorry afterwards. ‘It is one of the three things,’ he used to say, ‘which
I would I had not done.’[38]
CHAPTER VI.
STORY OF MALIK IBN NOWEIRA.

A.H. XI. A.D. 632.

Having subdued the Beni Asad, and


other tribes inhabiting the hills and desert Khâlid advances south. a.h.
to the north-west of Medîna, Khâlid now XI. November (?) a.d. 632.
bent his steps southward, against the Beni Temîm who occupied the
plateau towards the Persian Gulf.
This great tribe had from time
immemorial spread itself with multitudinous The Beni Temîm.
branches over the pasture-lands and
settlements lying between Yemâma and the delta of the Euphrates.
Some of its clans professed Christianity, but the greater portion were
heathen. They used in past times to have frequent passages, often
of a hostile character, with Persia.[39] Most part of this people had
submitted to the claims of Mahomet, and the oratorical contest
between their embassy and the poets of Medîna forms a curious
episode in the Prophet’s life.[40] His death had produced amongst
them the same unsettlement and apostasy as elsewhere. Abu Bekr’s
first early success resulted, as we have seen, in bringing some of
their chiefs to Medîna with the tithes. Meanwhile a strange
complication had arisen which embroiled the Beni Yerbóa, one of
their clans, commanded by the famous Mâlik ibn Noweira, and
eventually brought Khâlid on the scene.
It was no less than the advent of Sajâh, a prophetess, at the
head of a great host from Mesopotamia. She was descended from
the Beni Yerbóa, but her family had migrated north, and joined the
Beni Taghlib, among whom in Mesopotamia she had been brought
up as a Christian. How long and by what steps she had assumed the
prophetic office, and what (if any) were her peculiar tenets, we do
not know; for nothing of hers excepting
some childish verses has been preserved. Sajâh the prophetess gains
At the head of the Taghlib and other over Mâlik ibn Noweira, chief
of Beni Yerbóa.
Christian tribes,[41] each led by its own
captain, she had crossed into Arabia, hoping to profit by the
confusion that followed on the death of Mahomet, and was now on
her way to attack Medîna. Reaching the seats of the Beni Temîm,
she summoned to her presence the Beni Yerbóa, her own clan, and
promised them the kingdom, should victory crown her arms. They
joined her standard, with Mâlik ibn Noweira at their head. The other
clans of the Beni Temîm refused to acknowledge the prophetess;
and so, diverted from her design upon Medîna, she turned her arms
against them. In a series of combats, though supported by Mâlik, she
was worsted. Then, having made terms and exchanged prisoners,
she bethought her of attacking the rival prophet, Moseilama of
Yemâma, whose story I must here in some part anticipate.
Moseilama was strongly supported by
his own people, the Beni Hanîfa, in his Sajâh, having married
Moseilama, retires to
claim to be their prophet and ruler; but he Mesopotamia.
now felt that the meshes of Abu Bekr were
closing round him. The Caliph’s officers were rallying the yet loyal or
vacillating chiefs in Hejer; and Khâlid, whom Moseilama dreaded
most of all, was behind. Tidings of the approach of a new enemy at
this crisis added to his perplexity; and he therefore sent a friendly
message to the prophetess to come and meet him. She came, and
they found their sentiments so much in unison that they cemented
the alliance by marriage. Moseilama conceded to her one half-share
of the revenues of Yemâma—the share, he said, which belonged to
the Coreish, but which, by their tyranny and violence, they had
forfeited. After a few days she departed again to her own country,
leaving a party with three of her officers to collect the stipulated
tribute. Like a meteor, this strange personage disappeared as soon
almost as she had startled Arabia by her advent; and we hear no
more of her.[42]
Khâlid, flushed with victory, was now drawing near, and most of
the branches of the Temîm were forward in tendering their
submission to him. At this critical juncture,
the withdrawal of Sajâh, and his own Mâlik ibn Noweira and the
previous doubtful attitude, left Mâlik ibn Beni Yerbóa attacked by
Khâlid.
Noweira at the head of the Beni Yerbóa in
a position of some perplexity, and he was undecided how to act.[43]
On the other hand, conflicting news divided the Moslem camp. For
some reason Khâlid was bent on attacking the Beni Yerbóa. The
men of Medîna[44] were equally opposed to the design, for which
they alleged that Khâlid had from the Caliph no authority. It would
have been better for him had he listened to the remonstrance. But he
replied haughtily, ‘I am commander. In the absence of orders, it is for
me to decide. I will march against Mâlik ibn Noweira with the men of
Mecca, and with such others as choose to follow me. I compel no
man.’ So he went forward and left the malcontents behind. These,
however, thought better of it, and rejoined the army. Khâlid marched
straight upon Bitâh, the head-quarters of Mâlik, but he found not a
soul upon the spot. It was utterly deserted.
In fact, Mâlik had resolved on
submission, though his proud spirit Mâlik brought a prisoner into
Khâlid’s camp;
rebelled against presenting himself before
Khâlid. He knew the ordinance of Abu Bekr, that none but they who
resisted his arms, and refused the call to prayer, should be molested.
So he told his people that there was no longer use in opposing this
new way, but that, bowing down, they should suffer the wave to pass
over them: ‘Break up your camp,’ he said, ‘and depart every one to
his house.’ Khâlid finding things thus, was not content, but, treating
the neighbourhood as enemy’s land, sent forth bands everywhere to
slay and plunder, and take captive all that offered opposition or failed
to respond to the call for prayer. Amongst others, Mâlik was brought
in with his wife and a party of his people. When challenged, they had
replied that they too were Moslems. ‘Why, then, these weapons?’ it
was asked. So they laid aside their arms and were led as captives to
the camp. As they passed by Khâlid, Mâlik cried aloud to him, ‘Thy
master never gave command for this.’ ‘“Thy master,” sayest thou?’
was the scornful reply of Khâlid; ‘then, rebel, by thine own
admission, he is not thine!’
The captors differed in their evidence.
Some averred that the prisoners had and, with other prisoners put
offered resistance. Others, with Abu to death.
Catâda, a citizen of Medîna, at their head, deposed that they had
declared themselves Moslems, and at once complied with the call to
prayer. So they were remanded till morning under an armed guard.
The night set in cold and stormy, and Khâlid (such is his
explanation), with the view of protecting them from its inclemency,
gave the guard command ‘to wrap their prisoners.’ The word was
ambiguous, signifying in another dialect[45] not ‘to wrap,’ but ‘to slay,’
and Dbirâr, commandant of the guard, taking it in that sense, put the
prisoners, and with them Mâlik, forthwith to the sword. Khâlid,
hearing the uproar, hurried forth; but all was over, and he retired,
exclaiming, ‘When the Lord hath determined a thing, the same
cometh verily to pass.’ But the fate of Mâlik was not thus easily to be
set at rest. He was a chief of name and influence, and a poet of
some celebrity. The men of Medina who had opposed the advance
were shocked at his cruel fate. Abu Catâda roundly asserted the
responsibility of Khâlid. ‘This is thy work!’ he said; and, though
chided for it, he persisted in the charge. He declared that never
again would he serve under Khâlid’s banner. In company with
Motammim, Mâlik’s brother, he set out at once for Medina, and there
laid a formal complaint before the Caliph. Omar, with his native
impetuosity, took up the cause of the Yerbóa chief. Khâlid had given
point to the allegations of his enemies by marrying Leila, the
beautiful widow of his victim, on the spot. From this scandalous act,
Omar drew the worst conclusion. ‘He hath conspired to slay a
believer,’ he said, ‘and hath gone in unto his wife.’ He was instant
with Abu Bekr that the offender should be degraded and put in
bonds, saying, ‘The sword of Khâlid, dipped thus in violence and
outrage, must be sheathed.’ ‘Not so,’ replied the Caliph (of whom it is
said that he never degraded one of his commanders); ‘the sword
which the Lord hath made bare against the heathen, shall I sheathe
the same? That be far from me.’ Nevertheless, he summoned Khâlid
to answer for the charge.
Khâlid lost no time in repairing to
Medina. He went up straightway to the Khâlid exonerated by Abu
Great Mosque, and entered it in his rough Bekr;
field costume, his clothes rubbed rusty with his girded armour, and
his turban coiled rudely about the head with arrows stuck in it. As he
passed along the courtyard towards the Caliph’s place, Omar could
not restrain himself, but seizing the arrows from his turban, broke
them over his shoulders, and abused him as hypocrite, murderer,
and adulterer. Khâlid, not knowing but that Abu Bekr might be of the
same mind, answered not a word, but passed into the Caliph’s
presence. There he told his story, and the explanation was accepted
by Abu Bekr;—only he chided him roughly for having thus
incontinently wedded his victim’s widow, and run counter to the
custom and feelings of the Arabs in celebrating his nuptials on the
field. As Khâlid again passed Omar, he lightly rallied him in words
which showed that he had been exonerated. Motammim then
pressed the claim, as one of honour, for payment of his brother’s
blood-money, and release of the prisoners that remained. For the
release Abu Bekr gave command, but the payment he declined.
Omar remained unconvinced of the
innocence of Khâlid, and still was of but held guilty by Omar.
opinion that he should be withdrawn from
his command. He persevered in pressing this view upon Abu Bekr,
who would reply, ‘Omar, hold thy peace! Refrain thy tongue from
Khâlid. He gave an order, and the order was misunderstood.’ But
Omar heeded not. He neither forgave nor forgot, as in the sequel we
shall see.
The scandal was the greater, because
Mâlik ibn Noweira was a chief renowned Mâlik’s death
for his generosity and princely virtues, as commemorated
his brother.
in verse by

well as for poetic talent. His brother,


Motammim, a poet likewise of no mean fame, commemorated his
tragic end in many touching verses. Omar loved to listen to his
elegies; and he used to tell Motammim that if he had himself
possessed the poetic gift, he would have had no higher ambition
than to mourn in such verse over the fate of his own brother Zeid,
who shortly after fell at Yemâma.[46]
The materials are too meagre to judge
conclusively whether the right in this grave The affair leaves a stain on
matter is on the side of Omar or of the Khâlid’s fame.
Caliph, Abu Bekr. Although the hostile bias of Khâlid against Mâlik
led undoubtedly to the raid upon his tribe and the harsh treatment
which followed thereupon, still, with the conflicting evidence, we may
hold the deeper charge unproven. But in wedding the widow of his
enemy while his blood (shed as we are to believe in misconception
of his order) was fresh upon the ground, Khâlid, if he gave no colour
to darker suspicions, yet transgressed the proprieties even of Arab
life, and justified the indictment of unbridled passion and cold-
blooded self-indulgence.[47]
CHAPTER VII.
BATTLE OF YEMAMA.

End of A.H. XI. Beginning of 633 A.D.

But sterner work was in reserve for


Khâlid. In the centre of Arabia, and right in Campaign of Khâlid against
front of his army, some marches east, lay Moseilama.
January, a.d.
633.[48]
Yemâma. There resided the Beni Hanîfa, a
powerful branch of the great tribe Bekr ibn Wâil. Partly Christian and
partly heathen, the Beni Hanîfa had submitted to Mahomet; but they
were now in rebellion, 40,000 strong, around their prophet
Moseilama. It was against these that Khâlid next directed his steps.
The beginning of Moseilama’s story
belongs to the life of Mahomet.[49] Small in Moseilama’s previous story.
stature, and of a mean countenance, he had yet qualities which fitted
him for command. He visited Medîna with a deputation from his
people, and it was pretended that words had then fallen from
Mahomet signifying that he would yet be a sharer with him in the
prophetic office. Building thereon, Moseilama advanced his claim,
and was accepted by his people as their prophet. When summoned
by Mahomet to abandon his impious pretensions, he sent an insolent
answer claiming to divide the land. Mahomet replied in anger, and
drove the ambassadors from his presence. To counteract his
teaching, he deputed Rajjâl, a convert from the same tribe, who had
visited Medîna, and there been instructed in the Corân.[50] On
returning to his people, however, this man also was gained over by
the pretender to espouse his claims as founded on the alleged
admission of Mahomet himself. Moseilama, we are told, deceived
the people by tricks and miracles; aped, in childish terms, the
language of the Corân; and established a system of prayers similar
to those of Mahomet. In short, his religion, so far as we can tell, was
but a wretched imitation of Islam.[51] At the period we have now
reached, he had just rid himself of Sajâh, the rival prophetess, by the
singular expedient of taking her to wife, and then bribing her by half
the revenues of Yemâma to return from whence she came. Parties of
Mesopotamian horse were still about the country collecting her dues,
when Khâlid’s approach changed the scene; and Moseilama,
marching out with a great army to meet him, pitched his camp at
Acraba.
Ikrima and Shorahbîl were the
commanders originally despatched by Abu Ikrima suffers a reverse.
Bekr to quell the rising at Yemâma,[52] and
both suffered at the hands of Moseilama from a hasty and
unguarded advance. Ikrima, anxious to anticipate his fellow, hurried
forward, and was driven back with loss. The details (as generally the
case when tradition deals with a defeat) are wanting; but the reverse
was so serious that Abu Bekr, in reply to the despatch reporting it,
wrote angrily to Ikrima. ‘I will not see thy face,’ he said, ‘nor shalt
thou see mine, as now thou art. Thou shalt not return hither to
dishearten the people. Depart unto the uttermost coasts, and there
join the armies in the east of the land, and then in the south.’ So,
skirting Yemâma, he went forward to Omân, there to retrieve his
tarnished reputation. Shorahbîl, meanwhile, was directed to halt and
await the approach of Khâlid.[53]
It was after the reverse of Ikrima that
Khâlid, on being summoned to Medîna on Khâlid sets out for Yemâma.
the affair of Mâlik ibn Noweira, received the
commission to attack Moseilama. In anticipation of serious
opposition, the Caliph promised to strengthen his army by a fresh
column composed of veterans from amongst the men of Mecca and
Medîna. So Khâlid returned to his camp at Bitâh, and when these
reinforcements came up, he marched in strength to meet the enemy.
It was now that Shorahbîl, whose troop formed the vanguard,
hastening forward like Ikrima, met with a like reverse, and was
severely handled by Khâlid for his temerity.
While yet a march from Acraba, Khâlid
surprised a mounted body of the Beni Mojâa, a chief of the Beni
Hanîfa, taken prisoner.
Hanîfa under command of the chief Mojâa. They were returning from
a raid against a neighbouring tribe, unaware of the approach of the
Mussulman army. But they belonged to the enemy, and as such were
all put to the sword, excepting Mojâa, whom Khâlid spared, as he
said he promised to be useful on the coming eventful day, and kept
chained in his tent under charge of Leila, his lately espoused wife.
On the morrow, the two armies met
upon the sandy plain of Acraba. The Battle of Acraba or Yemâma.
enemy rushed on with desperate bravery.
‘Fight for your loved ones!’ cried the son of Moseilama; ‘it is the day
of jealousy and vengeance; if ye be worsted, your maidens will be
ravished by the conqueror, and your wives dragged to his foul
embrace!’ So fierce was the shock that the Moslems were driven
back, and their camp uncovered. The tent of Khâlid was entered by
the wild Bedouins; and, but for the chivalry of her captive, who
conjured his countrymen to spare a lady of such noble birth, Leila
would have perished by their swords. ‘Go, fight against men,’ Mojâa
cried, ‘and leave this woman;’ so they cut the tent-ropes and
departed. There was danger for Islam at the moment. Defeat would
have been disastrous; indeed, the Faith could hardly have survived
it. But now the spirit of the Moslems was aroused. Khâlid, knowing
the rivalry between the Bedouin and the city Arabs, separated them
to fight apart. On this they rallied one the other; and the sons of the
desert cried: ‘Now we shall see the carnage wax hot amongst the
raw levies of the town. We will teach them how to fight!’ Prodigies of
valour were fought all round. The heroic words and deeds of the
leaders, as one after another fell in the thick of battle, are dwelt on
by the historian with enthusiasm. Zeid, the favourite brother of Omar,
who led the men of Mecca, singled out Rajjâl, and, reproaching his
apostasy, despatched him forthwith. A furious south wind, charged
with the desert sand, blew into the faces of the Moslems, and,
blinding them, caused a momentary pause. Upbraiding them for their
slackness, Zeid cried out: ‘I shall follow them that have gone before;
not a word will I utter more, till we beat the apostates back, or I
appear to clear myself before my Lord. Close your eyes and clench
your teeth. Forward like men!’ So saying, he led the charge and fell.
Abu Hodzeifa, another Companion of note, calling out ‘Fight for the
Corân, ye Moslems, and adorn it by your deeds!’ followed his
example and shared his fate. Seeing this, Abu Hodzeifa’s freedman,
Sâlim, seized the banner from his dying master, and exclaiming, ‘I
were a craven bearer of the Corân if I feared for my life,’ plunged into
the battle and was slain.[54] Nor were the citizens of Medîna behind
their fellows. Their commander, Thâbit ibn Cays, reproached them
indignantly: ‘Woe be to you,’ he said, ‘because of this backsliding.
Verily, I am clear of ye, even as I am clear of these,’ and he pointed
to the enemy as he flung himself and perished in their midst.
Animated thus, the rank and file charged furiously. Backwards and
forwards swayed the line, and heavy was the carnage. But urged by
Khâlid’s valiant arm,[55] and raising the grand battle-cry ‘Yâ
Mohammedâ!’ the Moslem arms at length prevailed. The enemy
broke and began to give. ‘To the garden!’ cried Mohakkem, a brave
leader of the Beni Hanîfa; ‘to the garden, and close the gate!’ Taking
his stand, he guarded their retreat as they fled into an orchard
surrounded by a strong wall, and Moseilama with them. The Moslem
troops, following close, soon swarmed all round the wall, but found
no entrance anywhere. At last Berâa, one
of the Twelve,[56] cried, ‘Lift me aloft upon The Garden of Death.
the wall.’ So they lifted him up. For a moment, as he looked on the
surging mass below, the hero hesitated; then, boldly leaping down,
he beat right and left, until he reached the gate, and threw it open.
Like waters pent up, his comrades rushed in; and, as beasts of the
forest snared in a trap, so wildly struggled the brave Beni Hanîfa in
the Garden of Death. Hemmed in by the narrow space, and
hampered by the trees, their arms useless from their very numbers,
they were hewn down, and perished to a man. The carnage was
fearful, for besides the slain within the walls, an equal number were
killed on the field, and again an equal number in the flight.[57] The
Moslems, too, despite their splendid
victory, had cause to remember the The Beni Hanîfa discomfited,
Garden Death and the battle of Yemâma, with great slaughter on both
sides.
for their loss was beyond all previous
experience. Besides those killed hand to hand in the garden, great
numbers fell in the battle when their ranks wavered and gave way.

You might also like