You are on page 1of 20

Information Fusion 50 (2019) 92–111

Contents lists available at ScienceDirect

Information Fusion
journal homepage: www.elsevier.com/locate/inffus

Full Length Article

Data fusion and machine learning for industrial prognosis: Trends and
perspectives towards Industry 4.0
Alberto Diez-Olivan a, Javier Del Ser a,b,c,∗, Diego Galar a,d, Basilio Sierra e
a
TECNALIA, Donostia-San Sebastián 20009, Spain
b
Department of Communications Engineering, University of the Basque Country (UPV/EHU), Bilbao 48013, Spain
c
Basque Center for Applied Mathematics (BCAM), Bilbao, Bizkaia 48009, Spain
d
Department of Civil, Environmental and Natural Resources Engineering, Operation, Maintenance and Acoustics, Luleå University of Technology, Luleå, Sweden
e
Department of Computer Sciences and Artificial Intelligence, University of the Basque Country (UPV/EHU), Donostia-San Sebastián 20018, Spain

a r t i c l e i n f o a b s t r a c t

Keywords: The so-called “smartization” of manufacturing industries has been conceived as the fourth industrial revolution
Data-driven prognosis or Industry 4.0, a paradigm shift propelled by the upsurge and progressive maturity of new Information and Com-
Data fusion munication Technologies (ICT) applied to industrial processes and products. From a data science perspective, this
Machine learning
paradigm shift allows extracting relevant knowledge from monitored assets through the adoption of intelligent
Industry 4.0
monitoring and data fusion strategies, as well as by the application of machine learning and optimization meth-
ods. One of the main goals of data science in this context is to effectively predict abnormal behaviors in industrial
machinery, tools and processes so as to anticipate critical events and damage, eventually causing important eco-
nomical losses and safety issues. In this context, data-driven prognosis is gradually gaining attention in different
industrial sectors. This paper provides a comprehensive survey of the recent developments in data fusion and
machine learning for industrial prognosis, placing an emphasis on the identification of research trends, niches of
opportunity and unexplored challenges. To this end, a principled categorization of the utilized feature extraction
techniques and machine learning methods will be provided on the basis of its intended purpose: analyze what
caused the failure (descriptive), determine when the monitored asset will fail (predictive) or decide what to do
so as to minimize its impact on the industry at hand (prescriptive). This threefold analysis, along with a discus-
sion on its hardware and software implications, intends to serve as a stepping stone for future researchers and
practitioners to join the community investigating on this vibrant field.

1. Introduction The industrial manufacturing sector has also been clearly affected by
this change of paradigm, resulting in the widespread adoption of new
Industry 4.0 is a global modernization movement in the manufactur- digital technologies within its processes and assets. Indeed the merge of
ing industry towards the adaptation of recent advances in the ICT realm: physical and digital worlds lays at the core of this industrial revolution,
new communication systems and protocols, cyber security standards, establishing the basis for smart factories of the future.
multi-device displays, mobile and compact communication devices with This paradigm shift has been defined as the fourth industrial rev-
evergrowing computational capabilities and deployable Artificial Intel- olution or Industry 4.0 (Industrie 4.0 in Germany [1–3] or Industrial
ligence methods, among many others. In parallel with the development Internet in USA [4]), on the basis of the end-to-end deployment of the
of this worldwide trend, Internet has grown at unprecedented scales to aforementioned ICT advances in production processes, covering from
become ubiquitous in all economic and social aspects of the human life.

Abbreviations: ANFIS, Adaptive Neuro-Fuzzy Inference System; ANNs, Artificial Neural Networks; BPNN, Back Propagation Neural Networks; DBN, Deep Belief
Networks; DWT, Discrete Wavelet Transformation; EM, Expectation Maximization; EWMA, Exponentially Weighted Moving Average; FFT, Fast Fourier Transform;
FPCA, Functional Principal Component Analysis; GMM, Gaussian Mixture Models; GRBMs, Gaussian–Bernoulli Restricted Boltzmann Machines; GRNN, General
Regression Neural Network; HMM, Hidden Markov Model; kNN, k-Nearest Neighbors; KDE, Kernel Density Estimator; LAD, Logical Analysis of Data; LOF, Local
Outlier Factor; PCA, Principal Component Analysis; PoF, Physics of failure; RBM, Restricted Boltzmann Machines; RNN, Recurrent Neural Networks; SARMA, Seasonal
Autoregressive Moving Average; SBM, Similarity Based Modeling; SOM-MQE, Self-Organizing Map Minimize Quantization Error; SVMs, Support Vector Machines;
VCM, Vibration-based Condition Monitoring.

Corresponding author at: TECNALIA Research & Innovation. P. Tecnologico, Ed. 700., Derio, Bizkaia 48170, Spain.
E-mail address: javier.delser@tecnalia.com (J. Del Ser).

https://doi.org/10.1016/j.inffus.2018.10.005
Received 6 July 2018; Received in revised form 25 September 2018; Accepted 14 October 2018
Available online 15 October 2018
1566-2535/© 2018 Elsevier B.V. All rights reserved.
A. Diez-Olivan et al. Information Fusion 50 (2019) 92–111

the product design phase to the product life-cycle management through diagram in Fig. 1, the idea is to characterize behavioral patterns of in-
manufacturing and related logistics phases. Traditional production sys- terest on the basis of the data monitored from the process or asset un-
tems are typically static, hierarchical processes that incur disruptive der study (training data) by means of mathematical algorithms (machine
changes and important costs when adapting production policies and learning models). This acquired knowledge can be then applied to new
product portfolios to the requirements imposed by the market. Cur- unseen data (test data) to tackle a wide variety of problem (hypothesis),
rent market demand more flexible solutions, with high levels of produc- including prediction, classification and anomaly detection, among oth-
tion customization to be met while ensuring the profitability of smaller ers. This task is specially challenging nowadays, since it involves pro-
product runs. After-sales services are mainly focused on product main- cessing and analyzing huge amounts of data and additional information
tenance, hence this aspect is of equal concern at this point. coming from different, diverse monitoring systems and smart devices.
Based on the ab definition of Industry 4.0 introduced above, we can When data models are to be designed and deployed in an industrial
identify three levels of implementation of technology from a production setup, the de facto methodology for data-driven industrial prognosis is
perspective: the so-called Cross Industry Standard Process for Data Mining (CRISP)
shown in Fig. 2. This methodology builds upon the standard process
• Vertical integration: in the context of production and automation,
cycle for data mining towards conceiving it as a set of steps along a
this concept refers to the integration of diverse ICT systems into
workflow: from business and data understanding to the evaluation and
different hierarchical levels, from the very basic ones (e.g. sensors
deployment of the produced models, going through data preparation
and actuators) to the highest levels of production management, ex-
and modeling phases. Once models have been deployed on an online
ecution, planning and scheduling. This level of integration supports
monitoring platform, their output can be a recommendation, a warning,
manufacturing processes, making them more flexible.
a critical alarm or even an optimal planning and scheduling of mainte-
• Horizontal integration: this level includes the integration of ICT tech-
nance operations.
nologies into mechanisms and agents involved in the different stages
Aiming to achieve good performance scores, the modeling step must
of the manufacturing processes and business planning; this means ex-
be enriched with diverse data collected from different sources of infor-
changing energy and information within a company (e.g. input and
mation. In this context multiple sensor data fusion is usually performed
output logistics, production and commercialization), and between
to provide the prognostic models with consolidated information corre-
companies and entities (value networks).
lated with the condition of the industrial assets and production processes
• Circular integration: vertical and horizontal integrations are joined
under study [12] or the activity of the workers that interact with them
to link the end user and the product life cycle. This integration ends
[13]. Of special interest are the operational parameters and the spe-
the production loop; therefore, a whole end-to-end digitalization is
cific process conditions, e.g. asset load, working hours or failure rates.
fully achieved, from the initial design stages, to planning and manu-
Additional relevant sources of information are contextual variables, e.g.
facturing, the logistics and resources management mechanisms and,
external conditions, such as temperature or humidity, and expert knowl-
finally, to the end user and product related services.
edge about the process under study. Whenever this latter source of in-
The above concepts are increasingly embraced in strategic plans of formation is available and it can be modeled and integrated into the
entities and companies all over Europe, America and Asia [5]. Examples prognostic approach, it can support the learning phase and enhance the
abound: to mention a few, the technological prototype coined as Digital obtained results significantly. Thus, more complex and human-centered
Factory or Industry 4.0 Demonstrator was created in UK to embody a frameworks can be found beyond physical and contextual monitoring
living laboratory in which industrial stakeholders can explore and assess data, combining lower level information and high level knowledge [14].
the potentiality of smart ICT technologies for their production processes Another approach is to artificially obtain partial predictions that can
[6]. In further detail, the demonstrator consists of a real production line be smartly combined by means of data-driven model hybridization or
connected to a 3D virtual factory, designed to demonstrate the capabil- ensemble methods to produce a more robust prediction, and even to
ities of customization and personalization. Another exemplifying case provide a recommendation or maintenance operation to avoid the en-
is the Basque Country region in Spain, with a clear governmental push visaged faulty condition [15].
towards prioritizing science, technology and innovation efforts of its re- This paper capitalizes on the great momentum that industrial prog-
search ecosystem towards advanced manufacturing and Industry 4.0 so nostic models have gained within the Industry 4.0 paradigm by exam-
as to meet the goals of Horizon 2020, as specified in the Basque Industry ining the most recent and influential literature related to data fusion
4.0 strategy [7]. Likewise, in recent years German and US governments and machine learning methods for this particular class of data-based
have promoted separate yet similar initiatives to accelerate the adop- modeling problems. For the sake of an structured survey we will here-
tion of the Internet of Things (IoT) and smart analytics in manufactur- after classify prognostic models depending on the hypothesis or goal for
ing industries towards improving the overall performance, quality, and which they are designed. As such, we will deal with descriptive prognos-
controllability of their manufacturing processes [8]. Other contributions tic models (e.g. those for unsupervised pattern classification and health
in the literature have also stressed on the crucial role played nowadays management), predictive prognostic models (correspondingly, embrac-
by IoT and cyber physical systems as technology enablers for predic- ing those for condition-based and predictive maintenance) and prescrip-
tive production systems, an intelligent manufacturing system wherein tive prognosis models (namely, those whose output drive optimized pro-
networked assets are equipped with self-awareness to predict, find root duction schedules, life cycle optimization and supply chain management
cause, and reconfigure faulty events automatically [8,9]. and logistics). Given the strong interdependencies found among assets
The increasing amount of information available in industrial plants and processes in complex industrial plants, it is often the case that prog-
motivates the adoption of data fusion and machine learning methods nostic methods are applied not to a single asset, a product or a process,
for addressing specific industrial requirements and needs [10]. A spe- but to several assets at the same time, thus involving the confluence
cial focus is placed on prognosis, namely, the capability to estimate of very heterogeneous data sources in the model design. Therefore, the
and anticipate events of interest regarding industrial assets and produc- survey also places a special emphasis on the role that data fusion has
tion processes [11]. There lies indeed the core challenge of the Indus- taken in the advent of industrial prognosis, emerging lately as a data
try 4.0 paradigm from a data science perspective: data-driven prognos- preprocessing phase of utmost necessity in upsurging data-intensive in-
tic approaches aim at predicting when an abnormal behavior is likely dustrial ecosystems. Our literature analysis concludes by identifying a
to arise within the monitored process, providing further insights such set of challenges in this field that remain insufficiently addressed to date,
as its severity and impact on the plant performance. For this reason it which are further discussed in detail so as to stimulate research efforts
becomes particularly interesting to characterize normality properly to- invested in such directions.
wards unveiling degradation patterns or trends. Following the generic

93
A. Diez-Olivan et al. Information Fusion 50 (2019) 92–111

Fig. 1. Generic schematic diagram of a machine learning process.

Fig. 2. The CRISP methodology for data-driven industrial prognosis.

The remainder of this manuscript is structured according to the over- unbiased fashion. This is the case of unsupervised machine learning,
arching goals of the paper as exposed above: Section 2 introduces and with clustering techniques and outlier detection methods lying at the
elaborates on the literature analysis central to our study by following heart of many practical cases of industrial prognosis reported so far.
the adopted threefold classification in descriptive (Section 2.1), predic- 2. Predict the time at which a fault in a monitored equipment will occur,
tive (Section 2.2) and prescriptive (Section 2.3) models for industrial and eventually its severity and coverage over the production chain.
prognosis. Implications of the implementation and deployment of data In this case predictive prognostic models rely on a dataset of fault
fusion and prognostic models on industrial hardware and communica- events held in the past, from which a learning algorithm learns the
tions are identified and discussed in Section 3. A critical argumentation pattern correlating the captured data from the monitored asset to a
and analysis of the research niches, opportunities and open challenges target variable characterizing the fault to be predicted (e.g. a prob-
stemming from the study is provided in Section 4. Finally, Section 5 ends ability of occurrence, a measure of severity or its location within
the paper by drawing concluding remarks and summarizing the research the process chain). From a machine learning perspective, supervised
paths that deserve most efforts from the community in the near future. learning models are at the forefront of this category.
3. Prescribe optimal actions as a result of a fault alarm over the plant.
2. Data fusion and machine learning for industrial prognosis When the alarm is risen by a predictive model before the fault occurs,
prescriptive prognosis models actions aim at reducing its chances to
When inspecting the amount of literature related to industrial prog- occur by modifying working parameters and variables of the indus-
nosis several criteria can be embraced for its classification and analysis, trial process eventually affected by the fault. By the contrary, if the
such as the industrial sector under scope, the nature of the data han- alarm results from a confirmed fault, models from this category are
dled by the models or the type of asset/process that benefits from the rather used to minimize its impact over the production of the indus-
application of prognostic models. However, this study departs from the try, e.g. by optimally rerouting assets to non-faulty production lines,
conception of prognosis as a data-based workflow aimed at solving one or by allocating human resources for unexpected maintenance opera-
out of three different goals: tions. Such a casuistry is often modeled as an optimization problem,
whose objective(s) are often driven by the outcome of predictive
1. Describe the use case under study based on the data captured in the prognostic models. Therefore, optimization solvers prevail within
industrial plant, without taking any assumptions on the root cause this category.
of the problem and/or the presence of patterns of interest within
the retrieved data. Therefore, descriptive prognostic models ignore It is noteworthy to clarify that the above classification does not im-
any a priori assumption that could bias their obtained insights from ply that contributions in the literature related to industrial prognosis
data, hence focusing on the extraction of added value in a blind, must be discriminated and categorized exclusively as an instance of de-

94
A. Diez-Olivan et al. Information Fusion 50 (2019) 92–111

Fig. 3. Classification of industrial scenarios and data-driven analytical methods for industrial prognosis reviewed in this work, along with the most representative
references for each category. Sections 2.1–2.3 elaborate on each of these identified categories.

scriptive, predictive or prescriptive prognosis. Indeed, it is often the case In this regard, selected contributions for descriptive prognosis used
that a certain application scenario requires hybridizing models of differ- in Industry 4.0 setups are presented in Fig. 4 and Table 1, which are
ent kind for manifold purposes. The aforementioned example combin- discussed in more detail in following subsections.
ing predictive prognosis – e.g. to predict the probability of a machine to
undergo a fault – with prescriptive prognosis – to tune the machine con-
2.1.1. Pattern recognition and classification
figuration towards less likely faults to happen – is among the most rep-
Machine learning algorithms and data fusion strategies are usually
resentative and intuitive exemplary scenarios that illustrate this noted
employed to find patterns in data and use this knowledge in industrial
mixture of approaches.
scenarios [114]. The most common approach deals with modeling be-
With this methodological criterion in mind, we now comprehen-
haviors of interest from operational data [115,116].
sively examine the latest literature related to industrial prognosis. Each
Data-driven behavior characterization consists of grouping similar
of the analyses provided in what follows identifies and highlights tech-
data into datasets, which physically represent the same operational con-
nological trends in terms of models and data fusion techniques, as well
dition. Within formed groups, there are data points far from the identi-
as industrial sectors where the study of data-based prognosis has been
fied pattern; these corresponds to a distinctive property (e.g. the mean
particularly notable in recent years. Fig. 3 overviews the scenarios and
point or the group distribution). Such patterns could be very significant
data-driven methods for industrial prognostics reviewed in this paper.
to identify behaviors linked to data themselves, or to detect or infer
possible faults or anomalous operational conditions. Large groups, or
2.1. Descriptive prognosis groups that are close together, usually imply normal behavioral pat-
terns, whereas small groups or events that are far from the pattern (of
Briefly stated, the main purpose of descriptive prognostics is to sum- the same group or a big group) imply anomalies or outliers (e.g. noise
marize data towards unveiling unknown patterns beneath them. In the and transient data). Unfortunately, the industrial scenarios for data-
context of intelligent monitoring of complex industrial processes and/or driven prognosis do not always provide a proper tracking of past ab-
assets, descriptive models also imply statistically inferring insights from normal behaviors or maintenance operations performed to prevent or
data. As a result, this gained information from data can help detecting correct a faulty condition, thus the learner is only given unlabeled ex-
events of interest or to estimate the health status of the industrial asset, amples. Therefore, the characterization problem must be addressed from
product or process under study. Indeed, one of the big challenges fac- an unsupervised learning perspective, by which the dataset does not
ing Industry 4.0 revolves on how to optimally and automatically infer contain a priori a target to be predicted [117]. In these circumstances
patterns of interest and characterize knowledge and critical events from we deal with a dataset X composed by N data samples [𝐱𝑛 ]𝑁 𝑛=1
such
the monitored data [110]. The aim is to use such patterns to establish that 𝐱𝑛 ≐ [𝑥𝑛1 , … , 𝑥𝑛𝑀 ], with each feature or predictor xm taking values
the health status of the assets in an online fashion for fault detection and from a discrete or continuous alphabet 𝑚 . In an unsupervised approach
diagnosis [111,112]. Current techniques and procedures still hinge on to industrial prognosis, clustering is the most typically used technique,
manual inspections and basic control systems, neither fully exploiting whereby instances are grouped by their similarity to each other given a
metric of similarity SIM(𝐱𝑛 , 𝐱𝑛 ). Such a metric can be particularized to

the available data nor considering the advantages of data analytics and
processing capabilities [113]. meet the specificities of the prognosis problem at hand.

95
A. Diez-Olivan et al. Information Fusion 50 (2019) 92–111

Fig. 4. Solutions and industrial sectors addressed for descriptive prognosis.

In [18], for instance, the above schema is adopted by applying de- signals, respectively. Similarly, the work in [22] adopts a Deep Learning
cision trees and fuzzy modeling to vibration signals in rotatory equip- approach (specifically, RBM) to fuse fault evidence and reason vectors
ment. A data simulator is used to generate faults under certain con- for fault diagnosis of high-speed train equipment.
trolled conditions, which also corresponds to the strategy followed by All in all, finding patterns within monitored data usually requires a
the authors in [16]. Different features and patterns are then extracted deep knowledge of the problem and the underlying physics of the pro-
from condition monitoring data and process variables, fused and finally cess. This stringent requirement in descriptive prognosis paves the way
used to train the models. In [19] a density-based approach (by which towards one of the niches of opportunity in this field (specialized feature
instances are grouped based on a similarity threshold imposed on the engineering), which is later discussed in the corresponding section.
value of SIM(𝐱𝑛 , 𝐱𝑛 )) is applied to learn patterns that correspond to sim-

ilar groups of data in a reduced feature space; then the distances of the 2.1.2. Health management
patterns and new process signals are studied. In a similar framework Health management of industrial assets and production processes
[23], a multi-sensor fusion is performed at the feature level for cutting requires accurately estimating their health status. The survey by
parameters and vibration signals; the aim is to characterize and rec- Schwabacher and Goebel [120] underpins fault detection, fault diag-
ognize different machining conditions in a milling machine. A Support nostics and failure prognostics as key elements in Integrated Systems
Vector Machine (SVM) approach is adopted due to its regularization Health Management, further emphasizing the potential of Artificial In-
and generalization properties, as well as its good accuracy and flexibil- telligence for their implementation. To this end, feature extractors, prob-
ity when modeling behaviors of interest. lem descriptors and Key Performance Indicators (KPIs) are developed to
The pattern classification approach is often applied to character- reduce the complexity of raw data, making patterns related to domain
ize training data. In this case failure mode analysis aims at identifying knowledge more discriminable for the learning algorithms to be sub-
the most probable causes of confirmed abnormal behaviors and failure sequently applied. The resulting models are more meaningful and can
episodes so as to predict and avoid them in the future [118]. When la- accurately trace the health status of the assets over time. Nevertheless,
bels are available, supervised learning and reinforcement learning can this process is difficult and very time consuming to perform, because
be applied to train a classifier and act on the monitored asset depending data preprocessing, feature extraction/engineering and results assess-
on the predicted outcome of the model. This learning setup corresponds ment usually call for the involvement of domain experts, as highlighted
to a supervised framework, in which the dataset comprises not only the in [25] with relation to environmental assets.
aforementioned set of examples X, but also values for the target feature Automatic selection of features is applied when the support of do-
associated to each of such examples, namely, 𝐘 = [𝑦1 , … , 𝑦𝑁 ], with  main experts cannot be put into practice. A ranking of the most relevant
denoting its alphabet [119]. The target feature can be continuous or features can be obtained given their importance when solving a prob-
discrete, usually representing a characteristic related to the diagnosis. lem, e.g. variance explanation or impurity decrease given a target fea-
A label can be also estimated by tracking events of interest or based on ture and a set of input features. Some machine learning algorithms can
maintenance operations carried out in the past. Data from such kinds of deal with feature transformation and extraction in an automatic manner,
events are assumed to correspond to normality. Similarly, when a cor- as part of their learning framework. This is the case of Deep Learning
rective action is performed, the preceding data can potentially repre- models and Kernel methods, for instance. They perform data transfor-
sent abnormal behaviors. Under this premise, fault diagnosis is usually mations and learn high-level features to operate in a high-dimensional,
applied in a straightforward fashion, once features have been inferred, implicit feature space that, except for image data, cannot be straight-
selected and combined. In some contributions environmental conditions forward interpreted [121]. This noted drawback stimulates a focus shift
are used as context to better describe and understand the faulty condi- onto the recent advances made in regards to automated feature engi-
tion to be classified [17]. In [24], quantile regression forests are utilized neering and model construction [122], which are certainly among the
to optimize production in broiler farming on the basis of environmen- most promising research paths to follow in the near future.
tal indicators and production and welfare parameters. In [20,21], fault Notwithstanding the advent of new feature extraction techniques as
detection and classification on wind turbines is performed by combin- the one cited above, most contributions have addressed feature con-
ing acoustic and vibratory signals and texture features from time domain struction by applying elaborated signal processing and feature extrac-

96
A. Diez-Olivan et al. Information Fusion 50 (2019) 92–111

Table 1
Comparison of representative works on descriptive prognostics.

Ref. Method Data used Industrial sector Assets involved Main goal

[16] LAD Combination of controlled, manipulated and Chemical processes Process components (e.g. Fault diagnosis
measured variables (i.e. valves’ positions, the two-phase reactor, a
reactor agitator speed and process variables like condenser and a
pressures, temperatures, flowrates or recycle compressor)
concentrations)
[17] Outlier detection Failures, environmental measures (the context) and Railway Train doors Fault detection and
through EM events described by type, timestamp, subsystem, explanation
disturbance, duration, severity and description
[18] Decision trees and fuzzy Feature extraction and combination of data from Manufacturing (generic) Spur gears in rotary Pattern recognition and
classifier condition monitoring vibration signals machines fault diagnostics
[19] KDE and distance-based Combining fastener features (i.e. diameter and Manufacturing Blind fasteners Pattern classification
classification height of the formed heads) with torque-rotation (aerospace)
signals
[20] Deep Random Forest Fusion of acoustic emissions and vibratory signals Wind energy Wind turbine gearbox Fault diagnosis
[21] Bag Tree Statistical, wavelet, granulometric and Gabor Wind energy Wind turbines Fault detection and
features from time domain signals acquired from classification
the operating wind turbine (e.g. acceleration at
tower bottom)
[22] DBN based on RBM Fusion of fault evidence and reason vectors Railway Vehicle on-board Fault diagnosis
equipments (VOBEs)
for high speed trains
[23] DWT and SVMs Multi-sensor fusion at feature level (i.e. cutting force Manufacturing (generic) Cutting tool and milling Machining condition
and vibration signals and cutting parameters) machine recognition
[24] Quantile regression Environmental indicators and production and Agriculture Farms Production optimization
forests welfare parameters (i.e. weights and welfare data) and animal welfare
[26] Framework based on GPS and wind velocity sensors output Construction (civil Bridges Structural status
hybrid adaptive structures) estimation and risk
resonant theory ANN analysis
and adaptive fuzzy
inference
[27] Bayesian Inference Multistage data fusion at component and global Manufacturing (generic) Electric motor, two Health status assessment
levels (e.g. acceleration, current, voltage and gearboxes and a load
temperature)
[28] FFT, kNN and K-means Fusion of vibration-based features from Construction (civil Bridges Damage detection
accelerometer data and locations of the monitored structures)
areas
[29] GRBMs Deep statistical feature learning from vibration Manufacturing (generic) Rotary machinery Fault diagnosis and
measurements health status
estimation
[30] KDE and SVMs Operational data (e.g. speed, temperatures, load, Maritime Diesel engine subsystems Health status estimation
humidity, pressure, voltage or intensity) and
environmental conditions
[31] GMM, SOM-MQE and CMS data and SCADA variables (i.e. wind speed and Wind energy Wind turbine Health assessment
PCA direction, output power, pitch angle and vibration components (e.g.
signals) rotor, gearbox and
generator)
[32] Model-based reasoning Sensor fusion of real-time monitoring of system Aerospace Civil aircrafts Health assessment and
status (e.g. control parameters, mission status and fault diagnosis
aircraft structure)

tion techniques to extract sensitive features symptomatically detecting making the damage identification more accurate than the same model
changes in the health condition of the asset, as proposed in [32] in re- without location information.
gards to the physical subsystems of an aircraft. In [26], for instance, a In several industrial scenarios, available labels are related to critical
composite structure health index for risk analysis is computed using a events that occurred in the past. As mentioned, the strategy most com-
hybrid adaptive resonant theory of neural networks and adaptive fuzzy monly used when dealing with time series data consists of selecting a
inference, as well as a data fusion framework. Li et al. [29] propose a set of data before and after the registered label (e.g. in a time window
deep statistical learning of features from vibration measurements com- of one or several months). Then, each set of data can be contextualized
bined to establish the health status of rotary machinery by means of on the basis of the corresponding event. For instance, if a maintenance
a GRBM. Alternatively, the statistical analysis of vibration data from operation or an overhaul occurred at a given time instant, the set of data
the industrial asset in time and/or frequency domain has been widely selected after that event in time can be considered to categorize normal-
explored [123], combining Condition Monitoring System (CMS) data ity, whereas the set of data selected before it must be further analyzed
and Supervisory Control And Data Acquisition (SCADA) variables [31]. in order to infer and model the trend leading to the event of interest.
Time series analysis is commonly applied to extract damage and fault- This allows combining labeled and unlabeled instances in training data
sensitive features from data. Time series models are used to fit the vibra- under a semi-supervised learning framework, also referred to as weakly
tion data; damage indicators are then obtained by comparing new data supervised learning [124]. In general the instance-label relationship de-
to the learned models. The authors in [27] propose a multistage com- termines the problem to be addressed having, for instance, multi-label
bination of time-domain and frequency-domain features to assess the frameworks in which instances are associated to one or more labels at
health status of critical equipment holistically. Similarly, in [28], vibra- the same time, i.e. several symptoms arising at the same time instant, or
tion signals are combined with the locations of the monitored areas, instances known to belong to different categories, i.e. several unknown
symptoms that occur at different time instants.

97
A. Diez-Olivan et al. Information Fusion 50 (2019) 92–111

Spurious and transient data can be related to unstable asset condi- 2.2.1. Condition-based maintenance
tions. These should be filtered out to train reliable normality models. Un- Condition-based maintenance (CBM) aims to anticipate a mainte-
der such circumstances, one of the most challenging issues when learn- nance operation based on evidence of degradation and deviations from
ing diagnosis and prognosis models from data is modeling the normal normal asset behavior [132]. When the condition of a particular system
behavior of the assets. The problem can be seen as a positive-unlabeled is being observed, a set of monitoring devices and sensors must be con-
framework in which positive samples must be automatically selected to sidered. Intelligent monitoring of equipment by using sensors is essential
learn the model. Learning a model from monitoring sensor data that to acquire relevant data containing the characterization of operational
characterize normality implies the absence of outliers and operational faults in physical signals; acoustic and ultrasonic sensors, accelerome-
faults. Outliers can be defined as patterns in data that do not conform ters, current measurements or thermocouples are usually employed for
to a previously well-established notion of normal, or frequent, behavior this purpose [133,134]. In addition to these data, environmental condi-
[125,126]. Assuming a small percentage of outliers are present in data, tions and contextual information, such as temperature, pressure or hu-
their frequency normally ranges from 5% to less than 0.01% depend- midity, provide very useful information to enrich the modeling process
ing on the application. An automatic outlier detection process can be [135]. From such information, specific KPIs are calculated and analysed
done on the basis of density estimation [127] and on deviations from to discover trends that can lead to a potential critical fault.
expected normal, common behavior [128]. Once the outliers have been When the aim is to achieve maximum reliability, an appropriate CBM
isolated, a model that fits the resulting normal data set can be learned. system with monitoring capabilities must be adopted, gathering and
Therefore, more accurate normality models are obtained and the num- combining all kinds of useful sources of information simultaneously and
ber of false negatives when performing fault detection tasks is reduced. providing the prognostics needed to assure the correct operation of the
Such models can be applied in an online fashion to check real-time data assets. In the work proposed by Kadri et al. [35] early alerts in the event
for health status assessment and fault prediction. One-class SVM, or 𝜈- of abnormal situations in complex production systems (e.g. the critical
SVM, for example, allows controlling the false positive rate given by 𝜈 components of a paediatric emergency department) are provided by a
and can therefore be used to model normality on the basis of a small seasonal autoregressive moving average (SARMA)-based exponentially
percentage of anomalies assumed to be present in data, as proposed in weighted moving average (EWMA) anomaly detection schema. Like-
[30]. wise, the prognostic system implemented in [36] predicts the wear of
Nevertheless, in most cases, success is restricted to simulations, lab- railway braking systems from condition monitoring data; an online prog-
oratory studies and well-controlled experiments. Thus, there is limited nostics control for maintenance optimization is also provided. The re-
evidence of real structural faults; therefore, the effectiveness of the ap- sulting CBM system must include data acquisition and processing, diag-
proaches remains to be validated for operational data. nostics and prognostics and decision-making functionalities [136]. Gen-
erated data-driven models for diagnostics and prognostics must be de-
ployed in a monitoring platform with online data acquisition and inspec-
2.2. Predictive prognosis tion capabilities [137]. Several commercial CBM systems and eMainte-
nance frameworks are already available, most of them separately using
Predictive analytics is the next step up in the data processing schema. a wide variety of potential failure indicators and data fusion strategies
It utilizes a variety of data fusion, statistical, modeling, and machine that are able to integrate vibration data and operating parameters to
learning techniques to study recent and historical data, to learn prog- finally provide effective condition monitoring of the assets [38,39].
nostic models, which make accurate predictions about the future status Data-driven prognostic models are the core of the whole process
of the monitored asset. since they apply the behavioral and statistical methods for fault pre-
The intelligent maintenance of industrial assets and production lines diction and classification [138]. ANNs [139] and SVMs [140,141] are
is one of the most critical parts of the Industry 4.0 paradigm. A tradi- usually applied to analyze data and infer such models, not only for main-
tional preventive strategy may obtain high reliability levels if it is well tenance purposes but also to optimize asset operation and reduce emis-
designed [129]. However, this sometimes implies over-maintaining the sions, as proposed in [33]. The use of projection methods (e.g. linear,
assets and production lines. Equipment manufacturers are always con- nonlinear and orthogonal projections to latent structures, kernel meth-
servative in their maintenance policies so that reliability is achieved, but ods, or PCA) for dimensionality reduction and regression can highly sup-
they assume high maintenance costs. It is well known that the failure port the feature extraction process [41] and make the prediction more
probability of many components is high at the beginning and end of their precise and accurate [142–144]. Nevertheless, depending on the appli-
operational life, following the bathtub failure pattern [130]. Therefore, cation and whenever it is possible, it may be beneficial to incorporate
unnecessary maintenance tasks increase failure rate when a defective specific knowledge directly into whichever algorithm is applied.
item is installed or when a human mistake occurs. Moreover, preven- One of the most challenging objectives is to explicitly and automat-
tive strategies do not take into account operational contexts, such as ically represent and model expert knowledge [145], characterizing dif-
load profiles, number of starts or environmental parameters, and these ferent behaviors of interest and linking them to critical faults in as-
strongly affect components’ lifetime. Finally, preventive maintenance sets and production processes. In [34], for instance, a constrained K-
is erroneously based on the idea that the probability of occurrence of means clustering based on the engine load is first applied to establish
operational faults increases exponentially at a certain time. In preven- normality based on load ranges. In [37] the maintainer experience is
tive strategies, components are replaced or repaired before that moment integrated into the proposed intelligent maintenance system. Both ap-
occurs. This assumption is not true in many cases; there are several fail- proaches eventually provide comprehensive behavior modeling using
ure patterns in which failure probability never increases [131]. In such fuzzy logic. These types of methodologies are not yet commonly in-
cases, failure probability is constant in time. Thus, a component could tegrated in many industrial sectors, either because companies doubt
fail at any time. Especially relevant examples of this phenomenon are their benefits or there are integration drawbacks in terms of both time
failure patterns of electrical and electronic components, where repair and cost [146]. However, as new research appears and commercial sys-
and substitution tasks at planned periods of time do not imply an im- tems are developed, demonstrating important improvements in reliabil-
provement in reliability. For all these reasons, reliability increase and ity over traditional strategies and showing attractive Return on Invest-
cost reduction margins remain important. ment (RoI) levels, companies and maintenance suppliers will show an
Several predictive prognostics-based research projects have recently increasing acceptance of these novel technologies.
been proposed to address complex Industry 4.0 related problems in var- However, in many industrial scenarios, the limited information on
ious industrial sectors. The most relevant ones are presented in Table 2 real faults makes it challenging to obtain accurate fault prediction mod-
and Fig. 5, and described in more detail in the next subsections. els [147] and simulated data are usually employed [40]. Therefore,

98
A. Diez-Olivan et al. Information Fusion 50 (2019) 92–111

Table 2
Comparison of representative works on predictive prognostics.

Ref. Method Data used Industrial sector Assets involved Main goal

[33] ANNs Operational parameters, environmental conditions and the Maritime Diesel engines Maintenance
electric energy consumption of the alternator
[34] Constrained K-means, fuzzy logic Operational parameters and environmental conditions Maritime Diesel engines Fault prediction
and LOF
[36] SBM Condition monitoring data (i.e. acceleration, current and Railway Braking systems PHM, wear prediction
force) and hybrid optimization from online prognostic and maintenance
control optimization
[37] Rule-based fuzzy semantic Vibration and current signals Railway Electric multiple Intelligent
inference units (EMU) maintenance
trains
[38] GRNN, BPNN and ANFIS Operating parameters (static data) fused with asset (i.e. Electrical power Gearboxes in Condition monitoring
current, load and temperature) and fault conditions electric motors
[40] Interval-valued fuzzy reasoning Combination of the system performance decay based on Manufacturing Gas turbines Condition-based
physical gas turbine measurements and fault types (aerospace) maintenance
optimization
[41] PCA and kNN Multiple sensors fusion at accelerometer and load cell data, Manufacturing Rolling element Condition-based
feature and decision level (generic) bearings monitoring and
diagnosis
[43] PoF and expert judgment Integration of condition assessment, RUL estimation based Oil and gas A three-phase Risk based condition
modeling on pressure, thickness and corrosion data and life (offshore) separation and RUL estimation
extension decision making system on a
platform
[45] kNN and discrete Bayesian filter Health indicators inferred from process variables and Aerospace Battery and RUL prediction
operational data turbofan engine
[46] GMM and L2 -regularized linear Low Power Refueling (LPR) states and their corresponding Oil and gas Gas circulators LPR state and RUL
SVM vibration profiles estimation
[47] LAD and KaplanMeier estimation Condition monitoring data and indicators based on failure Aerospace Turbofan engine RUL prediction
times and the corresponding covariates
[48] RNN-based health indicator Related-similarity features combined with time-frequency Aerospace Bearings RUL prediction
features from vibration signals
[49] Statistical approach and ANNs Knowledge extraction from condition monitoring data at Electrical power Medium Voltage RUL calculation
single product and fleet levels and High
Voltage Circuit
Breakers
[50] Adaptive functional Multi-sensor signal fusion (i.e. physical and performance Aerospace Aircraft turbofan RUL prediction
(log)-location-scale regression signals) using Multivariate FPCA engine
modeling
[52] HMM-based log-likelihood Observation sequences of data during the drilling process Manufacturing Cutting tools Health state estimation
regression (i.e. thrust-force and torque signals) and health-states (generic) and RUL prediction
labels
[54] Physics based models and Sensor, physics, and data model fusion, including extracted Aerospace Aircraft Predictive
data-driven analytics (Bayesian features maintenance, repair
inference) and overhaul

Fig. 5. Solutions and industrial sectors addressed for predictive prognostics.

99
A. Diez-Olivan et al. Information Fusion 50 (2019) 92–111

maintenance strategies are focused on conservative, preventive oper- tion. Zhou et al. [53] present a time window based preventive mainte-
ations [148]. The main goal is to avoid costly corrective interventions, nance model for multi-component systems with stochastic failures and
since the consequences of an unexpected failure could be catastrophic the disassembly sequence included.
[149]. In addition, incidental faults may imply an important impact in Much like MLP networks, deep learning methods can be seen as a
terms of risks, costs, resources and service loss that must be minimized. cascade of many layers of processing units that combine the predictor
features to approximate the target feature [121]. In [48], the authors
2.2.2. Predictive maintenance use health indicators fusing statistical features. A RNN-based health in-
To improve reliability and reduce costs, an optimal maintenance dicator is proposed to overcome drawbacks related to the computation
strategy should provide a set of predictive, preventive and corrective of bearing health indicators, e.g. considering different ranges and fail-
procedures as a result of a technical and economic analysis of ev- ure thresholds of the statistical indicators used and failure times, respec-
ery failure mode, taking into consideration the related consequences tively. RNNs present some interesting properties for time series forecast-
[150]. Reliability Centered Maintenance (RCM) strategies include Fail- ing; their loops allow information to persist [156]. They are powerful
ure Tree Analysis and Failure Mode, and Effects and Criticality Analysis and increasingly popular models for learning from varying-length se-
(FMECA). Once failure modes are identified and criticality classified, quence data, particularly those using LSTM hidden units [157]. LSTM
maintenance tasks are established to avoid faults [151]. When predic- networks for anomaly/fault detection in time series have demonstrated
tive maintenance is technically and economically possible, it is applied. very good accuracy [48,158]. Some recent research has dealt with LSTM
Maximum reliability is obtained when a robust and trustworthy failure networks for anomaly detection in time series [159,160]. However, this
indicator parameter is monitored. When it is not technically possible has not yet been combined with an understandable physical modeling of
or it is not affordable, preventive maintenance strategies are adopted. condition monitoring data for prognostics to anticipate anomalous data
Corrective maintenance is only used if predictive and preventive mainte- sequences over time.
nance strategies are not feasible. In those situations, using safety devices Data fusion techniques ranges from multi-sensor signal combinations
to apply appropriate troubleshooting tasks or redesigning the affected to a more complex integration of condition assessment, RUL estimation
asset (e.g. installing a standby component) is required. and decision making, as proposed in [50] by using a multivariate FPCA
Prognostics and Health Management (PHM) has seen a resurgence, and in [43] by integrating expert knowledge. The novel Digital Twin
with new service offerings in industry for guaranteed uptime for cost- concept is also grasping great attention in the scientific community, fus-
containing CBM implementations. A chief component of PHM is prog- ing information from sensors, physics and data-driven models [54]. Un-
nostics, but this is also its least mature element. Prognostics attempts to fortunately, there is often no record or clear evidence of maintenance
estimate remaining component life, given that an abnormal condition operations, faults or malfunctions which can be used to select subsets of
has been detected. The key to useful prognostics is not only an accu- labeled data [161].
rate estimate of remaining life but also an assessment of the estimates
confidence [152]; uncertainty poses challenges to the prediction, as it 2.3. Prescriptive prognosis
must account for differences in measurements, state estimation, model
inaccuracies and future load uncertainty. To this end, fuzzy theory is The emerging technology of prescriptive analytics goes beyond de-
usually employed to better represent uncertainties in prediction [42]. scriptive and predictive models by recommending one or more courses
The Remaining Useful Life (RUL) prediction of assets is a key con- of action – and showing the likely outcome of each decision. Despite
cept in reducing the maintenance and life-cycle management cost and the undoubted relevance of prescribing actions to gain a competitive
increasing their availability. It can be accomplished by different strate- advantage or an increased business value from the captured industrial
gies, e.g. by applying a multivariate pattern matching process from the data, research on prescriptive prognosis has been paid less attention by
data to the remaining life, by first estimating damage and then extrap- the community working on Industry 4.0 than its predictive and descrip-
olating its progression over time until it intersects the failure criterion. tive counterparts. Understood as the recommendation of one or more
The future degradation state is predicted based on the model and the courses of action based on the outcomes of models for descriptive and
identified features [44], including the uncertainties inherent to predic- predictive prognosis, prescriptive prognostic methods have undergone
tions made in future monitoring system states. As we get further in the direct implications of the relative lack of maturity of the digitaliza-
time from the current state, the uncertainty increases and, consequently, tion in the industrial sector, reflected in an inherent difficulty to build
the prediction accuracy decreases. In many cases, the RUL prediction is practical prescriptive models [162]. Furthermore, most decision making
achieved by health indicators that best represent the health condition processes are strongly linked to the particularities of the use case itself,
of the asset by mainly analyzing process and operational parameters, yielding very unlikely chances to replicate the scarcity of prescriptive
as in [45,46,52]. However, some other works are focused on the use of models reported so far over different industrial scenarios. Research ac-
registered failure times [47]. In [51] a novel strategy for predicting the tivity has thus been bounded to ad-hoc model developments for specific
RUL on a real-time basis is proposed by simultaneously considering eco- setups, awaiting the digital maturity of previous prognosis methods in
nomic and stochastic dependences and a dynamic condition monitoring the processing chain that stimulate new advances in regards to the mod-
strategy for multi-component systems. els themselves [163] (Table 3).
In predictive maintenance, time series analysis using condition mon- Notwithstanding the stagnant shortage of prescriptive prognosis
itoring data is crucial to anticipate anomalies and malfunctions in in- models noted above, an analysis of recent research effort around general
dustrial assets and processes. Temporal anomaly prediction approaches prescriptive analytics in industrial data must be made so as to evince,
usually learn models that best fit time series to compute errors when in hindsight, the importance of addressing this niche of research in the
comparing new, incoming data to predicted values. Traditional strate- short term. Therefore, this section is devoted to this analysis, capital-
gies use statistical measures, such as moving average over a time win- izing on those examples where prognosis models and data fusion have
dow, ARIMA, Kalman filter and cumulative sum [153]. Regression mod- been utilized jointly with optimization solvers, rule inference systems,
els fitted to non-stationary data can better represent more complex, fuzzy reasoning and other prescriptive algorithms alike. While this hy-
nonlinear dependencies with other related features. Gaussian process bridization of models could be a priori straightforward to implement in
regression [154] and Multilayer Perceptron (MLP) networks for regres- theory, in practice the conversion from predictive/descriptive outcomes
sion [155] are two very popular examples of prognostic models. In [49], to actionable information goes beyond the mere connection of such out-
a neural network-based approach is proposed to efficiently predict the comes in the criteria driving the optimality of the evaluated recommen-
RUL by taking as inputs the last observations coming from condition dations and actions. Many practical considerations often jeopardize the
monitoring data, e.g. the time instants and their related health condi- adoption of naive schemes to blend together models for descriptive, pre-

100
A. Diez-Olivan et al.
Table 3
Comparison of representative works on prescriptive prognostics.

Ref. Method Data used Industrial sector Assets involved Main goal

CPLEX (no further details given) Predicted aircraft health metrics, amount Aerospace Aircraft, yearly maintenance workforce Optimally schedule maintenance
[58] of maintenance workload and cost, operations of an aircraft fleet given
prediction uncertainties flying plan and aircraft demands
Evolutionary optimization Maintenance schedule, energy Semiconductor Machinery on-off duty cycle, production Minimization of the energy consumption
[60] consumption of production machinery, manufacturing scheduling given a production commit and
production commit maintenance schedule
Ant Colony Optimization Makespan, processing times of Manufacturing (generic) Production and maintenance scheduling Flexible job shop scheduling with
[67] manufacturing and maintenance (assign each operation to an machine unavailability constraints due
operations, availability period of appropriate machine, and sequence to preventive maintenance
machines operations over time)
Bi-objective Ant Colony Optimization Job processing times, failure and repair Manufacturing (generic) Schedule of multiple maintenance Minimization of the system unavailability
[68] rate of machines services, assignment of jobs to machines and minimization of the production
makespan
Multi-objective evolutionary algorithms Production preschedule, processing time Manufacturing (generic) Production schedule (job to machine Minimization of the production makespan
[71] of manufacturing operations, MTTR assignment, time sequencing) versus maximization of stability under
and MTBF of the machinery random machine breakdowns
Multi-objective evolutionary algorithm Flexibility of maintenance operations, Manufacturing (generic) Maintenance plan, production schedule Minimization of maximum completion
[72] health index of machinery produced by (job to machine assignment, time time versus minimization of
prognosis modeling, job processing sequencing) maintenance costs
times, machine deterioration model
Multi-objective evolutionary algorithm MTTR and MTBF of machinery, Manufacturing (generic) Maintenance schedule (time and system Minimization of completion time versus
[73] production commit, job processing on which maintenance is made), minimization of average machinery
times production schedule (job to machine unavailability versus minimization of
assignment, time sequencing) average mould unavailability
Weighted-sum single-objective Genetic Groups of production lots, machine Manufacturing (generic) Production scheduling and maintenance Minimization of total completion time
101

[75] Algorithm degradation model, cost model, job planning and maintenance costs given delivery
processing times date and cost constraints
Multi-objective evolutionary algorithm Reliability model Manufacturing (generic) Maintenance schedule, job sequence Makespan minimization versus system
[76] calibrated by neural networks unavailability minimization
Multi-objective evolutionary algorithm Marginal profit of every product lot, Manufacturing of cleaning Production plan Profitability versus robustness of the plan
[77] robustness model, maximum demand products against failures over the production
per product, production capacity of lines
every line
Ant Colony Optimization, Genetic Job arrival, due and processing times, Manufacturing (generic) Joint executing sequence of production Minimization of manufacturing total time
[79] Algorithm, Tabu Search, hybrid minimum and maximum gaps between and maintenance tasks on machines (makespan) and its robustness with
methods maintenance operations, processing respect to ideal maintenance periods
time of a maintenance operation
Fuzzy logic Output of vibration and temperature Textile manufacturing Predictive maintenance schedule of the Perform maintenance activities prior to
[82] monitoring systems deployed on the textile machines the machinery failure
machinery
Multi-criteria Fuzzy Decision Making Past information about failure cases, Rolling element bearings of Best suited maintenance approach Cost-effective maintenance given the
[83] portfolio of applicable maintenance paper mills current machine condition, product
policies quality and other factors
GAMS/CPLEX (no further details given) Maximum RUL, batch capacity, operation Chemical process in steel Multiscale production and maintenance Cost-effective joint production and

Information Fusion 50 (2019) 92–111


[85] modes of the processing unit, cost making industry scheduling plans maintenance scheduling considering
models, production targets per material residual useful life and operation modes
of the plant
Monte Carlo sampling with surrogate Job processing times, machinery failure Manufacturing (generic) Joint production and schedule plans Minimization of expected production
[86] measures model (time between failures) makespan versus minimization of
deviation between actual and initially
planned schedules
Fuzzy Petri Nets and multi-objective Design factors and cost models for every CNC machinery Relative importances given to reliability Cost-effective, max-reliability design of
[97] Particle Swarm Optimization subsystem of the product to be and cost for the design of every produced assets
manufactured subsystem
A. Diez-Olivan et al. Information Fusion 50 (2019) 92–111

dictive and prescriptive analysis, ranging from ethical implications of they can also become part of the optimization goal, prescribing which
the produced rules and actions to the economic feasibility of the pre- personnel should be allocated to which maintenance tasks to reduce the
scribed rules [164]. Furthermore, interplays also appear back and forth impact of predicted failures on the production of the industrial plant,
between such models: as to mention, certain actions dictated by a pre- mostly in terms of the time for which it is interrupted.
scriptive model can change the working regime of a given machinery, From the algorithmic perspective, most contributions so far have re-
thus impacting on the stationarity of the data captured from the equip- volved on the use of different heuristic solvers to optimally allocate pro-
ment and eventually demanding a retraining phase of the predictive duction resources. Among them, bio-inspired optimization techniques
prognosis models. Therefore, models must be elaborated much further, have grasped a good deal of attention, either those inspired by evolu-
carefully inspecting how the prescribed decisions propagate along the tionary concepts [61–63] or emulating other behavioral patterns and
data processing chain. phenomena observed in Nature [64–66]. Elements from Swarm Intelli-
The overview provided in what follows discriminates recent contri- gence, encompassing multi-agent schemes, have also emerged as compu-
butions in the broad topic of prescriptive methods for industry by the ap- tationally efficient means to address complex task scheduling problems
plication for which the model was developed. A focus will be also placed fed with predictive maintenance needs [67–70]. Another research trend
on the techniques and data utilized, highlighting those cases where de- is noted around approaching the production scheduling problem as a
scriptive and predictive models were employed. In those cases where problem comprising multiple conflicting optimization criteria, which
such an hybridization is missing, possibilities in the form of research calls for the adoption of solvers capable of inferring a set of feasible
hypothesis will be postulated briefly in order to encourage future re- Pareto-optimal task schedules [71–76]. The ever-growing dynamism of
search efforts. the industrial production environment has also spurred the exploration
of flexible on-line optimization approaches capable of reworking sched-
2.3.1. Production scheduling ules incrementally (also known as rescheduling) to accommodate the
To begin with, production scheduling emerges as the most intuitive contextual variability of the plant in different aspects, including power
actionable axis on which prescriptive models have been used in the In- outages and unpredictable failures [77,78]. Uncertainty in the predic-
dustry 4.0 [165,166]. In this context scheduling refers to all decision tive estimation of the maintenance needs of physical assets and the time
support system aimed at efficiently managing the assets, tools and re- required for repair has been also managed at the algorithmic level using
sources needed in production to increase its optimality, gauged in terms robust optimization algorithms [79,80]. In those works where heteroge-
of one or many criteria. This wide optimization can be pursued at very neous information is collected from the plant and exploited for produc-
different levels of the production chain, either inside the manufactur- tion scheduling and maintenance planning, data fusion has been kept
ing plant (stock, raw material, human resources, machinery, produc- relative apart from the optimization process, consolidating all opera-
tion line, intermediate buffers, in-plant logistics, intra-departmental ex- tional parameters and signals of relevance for the scheduling problem
change) or between different factories performing distinct albeit related at hand in the formulation of the fitness functions themselves, or dele-
production phases [167]. Not only resources and assets may vary among gating this fusion to models deployed in previous processing stages (i.e.
scheduling problems, but also the measure of fitness under which pre- inside the middleware/Big Data platform, or embedded in models for
scriptions are evaluated: to mention a few, maximum productivity, min- predictive prognosis).
imum make span, maximum energy efficiency and minimum scrap rate Besides the heuristic approach to maintenance and production
appear as the most frequently considered criteria in recent works grav- scheduling, concepts related to fuzzy logic have taken also a paramount
itating on this topic. The complexity of scheduling problems is usually role in production scheduling with predictive maintenance and other
exacerbated by the establishment of operational constraints to reflect sources of uncertainty [81–84]. By mining the expert knowledge of plant
practical limitations that can be anticipated beforehand, often by virtue operators using fuzzy logic, it is possible to plan predictive maintenance
of non-obvious expert knowledge not necessarily reflected nor inferable activities over industrial machinery by optimally converting the mon-
from the retrieved data. itored parameters to a fuzzy domain where expert rules are defined.
Within the huge research activity on production scheduling observed Other contributions in maintenance-aware scheduling include models
in the last years, a fraction of contributions have considered criteria re- to emulate the response of the plant operation under such as State Task
lated to maintenance planning for faulty systems and processes, which Networks [85] or surrogate models [86] to avoid time-consuming per-
can be conceived as a variant of production scheduling with particular formance estimations of the production chain under different schedules.
optimization metrics and constraints [55]. In this context, the outcomes
of previous predictive models can be fed to the prescription of schedul- 2.3.2. Life cycle optimization
ing actions in manifold ways. When energy efficiency comes into play, Another research branch in which prognostic information has been
the use of regression models is mandatory in order to generalize and esti- exploited is the optimization of the life cycle of industrial products and
mate the power consumption of the monitored machine under different assets. This area spans beyond the inherent use of predictive mainte-
working regimes, whose control parameters are optimized to yield the nance estimations for extending the productivity of industrial machin-
lowest consumption level at a given production rate [56,57]. In this ex- ery, which can be certainly thought of as the extension of its life cycle.
emplifying case prognosis does not take an explicit role in the prescrip- When inferred from other tools and the final products themselves, pre-
tion of actions, but benefits indirectly from a lower power consumption dictive information on the performance, quality and in-service operation
of the machinery, which potentially reduces its chances to enter an oper- can help managing optimally the life cycle of products and industrial
ational failure. Fortunately, more explicit examples of how prognosis be- tools, to the extent of prescribing and imprinting core changes from
come an inner part of production scheduling problems abound in the lit- the conception and design of the product to its delivery, service, dis-
erature: many contributions dealing with job shop scheduling problems posal/disassembling and recycling. This holistic management of the life
include in their problem statement the possibility for a machine to be cycle must several criteria must be assessed jointly with those related
broken down or in maintenance state, which can be indeed anticipated to productivity and operation estimated by prognostic models: quality,
by predictive prognostic models [58–60]. This predicted unavailability economics, flexibility and sustainability, among others [168].
of machines within a plant can be exploited in the optimization process In all cases, simulation tools and prediction models play a crucial role
to reduce, for instance, the transfer time of production tasks between in the determination of how the prescription of different actions affect
machines. Preprogrammed maintenance and/or engineering tasks are the life cycle of products and processes. The sharply rising momentum
also included within the pool of production tasks and tackled in the of these technologies all over the product cycle has given rise to popular
overall scheduling optimization, often resorting to heuristics of very di- terms handled nowadays in the Industry 4.0 realm, such as Digital Twins
verse nature. When maintenance resources are below the required level, [169], Virtual Factories [170] or Soft Sensing [171]. Notably, the char-

102
A. Diez-Olivan et al. Information Fusion 50 (2019) 92–111

acterization of processes and systems of the industrial plant based on the redesign. Finally, prescriptive analytics have also been at the forefront
captured data is at the core of the technological portfolio underneath all of optimized disassembling strategies, waste management policies and
such paradigms. Specific samples of the application of prediction mod- product recycling procedures optimized in the final stages of the life cy-
els to life cycle assessment include, to mention a few, the use of ANNs cle [101,102]. However, findings in this latter stage have not considered
and linear regression models for the estimation of the RUL of renewed prognostic information except for those related to the prediction of the
products incorporating used parts [87] and the environmental impact remaining lifetime of reused part for remanufacturing plants [103].
of different product compositions and designs [88], or the utilization of
machine learning models to tailor the composition of raw materials in 2.3.3. Supply chain management and logistics
manufacturing environments to match a set of desired properties (e.g. The inherent benefits of fault diagnosis have also fed into other
hardness, fatigue, deflection, etc.) or a target measure of performance stages of the industrial cycle, related to yet not necessarily embedded
[89–91]. A fresh view on inclusive life cycle assessment is provided in into the production process [172]. For instance, the propagation of ef-
[92], where production and customer service data are analyzed using fects of a given machinery fault (e.g. an interruption between tasks or
Random Forest models and ANNs to discover and understand causal re- delays in the production chain) has not only been analyzed in regards to
lationships of failure patterns in industrial products. The unprecedented the net productivity of the plant, but also in what refers to the supply,
scales at which products undergo life cycle assessment and optimization management and storage of raw materials [104–106]. The end-to-end
have lately steered the attention of the community towards the adop- profusion of data collected in manufacturing environments, the progres-
tion of Big Data frameworks. This rapprochement is not only to ingest, sive adoption of data-based modeling as a core driver of industrial op-
manage and store multi-sensory, heterogeneous data sources related to erations and their implementation in manifold sectors have sprung and
the entire manufacturing life cycle and the status of the production as- coined the term Supply Chain Analytics (SCA), which has lately settled in
sets, but also enables more enriched, informed prescriptive decisions in the limelight to the detriment of traditional non-data-driven approaches
regards to energy efficiency, productivity, marketing and fault diagno- [107–109]. In this matter the literature has been specially rich in the last
sis [93]. As a result of the provision of more computing capabilities, couple of years, with contributions related to fault prognosis observed
the literature is lately witnessing the timid incursion of deep (reinforce- around several axis:
ment) learning models for life cycle assessment in industrial systems,
particularly for process control [94] and RUL estimation [95]. • The predictability of failures within the industrial machinery can
At this point it is noteworthy to highlight the concept of prescriptive prescribe decisions even in the initial stages of the supply chain. For
information fusion recently coined in [96]. This contribution postulates instance, the selection of the supplier can be driven by an estima-
and proves that prescriptive analytics in manufacturing should benefit tion of possible deviations of the delivered raw material from their
from the surplus of data provided by simulation models. Recommenda- expected performance, or by the effect of a change in the proper-
tions and strategies should be driven by an augmented dataset, fusing ties of the supplied goods in the quality of the product itself. It is
together real-world industrial data and the outcomes produced by sim- interesting to highlight that a close connection between predictive
ulation models within a closed-loop iterative framework. As a result the and prescriptive prognosis could help manufacturers in their criteria
decision making process can explore regions of a design space that are used to evaluate and select suppliers. The prescription of decisions
not necessarily covered by the captured real-world data or cannot be at this early stage can be profound enough to imprint changes in the
reliably generalized by predictive models. Not in vain is the materials- design of the supply chain network so as to accommodate demand
aware design of product manufacturing processes (in essence, an in- uncertainties, unreliable supply schedules (due to e.g. the scarcity of
stance of life cycle optimization) exposed as one of the empirical evi- raw material, difficulties in the transportation, supply volatility or
dences that shed light on the benefits and generality of the proposed economical circumstances) or disruptions in the production chain,
information fusion methodology. as those that could be due to machinery faults.
The prescribed rules and actions for improving the product life cy- • Inventory management can be also steered by information on the
cle can operate on variables and parameters of very diverse nature. The faults occurred within the production process. This is particularly
early design of the product can be optimized to minimize the criticality crucial in multi-echelon inventory optimization processes, where the
of eventual failures along the production chain, as done in [97] using variability of the demand and storage at a certain inventory level can
a mixture of multi-criteria optimization heuristics and fuzzy Petri nets, propagate downstream over other related inbound/outbound stock
and in [98,99] resorting to Bayesian decision networks for reasoning buffers, desynchronizing decisions over the whole inventory hierar-
and mapping design decisions over products and processes subject to chy (from the emission of replenishment orders to the adjustment of
uncertainty. However, in the beginning of the life cycle machine learn- intra-echelon lead times).
ing models can also be extensively utilized for other industrial purposes, • Finally, logistics can also become affected eventual production dis-
from the determination of demanded product specifications based on ruptions in the plant due to maintenance tasks. A lower productivity
information retrieved from the client portfolio or the marketing/sales of the plant should be contemplated in the management and opti-
warehouse, to the optimization of the product details and specifications mization of delivery logistics. Therefore, it is essential to connect
balancing a trade-off between miscellaneous objectives (e.g. costs, per- the output of prognostic models for failure prediction and predictive
formance, reliability, safety and environmental impact, among others) maintenance to problems modeling logistics operations so as to op-
[100]. Tolerances, materials, assembling procedures, item dimensions timize the management and planning of vehicles, crew and ancillary
and other factors alike are among those most frequently considered. aspects related to the delivery and shipment of products to demand-
Simulation tools are often hybridized with machine learning and opti- ing users and companies. This broad family of optimization problems
mization techniques to find optimal values for such factors. In the mid- consider, among others, multiple vehicles with varying capacities,
dle life cycle (production), predicted faults in the machinery can be an- delivery and pickup time windows, the placement of intermediate
alyzed and may serve as the managerial trigger to decide changes in depots or the cost margin of the transportation.
the design of the product or in several stages of the production process.
This can be regarded as an extension of the regular quality inspection of 3. Implications on industrial hardware, software and
products while they are manufactured, wherein not the product itself is communications
inspected, but rather the causal consequences of a bad design or an un-
expected lack of compliance with the production process is inferred by The spectrum of technologies underlying IoT and cyber-physical sys-
virtue of predictive prognosis. When detected, the reason for failure un- tems enable the digitalization of several key stakeholders in the Indus-
chains new design constraints that are fed back for subsequent product try 4.0 ecosystem: the factory, the process, the asset, the product, the

103
A. Diez-Olivan et al. Information Fusion 50 (2019) 92–111

operator and the optimal management and information valuation of • Transfer Learning and Domain Adaptation, which aim at the deriva-
the product. This digital transformation allows gathering unprecedented tion of new data-based methods for extrapolating knowledge gained
amounts of data over the plant, automating processes, connecting and for solving a (predictive) task for solving a different yet related task
implementing digital interfaces with customers, to the point of imprint- in another context [177], thereby enabling edge devices to learn
ing essential changes to traditional industrial business models. from each other even if their targeted problem is not the same [178].
For this transformation process to succeed it is of utmost importance Although this paradigm was first addressed in isolation with respect
that the data life cycle in its entirety (from sensing to data fusion and to Federated Learning, it is clear that both trends are closely inter-
prognostic modeling) is supported efficiently by hardware and software connected: Transfer Learning can be a technical path to follow when
platforms specially suited for this purpose. In this context, physical hard- the learning algorithm behind the federated prognostic models is not
ware systems must first ensure that data requirements imposed by the unique and/or the local data from which they learn are not produced
prognostic problem and subsequent modeling are fulfilled efficiently, by the same industrial asset, nor do they correspond to the same
wherein efficiency obeys different albeit related criteria (e.g. econom- monitored physical parameter. In this practical situation the trans-
ical costs, energy efficiency, cross-compatibility, scalability). As such, fer of predictive knowledge between different prognosis contexts
approaches such as hardware virtualization, smart sensors and micro- can yield significant performance gains, particularly when the cap-
electro-mechanical systems (MEMS) [173] have been proven to profi- tured data undergoes class imbalance, weak supervision and other
ciently deal with connectivity and computational issues stemming from data-based shortfalls that jeopardize the learning process of the local
industrial scenarios so as to allow the entire IoT stack to be distributed model.
differently over different parts of the communications architecture, giv- • Online Learning, which relates to the extraction of predictive knowl-
ing rise to what is nowadays referred to as Cloud/Edge computing. Sim- edge from data streams characterized by stringent processing re-
ilarly, implemented data-based models must be deployed on software quirements, and possibly subject to exogenous non-stationary effects
platforms capable of accommodating the volumes and speed at which in the data patterns to be learned. Online Learning assumes that
industrial data are generated, with Big Data technologies at the forefront data instances are produced at fast rates, and fed in a sequential
of the latest reported use cases of industrial prognosis. order and only once for incrementally updating the model. These
For a proper selection of these supporting technologies several as- assumptions clash with those of traditional batch learning, where
pects must be taken into consideration, part of which have relevance the entire training dataset is available at once for the training algo-
and a strong connection to data fusion and modeling approaches. These rithm [179]. In this regard, the choice and design of the hardware
specific aspects are discussed in what follows. platform and the communication protocols play a crucial role in de-
termining the precise timing requirements of an incremental update
3.1. Edge Computing versus Cloud Computing of the local models in the edge: parameters such as the data rate at
which sensors operate, the scheduling period by which knowledge
Edge Computing allows data produced by IoT devices to be processed is exchanged within a federation of models or the latency incurred
locally, closer to where it is acquired instead of sending it to data centers by the communication protocol, and the processing capability of the
or Clouds [174,175]. It helps filtering and reducing (fusing) the amount sensor itself can be decisive in the design of a practical incremental
of data that needs to be sent to the cloud, and takes advantage of multi- prognosis model over streaming data.
cloud and distributed computing strategies. Local processing is far more In light of the above it is straightforward to conclude that a proper se-
cost-effective, requiring less ongoing bandwidth and storage cost, and lection of smart sensing IoT devices and communication protocols is im-
reduces latency and the amount of information traversing the network, perative when dealing with an industrial prognosis problem, with design
since there is no need of instantaneously sending the data streams to implications that span up to the data-based modeling stage. Interest-
the cloud to be fused and processed by the machine learning models. ingly, most recent smart IoT sensors are able to locally extract features
Beyond the convenience of Edge Computing in terms of communica- from raw data and to even deploy simple models to produce estimations,
tion resources, this computation paradigm has important implications providing data-level fusion, feature-level fusion and decision-level fu-
not only on the selection of the hardware platform for the use case at sion capabilities [13]. This also opens interesting opportunities to study
hand, but also in the modeling stage. Data fusion and analysis are imple- the trade-off between the precision of the model and the transmission
mented locally, so radically new model design strategies must be devised bandwidth. To that latter concern, the use of certain sensor materials
and developed so as to distribute, share and incrementally learn from (i.e. metal composites and nanocomposites or piezoelectric polymers
the prognostic knowledge gained from locally captured data. This noted and transducers), propagation means (namely, radio frequency or acous-
need lies at the heart of several research trends in data science: tic waves) and communication protocols and standards (correspond-
• Federated Learning, which addresses how to decentralize the learn- ingly, Industrial Ethernet, HART, Fieldbus, PROFINET or PROFIBUS)
ing process of a data-based model over a large number of client mod- are of special interest when implementing centralized (Cloud) progno-
els, each observing a partial fraction of the data acquired locally from sis architectures in industrial environments, due to their complexity and
its local context [176]. By allowing such clients to communicate to significance in terms of bandwidth and quality of information trans-
each other, the knowledge learned locally by each client model can mission. An hybrid solution to balance computation and communica-
be shared with every other counterpart, and exploited therein so tion and to meet task-specific requirements when dealing with seamless
as to enrich its locally gained knowledge. This distributed comput- connectivity of billions of smart devices is the so-called Fog Computing
ing paradigm unveils interesting aspects to consider in a practical [180,181], which can be conceived as an extension of Cloud Computing
industrial scenario, ranging from those related purely to communi- services towards the edge of the network without implementing all data
cation aspects (latency, scheduling of knowledge exchanges, band- processing functionalities in distributed IoT edge nodes, but instead on
width requirements to encode and exchange the knowledge learned intermediate processing gateways.
by a model – e.g. layer weights of a neural network) to the implica-
tions in terms of data modeling (incremental model update, training 3.2. Traditional databases versus Big Data technologies
latency in real-time prognosis, knowledge representation for feder-
ating heterogeneous data-based models). All in all, advances in Fed- According to the increasing amount of distributed information
erated Learning should be embraced by industrial scenarios opting sources available in industrial environments, data-intensive technolo-
for Edge Computing to alleviate the computational costs that a cen- gies are becoming increasingly prevalent to ingest, store, manage and
tralized prognosis modeling process would imply. process massive data [182]. Under such circumstances, the real-time re-

104
A. Diez-Olivan et al. Information Fusion 50 (2019) 92–111

sponse needed in many industrial scenarios imposes that data processing and 2) to provide different levels of abstraction and complexity in the
and analysis is performed in an online manner over Stream Processing representation of data, optimally matched to the technical competences
Engines (SPE). Different generations of SPEs have been developed dur- and needs of the managing staff of the industrial company. Once the
ing last years, from extensions to traditional Data Base Management Sys- industrial staff verifies that the prognostic information provided by ba-
tems (DBMS) to highly distributed, edge and cloud computing solutions sic descriptive models matches their intuition, a path is cleared towards
[183]. embracing more advanced algorithms and methods for describing the
DBMS provides data storage capabilities for performing relational normal operation of the industrial setup under analysis. To this end,
operations of data structures and batch analysis, whereas Data Stream data fusion techniques, when needed, must be designed with extreme
Management Systems (DSMS) and edge technologies are mainly focused care not to oversee the expertise of the personnel in regards to the num-
on fast data management and quick processing tasks. Similarly, Com- ber and temporal resolution of the monitored signals. Visual analytics,
plex Event Processing (CEP) systems support the analysis of series of understood as the study and development of new ways of data represen-
events and thus detecting, for instance, time relationships among them tation fostering interpretability and understandability of the displayed
by means of correlation rules. Another useful application of CEP sys- information flows [188], has recently emerged as a promising discipline
tems in distributed IoT architectures and networks of edge devices is to visually adapt the discovered insights and optimally present results
the smart fusion of sensor data [184,185]. to different human profiles [189,190]. These aspects will be crucial in
From those solutions that are mainly intended to address prognosis- real use cases where to deploy models for descriptive prognostic with a
related problems, it is worth mentioning ESPER (Event Series Intelli- minimum guarantee of usability and practical utility, along with other
gence [186]), a CEP-based solution designed to analyze series of events technological approaches aimed at this same purpose (e.g. human ma-
and implemented on the basis of an event-driven architecture, and In- chine interfaces).
fluxDB [187], which is an-open source time series DBMS that is able
to handle and perform real-time analytics over large amounts of times- 4.2. Predictive prognosis: Class imbalance, non-stationarity and transfer
tamped data, including IoT sensor data. learning

4. Future trends and perspectives When industrial prognosis is formulated as a classification or regres-
sion problem, the relatively low incidence of faults in the industrial
In light of the literature survey reviewed in previous sections, there machine or asset being monitored is a circumstance that hinders the
is no doubt posed on the research momentum around data fusion and proper construction of a predictive model to undertake the classifica-
analysis for industrial prognosis. Indeed, almost every proposal for new tion task. When training a model with few or no evidences of the events
architectural solutions related to Industry 4.0 conceives prognosis as a of interest (e.g. operational faults or changing operational conditions),
core part of utmost relevance for the smart operation of the industrial it is likely that the model becomes biased towards the so-called major-
asset under focus. The use of data fusion techniques and machine learn- ity class. In other words, the learning algorithm focuses on predicting
ing algorithms to exploit all the available information allows incorpo- the most frequent class (namely, normal operation) with high accuracy,
rating intelligence into improved, cloud-based hands-on machines and while misclassifying or simply ignoring the least frequent class (corre-
production lines, through software integration and deployment. Com- spondingly, faulty operation) which, in turn, is the one whose detection
plex behaviors and prognostic models can be learned from historical provides most practical value for the industry. This is actually a very
data, tons of data can be analyzed in real time and industrial assets and recurrent problem in predictive prognosis, particularly when casted as
production processes can be intelligently monitored in an on-line fash- a binary classification problem [191]. Workarounds abound in the form
ion. Cloud-powered data processing and Big Data management are also of preprocessing methods such as class under/oversampling techniques,
key technological ingredients in this regard. specialized balanced ensembles or embedded modifications of the model
However, the community still faces a number of research niches learning algorithm devised to account for the class imbalance present in
and challenges demanding further investigation and development in the the training dataset [192,193]. However, even though there have been
near future. We next describe such challenges in detail by providing notable advances in class imbalance for multilabel and multiclass clas-
argued rationale and by sketching potential research paths to follow, sification from an application-agnostic perspective [194], most real use
aimed at stimulating the interest and steering the efforts of early re- cases where predictive prognosis is put to practice oversimplify the un-
searchers and newcomers to this exciting research field. For the sake of derlying problem to its binary version, despite the immediate profits that
clarity the identified challenges will be arranged in increasing order of could derive from the discrimination of the type of fault predicted to oc-
their level of abstraction, from those related to the models themselves cur (e.g. tailored predictive maintenance or a more resilient design of
to the ones that connect closely with their practicality in real industrial the processes and machinery involved in production). Extrapolating the
environments. aforementioned findings to industrial prognosis would by itself provide
an increased predictive awareness of the fault patterns of the monitored
4.1. Descriptive prognosis: Visual analytics for an enhanced assets. This would call for interesting synergies with visual analytics so
understandability as to help managers upon an alarm comprising different types of fault.
Another research area in data fusion and analysis emerging in in-
One of the most recurrently encountered handicaps for the dustrial prognosis is Online Learning over data streams. Following the
widespread adoption of data-based prognosis is the assimilation of in- outline around this paradigm in Section 3, Online Learning implies deep
formation by the operator of the industrial plant. When it comes to de- changes not only in what refers to the learning algorithm (e.g. incre-
scriptive prognosis, it is often the case that the produced information mental model update), but also in regards to the obsolescence of the
by the deployed models cannot be processed straightforward by non- knowledge retained by the model under phenomena that is not neces-
specialized personnel unless some sort of preprocessing is devised for sarily symptomatic of the failure to be predicted (e.g. a change of work-
an improved, more intuitive understanding of the captured patterns. ing regime of the machinery, lack of calibration, sensing drift and other
This is particularly relevant in legacy industrial facilities through their factors alike). In such a case, the adoption of elements from concept
first transition steps towards a digital mode of operation, production and drift detection and adaptation [195] for the industrial setting has lately
management. In this stage descriptive modeling should incur simplistic come into scientific debate, as they can be efficient means for prog-
approaches targeted at a twofold objective: 1) to crosscheck that the cap- nosis over time-evolving data streams [196]. Indeed, subtle changes in
tured data is in accordance to the knowledge and historical experience the distribution of the data streams under faulty and faultless opera-
by the personnel of the plant during their working years in the plant; tion can make the predictive knowledge captured in the model become

105
A. Diez-Olivan et al. Information Fusion 50 (2019) 92–111

catastrophically obsolete at a point in time, eventually triggering main- stress on the fusion of data and the definition of the KPIs to be mea-
tenance alarms when there is no such a need in practice. The detection sured. If expert knowledge is limited or not easily representable as a
and consequent adaptation of the learning algorithm (either actively or continuous or discrete variable, the emphasis must be instead placed on
passively) to this drift could eventually minimize its impact and main- the modeling phase, attempting to address the analytical task in a more
tain the detection performance of the prognostic solution within admis- exhaustive manner. This can be seen as a trade-off between model com-
sible levels of practicality. In this context, industrial applications requir- plexity and a priori knowledge. Based on this principle, the attention of
ing prognosis over data streams should particularly inspect the latest the research community should be directed towards the development of
advances for recurrent concept drifts [197,198], since phenomena for hybrid models capable of seamlessly fusing incorporating the expertise
drifting data streams usually occur repeatedly in industrial setups (e.g. fed back from the industry personnel within its learning algorithm. For
recalibration or the change of operator in the machine). Online predic- this purpose models suited to deal with multidimensional time-domain
tive models capable of learning from data streams subject to uncertainty data instances should lay at the core of this research niche, such as re-
should also be at the core of future research in industrial prognosis, due cently reported recurrent models for sequence prediction with uncer-
to the high level of uncertainty and noise characterizing certain sources tainty [201] and distance based classification for time series [202].
of data [199]. An open challenge related to the above remains when blending to-
Finally, as stated in Section 3 Transfer Learning and Domain Adap- gether Data Science and principles stemming from Mechanics, Thermo-
tation are also trends in Data Science deserving further attention for dynamics and other physical principles linked to the failure of specific
industrial prognosis, since this portfolio of techniques can be an effec- industrial processes, particularly those related to the manufacture of ma-
tive workaround for the scarcity of labeled prognosis data in industrial terials (e.g. metallurgy, polymers and plastics). The hybridization of the-
setups. In manufacturing industries with presence in different coun- oretical concepts with learned evidences from historical data has shown
tries the deployed machinery features a high level of similarity between itself to be highly profitable for energy efficiency [203] or battery life
plants, with different designs due to the provider or varying contexts in prediction [204], coining the so-called gray-box or semi-physical model-
diverse aspects such as maintenance policies, personnel skills or quality ing concept [205]. Both complement each other and may help reducing
of the processed raw material. Transfer Learning could make a predic- the impact of label scarcity, lack of data or insufficiently generalizable
tive prognosis model developed for a certain industrial plant be reused theoretical approaches to the prognosis task under analysis. However,
in part as a starting point for predicting failures in another plant, even the integration of this theoretical knowledge in prognostic models is
if differences exist between the context in which such plants operate. made in an ad-hoc fashion, being fully determined by the use case at
hand. More principled studies are needed to evince under which con-
4.3. Prescriptive prognosis: Complex constraints and realistic objectives ditions this hybridization yields significant performance gains for the
model, delving into new ways to quantify the degree of innovation pro-
When turning the focus to prescriptive prognosis, the most chal- vided by theoretical concepts over a given prognostic dataset.
lenging paradigm encountered in practice remains tightly coupled to
the match between the formulated optimization problem and the de- 4.5. Prognosis towards flexible, cost-effective production
cision making process that such a problem aims to model. Industries,
particularly those related to the manufacturing of goods, are complex The digital revolution faced by industries in recent times can be
environments where human and machinery coexist and interact, often thought of as an enabler for a better adaptability of manufacturing pro-
without a holistically centralized management. In this context it is of- duction processes and industrial assets to dynamic conditions and re-
ten the case that actions triggered by a prescriptive prognosis model quirements demanded by their consumers. Companies can even pro-
do not conform to the practical criteria and/or constraints under which duce different products by communicating specifications to the ma-
such actions would be manually enforced. In this case, the developed chine. Thus, product variations can be automatically and flexibly manu-
prescriptive models would fail to apply when deployed over the indus- factured by using well-defined standards. To this end, every single part
trial plant, thus being left aside from managerial processes. Therefore, involved in the process produces and processes data delivered by other
new working methodologies are needed to ensure that the prescriptive parts, including information related to quality, inventory and relevantly
research hypothesis is aligned with the real requirements of the indus- for the current study: health monitoring. Parts are continuously inform-
try process at hand. Besides, such methodologies should also account ing about their own status and that of the phase of the production line
for other practical aspects that could eventually affect the design of ef- where they are installed, which requires gathering all such information
ficient solvers for their resolution, including the variability of metrics through an IoT platform and centralizing it in a cloud-based system
and/or constraints along time, cost implications of decisions made by able to store and process large volumes of data. The intelligent prog-
the model or the presence of conflicting objectives in the criteria guid- nostic analysis of this collected information is crucial to ensure that
ing such decisions (such as productivity and reliability when prescribing manufacturing industries operate robustly and cost-efficiently within
maintenance operations in a job shop scheduling problem). highly competitive markets. By ensuring optimized prognostic decisions
in terms of maintenance highly customized and reconfigurable products
4.4. Integration of expert knowledge and physics in hybrid prognostic can be manufactured, closely matching their specifications to customers’
models needs. However, a closer look at the compatibility of maintenance deci-
sions and flexible production needs and schedules is still lacking in the
In terms of data fusion, there has been little discussion on efficient literature.
procedures for representing and integrating expert knowledge towards In this regard, cost effectiveness of prognostic decisions are rarely ad-
its consideration in subsequent modeling phases. Beyond techniques for dressed in the literature, even though this criterion can determine their
fusing the information captured at different scales and temporal reso- practical feasibility. Instead, the optimality of decisions is rather for-
lutions from the industrial machinery and warehouse platforms, there mulated as productivity, energy savings, inventory saturation and other
is common belief that the aggregated knowledge collected over years technological KPI. In regards to flexibility, the ability of a manufacturing
of experience of the personnel is a valuable informational asset and plant to dynamically produce small yet highly customized lots by virtue
a key factor for success in prognosis modeling [200]. When the com- of informed, data-based decisions in operations and maintenance may
puted health or status indicator from data is representative enough to clash with excessive economical investments if the prognosis modeling
address the problem under study, the modeling stage becomes rather problem is formulated without considering cost effectiveness among its
straightforward. More precisely, if expert knowledge of the problem to objectives. The inclusion of this metric into the design of prognostic
be solved and involved assets is available in advance, it is advisable to models (particularly those prescribing predictive maintenance actions)

106
A. Diez-Olivan et al. Information Fusion 50 (2019) 92–111

is promising in light of a strand of contributions related to the effective- cloud-based systems will reduce both the amount of unexpected ma-
ness of maintenance investments [206]. chine downtimes and maintenance costs. Production and processes will
be interconnected, embedding intelligence not only in every single part
of the workbench but also in the product or asset itself. As a conse-
4.6. Hardware and communications: Industrial IoT networks using fog
quence, this decentralization will ignite even further the need for ad-
computing
dressing emerging distributed computing paradigms, such as the afore-
mentioned edge analytics [216] and privacy-aware Federated Learning
The IoT revolution within Industry extends computing and network
[217], with profound design implications for data fusion techniques and
capabilities and minimizes the need for human interaction within indus-
prognostic modeling as those outlined in Section 3.
trial processes and operations. However, implementing IoT solutions is
The advent of all these new data-based technologies to industry en-
a huge transformation process, involving not only technology and prod-
tails very specific and technical worker skills. Data scientists, engineers
ucts, but also the change of mindsets.
and architects, database administrators and business analysts are be-
Interestingly, several challenges faced by companies willing to de-
coming more frequently demanded by manufacturing companies, a de-
ploy IoT networks composed by distributed edge devices connect with
mand propelled by the progressive digitalization of this sector [218].
the risks derived from this sharp transformational process [207–209].
The challenge in this regard resides in the attraction and retention of
One key challenge has to do with the investment costs required for the
talented individuals, a matter that copes with practical issues such as
IoT deployment, which could be daunting for Small and Medium Enter-
the relatively low digital maturity of this domain, and the shortage of
prises (SMEs) due to the unpredictability of the future value chain. The
experienced profiles capable of fully exploiting all asset information and
lack of skills and experience of the current Information Technology (IT)
manufacturing data. As long as more staff training courses are completed
staff must be carefully considered as well, since it is often insufficient to
and academic degrees on industrial prognosis become available, this is-
deal with the vast amount of hardware and software solutions required
sue will be progressively resolved.
for IoT-empowered Industry 4.0. In terms of interoperability and avail-
able standards, it turns out that current IoT ecosystems suffer from the
fragmentation of conventional solutions and implementation standards 5. Concluding remarks
[210]. Moreover, industrial IoT sensors must coexist with legacy equip-
ment that is already deployed on the plant, which must be integrated This article has discussed on the manifold directions of data fusion
into distributed IoT architectures as seamlessly and efficiently as possi- strategies and machine learning algorithms for data-driven prognosis
ble [211,212]. Data security, privacy and governance is also important within the Industry 4.0 paradigm. Three main categories have been
[212,213] given the vast amount of data generated by a wide variety of discussed, namely, descriptive, predictive and prescriptive prognostics,
sources. It actually causes a big concern with the ownership of data, so which differ from each other in regards to the main objective targeted
that a secure access must be guaranteed particularly for industries whose by the scheme in question. For the sake of an informed analysis of the
products and assets are critical in these terms. In this regard, prognos- research activity in this field, a comparative overview has been done
tic models must be complemented by schemes and mechanisms for au- among different methods within each category, stressing on the indus-
thenticated sensor access and data encryption/verification/integrity as- trial problems and sectors where such reported approaches have been
surance so that the operation of the prognostic model is robust against applied. Finally, in light of the surveyed literature we outline research
attacks based on unauthorized modification/injection/removal of indus- trends and directions that will grasp the attention of the research com-
trial data along their life cycle. munity in what refers to industrial prognosis from a data perspective.
Data capture mechanisms are already in place and the digitalization Some major questions and open technical challenges have been identi-
of industrial assets, products, processes and services are ready to im- fied, not only related to data-based modeling and fusion (namely, highly
prove productivity, satisfaction and incomes through data-driven solu- imbalanced data, nonstationarity, heterogeneity of information and the
tions. However, there still remains a wide gap to be bridged between real transferability of the captured prognostic knowledge across tasks), but
industrial equipment and their digital twins which are required, among also to the consequences of their application to practical industrial se-
other uses, to develop optimized maintenance/operation decisions in re- tups (correspondingly cost efficiency, flexibility, and the need for spe-
gards to their predicted prognosis. Furthermore, in many cases data are cialized training). It is unquestionable that such issues will be fully ad-
used only when the equipment undergoes servicing by a field engineer. dressed in years to come with new advances in data-driven prognosis as
Moreover, the full integration of Edge and Cloud Computing technolo- those reviewed in this survey.
gies is yet uncertain in many industrial sectors. Therefore, the develop-
ment and relative higher maturity of Fog Computing frameworks can Acknowledgments
ignite the digitalization process of Industry 4.0 and efficiently support
IoT applications [214]. The authors would like to thank the Basque Government for its fund-
ing support through the EMAITEK program.
4.7. New services, business models and specialized jobs
References
Another key aspect of the Industry 4.0 revolution is the concept of
[1] Industrie 4.0 Working Group, Recommendations for Implementing the Strategic
smart services or “servitization”, which is reinventing the maintenance Initiative INDUSTRIE 4.0, Technical Report, 2013.
of assets [215]. The main issue is to enable early prognosis of system [2] M. Blanchet, T. Rinn, G. von Thaden, G. De Thieulloy, Industry 4.0: The New Indus-
errors and thus to accurately anticipate when an asset on the produc- trial Revolution – How Europe will Succeed, Roland Berger Strategy Consultants,
2014.
tion line is going to fail, why, and how to prevent it, and even to au-
[3] T. Devezas, J. Leitão, A. Sarygulov, Industry 4.0: Entrepreneurship and Structural
tonomously act in consequence, including self-healing capabilities, to Change in the New Digital Landscape, Springer, 2017.
cause minimum impact on production. This new after-sales business [4] D. Serpanos, M. Wolf, Industrial Internet of Things, in: Internet-of-Things (IoT)
Systems, Springer, 2018, pp. 37–54.
includes the intelligent and proactive maintenance of the production
[5] M. Rüßmann, M. Lorenz, P. Gerbert, M. Waldner, J. Justus, P. Engel, M. Harnisch,
assets, not based on preventive nor corrective operations. The verti- Industry 4.0: The Future of Productivity and Growth in Manufacturing Industries,
cal integration of data monitored at the asset level with service pro- Boston Consulting Group, 2015.
cesses residing in back-end systems into the cloud will provide a suitable [6] Y. Lu, Industry 4.0: A Survey on Technologies, Applications and Open Research
Issues, J. Ind. Inf. Integr. 6 (2017) 1–10.
environment for the development of cloud-based services to remotely [7] Gobierno Vasco (Basque Government), PCTI Euskadi 2020: Una Estrategia de Es-
offer customized and prognostics approaches. Such highly integrated pecialización Inteligente, 2015.

107
A. Diez-Olivan et al. Information Fusion 50 (2019) 92–111

[8] S. Jeschke, C. Brecher, T. Meisen, D. Özdemir, T. Eschert, Industrial internet of [37] G. Niu, H. Li, IETM centered intelligent maintenance system integrating fuzzy se-
things and cyber manufacturing systems, in: Industrial Internet of Things, Springer, mantic inference and data fusion, Microelectron. Reliab. 75 (2017) 197–204.
2017, pp. 3–19. [38] M. Baqqar, Machine Performance and Condition Monitoring Using Motor Operating
[9] P.L. ao, L. Ribeiro, J. Lee, Guest editorial: special section on smart agents and Parameters Through Artificial Intelligence Techniques, University of Huddersfield,
cyber-physical systems for future industrial systems, IEEE Trans. Ind. Inform. 13 2015 Ph.D. thesis.
(2) (2017) 657–659. [39] A. Yunusa-Kaltungo, J.K. Sinha, Effective vibration-based condition monitoring
[10] R.S. Michalski, J.G. Carbonell, T.M. Mitchell, Machine Learning: An Artificial In- (evcm) of rotating machines, J. Qual. Maint. Eng. 23-3 (2017) 279–296.
telligence Approach, Springer Science & Business Media, 2013. [40] A. Kumar, R. Shankar, L.S. Thakur, A Big Data driven sustainable manufacturing
[11] A. Diez Oliván, Machine Learning for Data-driven Prognostics: Methods and Appli- framework for condition-based maintenance prediction, J. Comput. Sci. (2017) in-
cations, 2017 Ph.D. thesis. press.
[12] W. Elghazel, J.M. Bahi, C. Guyeux, M. Hakem, K. Medjaher, N. Zerhouni, De- [41] M. Safizadeh, S. Latifi, Using multi-sensor data fusion for vibration fault diagnosis
pendability of sensor networks for industrial prognostics and health management, of rolling element bearings by accelerometer and load cell, Inf. Fusion 18 (2014)
arXiv:1706.08129 (2017) preprint. 1–8.
[13] R. Gravina, P. Alinia, H. Ghasemzadeh, G. Fortino, Multi-sensor fusion in body [42] H. Li, H.-Z. Huang, Y.-F. Li, J. Zhou, J. Mi, Physics of failure-
sensor networks: state-of-the-art and research challenges, Inf. Fusion 35 (2017) based reliability prediction of turbine blades using multi-
68–80. source information fusion, Appl. Soft Comput. (2018) in press
[14] C. Emmanouilidis, P. Pistofidis, A. Fournaris, M. Bevilacqua, I. Durazo-Carde- https://www.sciencedirect.com/science/article/pii/S1568494618302783.
nas, P.N. Botsaris, V. Katsouros, C. Koulamas, A.G. Starr, Context-based and hu- [43] I. Animah, M. Shafiee, Condition assessment, remaining useful life prediction and
man-centred information fusion in diagnostics, IFAC-PapersOnLine 49 (28) (2016) life extension decision making for offshore oil and gas assets, J. Loss Prev. Process
220–225. Ind. 53 (2018) 17–28.
[15] S. Rawat, S. Rawat, Multi-sensor data fusion by a hybrid methodology – a compar- [44] J. Wu, Y. Su, Y. Cheng, X. Shao, C. Deng, C. Liu, Multi-sensor information fusion
ative study, Comput. Ind. 75 (2016) 27–34. for remaining useful life prediction of machining tools by adaptive network based
[16] A. Ragab, M. El-Koujok, B. Poulin, M. Amazouz, S. Yacout, Fault diagnosis in in- fuzzy inference system, Appl. Soft Comput. 68 (2018) 13–23.
dustrial chemical processes using interpretable patterns based on logical analysis [45] A. Mosallam, K. Medjaher, N. Zerhouni, Data-driven prognostic method based on
of data, Expert Syst. Appl. 95 (2018) 368–383. bayesian approaches for direct remaining useful life prediction, J. Intell. Manuf. 27
[17] G. Manco, E. Ritacco, P. Rullo, L. Gallucci, W. Astill, D. Kimber, M. Antonelli, Fault (5) (2016) 1037–1048.
detection and explanation through big data analysis on sensor streams, Expert Syst. [46] J.J. Costello, G.M. West, S.D. McArthur, Machine learning model for event-based
Appl. 87 (2017) 141–156. prognostics in gas circulator condition monitoring, IEEE Trans. Reliab. 66 (4)
[18] A. Krishnakumari, A. Elayaperumal, M. Saravanan, C. Arvindan, Fault diagnostics (2017) 1048–1057.
of spur gear using decision tree and fuzzy classifier, Int. J. Adv. Manuf. Technol. [47] A. Ragab, M.-S. Ouali, S. Yacout, H. Osman, Remaining useful life prediction us-
89 (9-12) (2017) 3487–3494. ing prognostic methodology based on logical analysis of data and kaplan–meier
[19] A. Diez-Olivan, M. Penalva, F. Veiga, L. Deitert, R. Sanz, B. Sierra, Kernel densi- estimation, J. Intell. Manuf. 27 (5) (2016) 943–958.
ty-based pattern classification in blind fasteners installation, in: International Con- [48] L. Guo, N. Li, F. Jia, Y. Lei, J. Lin, A recurrent neural network based health indi-
ference on Hybrid Artificial Intelligence Systems, Springer, 2017, pp. 195–206. cator for remaining useful life prediction of bearings, Neurocomputing 240 (2017)
[20] C. Li, R.-V. Sanchez, G. Zurita, M. Cerrada, D. Cabrera, R.E. Vásquez, Gearbox fault 98–109.
diagnosis based on deep random forest fusion of acoustic and vibratory signals, [49] L. Cristaldi, G. Leone, R. Ottoboni, S. Subbiah, S. Turrin, A comparative study on
Mech. Syst. Signal Process. 76 (2016) 283–293. data-driven prognostic approaches using fleet knowledge, in: IEEE International
[21] M. Ruiz, L.E. Mujica, S. Alférez, L. Acho, C. Tutivén, Y. Vidal, J. Rodellar, F. Pozo, Conference on Instrumentation and Measurement Technology (I2MTC), 2016,
Wind turbine fault detection and classification by means of image texture analysis, pp. 1–6.
Mech. Syst. Signal Process. 107 (2018) 149–167. [50] X. Fang, N.Z. Gebraeel, K. Paynabar, Scalable prognostic models for large-scale
[22] J. Yin, W. Zhao, Fault diagnosis network design for vehicle on-board equipments condition monitoring applications, IISE Trans. 49-7 (2017) 698–710.
of high-speed railway: a deep learning approach, Eng. Appl. Artif. Intell. 56 (2016) [51] H. Shi, J. Zeng, Real-time prediction of remaining useful life and preventive oppor-
250–259. tunistic maintenance strategy for multi-component systems considering stochastic
[23] C. Liu, Y. Li, G. Zhou, W. Shen, A sensor fusion and support vector machine based dependence, Comput. Ind. Eng. 93 (Suppl C) (2016) 192–204.
approach for recognition of complex machining conditions, J. Intell. Manuf. (2016) [52] A. Kumar, R.B. Chinnam, F. Tseng, An HMM and polynomial re-
1–14. gression based approach for remaining useful life and health state
[24] A. Diez-Olivan, X. Averós, R. Sanz, B. Sierra, I. Estevez, Quantile re- estimation of cutting tools, Comput. Ind. Eng. (2018) in press
gression forests-based modeling and environmental indicators for deci- https://www.sciencedirect.com/science/article/pii/S0360835218302183.
sion support in broiler farming, Comput. Electron. Agric. (2018) in press [53] X. Zhou, K. Huang, L. Xi, J. Lee, Preventive maintenance modeling for multi-compo-
https://www.sciencedirect.com/science/article/pii/S0168169917314394. nent systems with considering stochastic failures and disassembly sequence, Reliab.
[25] E.E. Peterson, S.A. Cunningham, M. Thomas, S. Collings, G.D. Bonnett, B. Harch, An Eng. Syst. Saf. 142 (Suppl C) (2015) 231–237.
assessment framework for measuring agroecosystem health, Ecol. Indic. 79 (2017) [54] Z. Liu, N. Meyendorf, N. Mrad, The role of data fusion in predictive maintenance
265–275. using digital twin, in: AIP Conference, 1949, 2018, p. 020023.
[26] D. Sun, V.C. Lee, Y. Lu, An intelligent data fusion framework for structural health [55] S.-H. Ding, S. Kamaruddin, Maintenance policy optimizationliterature review and
monitoring, in: 11th IEEE Conference on Industrial Electronics and Applications directions, Int. J. Adv. Manuf. Technol. 76 (5-8) (2015) 1263–1283.
(ICIEA), 2016, pp. 49–54. [56] C. Gahm, F. Denz, M. Dirr, A. Tuma, Energy-efficient scheduling in manufacturing
[27] V.H. Jaramillo, J.R. Ottewill, R. Dudek, D. Lepiarczyk, P. Pawlik, Condition moni- companies: a review and research framework, Eur. J. Oper. Res. 248 (3) (2016)
toring of distributed systems using two-stage bayesian inference data fusion, Mech. 744–757.
Syst. Signal Process. 87 (2017) 91–110. [57] F. Shrouf, J. Ordieres-Meré, A. García-Sánchez, M. Ortega-Mier, Optimizing the
[28] A. Diez, N.L.D. Khoa, M.M. Alamdari, Y. Wang, F. Chen, P. Runcie, A clustering production scheduling of a single machine to minimize total energy consumption
approach for structural health monitoring on bridges, J. Civil Struct. Health Mon- costs, J. Cleaner Prod. 67 (2014) 197–207.
itoring 6 (3) (2016) 429–445. [58] Z. Li, J. Guo, R. Zhou, Maintenance scheduling optimization based on reliability
[29] C. Li, R.-V. Sánchez, G. Zurita, M. Cerrada, D. Cabrera, Fault diagnosis for rotating and prognostics information, in: Annual Reliability and Maintainability Symposium
machinery using vibration measurement deep statistical feature learning, Sensors (RAMS), 2016, pp. 1–5.
16 (6) (2016) 895. [59] Y. Zeng, A. Che, X. Wu, Bi-objective scheduling on uniform parallel machines con-
[30] A. Diez-Olivan, J.A. Pagan, N.L.D. Khoa, R. Sanz, B. Sierra, Kernel-based support sidering electricity cost, Eng. Optim. 50 (1) (2018) 19–36.
vector machines for automated health status assessment in monitoring sensor data, [60] C. Garcia-Santiago, J. Del Ser, C. Upton, F. Quilligan, S. Gil-Lopez, S. Salcedo-Sanz,
Int. J. Adv. Manuf. Technol. 95 (2018) 327–340. A random-key encoded harmony search approach for energy-efficient production
[31] J. Lee, C. Jin, Z. Liu, H.D. Ardakani, Introduction to data-driven methodologies scheduling with shared resources, Eng. Optim. 47 (11) (2015) 1481–1496.
for prognostics and health management, in: Probabilistic Prognostics and Health [61] M. Gen, W. Zhang, L. Lin, Y. Yun, Recent advances in hybrid evolutionary algo-
Management of Energy Systems, Springer, 2017, pp. 9–32. rithms for multiobjective manufacturing scheduling, Comput. Ind. Eng. 112 (2017)
[32] F. Wang, S. Pan, Y. Xiong, H. Fang, D. Wang, Research on software architec- 616–633.
ture of prognostics and health management system for civil aircraft, in: Interna- [62] I. Moon, S. Lee, M. Shin, K. Ryu, Evolutionary resource assignment for workload-
tional Conference on Sensing, Diagnostics, Prognostics, and Control (SDPC), 2017, -based production scheduling, J. Intell. Manuf. 27 (2) (2016) 375–388.
pp. 510–513. [63] C. Bierwirth, D.C. Mattfeld, Production scheduling and rescheduling with genetic
[33] O.C. Basurko, Z. Uriondo, Condition-based maintenance for medium speed diesel algorithms, Evol. Comput. 7 (1) (1999) 1–17.
engines used in vessels in operation, Appl. Therm. Eng. 80 (2015) 404–412. [64] H. Wang, W. Wang, H. Sun, Z. Cui, S. Rahnamayan, S. Zeng, A new cuckoo search
[34] A. Diez-Olivan, J.A. Pagan, R. Sanz, B. Sierra, Data-driven prognostics using a com- algorithm with hybrid strategies for flow shop scheduling problems, Soft Comput.
bination of constrained k-means clustering, fuzzy modeling and lof-based score, 21 (15) (2017) 4297–4307.
Neurocomputing 241 (2017) 97–107. [65] M. Zandieh, A. Khatami, S.H.A. Rahmati, Flexible job shop scheduling under con-
[35] F. Kadri, F. Harrou, S. Chaabane, Y. Sun, C. Tahon, Seasonal arma-based spc charts dition-based maintenance: improved version of imperialist competitive algorithm,
for anomaly detection: application to emergency department systems, Neurocom- Appl. Soft Comput. 58 (2017) 449–464.
puting 173 (2016) 2102–2114. [66] Y. Fu, J. Ding, H. Wang, J. Wang, Two-objective stochastic flow-shop scheduling
[36] G. Niu, J. Jiang, Prognostic control-enhanced maintenance optimization for multi- with deteriorating and learning effect in industry 4.0-based manufacturing system,
-component systems, Reliab. Eng. Syst. Saf. 168 (2017) 218–226. Appl. Soft Comput. 68 (2018) 847–855.

108
A. Diez-Olivan et al. Information Fusion 50 (2019) 92–111

[67] F. El Khoukhi, J. Boukachour, A.E.H. Alaoui, The dual-ants colony: a novel hybrid [99] M. Hanafy, H. ElMaraghy, Co-design of products and systems using a bayesian
approach for the flexible job shop scheduling problem with preventive mainte- network, Procedia CIRP 17 (2014) 284–289.
nance, Comput. Ind. Eng. 106 (2017) 236–255. [100] B. Malakooti, Operations and Production Systems with Multiple Objectives, John
[68] M. Khatami, S.H. Zegordi, Coordinative production and maintenance scheduling Wiley & Sons, 2013.
problem with flexible maintenance time intervals, J. Intell. Manuf. 28 (4) (2017) [101] Y.A. Alamerew, D. Brissaud, Evaluation of remanufacturing for product recovery:
857–867. multi-criteria decision tool for end-of-life selection strategy, in: 3rd International
[69] M. Ventresca, B.M. Ombuki, Ant colony optimization for job shop scheduling prob- Conference on Remanufacturing, 2017.
lem, in: 8th IASTED International Conference on Artificial Intelligence and Soft [102] C. Diallo, U. Venkatadri, A. Khatab, S. Bhakthavatchalam, State of the art review
Computing, 2004. 451–152. of quality, reliability and maintenance issues in closed-loop supply chains with
[70] R.-H. Huang, T.-H. Yu, An effective ant colony optimization algorithm for multi- remanufacturing, Int. J. Prod. Res. 55 (5) (2017) 1277–1296.
-objective job-shop scheduling with equal-size lot-splitting, Appl. Soft Comput. 57 [103] S. Kara, M. Mazhar, H. Kaebernick, A. Ahmed, Determining the reuse potential of
(2017) 642–656. components based on life cycle data, CIRP Ann. Manuf. Technol. 54 (1) (2005) 1–4.
[71] E. Ahmadi, M. Zandieh, M. Farrokh, S.M. Emami, A multi objective optimization [104] L.V. Snyder, Z. Atan, P. Peng, Y. Rong, A.J. Schmitt, B. Sinsoysal, OR/MS models
approach for flexible job shop scheduling problem under random machine break- for supply chain disruptions: a review, IIE Trans. 48 (2) (2016) 89–109.
down by evolutionary algorithms, Comput. Oper. Res. 73 (2016) 56–66. [105] F. Badurdeen, M. Shuaib, K. Wijekoon, A. Brown, W. Faulkner, J. Amundson,
[72] W. Liao, M. Chen, X. Yang, Joint optimization of preventive maintenance and pro- I. Jawahir, T. J. Goldsby, D. Iyengar, B. Boden, Quantitative modeling and analysis
duction scheduling for parallel machines system, J. Intell. Fuzzy Syst. 32 (1) (2017) of supply chain risks using bayesian theory, J. Manuf. Technol. Manage. 25 (5)
913–923. (2014) 631–654.
[73] S. Wang, M. Liu, Multi-objective optimization of parallel machine scheduling inte- [106] R. Sarrate, F. Nejjari, F.D. Mele, J. Quevedo, L. Puigjaner, Event-based approach
grated with multi-resources preventive maintenance planning, J. Manuf. Syst. 37 for supply chain fault analysis, Comput. Aided Chem. Eng. 20 (2005) 1261–1266.
(2015) 182–192. [107] G. Wang, A. Gunasekaran, E.W. Ngai, T. Papadopoulos, Big data analytics in logis-
[74] D. Sha, H.-H. Lin, A multi-objective pso for job-shop scheduling problems, Expert tics and supply chain management: certain investigations for research and appli-
Syst. Appl. 37 (2) (2010) 1065–1070. cations, Int. J. Prod. Econ. 176 (2016) 98–110.
[75] W. Liao, X. Zhang, M. Jiang, Multi-objective group scheduling optimization inte- [108] A. Gunasekaran, T. Papadopoulos, R. Dubey, S.F. Wamba, S.J. Childe, B. Hazen,
grated with preventive maintenance, Eng. Optim. (2017) 1–15. S. Akter, Big data and predictive analytics for supply chain and organizational
[76] H. Seidgar, M. Zandieh, I. Mahdavi, An efficient meta-heuristic algorithm for performance, J. Bus. Res. 70 (2017) 308–317.
scheduling a two-stage assembly flow shop problem with preventive maintenance [109] R.Y. Zhong, S.T. Newman, G.Q. Huang, S. Lan, Big data for supply chain man-
activities and reliability approach, Int. J. Ind. Syst. Eng. 26 (1) (2017) 16–41. agement in the service and manufacturing sectors: Challenges, opportunities, and
[77] J.E. Diaz, J. Handl, D.-L. Xu, Evolutionary robust optimization in production plan- future perspectives, Comput. Ind. Eng. 101 (2016) 572–591.
ning–interactions between number of objectives, sample size and choice of robust- [110] E.S. Madsen, A. Bilberg, D.G. Hansen, Industry 4.0 and digitalization call for vo-
ness measure, Comput. Oper. Res. 79 (2017) 266–278. cational skills, applied industrial engineering, and less for pure academics, in: 5th
[78] D. Gupta, C.T. Maravelias, J.M. Wassick, From rescheduling to online scheduling, World Conference on Production and Operations Management P&OM, 2016.
Chem. Eng. Res. Des. 116 (2016) 83–97. [111] S. Yin, S.X. Ding, D. Zhou, Diagnosis and prognosis for complicated industrial sys-
[79] A. Boudjelida, On the robustness of joint production and maintenance scheduling tems – part I, IEEE Trans. Ind. Electron. 63 (4) (2016) 2501–2505.
in presence of uncertainties, J. Intell. Manuf. (2017) 1–16. [112] S. Yin, S.X. Ding, D. Zhou, Diagnosis and prognosis for complicated industrial sys-
[80] N.H. Lappas, C.E. Gounaris, Multi-stage adjustable robust optimization for process tems – part II, IEEE Trans. Ind. Electron. 63 (5) (2016) 3201–3204.
scheduling under uncertainty, AIChE J. 62 (5) (2016) 1646–1667. [113] K. Severson, P. Chaiwatanodom, R.D. Braatz, Perspectives on process monitoring
[81] O.A. Arık, M.D. Toksarı, Multi-objective fuzzy parallel machine scheduling prob- of industrial systems, Annu. Rev. Control 42 (2016) 190–200.
lems under fuzzy job deterioration and learning effects, Int. J. Prod. Res. (2017) [114] F. Rousseaux, Big data and data-driven intelligent predictive algorithms to support
1–18. creativity in industrial engineering, Comput. Ind. Eng. 112 (2017) 459–465.
[82] C.F. Baban, M. Baban, M.D. Suteu, Using a fuzzy logic approach for the predictive [115] A.T. Azar, S. Vaidyanathan, Computational Intelligence Applications in Modeling
maintenance of textile machines, J. Intell. Fuzzy Syst. 30 (2) (2016) 999–1006. and Control, Springer, 2015.
[83] B. Al-Najjar, I. Alsyouf, Selecting the most efficient maintenance approach using [116] Q. Zhu, A.T. Azar, Complex System Modelling and Control Through Intelligent Soft
fuzzy multiple criteria decision making, Int. J. Prod. Econ. 84 (1) (2003) 85–100. Computations, Springer, 2015.
[84] M. Sakawa, R. Kubota, Fuzzy programming for multiobjective job shop scheduling [117] T. Hastie, R. Tibshirani, J. Friedman, Unsupervised learning, in: The Elements of
with fuzzy processing time and fuzzy duedate through genetic algorithms, Eur. J. Statistical Learning, Springer, 2009, pp. 485–585.
Oper. Res. 120 (2) (2000) 393–407. [118] J. Banks, J. Hines, M. Lebold, R. Campbell, C. Begg, Failure Modes and Predictive
[85] M. Biondi, G. Sand, I. Harjunkoski, Optimization of multipurpose process plant Diagnostics Considerations for Diesel Engines, Technical Report, The Pennsylvania
operations: a multi-time-scale maintenance and production scheduling approach, State University Park, Applied Research Laboratory, 2001.
Comput. Chem. Eng. 99 (2017) 325–339. [119] S.B. Kotsiantis, I. Zaharakis, P. Pintelas, Supervised Machine Learning: A Review
[86] W. Cui, Z. Lu, C. Li, X. Han, A proactive approach to solve integrated production of Classification Techniques, 2007.
scheduling and maintenance planning problem in flow shops, Comput. Ind. Eng. [120] M. Schwabacher, K. Goebel, A survey of artificial intelligence for prognostics, in:
115 (2018) 342–353. AAAI Fall Symposium, 2007, pp. 107–114.
[87] M. Mazhar, S. Kara, H. Kaebernick, Remaining life estimation of used components [121] Y. LeCun, Y. Bengio, G. Hinton, Deep learning, Nature 521 (7553) (2015) 436.
in consumer products: Life cycle data analysis by weibull and artificial neural net- [122] M. Feurer, A. Klein, K. Eggensperger, J. Springenberg, M. Blum, F. Hutter, Effi-
works, J. Oper. Manage. 25 (6) (2007) 1184–1193. cient and robust automated machine learning, in: Advances in Neural Information
[88] M. Klein, Calculating life cycle impact assessment of chemicals with neural net- Processing Systems, 2015, pp. 2962–2970.
works, Chemie Ingenieur Technik 86 (9) (2014). 1631–1631. [123] E.P. Carden, P. Fanning, Vibration based condition monitoring: a review, Struct.
[89] T. Shiraiwa, F. Briffod, Y. Miyazawa, M. Enoki, Fatigue performance prediction Health Monitoring 3 (4) (2004) 355–377.
of structural materials by multi-scale modeling and machine learning, in: 4th [124] O. Chapelle, B. Scholkopf, A. Zien, Semi-supervised learning, The MIT
World Congress on Integrated Computational Materials Engineering (ICME 2017), Press, 2010 https://dl.acm.org/citation.cfm?id=1841234, ISBN:0262514125
Springer, 2017, pp. 317–326. 9780262514125.
[90] X. Gao, Y. Chen, D. You, Z. Xiao, X. Chen, Detection of micro gap weld joint by [125] V. Barnett, T. Lewis, et al., Outliers in Statistical Data, 3, Wiley New York, 1994.
using magneto-optical imaging and kalman filtering compensated with RBF neural [126] P.J. Rousseeuw, A.M. Leroy, Robust Regression and Outlier Detection, 589, John
network, Mech. Syst. Signal Process. 84 (2017) 570–583. Wiley & sons, 2005.
[91] G.X. Gu, C.-T. Chen, M.J. Buehler, De novo composite design based on machine [127] L.J. Latecki, A. Lazarevic, D. Pokrajac, Outlier detection with kernel density func-
learning algorithm, Extreme Mech. Lett. 18 (2018) 19–28. tions, in: Machine Learning and Data Mining in Pattern Recognition, Springer,
[92] S. Kang, E. Kim, J. Shim, S. Cho, W. Chang, J. Kim, Mining the relationship between 2007, pp. 61–75.
production and customer service data for failure analysis of industrial products, [128] V. Hautamaki, I. Karkkainen, P. Franti, Outlier detection using k-nearest neigh-
Comput. Ind. Eng. 106 (2017) 137–146. bour graph, in: 17th International Conference on Pattern Recognition, 3, 2004,
[93] Y. Zhang, S. Ren, Y. Liu, T. Sakao, D. Huisingh, A framework for big data driven pp. 430–433.
product lifecycle management, J. Cleaner Prod. 159 (2017) 229–240. [129] M. Rausand, J. Vatn, Reliability centred maintenance, in: Complex System Mainte-
[94] S. Spielberg, R. Gopaluni, P. Loewen, Deep reinforcement learning approaches for nance Handbook, Springer, Berlin, Heidelberg, 2008, pp. 79–108.
process control, in: 6th IEEE International Symposium on Advanced Control of [130] C. Milkie, A.N. Perakis, Statistical methods for planning diesel engine overhauls in
Industrial Processes (AdCONIP), 2017, pp. 201–206. the us coast guard, Naval Eng. J. 116 (2) (2004) 31–42.
[95] X. Li, Q. Ding, J.-Q. Sun, Remaining useful life estimation in prognostics using deep [131] M. John, Reliability Centered Maintenance, 1997.
convolution neural networks, Reliab. Eng. Syst. Saf. 172 (2018) 1–11. [132] A. Hernandez, D. Galar, Techniques of Prognostics for Condition-Based Mainte-
[96] G. Shroff, P. Agarwal, K. Singh, A.H. Kazmi, S. Shah, A. Sardeshmukh, Prescriptive nance in Different Types of Assets, Luleå Tekniska Universitet, 2014.
information fusion, in: 17th IEEE International Conference on Information Fusion [133] C. James Li, S. Li, Acoustic emission analysis for bearing condition monitoring,
(FUSION), 2014, pp. 1–8. Wear 185 (1) (1995) 67–74.
[97] Z. Hong, Y. Feng, Z. Li, G. Tian, J. Tan, Reliability-based and cost-oriented product [134] W.T. Peter, Y. Peng, R. Yam, Wavelet analysis and envelope detection for rolling
optimization integrating fuzzy reasoning petri nets, interval expert evaluation and element bearing fault diagnosis their effectiveness and flexibilities, J. Vib. Acoust.
cultural-based dmopso using crowding distance sorting, Appl. Sci. 7 (8) (2017) 791. 123 (3) (2001) 303–310.
[98] M. Hanafy, H. ElMaraghy, Integrated products–systems design environment using [135] M. Braglia, G. Carmignani, M. Frosolini, F. Zammori, Data classification and MTBF
bayesian networks, Int. J. Comput. Integr. Manuf. 30 (7) (2017) 708–723. prediction with a multivariate analysis approach, Reliab. Eng. Syst. Saf. 97 (1)
(2012) 27–35.

109
A. Diez-Olivan et al. Information Fusion 50 (2019) 92–111

[136] A. Coraddu, L. Oneto, A. Ghio, S. Savio, M. Figari, D. Anguita, Machine learning [172] M. Colledani, T. Tolio, A. Fischer, B. Iung, G. Lanza, R. Schmitt, J. Váncza, De-
for wear forecasting of naval assets for condition-based maintenance applications, sign and management of manufacturing systems for production quality, CIRP Ann.
in: 2015 International Conference on Electrical Systems for Aircraft, Railway, Ship Manuf. Technol. 63 (2) (2014) 773–796.
Propulsion and Road Vehicles (ESARS), 2015, pp. 1–5. [173] S. Nihtianov, A. Luque, Smart Sensors and MEMS: Intelligent Sensing Devices and
[137] B.S.J. Costa, P.P. Angelov, L.A. Guedes, Fully unsupervised fault detection and iden- Microsystems for Industrial Applications, Woodhead Publishing, 2018.
tification based on recursive density estimation and self-evolving cloud-based clas- [174] N. Abbas, Y. Zhang, A. Taherkordi, T. Skeie, Mobile edge computing: a survey, IEEE
sifier, Neurocomputing 150 (2015) 289–303. Internet Things J. 5 (1) (2018) 450–465.
[138] H. Yang, J. Mathew, L. Ma, Intelligent diagnosis of rotating machinery faults-a [175] W. Shi, J. Cao, Q. Zhang, Y. Li, L. Xu, Edge computing: vision and challenges, IEEE
review, in: 3rd Asia-Pacific Conference on Systems Integrity and Maintenance (AC- Internet Things J. 3 (5) (2016) 637–646.
SIM), 2002, pp. 385–392. [176] T. Nishio, R. Yonetani, Client selection for federated learning with heterogeneous
[139] W. Li, Z. Zhu, F. Jiang, G. Zhou, G. Chen, Fault diagnosis of rotating machinery with resources in mobile edge, arXiv preprint arXiv:1804.08333(2018).
a novel statistical feature extraction and evaluation method, Mech. Syst. Signal [177] K. Weiss, T.M. Khoshgoftaar, D. Wang, A survey of transfer learning, J. Big Data 3
Process. 50 (2015) 414–426. (1) (2016) 9.
[140] Z. Zhou, J. Zhao, F. Cao, A novel approach for fault diagnosis of induction motor [178] T. Hou, G. Feng, S. Qin, W. Jiang, Proactive content caching by exploiting transfer
with invariant character vectors, Inf. Sci. 281 (2014) 496–506. learning for mobile edge computing, Int. J. Commun. Syst. 31 (11) (2018) e3706.
[141] R. Jegadeeshwaran, V. Sugumaran, Fault diagnosis of automobile hydraulic brake [179] T. Chen, Q. Ling, Y. Shen, G.B. Giannakis, Heterogeneous online learning for
system using statistical features and support vector machines, Mech. Syst. Signal thing-adaptive fog computing in iot, IEEE Internet Things J. (2018).
Process. 52 (2015) 436–446. [180] Y. Jiang, Z. Huang, D.H. Tsang, Challenges and solutions in fog computing orches-
[142] S. Yin, G. Wang, H. Gao, Data-driven process monitoring based on modified or- tration, IEEE Netw. 32 (3) (2018) 122–129.
thogonal projections to latent structures, IEEE Trans. Control Syst. Technol. 24 (4) [181] M. Mukherjee, L. Shu, D. Wang, Survey of fog computing: fundamental, network
(2015) 1480–1487. applications, and research challenges, IEEE Commun. Surv. Tutorials (2018).
[143] E. Kokiopoulou, J. Chen, Y. Saad, Trace optimization and eigenproblems in dimen- [182] C.P. Chen, C.-Y. Zhang, Data-intensive applications, challenges, techniques and
sion reduction methods, Numer. Linear Algebra Appl. 18 (3) (2011) 565–602. technologies: a survey on big data, Inf. Sci. 275 (2014) 314–347.
[144] F. Zhou, J.H. Park, Y. Liu, Differential feature based hierarchical PCA fault detec- [183] M.D. de Assuncao, A. da Silva Veith, R. Buyya, Distributed data stream process-
tion method for dynamic fault, Neurocomputing 202 (2016) 27–35. ing and edge computing: a survey on resource elasticity and future directions, J.
[145] R. Akerkar, P. Sajja, Knowledge-Based Systems, Jones & Bartlett Publishers, 2010. Network Comput. Appl. 103 (2018) 1–17.
[146] W.J. Verhagen, R. Curran, Knowledge-based engineering review: conceptual foun- [184] R. Dautov, S. Distefano, Distributed data fusion for the internet of things, in:
dations and research issues, in: New World Situation: New Directions in Concurrent International Conference on Parallel Computing Technologies, Springer, 2017,
Engineering, Springer, Berlin, Heidelberg, 2010, pp. 267–276. pp. 427–432.
[147] K. Tidriri, N. Chatti, S. Verron, T. Tiplica, Bridging data-driven and model-based ap- [185] C. Esposito, A. Castiglione, F. Palmieri, M. Ficco, C. Dobre, G.V. Iordache, F. Pop,
proaches for process fault diagnosis and health monitoring: a review of researches Event-based sensor data exchange and fusion in the internet of things environ-
and future challenges, Annu. Rev. Control 42 (2016) 63–81. ments, J. Parallel Distrib. Comput. 118 (2018) 328–343.
[148] I. Steinwart, D.R. Hush, C. Scovel, A classification framework for anomaly detec- [186] O. Etzion, P. Niblett, D.C. Luckham, Event Processing in Action, Manning Green-
tion, J. Mach. Learn. Res. (2005) 211–232. wich, 2011.
[149] W.C. Greene, Evaluation of Non-intrusive Monitoring for Condition Based Mainte- [187] S.N.Z. Naqvi, S. Yfantidou, E. Zimányi, Time Series Databases and Influxdb, Uni-
nance Applications on US Navy Propulsion Plants, Massachusetts Institute of Tech- versité Libre de Bruxelles, 2017.
nology, Dept. of Mechanical Engineering, 2005 Ph.D. thesis. [188] D. Keim, G. Andrienko, J.-D. Fekete, C. Görg, J. Kohlhammer, G. Melançon, Vi-
[150] G.W. Vogl, B.A. Weiss, M. Helu, A review of diagnostic and prognostic capabilities sual analytics: definition, process, and challenges, in: Information visualization,
and best practices for manufacturing, J. Intell. Manuf. (2016) 1–17. Springer, 2008, pp. 154–175.
[151] J.M. Brown, J.A. Coffey III, D. Harvey, J.M. Thayer, Characterization and prognosis [189] J. Posada, C. Toro, I. Barandiaran, D. Oyarzun, D. Stricker, R. De Amicis, E.B. Pinto,
of multirotor failures, in: Structural Health Monitoring and Damage Detection, 7, P. Eisert, J. Döllner, I. Vallarino, Visual computing as a key enabling technology
Springer, Berlin, Heidelberg, 2015, pp. 157–173. for industrie 4.0 and industrial internet, IEEE Comput. Graph. Appl. 35 (2) (2015)
[152] K. Goebel, B. Saha, A. Saxena, J.R. Celaya, J.P. Christophersen, Prognostics in Bat- 26–40.
tery Health Management, 2008. [190] A. Stork, Visual computing challenges of advanced manufacturing and industrie
[153] G.E. Box, G.M. Jenkins, G.C. Reinsel, G.M. Ljung, Time Series Analysis: Forecasting 4.0, IEEE Comput. Graph. Appl. (2) (2015) 21–25.
and Control, John Wiley & Sons, 2015. [191] P. Lade, R. Ghosh, S. Srinivasan, Manufacturing analytics and industrial internet
[154] C.E. Rasmussen, Gaussian Processes for Machine Learning, 2006. of things, IEEE Intell. Syst. 32 (3) (2017) 74–79.
[155] R. Rojas, Neural Networks: A Systematic Introduction, Springer Science & Business [192] B. Krawczyk, Learning from imbalanced data: open challenges and future direc-
Media, 2013. tions, Prog. Artif. Intell. 5 (4) (2016) 221–232.
[156] T. Mikolov, M. Karafiát, L. Burget, J. Cernockỳ, S. Khudanpur, Recurrent neural [193] V. López, A. Fernández, S. García, V. Palade, F. Herrera, An insight into classifi-
network based language model., in: Interspeech, 2, 2010, p. 3. cation with imbalanced data: empirical results and current trends on using data
[157] S. Hochreiter, J. Schmidhuber, Long short-term memory, Neural Comput. 9 (8) intrinsic characteristics, Inf. Sci. 250 (2013) 113–141.
(1997) 1735–1780. [194] F. Charte, A.J. Rivera, M.J. del Jesus, F. Herrera, Addressing imbalance in multil-
[158] P. Hayton, S. Utete, D. King, S. King, P. Anuzis, L. Tarassenko, Static and dynamic abel classification: measures and random resampling algorithms, Neurocomputing
novelty detection methods for jet engine health monitoring, Philos. Trans. R. Soc. 163 (2015) 3–16.
Lond. A: Math. Phys. Eng. Sci. 365 (1851) (2007) 493–514. [195] J. Gama, I. Žliobaitė, A. Bifet, M. Pechenizkiy, A. Bouchachia, A survey on concept
[159] P. Malhotra, L. Vig, G. Shroff, P. Agarwal, Long short term memory networks for drift adaptation, ACM Comput. Surv. (CSUR) 46 (4) (2014) 44.
anomaly detection in time series, in: Proceedings, Presses universitaires de Louvain, [196] I. Žliobaitė, M. Pechenizkiy, J. Gama, An overview of concept drift applications, in:
2015, p. 89. Big Data Analysis: New Algorithms for A New Society, Springer, 2016, pp. 91–114.
[160] D.T. Shipmon, J.M. Gurevitch, P.M. Piselli, S.T. Edwards, Time series anomaly de- [197] P. Li, X. Wu, X. Hu, Mining recurring concept drifts with limited labeled streaming
tection; detection of anomalous drops with limited features and sparse examples data, in: 2nd Asian Conference on Machine Learning, 2010, pp. 241–252.
in noisy highly periodic data, arXiv:1708.03665 (2017) preprint. [198] C. Alippi, G. Boracchi, M. Roveri, Just-in-time classifiers for recurrent concepts,
[161] J. Hernández-González, I. Inza, J.A. Lozano, Weak supervision and other non-stan- IEEE Trans. Neural Netw. Learn. Syst. 24 (4) (2013) 620–634.
dard classification problems: a taxonomy, Pattern Recognit. Lett. 69 (2016) 49–55. [199] B. Krawczyk, A. Cano, Online ensemble learning with abstaining classifiers for drift-
[162] P. ODonovan, K. Leahy, K. Bruton, D.T. OSullivan, Big data in manufacturing: a ing and noisy data streams, Appl. Soft Comput. 68 (2018) 677–692.
systematic mapping study, J. Big Data 2 (1) (2015) 20. [200] J. Sikorska, M. Hodkiewicz, L. Ma, Prognostic modelling options for remain-
[163] K. Zhou, T. Liu, L. Zhou, Industry 4.0: towards future industrial opportunities and ing useful life estimation by industry, Mech. Syst. Signal Process. 25 (5) (2011)
challenges, in: 12th International Conference on Fuzzy Systems and Knowledge 1803–1836.
Discovery (FSKD), 2015, pp. 2147–2152. [201] Z. Che, S. Purushotham, K. Cho, D. Sontag, Y. Liu, Recurrent neural networks for
[164] K.E. Martin, Ethical issues in the big data industry, MIS Q. Executive 14:2 (2015) multivariate time series with missing values, Sci. Rep. 8 (1) (2018) 6085.
67–85. [202] A. Abanda, U. Mori, J.A. Lozano, A review on distance based time series classifica-
[165] J. Zhang, G. Ding, Y. Zou, S. Qin, J. Fu, Review of job shop scheduling research tion, arXiv:1806.04509 (2018) preprint.
and its new perspectives under industry 4.0, J. Intell. Manuf. (2017) 1–22. [203] H. Viot, A. Sempey, L. Mora, J. Batsale, J. Malvestio, Model predictive control of
[166] R.Y. Zhong, X. Xu, E. Klotz, S.T. Newman, Intelligent manufacturing in the context a thermally activated building system to improve energy management of an ex-
of industry 4.0: a review, Engineering 3 (5) (2017) 616–630. perimental building: part imodeling and measurements, Energy Build. 172 (2018)
[167] J. Behnamian, S.F. Ghomi, A survey of multi-factory scheduling, J. Intell. Manuf. 94–103.
27 (1) (2016) 231–249. [204] L. Liao, F. Köttig, Review of hybrid prognostics approaches for remaining useful
[168] J. Li, F. Tao, Y. Cheng, L. Zhao, Big data in product lifecycle management, Int. J. life prediction of engineered systems, and an application to battery life prediction,
Adv. Manuf. Technol. 81 (1–4) (2015) 667–684. IEEE Trans. Reliab. 63 (1) (2014) 191–207.
[169] F. Tao, J. Cheng, Q. Qi, M. Zhang, H. Zhang, F. Sui, Digital twin-driven product de- [205] J. Glassey, M. von Stosch, Benefits and challenges of hybrid modeling in the process
sign, manufacturing and service with big data, Int. J. Adv. Manuf. Technol. (2017) industries: an introduction, in: Hybrid Modeling in Process Industries, CRC Press,
1–14. 2018, pp. 1–12.
[170] T. Tolio, M. Sacco, W. Terkaj, M. Urgo, Virtual factory: an integrated framework [206] C. Lundgren, A. Skoogh, J. Bokrantz, Quantifying the effects of maintenance – a
for manufacturing systems design and analysis, Procedia CIRP 7 (2013) 25–30. literature review of maintenance models, Procedia CIRP 72 (2018) 1305–1310.
[171] P. Kadlec, B. Gabrys, S. Strandt, Data-driven soft sensors in the process industry,
Comput. Chem. Eng. 33 (4) (2009) 795–814.

110
A. Diez-Olivan et al. Information Fusion 50 (2019) 92–111

[207] L. Farhan, R. Kharel, O. Kaiwartya, M. Quiroz-Castellanos, A. Alissa, M. Abdul- [213] A.-R. Sadeghi, C. Wachsmann, M. Waidner, Security and privacy challenges in in-
salam, A concise review on internet of things (iot) problems, challenges and oppor- dustrial internet of things, in: Design Automation Conference (DAC), 2015 52nd
tunities, 11th International Symposium Communication System Networks, Digital ACM/EDAC/IEEE, IEEE, 2015, pp. 1–6.
Signal Processing, Hungary, 2018. [214] S. Sarkar, S. Chatterjee, S. Misra, Assessment of the suitability of fog computing in
[208] K.E. Jeon, J. She, P. Soonsawad, P.C. Ng, Ble beacons for internet of things applica- the context of internet of things, IEEE Trans. Cloud Comput. 6 (1) (2018) 46–59.
tions: Survey, challenges, and opportunities, IEEE Internet Things J. 5 (2) (2018) [215] J. Huxtable, D. Schaefer, On servitization of the manufacturing industry in the UK,
811–828. Procedia CIRP 52 (2016) 46–51.
[209] T. Mazali, From industry 4.0 to society 4.0, there and back, AI & Society 33 (3) [216] M. Satyanarayanan, P. Simoens, Y. Xiao, P. Pillai, Z. Chen, K. Ha, W. Hu, B. Amos,
(2018) 405–411. Edge analytics in the internet of things, IEEE Pervasive Comput. (2) (2015) 24–31.
[210] S.A. Al-Qaseemi, H.A. Almulhim, M.F. Almulhim, S.R. Chaudhry, Iot architecture [217] K. Bonawitz, V. Ivanov, B. Kreuter, A. Marcedone, H.B. McMahan, S. Patel, D. Ram-
challenges and issues: lack of standardization, in: Future Technologies Conference age, A. Segal, K. Seth, Practical secure aggregation for privacy-preserving machine
(FTC), IEEE, 2016, pp. 731–738. learning, in: ACM SIGSAC Conference on Computer and Communications Security,
[211] M. Chiang, T. Zhang, Fog and iot: an overview of research opportunities, IEEE 2017, pp. 1175–1191.
Internet Things J. 3 (6) (2016) 854–864. [218] A. Ancarani, C. Di Mauro, Successful digital transformations need a focus on the
[212] E. Sisinni, A. Saifullah, S. Han, U. Jennehag, M. Gidlund, Industrial internet of individual, in: Digitalisierung im Einkauf, Springer, 2018, pp. 11–26.
things: challenges, opportunities, and directions, IEEE Trans. Ind. Inform. (2018).

111

You might also like