Professional Documents
Culture Documents
Renee Swischuk
d
Department of Mathematics
Texas A&M University
te
College Station, Texas, 77843
e di
Douglas Allaire
py
Assistant Professor
Department of Mechanical Engineering
Co
Texas A&M University
College Station, Texas, 77843
ot
tN
ABSTRACT
rip
Sensors are crucial to modern mechanical systems. The location of these sensors can often make them vulner-
able to outside interferences and failures, and the use of sensors over a lifetime can cause degradation and lead
to failure. If a system has access to redundant sensor output, it can be trained to autonomously recognize errors
sc
in faulty sensors and learn to correct them. In this work, we develop a novel data-driven approach to detecting
sensor failures and predicting corrected sensor data using machine learning methods in an offline/online paradigm.
nu
Autocorrelation is shown to provide a global feature of failure data capable of accurately classifying the state of
a sensor to determine if a failure is occurring. Feature selection of redundant sensor data in combination with
k-nearest neighbors regression is used to predict corrected sensor data rapidly, while the system is operational. We
Ma
demonstrate our methodology on flight data from a four engine commercial jet that contains failures in the pitot
static system resulting in inaccurate airspeed measurements.
ed
1 Introduction
Sensors are found in almost all modern mechanical systems. Continuous Glucose Monitoring sensors placed under the
pt
skin of a diabetic patient provide early warnings for possible hypo/hyper glycemic events [1]. Monitors used in ambient
assistant living environments, or smart homes, are used to help provide support to elderly and allow them to live safely and
ce
independently. Safety critical sensor systems can be found in all modern automobiles and the air data systems (ADS) on
an aircraft, consisting of hundreds of sensors, provides constant measurements of the performance and behavior of engines,
Ac
flight path progress and surrounding environmental conditions. Many mechanical systems rely on sensors to monitor their
conditions, thus the ability of a sensor to produce accurate and reliable measurements is crucial to good decision making.
Moreover, the ability of the system to autonomously detect when one of its own sensors becomes unreliable is important
when a failure occurs during operation, or at a time when immediate maintenance is not available. There are two essential
steps in enabling an autonomous operating system. First, the system needs to be able to detect and identify a failure, referred
to in many fields as sensor failure detection and identification/isolation (SFDI) or sensor fault detection and diagnosis (FDD).
Second, the system needs to be able to rapidly correct or replace the faulty measurement values, known as sensor failure
accommodation (SFA) or correction. These two tasks are often performed simultaneously, and referred to as sensor failure
detection, identification and accommodation (SFDIA), by making predictions or model based estimations of sensor values
and computing real time residuals to detect discrepancies.
JCISE-18-1301 1 Swischuk
d
conditions to derive a fault index and determine the probability of a future fault for a given component. Digital twins and
te
machine learning approaches have been used to develop predictive deterioration models for flow valves [11]. Support vector
machines are used to predict future failures of rock bolts in mines using a data-driven approach in [12]. Smart homes are
di
becoming increasingly common, and with that comes the need for well functioning sensors. FailureSense [13] is a failure
detection system for electronic appliance sensors in the home. The system takes advantage of known correlations between
e
other sensors in the home, for example, if an appliance is turned on, it is very likely that a nearby motion sensor will be
py
triggered by the individual who turned on the appliance. The system utilizes Gaussian mixture models to describe typical
intervals of sensor firings and has shown success in detecting fail-stop, obstructed view and moved location failures. Other
approaches in the field of failure detection in smart home sensors includes SMART [14], which uses physical redundancy to
Co
detect failures based on multiple classifiers trained to identify the same set of activities using different sensors as features.
A reputation based framework for sensor networks was identified in [15] as a way to design a system of trustworthy sensor
nodes to improve security of sensor networks with a Bayesian fusion approach. All of the above works require either a
ot
constant prediction of sensor values or previous knowledge of the relationships between sensors to make decisions on when
failures are occurring. Further, when using residual analysis, a system specific threshold must be defined, and will need to
tN
be specified for each new application.
We propose a simple, data-driven approach, where the behavior of a failing sensor is learned from the sensor data in the
form of a feature that can efficiently be detected during operation. The behavior of a failing sensor is different depending
rip
on the application, for example, a failing sensor may produce a constant signal, stop producing a signal, or simply provide
incorrect values. By learning the behavior of sensors directly from the data, we can avoid requiring domain knowledge of the
system, allowing for a more general framework that can be applied to any type of sensor system. The work in [16] takes a
sc
similar approach by using neural networks to learn the behavior of a healthy sensor from data, but the approach also requires
knowledge of the physics of the particular system to be incorporated into the decision making. In our work, we are also
determined to learn the behavior of a healthy/faulty operating system although instead of learning the error signature of a
nu
sensor by directly comparing sensor measurements, we compute the autocorrelation of the sensor over time, a value which
serves as a global feature of a time series. The use of autocorrelation as a way to compare the structure of multiple time
Ma
series is heavily used in the literature, having the ability to provide a single value that can represent the dynamics of an entire
portion of a time series enables quick and effective learning [17]. An interesting application of the power of autocorrelation
as a global descriptor of a time series is shown in [18], where autocorrelation of continuous wavelet coefficients is used to
detect gear failure in motorcycle gear boxes. We take a similar approach to detecting sensor failures, with the addition of a
ed
sensor readings. The most successful of these approaches are physics-based, and often use variants of the Kalman filter
for continuous state estimation of sensor systems [19, 20]. A number of nonlinear state estimation techniques exist in the
Ac
literature, such as, the unscented Kalman filter [21] and extended Kalman filter [22]. The aerospace community has a large
amount of work in the area of sensor failure detection and correction related to the on-board ADS. Using nuclear reactor
technology, the Integrated Pitot Health Monitor [23] transforms pressure readings into electronic signals and senses when
signals become static to detect obstructions in pitot tubes. It is shown in [24] that fault tolerant air data inertial reference units
(ADIRU) supported on-board an aircraft are able to calculate highly reliable ground speed readings, which in conjunction
with an accurate measure of wind speed can help define the load factor and capability of an aircraft. By sampling weather
forecasts at various altitudes and locations, an estimate of current wind speed can be found by interpolation as done in
the P.I.L.O.T.’s software designed by Klockowski et al. [25] to help correct faulty readings from compromised pitot tubes.
However, all of these techniques require knowledge of the state dynamics, a model to represent them or outside information
such as weather data.
JCISE-18-1301 2 Swischuk
d
testing stage of query style algorithms like k-nearest neighbors (KNN) and decision trees can outperform that of the neural
te
networks. In our approach, the simple, distance based KNN regression algorithm is used to make rapid predictions of new
sensor measurements as a replacement for a failed sensor. All sensors which exist in the system are originally considered
di
as candidates for making predictions, and a simulated annealing based feature selection scheme chooses the most effective
sensors to aid in making predictions, allowing the system to be explicitly data-driven.
e
py
1.3 Contributions and Outline
Employing two independent algorithms, one for failure detection and one for sensor prediction, is not a common theme
Co
in the literature. Many approaches use a prediction algorithm to enable residual analysis for failure detection then fall back
on the predictions as replacements. Our approach aims to avoid any bias that may cause issues in the detection of failure due
to the choice of threshold for monitoring the residual values, an extremely application driven value. Training an algorithm
to perform one specific task quickly and effectively will provide a better environment for success. Further, the data-driven
ot
approach to learning the behavior of a failing sensor as well as relationships between sensors provides a more general
framework, independent of the application. To demonstrate the framework, we have applied our data-driven failure detection
tN
and sensor correction to the pitot static system of an aircraft. The remainder of the paper is outlined as followed. Section
2 discusses the algorithms for performing our two tasks, including the feature selection and regression algorithm used for
prediction. Section 3 discusses the pitot static failure case study. Section 4 shows the application of our methodology to the
rip
case study and the results. Finally, Section 5 provides concluding remarks and avenues for future work.
sc
Our system uses an offline/online approach as developed in [32–35]. An illustration of the approach is shown, specific to
our aircraft application, in Figure 1. The offline state occurs prior to operation, when historical data is collected, analyzed
and stored in offline libraries. During the online state, when the system is operating, decisions about the status of sensors are
Ma
made by comparing current sensor data to historical data in an offline library. Then, in the case of a failure, a separate offline
library is used to make predictions of new sensor measurements. We begin by discussing the failure detection framework,
followed by sensor prediction.
ed
The first step to creating a functioning failure detection scheme is to determine how a sensor behaves when it is failing.
For example, do the sensors simply quit taking measurements or do they report incorrect measurements, if so, determining
ce
what the incorrect measurements look like for certain situations is important. In many cases, sensor data is reported over
time, as a time series. There are many ways to describe how a faulty sensor measurement behaves, for example, it becomes
constant, has excessive fluctuations or follows certain incorrect trends. While these are useful for humans to interpret,
Ac
we need a simple and informative way to describe the behavior of a time series to our algorithm. A common approach
to describing the shape and trend of a time series is through autocorrelation [18, 36, 37]. Autocorrelation of a stream of
measurements Y = {y1 , ..., yn } is defined as
n− j
∑ (yk − ȳ)(y j+k − ȳ)
k=1
ac j (Y ) = n , (1)
∑ (yi − ȳ)2
i=1
JCISE-18-1301 3 Swischuk
d
te
e di
py
Co
ot
tN
Fig. 1. The offline (upper) and online (lower) portions of our approach.
rip
where j is the autocorrelation lag index and ȳ is the sample mean of the entire signal, Y [36]. To construct our offline library
of data, also known as training data in the machine learning community, we analyze historical data pertaining to the specific
application. Failure data is collected or simulated and the associated autocorrelations are computed. These autocorrelations
sc
and their failure type are stored together in the offline library. Online, during operation, we continuously compute the
autocorrelation of the sensor we are monitoring and determine its state by making comparisons to the offline data. It is
nu
assumed that only the predefined failure types in our offline library can occur and their behavior will be consistent, therefore,
a simple comparison of our current state to the offline failure states should provide an efficient and accurate identification of
failure.
Ma
We utilize an analytic redundancy technique in our work, which introduces the issue of needing to ensure that none of our
redundant data could be effected by the sensor failure. In many cases, there can be a large number of redundant sensors
pt
that have a relationship to the failing sensor and only some of them will be useful in predicting the corrected measurements.
To remedy this issue, we consider data-driven feature selection to choose which sensors are best at predicting a certain
sensor based on the historical relationship between the values. In this section, we first present a data-driven feature selection
ce
JCISE-18-1301 4 Swischuk
d
te
error prediction (vt , v̂t ) = max |vt − v̂t |, (2)
t
di
where vt is the true measurement and v̂t is the predicted measurement at time t. Minimizing this function in a data-driven
e
way requires finding the set of features from our training data that can produce a minimal prediction error. This results in
a combinatorial optimization problem where our solutions will be discrete sets of features. To solve this problem we must
py
search through the finite set of possible features and move towards sets that produce low prediction error.
This optimization problem is approached using simulated annealing. Simulated annealing was founded on the idea of
Co
heating and cooling of materials proposed by Kirkpatrick, Gelatt and Vecchi [40]. The method randomly moves along the
objective function, calculating costs and assigning acceptance probabilities to solutions, then searches in a “neighborhood”
of these solutions to determine if there is a “nearby” solution that produces a lower cost. A neighborhood around a point ~x is
defined as the collection of points that share all but two components with ~x. Any point in this collection can be chosen as a
ot
nearby point. The solutions to our optimization problem are the indices of the features that minimize the error from Eq. 2.
This error is calculated by predicting the sensor measurement with the chosen set of features using a predictive model. Each
tN
dimension size, d, is treated as its own optimization problem, where the solution domain consists of all subsets of features of
size d. The set of features that results in the lowest prediction error is chosen as the feature to be used for future predictions.
The key feature selection functions are shown in Algorithm 1.
rip
To compute the prediction error of different sets of features, we need to define a prediction algorithm. Each of our
redundant sensors will represent a feature (an input) of the prediction algorithm and the output will be a measurement
nu
to replace the failed sensor. To refrain from making any assumptions about the sensor data, we restrict our choices to
those that are non-parametric. Non-parametric simply implies that we don’t assume a certain parametric form for our
predictive function, instead it is entirely learned by the data. Non-parametric approaches provide an environment to learn
Ma
relationships within data, allowing our approach to remain data-driven and avoid needing any domain knowledge to guide our
feature selection process. Gaussian process regression, decision tree regression and KNN are all examples of non-parametric
regression algorithms. The basic assumption of non-parametric regression is that there exists a function, f , between pairs of
points (xi , yi ) of the form
ed
where xi ∈ Rd represents a single measurement from d redundant sensors, yi ∈ R represents the output of our sensor of
interest (non faulty values) and ε ∈ R is a zero mean random variable representing noise. The idea is to allow the training
data to determine the form of f () as opposed to using domain knowledge to define it. Our criteria for choosing good features
Ac
relies on the choice of regression algorithm used. Since we need to learn the relationship between a large number of inputs
to a given output, this algorithm needs to be able to capture all possible trends, for example, be able to determine when
two features alone cannot predict well, but when used together cooperatively, do predict well. Due to its simplicity, we have
chosen to implement KNN regression. KNN has shown tremendous success in areas of computer vision and the collaborative
filtering approach to recommender systems [41].
The KNN regression algorithm takes a test input, finds the k nearest neighbors from the training set, and the prediction is
the weighted average of those k neighbors outputs. This is in essence a localized linear regression model. Our local region is
defined by the number of neighbors chosen and our predicted output is defined as v̂ = ∑ki=1 wvii , where vi is the measurement
associated with the ith nearest neighbor and wi is the Euclidean distance between the test point and the ith neighbor. The
JCISE-18-1301 5 Swischuk
d
11:
12: ap = ACCEPTANCE P ROBABILITY(costold , costnew , T )
te
13: if ap > random then . If accepted, update cost and solution
14: y = ynew
di
15: costold = costnew
16: i++
e
17: T = Tinitial (.95k ) . Cooling rate
py
18: return y, costold
19: function N EIGHBOR(y)
Co
20: # INPUT : y - current solution
21: # OUTPUT : y - a solution in the neighborhood of the input (two components randomly replaced)
22: a, b = random(0, len(y)) . Randomly select which two components of y to replace
23: x, y = random(0, n f eatures )
ot
24: y[a] = x . New features to analyze
25: y[b] = y
tN
26: return y
27: function ACCEPTANCE P ROBABILITY(costold , costnew , T )
# INPUT : costold - previous solution objective value, costnew - current solution objective value,
rip
28:
29: # T - temperature parameter
30: # OUTPUT : a - the acceptance probability
a = exp( costold −cost new
)
sc
31: T
32: return a
33: function C OST(y)
nu
v
u d
u
wi (~si ,~s j ) = t ∑ (si,k − s j,k )2 .
k=1
Our training input set consists of a collection of vectors in Rd containing measurements from d sensors over a time period
and the training output set contains a collection of measurements from our sensor of interest.
To validate KNN’s ability to determine when different levels of peer relationships exist between inputs to outputs, we
consider its performance on some sample datasets. The high dimensional model representation (HDMR) of a function,
JCISE-18-1301 6 Swischuk
where f0 , f1 , and f2 represent additive effects of inputs, f12 represents cooperative effects and ε is zero mean Gaussian
noise [42]. Analysis of variance HDMR, or ANOVA-HDMR, is a particular form of this representation where [43]
Z 1Z 1
f0 = f (x1 , x2 )dx1 dx2
0 0
Z 1
d
f1 = f (x1 , x2 ) − f0 dx2
0
te
Z 1
f2 = f (x1 , x2 ) − f0 dx1
0
di
f12 = f (x1 , x2 ) − f0 − f1 − f2 .
e
py
This setup allows us to easily create functions that exhibit different types of relationships to determine the effectiveness of
KNN. The first function we create is one that has no dependence on the inputs, g0 (x1 , x2 ) = ε, second we define a function
that only has a dependence on x1 , g1 (x1 , x2 ) = x1 − 0.5 + ε which is the case when f0 , f1 6= 0. Lastly, we would like a function
Co
such that x1 and x2 individually have no correlation with the output, but cooperatively, they are highly correlated with the
output, g(x1 , x2 ) = x1 x2 − 0.5x1 − 0.5x2 + 0.25 + ε, which is the case when f0 , f12 6= 0. Figure 2 shows the input output
relationships between each of the three test functions.
ot
tN
rip
sc
nu
Ma
The options KNN has to use as features for predicting the output of these test functions are x1 only, x2 only or both
x1 and x2 . We have set k = 3, the number of training points and the number of test points are both set to 1000, and the
process is repeated for the same test points over 500 different training sets. For predicting g0 , there is no dependence on
JCISE-18-1301 7 Swischuk
d
te
e di
py
Co
ot
(a) g0 (x1 , x2 ) vs x1 and x2 (b) g1 (x1 , x2 ) vs x1 and x2
tN
rip
sc
nu
Ma
Fig. 3. Prediction accuracy for the three functions using different subsets of the features.
ed
pt
ce
an aircraft. This application has gained interest due to the number of incidents of commercial flights related to the failure
of the pitot static system. In 1995, ice began forming on an X-31 aircraft producing incorrect airspeeds that caused aircraft
computers to reconfigure the aircraft for lower airspeeds, leading to one of only two existing X-31’s crashing. Air France
flight 447 experienced icing of the pitot tubes that caused that aircraft to crash into the Atlantic Ocean in 2009 [44]. In 2018
a Malaysia Airlines flight reported that its airspeed indicators were not working because the covers on the pitot tubes were
not removed, causing the aircraft to make an unplanned landing [45]. In 2006 and 2013, two flights from Brisbane airport
had rejected take-offs due to wasp nests blocking pitot tubes [46]. In this section, we explain how the pitot static system of
an aircraft works, and what information it provides. We also describe two types of failures that can occur in the pitot static
system and the effects they have on airspeed. Following this, we discuss the data set in detail, analyze how the redundant
sensor data relates to airspeed and discuss how faulty sensor data was simulated.
JCISE-18-1301 8 Swischuk
d
True Track Angle (degrees) Magnetic Track Angle (degrees) Phase (climb,cruise,approach)
te
di
3.1 Pitot Static System Failures
The pitot static system consists of two ports; the pitot tube and the static port, both located outside the aircraft. The pitot
e
tube measures total pressure and the static port measures static pressure, and together they determine an aircraft’s airspeed.
py
Airspeed can also be calculated analytically using Bernoulli’s equation and is proportional to the difference between these
pressures, as shown below:
Co
2(Pt − Ps ) 2Pd
V2 = = , (3)
r r
ot
where V is airspeed ( ms ), r is air density ( mkg3 ), Pt is total pressure (pascals), Ps is static pressure (pascals) and Pd is dynamic
tN
pressure (pascals). These ports are located outside the aircraft, completely exposed and vulnerable to damage. Damage to
the pitot static system can often lead to faulty airspeed measurements and, in certain situations, this airspeed data can be lost
entirely. We consider two types of failure; 1) the pitot tube becoming blocked and 2) the static port becoming blocked.
rip
Consider the first case where the pitot tube becomes blocked. No air will be moving in or out of the tube, preventing any
new air pressure measurements and causing total pressure to remain constant. As the aircraft increases in altitude, air pressure
will decrease and during the pitot tube block, this will only have an effect on the static pressure. This results in a dynamic
sc
pressure that is higher than the true value, producing an airspeed that is higher than the true value. As the aircraft decreases
in altitude, the air pressure will increase resulting in a dynamic pressure that is lower than the true value and thus an airspeed
that is lower than the true value. For the second case, when the static port is blocked, the opposite situation occurs. When
nu
the aircraft increases in altitude the airspeed produced is lower than the true value and when the aircraft decreases in altitude,
the airspeed produced is higher than the true value. An intuitive way to detect if these blocks are occurring is to detect when
Ma
either of these pressure streams become constant. If we do detect some type of failure, our airspeed is unreliable and a new,
more accurate airspeed should be calculated. In the next section we discuss the redundant sensors that are considered for
predicting this new airspeed using data collected on-board a single type of four engine aircraft that has been made available
by NASA [47].
ed
3.2 Dataset
pt
The dataset used in this work consists of data collected from a four engine commercial aircraft, available by NASA [47].
The original dataset consisted of 186 different features, although much of the data was irrelevant to our task, such as fire
ce
alarm data, aircraft identification numbers, data/time information, etc that were removed before analysis. After this removal,
the dataset consisted of 24 sensors, shown in Table 1. We collect these sensors measurements for multiple past flights as
times series and store them in a library offline. Figure 4 shows a sample of a few sensor measurements and how they relate
Ac
JCISE-18-1301 9 Swischuk
d
te
di
(a) Body longitudinal acceleration vs. airspeed during climb. (b) Angle of attack vs. airspeed during climb.
e
py
Co
ot
tN
rip
(c) Thrust command vs. airspeed during cruise. (d) Fuel flow vs. airspeed during climb.
sc
nu
Ma
ed
pt
ce
Fig. 4. Relationship between various sensor outputs during different portions of flight.
Ac
during flight but is shown to be related to actions the aircraft performs such as changes in thrust, angle of attack, acceleration,
etc. Further, airspeed is defined as the difference between the aircraft velocity vector (ground speed) and the wind speed
vector. The velocity can be explained by the forces applied to the aircraft, such as thrust from the engine, and the angle of
these forces. Another value related to this velocity is acceleration. We can visualize how changes in acceleration correspond
to changes in airspeed in Figure 4(a), which shows how low values of longitudinal acceleration correspond to high values
of airspeed during the climb portion of flight. The devices that measure acceleration are called accelerometers. These
accelerometers are entirely contained within the inertial navigation system and are safe from outside interference. Thus,
JCISE-18-1301 10 Swischuk
d
climb corresponds to an increase in airspeed as the aircraft gains in altitude. Figure 4(e) shows how a decrease in altitude
te
during approach corresponds to a decrease in airspeed as the aircraft slows to prepare for landing. The effects of wind speed
on an aircraft should also be considered. Although wind speed cannot be collected directly in flight, the drift angle can be
di
recorded to determine the effect of wind on the direction of flight. In addition, particularly during cruise, there are many
other sensors on an aircraft that help to capture the trends in airspeed in a nonlinear way, making the relationship difficult to
e
visualize.
py
Based on this analysis, our redundant sensor data should provide excellent predictors of airspeed. We collect the redun-
dant sensor measurements for multiple past flights and store them in a library offline. While Figure 4 shows a relationship
between aircraft sensor output and airspeed, none of these alone are capable of predicting airspeed throughout an entire
Co
flight. A specific sensors output may be correlated with airspeed during one portion of flight, but uncorrelated during another
and cooperative effects may exist between them. As a result, our hypothesis is that airspeed can be predicted using a com-
bination of sensor outputs that are selected specifically for use during a particular portion of flight (climb, cruise, approach).
ot
As shown in Section 2.2.2, KNN is capable of identifying any degree of cooperative effects between the sensors.
tN
3.2.2 Simulation of Faulty Sensor Data
We manually created two offline libraries of faulty pitot static system data to simulate the two types of blocks. The
library was created by holding the values of the total or static pressure constant and recalculating airspeed using Bernoulli’s
rip
equation (Eq. 3). Each of the blocks were simulated for 100 seconds at four different times of flight. For a pitot tube block,
we hold total pressure constant and for a static port block, we hold static pressure constant. The effects of these blocks on
airspeed can be seen in Figure 5. The following section discusses how autocorrelation of this data is computed, stored and
sc
Fig. 5. The effects of holding static pressure (left) and total pressure (right) constant for 100 seconds during 4 sections of flight. The black
line denotes the true airspeed and the dashed red line denotes the airspeed computed from the pitot static system (using Equation 3).
JCISE-18-1301 11 Swischuk
d
te
e di
py
Co
ot
Fig. 6. Three general ways a total pressure stream behaves during flight.
tN
3.1. In the case of a pitot static failure, the task will be to predict a new, more reliable airspeed. We begin by analyzing the
autocorrelation of pressure data. During flight, pressure data only behaves in a few distinct ways. Figure 6 illustrates the
total pressure throughout an entire flight. From the figure, we can see that the pressure will be trending with slight amounts
rip
of noise during takeoff and approach, noisy, but somewhat steady during cruise or constant when a block is occurring.
As discussed in Section 2.1, autocorrelation provides a useful description of the behavior of a time series. This is further
sc
verified in Figure 7, where we see distinctive differences between the different states of the pitot static system. There is
a clear distinction between the constant signal vs. a trending or noisy signal. The autocorrelation of a constant signal will
quickly become zero as the data stream converges to the sample mean, providing an easy way to distinguish when a block is
nu
occurring.
Offline, we treat each pressure stream separately, computing the autocorrelation lag values for total and static pressure
for both types of failures and safe flights over 100 second time windows. The lag values for static pressure are stored in
Ma
one library and the lag values for total pressure are stored in a different library. Each entry in the libraries is assigned a
label of pitot tube block, static port block or safe. The two types of blocks we consider only effect one pressure stream at a
time. During a pitot tube block, only total pressure is effected and during a static port block only static pressure is effected.
ed
Thus, during a pitot tube block, the label associated with the total pressure autocorrelation is a pitot tube block and the
corresponding static pressure autocorrelation is assigned a label of safe since this data is unaffected by the block. Similarly,
during a static port block, the total pressure autocorrelation is assigned a label of safe and the static pressure autocorrelation
pt
is assigned a label of static port block. To detect pitot static failures online, while the aircraft is in flight, autocorrelation lag
values of incoming total and static pressure are each calculated in a moving window of size n. At each timestep, the state of
ce
the system is determined by the labels of their nearest neighbors in the two offline libraries. A pitot tube block is reported
when the total pressure is labelled “blocked” and the static pressure is labelled “safe”. Similarly, we only indicate a static
Ac
port block when total pressure is labelled “safe” and static pressure is labelled “blocked”. All other outputs are considered
safe. Autocorrelation and comparison to the offline library can be computed online in linear time, O (tnm), where t is the
total time of flight, n is the size of the moving window, and m is the size of the offline library.
If an aircraft determines that a failure is occurring, it must be able to estimate new flight data and decide which sensors
to follow. The pitot static system controls the airspeed indicator so a failure in the system will result in faulty airspeed
measurements, which must be predicted rapidly, in flight. Due to the differences in engine power necessary in different
portions of flight, before performing feature selection, we first divide our offline flight data into three sections; climb,
cruise and approach. The airspeeds are changing at very different rates for each of these sections of flight so a generic
feature selection over the entire flight will produce suboptimal airspeed predictions. To account for this, we perform feature
selection in each of the three sections of flight found in the training data. That is, we divide our data into three sub datasets,
JCISE-18-1301 12 Swischuk
d
te
e di
py
Co
Fig. 7.
ot
Autocorrelation behavior of three types of data signals.
tN
then perform simulated annealing feature selection as described in Section 2.2.1 for each of our three datasets. This results
rip
in three sets of features, shown in Table 2, that will be used to predict airspeed during the three different sections of flight.
sc
Table 2. Features selected for each phase of flight and the prediction error for each.
Max Average Average
Flight
Features Error Error Percent
nu
Phase
(knots) (knots) Error (%)
Approach Thrust Command, Thrust Target, Drift Angle, Fuel Flow 84.5 23.9 10.8
ce
Ac
The error values shown in Table 2 were calculated by iteratively holding out one flight from our training data and making
airspeed predictions for each second of the held out flight using the denoted set of features. This was done for all 17 flights.
The max error refers to the maximum prediction error (predicted - true airspeed) for each second of predicted airspeed for all
of the 17 flights tested. The average error and percent error were computed by averaging over all 17 flights. The predicted
airspeed was within 41 knots of the true airspeed during climb and cruise while it reaches a max error of 84.5 knots in the
approach portion. Although the approach portion of flight produced a large max error, the average performance was very
similar to that of climb and cruise. The predictions for a single flight using these features is shown in Figure 8.
JCISE-18-1301 13 Swischuk
300
Airspeed(knots)
200
d
100
te
di
0
e
py
0 1000 2000 3000 4000 5000 6000 7000 8000
Time (seconds)
Co
Fig. 8. Airspeed prediction for a single flight.
ot
The performance of the overall system can be found in Figure 9. In this figure, the top plot shows the error detection
system, which indicates a pitot tube block (PTB), a static port block (SPB) or safe. Two 300 second pitot tube blocks and
tN
one 300 second static port block were simulated during flight and these blocks were detected within 20 seconds of the time
they were encountered. The true time frame of the block is denoted by the shaded segments. Anytime this system predicts
a block is occurring, a new airspeed is predicted, otherwise, airspeed is not actively predicted. The bottom plot shows the
rip
error in the airspeed produced by the pitot static system compared to the predicted airspeed during each of the blocks. Note
that when the system does not indicate a block, airspeed is not being predicted.
When the aircraft is changing altitude, error in airspeed produced by a blocked pitot static system can quickly reach
sc
above 100 knots. An illustration of how quickly the airspeed error compounds with altitude is demonstrated in Figure 10,
which shows the error produced by the pitot static airspeed and the predicted airspeed for a pitot tube block lasting 510
seconds. This error jumps well above the error produced by our prediction method and results in a very unsafe situation for
nu
library, the aircraft is able to accurately identify what failures, if any, are occurring within the pitot static system. Using
manually produced errors in our test flights, our detection system was able to identify the correct failure mode within 20
seconds of the initial block. Feature subset selection was performed on the large number of redundant sensors for airspeed
pt
prediction. From previous flight data, we collect and store the output of those selected features in an offline library. Then in
flight, the aircraft monitors only those features and uses them for airspeed predictions. Using KNN regression, the aircraft
ce
is able to predict airspeed within 41 knots of the true airspeed during the climb and cruise portions of flight. When a block
is encountered during flight, the fault in the airspeed readings will progressively worsen. Having an airspeed prediction that
Ac
can be incorrect by up to 41 knots is not ideal, but a block in the pitot static system can quickly produce airspeed error of
over 50 knots in only 90 seconds. One possibility for an increase in accuracy would be to use more flight data. This may
increase the presence of certain trends and help reduce the effects of abnormalities in the small number of flights currently
used.
This work can be generalized to various other sensors. First, we must be able to define the behavior of a failing sensor
and determine what values it has an effect on. With this information, a system can learn to classify these failures and detect
them online by referencing an offline library of sensor failure signatures. Once this is completed, a set of independently
functioning sensors can be selected that have some relationship to the unreliable sensor we would like to correct. Using these
redundant sensor measurements as the features of our dataset, feature selection can be performed and the faulty data can be
corrected using a prediction model. An interesting extension to the offline phase of failure detection, would be to learn the
JCISE-18-1301 14 Swischuk
d
te
e di
py
Co
ot
Fig. 9. Upper: Detection of two pitot tube (PTB) and one static port block (SPB). Lower: Airspeed error. Shaded columns denote actual
tN
duration of block.
rip
sc
nu
Ma
ed
pt
ce
Ac
Fig. 10. Error produced by the pitot static and predicted airspeeds as a function of altitude for a pitot tube block.
failures through clustering, or some other unsupervised method, instead of manually labeling the offline library based on the
previous knowledge of failure type.
Future work should be done in analyzing the responsiveness and accuracy required to maintain operational safety. A
more in-depth look at the sensor data that we have available in flight may provide better predictions of airspeed. While
many other regression algorithms would perform well in online prediction, KNN was chosen due to is simplicity, speed and
minimal assumptions made on the data. A useful extension of this work would be to learn to detect partial blocks, enabling
the ability to predict if complete blocks may happen in the future and incorporate preventative measures. This step would
JCISE-18-1301 15 Swischuk
Acknowledgements
This work was supported by FA9550-16-1-0108, under the Dynamic Data-Driven Application Systems Program, Pro-
gram Manager Erik Blasch.
d
te
References
di
[1] Zhao, C., and Fu, Y., 2015. “Statistical analysis based online sensor failure detection for continuous glucose monitoring
in type i diabetes”. Chemometrics and Intelligent Laboratory Systems, 144, pp. 128–137.
e
[2] ElHady, N., and Provost, J., 2018. “A systematic survey on sensor failure detection and fault-tolerance in ambient
py
assisted living”. Sensors, 18(7), p. 1991.
[3] Jiang, L., 2011. Sensor fault detection and isolation using system dynamics identification techniques.
[4] Isermann, R., 1984. “Process fault detection based on modeling and estimation methods—a survey”. automatica,
Co
20(4), pp. 387–404.
[5] Frank, P. M., 1990. “Fault diagnosis in dynamic systems using analytical and knowledge-based redundancy: A survey
and some new results”. automatica, 26(3), pp. 459–474.
ot
[6] Willsky, A. S., 1976. “A survey of design methods for failure detection in dynamic systems”. Automatica, 12(6),
pp. 601–611.
tN
[7] Zolghadri, A., 2012. “Advanced model-based fdir techniques for aerospace systems: Today challenges and opportuni-
ties”. Progress in Aerospace Sciences, 53, pp. 18–29.
[8] Napolitano, M. R., Neppach, C., Casdorph, V., Naylor, S., Innocenti, M., and Silvestri, G., 1995. “Neural-network-
rip
based scheme for sensor failure detection, identification, and accommodation”. Journal of Guidance, Control, and
Dynamics, 18(6), pp. 1280–1286.
[9] Liu, L., Liu, D., Zhang, Y., and Peng, Y., 2016. “Effective sensor selection and data anomaly detection for condition
sc
[11] Pandya, D., Suursalu, S., Lam, B., and Kwaspen, P., 2018. “Predicting valve failure with machine learning.”. In
Chemical Engineering Progress, Vol. 114, p. 26.
Ma
[12] Jiang, P., Craig, P., Crosky, A., Maghrebi, M., Canbulat, I., and Saydam, S., 2018. “Risk assessment of failure of rock
bolts in underground coal mines using support vector machines”. Applied Stochastic Models in Business and Industry,
34(3), pp. 293–304.
[13] Munir, S., and Stankovic, J. A., 2014. “Failuresense: Detecting sensor failure using electrical appliances in the home”.
ed
In Mobile Ad Hoc and Sensor Systems (MASS), 2014 IEEE 11th International Conference on, IEEE, pp. 73–81.
[14] Kapitanova, K., Hoque, E., Stankovic, J. A., Whitehouse, K., and Son, S. H., 2012. “Being smart about failures:
assessing repairs in smart homes”. In Proceedings of the 2012 ACM Conference on Ubiquitous Computing, ACM,
pt
pp. 51–60.
[15] Ganeriwal, S., Balzano, L. K., and Srivastava, M. B., 2008. “Reputation-based framework for high integrity sensor
ce
Networks, 1996., IEEE International Conference on, Vol. 4, IEEE, pp. 2270–2275.
[17] Lin, J., Williamson, S., Borne, K., and DeBarr, D., 2012. “Pattern recognition in time series”. Advances in Machine
Learning and Data Mining for Astronomy, 1, pp. 617–645.
[18] Rafiee, J., and Tse, P., 2009. “Use of autocorrelation of wavelet coefficients for fault diagnosis”. Mechanical Systems
and Signal Processing, 23(5), pp. 1554–1572.
[19] Rhudy, M. B., Fravolini, M. L., Gu, Y., Napolitano, M. R., Gururajan, S., and Chao, H., 2015. “Aircraft model-
independent airspeed estimation without pitot tube measurements”. IEEE Transactions on Aerospace and Electronic
Systems, 51(3), pp. 1980–1995.
[20] Rhudy, M. B., Fravolini, M. L., Porcacchia, M., and Napolitano, M. R., 2019. “Comparison of wind speed models
within a pitot-free airspeed estimation algorithm using light aviation data”. Aerospace Science and Technology.
JCISE-18-1301 16 Swischuk
d
in building technologies”. In 12th International Conference on Machine Learning and Applications (ICMLA), IEEE,
te
pp. 305–308.
[28] Ha, J.-H., Kim, Y.-H., Im, H.-H., Kim, N.-Y., Sim, S., and Yoon, Y., 2018. “Error correction of meteorological data
di
obtained with mini-awss based on machine learning”. Advances in Meteorology, 2018.
[29] Napolitano, M. R., An, Y., and Seanor, B. A., 2000. “A fault tolerant flight control system for sensor and actuator
e
failures using neural networks”. Aircraft Design, 3(2), pp. 103–128.
py
[30] Bentley, J. L., 1975. “Multidimensional binary search trees used for associative searching”. Commun. ACM, 18(9),
pp. 509–517.
[31] Omohundro, S. M., 1989. Five balltree construction algorithms. Tech. rep.
Co
[32] Allaire, D., Chambers, J., Cowlagi, R., Kordonowy, D., Lecerf, M., Mainini, L., Ulker, F., and Willcox, K., 2013. “An
offline/online dddas capability for self-aware aerospace vehicles”. Procedia Computer Science, 18(0), pp. 1959 – 1968.
2013 International Conference on Computational Science.
ot
[33] Burrows, B. J., and Allaire, D., 2017. “A comparison of naive bayes classifiers with applications to self-aware aerospace
vehicles”. In 18th AIAA/ISSMO Multidisciplinary Analysis and Optimization Conference, p. 3819.
tN
[34] Burrows, B., Isaac, B., and Allaire, D. L., 2016. “A dynamic data-driven approach to multiple task capability estimation
for self-aware aerospace vehicles”. In 17th AIAA/ISSMO Multidisciplinary Analysis and Optimization Conference,
p. 4125.
rip
[35] Burrows, B. J., Isaac, B., and Allaire, D., 2017. “Multitask aircraft capability estimation using conjunctive filters”.
Journal of Aerospace Information Systems, pp. 625–636.
[36] Box, G. E., Jenkins, G. M., Reinsel, G. C., and Ljung, G. M., 2015. Time series analysis: forecasting and control. John
sc
[40] Kirkpatrick, S., Gelatt, C. D., and Vecchi, M. P., 1983. “Optimization by simulated annealing”. Science, 220(4598),
pp. 671–680.
[41] D Jannach, M Zanker, A. F. G. F., 2010. Recommender systems: an introduction. Cambridge University Press.
ed
[42] Sobol, I. M., 2003. “Theorems and examples on high dimensional model representation”. Reliability Engineering &
System Safety, 79(2), pp. 187–193.
[43] Sobol, I. M., 1993. “Sensitivity estimates for nonlinear mathematical models”. Mathematical modelling and computa-
pt
2012.
[45] Preliminary report: Airspeed indication failure on take-off involving airbus a330, 9m-mtk, brisbane airport, queensland.
Ac
JCISE-18-1301 17 Swischuk
d
columns denote actual duration of block. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
te
10 Airspeed error during pitot tube block . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
e di
py
Co
ot
tN
rip
sc
nu
Ma
ed
pt
ce
Ac
JCISE-18-1301 18 Swischuk
d
te
e di
py
Co
ot
tN
rip
sc
nu
Ma
ed
pt
ce
Ac
JCISE-18-1301 19 Swischuk