D37
1
INDEX
1. Introduction
2. Objectives
3. Data
3.1 Data Scraping
3.2 Data Description
4. Model Approach
5. Exploratory Data Analysis
5.1 Network Graph
5.2 Radar Plot
5.3 Distribution Analysis
5.4 Feature Relationships
5.5 Weather Contributions
6. Time Series Analysis
6.1 Autocorrelation Function (ACF) and Partial Autocorrelation Function (PACF)
6.2 FFT Analysis
7. Feature Engineering
7.1 Flight Record Encoding
8. Model Training
8.1 Using Time Series Models
8.2 Using ML Models
8.3 Using Flight Record Encoder
8.4 Using Neural Networks
9. Explainability
10. Flight Rescheduling
11. Results
12. Conclusion
Annexure
2
INTRODUCTION
Airlines face a complex challenge in maximizing profitability while managing multiple
operational areas, such as flight scheduling, ground crew efficiency, fuel management, and
passenger services. Given strict regulations and fluctuating fuel costs, predicting these delays
and optimizing these processes is essential for minimizing delays, improving turnaround
times, and boosting overall performance. Airlines must carefully plan maintenance schedules
to minimize downtime without compromising safety. Balancing preventive maintenance with
revenue-generating flights is delicate.
OBJECTIVES
1. Predict Flight Delays: Develop predictive models to forecast potential delays in
flight schedules, enabling proactive resource adjustments to minimize disruptions.
2. Optimize Flight Rescheduling: Design a rescheduling model that minimizes
cumulative delays by considering constraints like aircraft availability, crew schedules,
and regulatory requirements.
3. Enhance Operational Efficiency: Use data-driven techniques to improve overall
turnaround times and reduce fuel consumption without compromising safety or
service quality
DATA
3.1 Data Scraping Process
● Monthly data on flight operations, delays, and airline ratings of the different airlines were
extracted in CSV file format from the Bureau of Transportation Statistics (BTS) website:
https://www.transtats.bts.gov.
● For extra weather data, Python packages such as Geopy, OpenCage, and Meteostat were
utilized. These tools were used to gather longitude and latitudes of the airports, to further
extract location-specific weather details, including average temperature, atmospheric
pressure, wind speed, and precipitation, from reliable meteorological sources.
3.2 Data Description
Dataset Overview:
3
● We have used the dataset corresponding to flight schedule and delay delays for the
United States of America.
● Number of records (rows):We have a total of 5,48,685 rows,which signify the
monthly data over a period from 2022 to 2024.
Data Variables:
● List of variables(column) in the dataset: [‘Year’, ’FlightDate’,
‘DOT_ID_Reporting_Airline’, ’Tail_Number’, ’Flight_Number_Reporting_Airline’,
‘OriginAirportID’, ’OriginCityName’, ’OriginWac’, ’DestAirportID’,
’DestCityName’, ‘DestWac’, ’CRSDepTime’, ’DepTime’, ’DepDelayMinutes’,
‘DepartureDelayGroups’, ’TaxiOut’, ’WheelsOff’, ’WheelsOn‘, ’TaxiIn’,
’CRSArrTime’, ‘ArrTime’, ’ArrDelayMinutes’, ’Diverted’, ’CRSElapsedTime’,
’ActualElapsedTime’, ‘AirTime’, ’FlightsDistance’, ’CarrierDelay’, ‘WeatherDelay‘,
’NASDelay’, ‘SecurityDelay’, ’LateAircraftDelay’, ‘DivAirportLandings’,
‘CancellationCode_encoded’, ‘index’, ’tavg’, ‘tmin’, ’tmax’, ‘prcp’, ‘snow’, ’wdir’,
‘wspd’, ’wpgt’, ‘pres’, ’tsun’, ‘airline_ratings’].
MODEL APPROACH
4
EXPLORATORY DATA ANALYSIS
5.1 Network Graph
Network Graph of Delayed Flights
Description: The network graph shows how the
airports (yellow coloured nodes) are connected via
flight routes. It has been shown on the map of the USA
for clearer understanding.
Observations:
1. The colors of the edges between two nodes depict the average delay along those routes.
a.The purple-colored edges denote a delay between 0-50 minutes.
b.The pink edges show an average delay of 50-100 minutes.
c.The yellow edge denotes a delay of more than 200 minutes.
5
5.2 Radar Plot
Description: A radar plot (or spider plot), showcasing the magnitudes by which different
features affect the value of Departure Delay has been shown. Observations:
1.For each flight, the factors affecting departure delay differ. So, K-Means Clustering is used
to cluster the data into different clusters which have different factors affecting the departure
delays with different weights.
2. An optimum value of 7 clusters was chosen using the elbow method.
3. In Cluster 0, as expected, arrival delay, arrival time, and departure time are significant in
predicting departure delays.
4. A unique observation is that the Destination Airports, Reporting Airlines and Aircrafts are
of higher importance. Similarly, Clusters 1 to 6 are given in the Annexure.
6
5.3 Distribution Analysis
Description: The distribution of departure delay has been shown. Observations:
1. More than 2,50,000 flights faced a delay between 0 to 100 minutes.
2. A very small percentage of flights got delayed by more than 100 minutes.
5.4 Feature Relationships
7
Description: This plot visualizes the relationships between ArrDelayMinutes (arrival delay),
‘tavg’ (average temperature), ‘prcp’ (precipitation), snow (snowfall), and ‘wspd’ (wind
speed).
Observations:
1. Each subplot shows the scatter plot for every pair of variables, highlighting potential
correlations or trends between flight delays and weather factors.
2. The diagonal plots display kernel density estimates (KDE) for each individual
variable, indicating their distributions.
5.5 Weather Contributions
Description: We can see
that the departure delays
are less when the
atmospheric pressure is
within the normal range
(1000 HectoPascals to
1020 HectoPascals), and
slowly increases as the
pressure increases above
the normal range or
decreases below it.
Description: The
departure delays are
less when the
temperature ranges
between 0℃ to 28℃.
The delay in flights
increases as the
temperature goes
beyond normal ranges.
8
TIME SERIES ANALYSIS
6.1 Autocorrelation Function (ACF) and Partial Autocorrelation Function (PACF):
The ACF and PACF graphs for scheduled departure time are shown. They show no correlation of
delay with the input features. Hence, there is no seasonality trend followed. The plots for other
features are given in the Annexure.
6.2 Fast Fourier Transforms Analysis
Fast Fourier Transforms are applied on the dataset to find any trends in the frequency
domain. As can be seen above, there are no significant trends.
9
FEATURE ENGINEERING
7.1 Flight Record Encoding
The flight record encoder is used to encode the features of the target flight record Rj and the
features related with the delay of its pre-order flight Fk. The inputs of the flight record
encoder are: the flight record Rj = < Fj,Wj >, the arrival delay dkt of the pre-order flight Fk,
and the pre-order flight gap g(Fk → Fj) between the scheduled departure time tak of Fj and the
actual arrival time of flight Fk.
The features R , dkt and g(Fk → Fj) can be separated into discrete and continuous features. As
shown in the figure, the flight record encoder encodes the discrete features into
high-dimensional dense vectors and then concatenates the embeddings of discrete features
with the continuous features. The concatenation is fed into a fusion layer to obtain the
representation of the flight record.
10
Evaluating Prediction Accuracy with MAE Score
MAE SCORE: Mean Absolute Error (MAE) is to predict continuous numerical values. It
measures the average absolute difference between the predicted values and the actual target
values. Unlike other metrics, MAE doesn’t square the errors, which means it gives equal
weight to all errors, regardless of their direction. This helps to understand the magnitude of
errors without considering whether they are overestimations or underestimations.
MAE is particularly well suited for predicting delay in flight departures, due to its
proficiency in quantifying the magnitude or errors in the predictive model. It quantifies the
prediction inaccuracy by considering both the size and direction of errors. Its suitability for
flight delay prediction arises from its capability to explain the error in the units of delay,
making the performance of the model more understandable.
Train-Test Split: Our dataset contains flight schedule and delay details starting from
01-01-2022 to 31-08-2024. For the evaluation of our model, we have used the data
corresponding to 2022 and 2023 as the training set, and the data from 2024 as the testing set.
The testing set contains 309 new routes, 6 new airports and 149 new aircrafts that were not
present in the training set. Thus, our prediction model generalizes well and can perform well
on data containing new routes, airports, and aircrafts.
MODEL TRAINING
8.1 Using Time Series Models
Utilized Time Series Models to predict the departure delays, focusing on datasets with
frequency encoded features. Models such as ARIMA and SARIMAX, as expected, gave poor
performances, owing to the lack of seasonal trend, as discussed in Exploratory Data Analysis
(Section 4).
MODEL MAE
ARIMA 50.4102
SARIMAX 73.9801
11
8.2 Using ML Models
Step 1: Individual feature prediction (X):
● For each feature column, we apply machine learning algorithms separately.
● The outcome of this step is the predicted values of each feature, which becomes our
new set of feature variables (X_pred).
Step 2: Departure delay prediction
● With the predicted feature values (X_pred), we subsequently apply another predictive
model or technique to predict the departure delays (y) based on these estimated
feature values.
We can observe that Machine Learning models give very high values of Mean Absolute
Errors. This could be due to the lack of flexibility in the models, due to which complex
non-linear relationships in the data could not be captured. ML models rely upon feature
engineering to a large extent.
MODEL MAE
LINEAR 53.1465
DECISION TREE 73.9801
ELASTICNET 53.5216
EXTRATREES 56.0891
RIDGE 51.7051
LGBM 53.4285
RANDOM FOREST 49.3948
LASSO 51.3386
XGBOOST 50.6907
GRADIENT BOOSTING 48.4953
12
8.3 Using Flight Record Encoder
Step 1: Individual feature prediction (X):
● The categorical columns like Tail Number, Origin City and Destination City were
converted to embeddings, as discussed in Feature Engineering (Section 7)
● For each feature column, we apply neural networks separately, and the outcome of
this step is the predicted values of each feature, which becomes our new set of feature
variables (X_pred).
Step 2: Departure delay prediction
● Since Convolutional Neural Network gives the lowest MAE score among all the
models, it is used to predict the features (X_pred).
● With the predicted feature values (X_pred), we subsequently apply another neural
network model to predict the departure delays (y) based on these estimated feature
values.
MAE Scores of Neural Networks for Step 1
MODEL MAE
LSTM 9.2648
CNN 8.0155
DNN 9.8621
CNN-LSTM 11.9860
GRU 9.8151
RESNET50 12.0592
13
MAE Scores of Neural Networks for Step 2
MODEL MAE
LSTM 24.0982
CNN 25.5936
DNN 26.5173
CNN-LSTM 25.4449
GRU 23.1388
RESNET50 22.5774
DL models gave better MAE values. As compared to ML models, Deep neural networks
perform better on large datasets.
Employing Flight Record Encoding on categorical features further improves the MAE scores
and reduces it to X, due to its proficiency in capturing the categorical data and their
relationships with the delay in departure time.
8.3 Using Neural Networks
Several Neural Networks were trained and evaluated using the 2-step procedure, as described
while using Machine Learning Models (Section 8.2).
MODEL MAE
CNN 26.1384
LSTM 23.9869
DNN 21.1841
CNN-LSTM 21.1173
GRU 24.5893
14
MODEL MAE
RESNET50 21.4540
The CNN-LSTM model is ideal for this dataset because it captures both spatial and temporal
dependencies, which are critical for accurately predicting delays. The convolutional layers in
the CNN component excel at extracting patterns from the feature set, isolating important
predictors like weather conditions, airport factors, and operational delays. This enhances the
feature representation before passing it to the LSTM component. The LSTM, designed for
sequential data, is well-suited to model the temporal dependencies in flight delay patterns,
such as cascading delays and seasonal trends. By combining CNN for feature extraction and
LSTM for sequence learning, this hybrid model can capture complex interactions and
time-dependent relationships that improve delay prediction accuracy over simpler models.
Additionally, it can leverage variations in daily and seasonal patterns, which are prevalent in
flight delay data.
FLIGHT RESCHEDULING
9.3 Simple Genetic Algorithm
A Simple Genetic Algorithm (SGA) is an optimization method inspired by the principles of
natural selection and genetics. SGA has been employed for rescheduling the flight, as it is
effective for scheduling and delay minimization tasks. With numerous constraints such as
aircraft availability, crew scheduling, and regulatory requirements, SGA can efficiently
explore a large search space of possible schedules by iteratively refining solutions. SGA’s
adaptability and explorations allow the algorithm to generate feasible, optimized schedules
that consider both immediate and future scenarios.
1. The algorithm uses a fitness function, to calculate the total delay for a given schedule,
given the scheduled and rescheduled times.
2. A random sample of 20 possible flight schedules is created using Population
Initialization.
3. The tournament selection method, using 3 tournaments, is used to choose parents for
crossover, which selects the best individuals from a subset.
15
4. The single-point crossover generates a new child schedule by combining two parent
schedules. A simple swap mutation is applied with a 0.1 mutation rate.
5. The algorithm runs for 100 generations, continually evolving the population to
improve fitness.
The optimization problem was solved for the entire test dataset (1st January 2024 to 31st
August 2024), last month of the test dataset (August 2024), and last week of the test dataset
(25th August 2024 to 31st August 2024). The total amount of initial delay, delay after
optimization (in minutes) and percentage improvement has been shown in the table below.
DATASET SIZE INITIAL OPTIMIZED IMPROVEMENT
DELAY DELAY (%)
(mins) (mins)
1 week 838 58,166 106 99.82
1 month 4322 3,06,241 3,100 98.99
8 months 32224 22,78,199 28,350 98.76
RESULTS AND CONCLUSION:
16
In the final analysis of our models' performance, the Mean Absolute Error
(MAE) scores for different approaches have been compared. The following
table displays the MAE scores for the top performing models:
1. Network graphs depict how the airports (yellow coloured nodes) are
connected via flight routes. We infer from it that most of the delays
are of less than 50 mins.
2. Radar plots derived from clustering show the impact of various
features on delay time. This helps us understand which features are
essentially important in predicting the delay.
3. The CNN-LSTM model captures spatial and temporal dependencies,
making it ideal for accurately predicting delays in complex,
time-sequenced data.
4. The test data contains 309 new routes, 6 new airports and 149 new
aircrafts that were not present in the training data. An MAE of 5
mins on the testing data shows that the model generalizes well on
new routes, airports and aircrafts.
5. A Simple Genetic Algorithm has been used to reschedule flights to
minimize delay, achieving a 98.8% reduction in delay within the test
dataset.
17
ANNEXURE:
18
Effect of Categorical Features on Departure Delay
19
Flights and Flight Delay Distribution across the Year
20
Effect of Weather Parameters on Departure Delay
21
Box Plots and Scatter Plots Suggesting Coherency between Departure and Arrival Delays
22
Effect of Categorical Features on Arrival Delay
23
Elbow Method To Find K and perform K-Means Clustering for Radar Plot
24
Radar Plots for Clusters 1 to 6
25