Forecasting Stability Categories Using Neural Networks

1
Forecasting Stability Categories Using Neural

Networks
Steevo Xavier
Abstract—Using Feed Forward Neural Network to construct a prediction Model for forecasting stability categories. Exploratory data
analysis for the understanding of the provided data.
Index Terms— Forecasting stability categories using neural networks ,Deep Learning, EDA
I. Introduction algorithms for classification, regression, clustering, and

more.
D eep Learning a subset of the machine learning has
revolutionized various industries by enabling computers
• keras: A high-level neural network library built on top of
to learn and make decisions from complex and unstructured
TensorFlow and Theano. It simplifies the process of
data. At the heart, the neural networks consisting of constructing neural network architectures and
interconnected layers of artificial neurons performs specific facilitates rapid experimentation.
computation on the input data and pass it through the network
to generate an output. • tensorflow: An open-source deep learning framework
The remarkable feature of deep learning is its ability to developed by Google. It provides a flexible platform
automatically extract relevant features from raw data, for building and training various types of neural
eliminating the need for manual feature engineering, which networks.
can at times be difficult while engaging with high volumes of
data like image processing or text analysis. • ann_visualizer: A library that aids in visualizing the
II. LIBRARIES USED architecture of neural networks, including activations
and connections between layers.
Several libraries and frameworks were utilized to perform
various tasks sauch as data manipulation, model construction,
visualization, and evaluation. The following is a list of key
libraries used throughout the project:
• pandas: A versatile data manipulation library used for

loading, cleaning, and organizing datasets. It provides
data structures for efficient data handling and
transformation.
• numpy: A fundamental library for numerical

computations in Python. It offers support for arrays and
matrices, enabling efficient mathematical operations on
large datasets.
• matplotlib: A popular data visualization library that aids

in creating various types of plots and charts to visually
represent data distributions, relationships, and trends. Figure 1: Libraries Used
• seaborn: Built on top of matplotlib, seaborn is another

data visualization library that provides an interface for
III. DATA ANALYTIC STEPS
creating aesthetically pleasing and informative
statistical graphics. Data analytics involves extracting valuable insights and
knowledge from raw data to make informed decisions. The
• sklearn (scikit-learn): A comprehensive machine learning data we use is for the prediction of stability categories using
library that offers tools for data preprocessing, model neural networks, the following data analytics steps were
selection, training, and evaluation. It includes various performed:
2
a. Data Loading and Understanding records for training and testing of the model.
b. Data Preprocessing
c. Exploratory Data Analysis (EDA)
d. Data Preparation
e. Neural Network Architecture
f. Model Compilation and Training
g. Model Evaluation and Visualization
h. Evaluation Metrics
A. Data Loading and Understanding

Import the necessary libraries such as pandas, numpy, and
matplotlib. The training and testing datasets were loaded using
the read_csv function from pandas. The info() method
provided an overview of the datasets, including the number of
entries, data types, and memory usage.
Figure 3: Details of the Null values in the train and test dataset
C. Exploratory Data Analysis (EDA)
EDA involves analyzing and visualizing data to gain

insights into its characteristics. Various visualizations were
created to understand the distribution of stability categories,
explore feature distributions, and examine correlations
between numerical features. Histograms, count plots,
heatmaps, and descriptive statistics were used to uncover
patterns and relationships in the data.
Figure 2: Dataframe details
B. Data Preprocessing
Data preprocessing is crucial to ensure data quality and
suitability for analysis. The datasets were examined for
missing values using the isnull().sum() method. Any missing
values were addressed by using techniques like dropping rows
or imputing values. This step ensured that the data was ready
for exploration and modelling.
We will be using dropna function to drop the rows with

NaN values at later stage of the analysis. As we have enough Figure 4: Stability Category Distribution
3
The correlation matrix provides us the insights to the

relations between the data in the dataset. With the EDA we got
the insights of the data and the features of our dataset..
D. Data Preparation
The data was divided into input features and target labels.
Categorical labels were converted to one-hot encoded format
using the to_categorical function from Keras. Unnecessary
columns, such as 'Timestamp' and 'SCMstability_category',
were removed from the datasets to isolate relevant features.
The rows containing the null values are deleted using the
dropna function.
Figure 5: Box plot for the risk indexes
Figure 8: Code for Data Preparation
X_train dataframe contains the features of our data for

training the model.
Figure 6: Feature Distribution
Figure 9: X_Train Dataframe
Y_train dataframe contains the categorical data which has

been converted to one-hot encoded labels.
Figure 10: Y_train Dataframe array
Figure 7: Correlation Matrix

4
E. Neural Network Architecture
A neural network model was constructed using the Keras

library. The architecture included hidden layers with ReLU
activation functions and an output layer with softmax
activation. The model's summary and architecture
visualization were generated to understand its structure.
Code for the construction of the neural network, we will be
using FFNN for the model.
# Build the neural network
model = Sequential()
model.add(Dense(64, activation='relu',
input_dim=X_train.shape[1]))
model.add(Dense(32, activation='relu'))
model.add(Dense(y_train.shape[1], activation='softmax'))
Figure 13: Neural Network Model
The neural network comprises of input layers , 2 hidden layers

and one output layer.
There are 5 inputs and 5 outputs. From each of the input
neurons, there are 5 outputs connected to the 1st hidden layer.
1st hidden layer consists of 64 neurons and each receives 5
inputs and 64 outputs. These 64 outputs are connected to the
32 neurons in the 2nd Hidden layer. 2nd hidden layer has 64
inputs receiving from 2nd hidden layer and 32 outputs going
to the 5 outputs. The output layer receives 32 inputs from the
2nd Hidden layer and has 5 outputs. The 5 outputs are the 5
categories we are having.
F. Model Compilation and Training
The neural network model was compiled using an optimizer
Figure 11: Model Summary (Adam) and a loss function (categorical cross-entropy). The
model was then trained using the training data. Techniques
like early stopping and learning rate reduction were applied
using Keras callbacks to prevent overfitting and optimize
training.
Figure 14: Compilation and Training of the model
Figure 12: Model Plot

5
G. Model Evaluation and Visualization H. Evaluation Metrics

Further evaluation metrics were calculated, including the
Model evaluation involved various steps: confusion matrix, F1 score, and accuracy. These metrics
provided a comprehensive understanding of the model's
• ROC Curve: An ROC curve was plotted to visualize performance.
the true positive rate vs. false positive rate for a Mean squared error ( MSE) and Root Mean Squared Error
specific class. are also plotted to find the error values of the model.
Figure 15: ROC curve
• Accuracy and Loss Curves: Training and validation

accuracy and loss curves were plotted using Figure 18: Confusion Matrix & F1 Score
matplotlib to track model performance during
training. The accuracy of the model was around 98
percentage.
Figure 19: MSE & RMSE Values
IV. CONCLUSION
The neural networks for stability category prediction,
showcased the efficacy of deep learning in complex data
analysis. We used data preprocessing, neural network
architecture creation, model training, thorough evaluation, and
insightful interpretation. Key evaluation metrics, like ROC
curves and F1 scores, underscored the model's competence in
forecasting stability categories. Visualizations and
Figure 16: Model Accuracy Train VS Test comprehensive reporting facilitated clear communication of
findings.
V. REFERENCES
Pedregosa, F., et al. (2011). Journal of Machine Learning

Research, 12, 2825-2830.
Seaborn Development Team. (2020). Seaborn: statistical data
visualization. https://seaborn.pydata.org/
The TensorFlow Authors. (2019). TensorFlow.
https://github.com/tensorflow/tensorflow
Figure 17: Model Loss Train VS Test

Forecasting Stability Categories Using Neural Networks

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Forecasting Stability Categories Using Neural Networks

Uploaded by

Copyright:

Available Formats

1

Forecasting Stability Categories Using Neural

I. Introduction algorithms for classification, regression, clustering, and

• pandas: A versatile data manipulation library used for

• numpy: A fundamental library for numerical

• matplotlib: A popular data visualization library that aids

• seaborn: Built on top of matplotlib, seaborn is another

A. Data Loading and Understanding

EDA involves analyzing and visualizing data to gain

Figure 2: Dataframe details

We will be using dropna function to drop the rows with

The correlation matrix provides us the insights to the

Figure 5: Box plot for the risk indexes

Figure 8: Code for Data Preparation

X_train dataframe contains the features of our data for

Figure 6: Feature Distribution

Figure 9: X_Train Dataframe

Y_train dataframe contains the categorical data which has

Figure 10: Y_train Dataframe array

Figure 7: Correlation Matrix

E. Neural Network Architecture

A neural network model was constructed using the Keras

The neural network comprises of input layers , 2 hidden layers

Figure 14: Compilation and Training of the model

Figure 12: Model Plot

G. Model Evaluation and Visualization H. Evaluation Metrics

Figure 15: ROC curve

• Accuracy and Loss Curves: Training and validation

Pedregosa, F., et al. (2011). Journal of Machine Learning

Figure 17: Model Loss Train VS Test

You might also like