You are on page 1of 13

PERFORMANCE EVALUATION OF

AIRLINES BASED ON CUSTOMER


RATING
Pre-Processing
Data Cleaning
• “Unnamed:0” column in the dataset is set a
Index column.
• Column “id” is dropped since it has no
significance in the analysis.
Unique values
• Dataset is checked for unique values in each
column.
– Majority of columns(14) have values range from 0 to
5.
– Four columns are with distribution of two values.
– Target column also have two vales.
– Other columns have unique values of
• Age – 75
• Class - 3
• Flight Distance – 3281
• Departure Delay in Minutes – 313
• Arrival Delay in Minutes - 320
Null values
• Null values are only present in “Arrival Delay in
Minutes”.
• The null values are replaced with mean value
of the column.
i)Handling the outlier in 'Flight Distance' column.
• Here outliers are present in 584 Rows.
Handling the outlier in 'CheckIn service' column
• The normal range of CheckIn services is 1 to 5
and obtained value for outlier is 1.
• This accounts for 12.1 percent of total data
set.
• So we concluded that it is not an unwanted
outlier. So we retained it.
Handling the outlier in 'Departure Delay in Minutes' column
Handling the outlier in Arrival Delay.
Encoding
• The dataset is assigned to following variables.
– Gender – Male(1),Female(0)
– Customer Type - Loyal Customer(0) Disloyal
Customer(1).
– Type of Travel - Business travel(0),Personal
travel(1).
– Class – Business(0),Eco(1),Eco Plus(2).

You might also like