You are on page 1of 55

COMSATS University, Islamabad, Attock Campus

Department Of Computer Science


Assignment # 2
Subject: Sprob
Program: BSSE-5

Submitted To : Sir Attaullah


Submitted by : Wajahat Ali(033)
Abdullah Khan(041)
Muhammad Faizan(027)
Awais Aksar(023)
Zain Ul Hassan(034)
Airline Passenger Satisfaction

Description Of About Dataset:


Customer satisfaction scores from 120,000+ airline passengers, including additional information
about each passenger, their flight, and type of travel, as well as their evaluation of different
factors like cleanliness, comfort, service, and overall experience.

There Are 24 Attributes in this dataset:

Quantitative Data:
 ID
 Age
 Flight Distance
 Departure Delay
 Arrival Delay
 Departure and Arrival Time Convenience
 Ease of Online Booking
 Check-in Service
 Online Boarding
 Gate Location
 On-board Service
 Seat Comfort
 Leg Room Service
 Cleanliness
 Food and Drink
 In-flight Service
 In-flight Wifi Service
 In-flight Entertainment
 Baggage Handling

Qualitative Data:
 Gender
 Customer Type
 Type of Travel
 Class
 Satisfaction
Details of Attributes:
Age: Age of the passenger
Customer Type: Type of airline customer (First-time/Returning)
Type of Travel: Purpose of the flight (Business/Personal)
Class: Travel class in the airplane for the passenger seat
Flight Distance: Flight distance in miles
Departure Delay: Flight departure delay in minutes
Arrival Delay: Flight arrival delay in minutes
Departure and Arrival Time Convenience: Satisfaction level with the convenience of the flight
departure and arrival times from 1 (lowest) to 5 (highest) - 0 means "not applicable"
Ease of Online Booking: Satisfaction level with the online booking experience from 1 (lowest)
to 5 (highest) - 0 means "not applicable"
Check-in Service: Satisfaction level with the check-in service from 1 (lowest) to 5 (highest) - 0
means "not applicable"
Online Boarding: Satisfaction level with the online boarding experience from 1 (lowest) to 5
(highest) - 0 means "not applicable"
Gate Location: Satisfaction level with the gate location in the airport from 1 (lowest) to 5
(highest) - 0 means "not applicable"
On-board Service: Satisfaction level with the on-boarding service in the airport from 1 (lowest)
to 5 (highest) - 0 means "not applicable"
Seat Comfort: Satisfaction level with the comfort of the airplane seat from 1 (lowest) to 5
(highest) - 0 means "not applicable"
Leg Room Service: Satisfaction level with the leg room of the airplane seat from 1 (lowest) to 5
(highest) - 0 means "not applicable"
Cleanliness: Satisfaction level with the cleanliness of the airplane from 1 (lowest) to 5 (highest)
- 0 means "not applicable"
Food and Drink: Satisfaction level with the food and drinks on the airplane from 1 (lowest) to 5
(highest) - 0 means "not applicable"
In-flight Service: Satisfaction level with the in-flight service from 1 (lowest) to 5 (highest) - 0
means "not applicable"
In-flight Wifi Service: Satisfaction level with the in-flight Wifi service from 1 (lowest) to 5
(highest) - 0 means "not applicable"
In-flight Entertainment: Satisfaction level with the in-flight entertainment from 1 (lowest) to 5
(highest) - 0 means "not applicable"
Baggage Handling: Satisfaction level with the baggage handling from the airline from 1
(lowest) to 5 (highest) - 0 means "not applicable"
Satisfaction: Overall satisfaction level with the airline (Satisfied/Neutral or unsatisfied)

Source:
https://www.kaggle.com/datasets/mysarahmadbhat/airline-passenger-satisfaction?
select=data_dictionary.csv
Measure OF Central Tendency
Age:
import pandas as pd
import statistics
from scipy.stats.mstats import gmean
from statistics import harmonic_mean
path="/content/drive/MyDrive/airline_passenger_satisfaction.csv"
df=pd.read_csv(path)
df
column_name1='Age'
mean1=df[column_name1].mean()
median1 =df[column_name1].median()
mode1=statistics.mode(df[column_name1])
geometric_mean1=gmean(df[column_name1])
Harmonic_mean1=harmonic_mean(df[column_name1])
print("Mean : ",mean1)
print("Median : ",median1)
print("Mode : ",mode1)
print("Geometric Mean : ",geometric_mean1)
print("Harmonic Mean : ",Harmonic_mean1)
Flight Distance:
column_name2='Flight Distance'
mean2=df[column_name2].mean()

median2 =df[column_name2].median()

mode2=statistics.mode(df[column_name2])
geometric_mean2=gmean(df[column_name2])

Harmonic_mean2=harmonic_mean(df[column_name2])

print("Mean : ",mean2)
print("Median : ",median2)
print("Mode : ",mode2)
print("Geometric Mean : ",geometric_mean2)
print("Harmonic Mean : ",Harmonic_mean2)
Departure Delay:
column_name3='Departure Delay'
mean3=df[column_name3].mean()

median3 =df[column_name3].median()

mode3=statistics.mode(df[column_name3])

geometric_mean3=gmean(df[column_name3])

Harmonic_mean3=harmonic_mean(df[column_name3])

print("Mean : ",mean3)
print("Median : ",median3)
print("Mode : ",mode3)
print("Geometric Mean : ",geometric_mean3)
print("Harmonic Mean : ",Harmonic_mean3)
Arrival Delay:
column_name4='Arrival Delay'
mean4=df[column_name4].mean()

median4 =df[column_name4].median()

mode4=statistics.mode(df[column_name4])

geometric_mean4=gmean(df[column_name4])

Harmonic_mean4=harmonic_mean(df[column_name4])

print("Mean : ",mean4)
print("Median : ",median4)
print("Mode : ",mode4)
print("Geometric Mean : ",geometric_mean4)
print("Harmonic Mean : ",Harmonic_mean4)
Departure and Arrival Time Convenience:
column_name5='Departure and Arrival Time Convenience'
mean5=df[column_name5].mean()

median5 =df[column_name5].median()

mode5=statistics.mode(df[column_name5])

geometric_mean5=gmean(df[column_name5])

Harmonic_mean5=harmonic_mean(df[column_name5])
print("Mean : ",mean5)
print("Median : ",median5)
print("Mode : ",mode5)
print("Geometric Mean : ",geometric_mean5)
print("Harmonic Mean : ",Harmonic_mean5)
Ease of Online Booking:
column_name6='Ease of Online Booking'
mean6=df[column_name6].mean()

median6 =df[column_name6].median()

mode6=statistics.mode(df[column_name6])

geometric_mean6=gmean(df[column_name6])

Harmonic_mean6=harmonic_mean(df[column_name6])

print("Mean : ",mean6)
print("Median : ",median6)
print("Mode : ",mode6)
print("Geometric Mean : ",geometric_mean6)
print("Harmonic Mean : ",Harmonic_mean6)
Check-in Service:
column_name7='Check-in Service'
mean7=df[column_name7].mean()

median7 =df[column_name7].median()
mode7=statistics.mode(df[column_name7])

geometric_mean7=gmean(df[column_name7])

Harmonic_mean7=harmonic_mean(df[column_name7])

print("Mean : ",mean7)
print("Median : ",median7)
print("Mode : ",mode7)
print("Geometric Mean : ",geometric_mean7)
print("Harmonic Mean : ",Harmonic_mean7)
Online Boarding:
column_name8='Online Boarding'
mean8=df[column_name8].mean()

median8 =df[column_name8].median()

mode8=statistics.mode(df[column_name8])

geometric_mean8=gmean(df[column_name8])

Harmonic_mean8=harmonic_mean(df[column_name8])

print("Mean : ",mean8)
print("Median : ",median8)
print("Mode : ",mode8)
print("Geometric Mean : ",geometric_mean8)
print("Harmonic Mean : ",Harmonic_mean8)

Gate Location:
column_name9='Gate Location'
mean9=df[column_name9].mean()

median9 =df[column_name9].median()

mode9=statistics.mode(df[column_name9])

geometric_mean9=gmean(df[column_name9])

Harmonic_mean9=harmonic_mean(df[column_name9])

print("Mean : ",mean9)
print("Median : ",median9)
print("Mode : ",mode9)
print("Geometric Mean : ",geometric_mean9)
print("Harmonic Mean : ",Harmonic_mean9)
On-board Service:
column_name10='On-board Service'
mean10=df[column_name10].mean()

median10 =df[column_name10].median()

mode10=statistics.mode(df[column_name10])

geometric_mean10=gmean(df[column_name10])
Harmonic_mean10=harmonic_mean(df[column_name10])

print("Mean : ",mean10)
print("Median : ",median10)
print("Mode : ",mode10)
print("Geometric Mean : ",geometric_mean10)
print("Harmonic Mean : ",Harmonic_mean10)
Seat Comfort:
column_name11='Seat Comfort'
mean11=df[column_name11].mean()

median11 =df[column_name11].median()

mode11=statistics.mode(df[column_name11])

geometric_mean11=gmean(df[column_name11])

Harmonic_mean11=harmonic_mean(df[column_name11])

print("Mean : ",mean11)
print("Median : ",median11)
print("Mode : ",mode11)
print("Geometric Mean : ",geometric_mean11)
print("Harmonic Mean : ",Harmonic_mean11)
Leg Room Service:
column_name12='Leg Room Service'
mean12=df[column_name12].mean()
median12 =df[column_name12].median()

mode12=statistics.mode(df[column_name12])

geometric_mean12=gmean(df[column_name12])

Harmonic_mean12=harmonic_mean(df[column_name12])

print("Mean : ",mean12)
print("Median : ",median12)
print("Mode : ",mode12)
print("Geometric Mean : ",geometric_mean12)
print("Harmonic Mean : ",Harmonic_mean12)
Cleanliness:
column_name13='Cleanliness'
mean13=df[column_name13].mean()

median13 =df[column_name13].median()

mode13=statistics.mode(df[column_name13])

geometric_mean13=gmean(df[column_name13])

Harmonic_mean13=harmonic_mean(df[column_name13])

print("Mean : ",mean13)
print("Median : ",median13)
print("Mode : ",mode13)
print("Geometric Mean : ",geometric_mean13)
print("Harmonic Mean : ",Harmonic_mean13)
Food and Drink:
column_name14='Food and Drink'
mean14=df[column_name14].mean()

median14 =df[column_name14].median()

mode14=statistics.mode(df[column_name14])
geometric_mean14=gmean(df[column_name14])

Harmonic_mean14=harmonic_mean(df[column_name14])

print("Mean : ",mean14)
print("Median : ",median14)
print("Mode : ",mode14)
print("Geometric Mean : ",geometric_mean14)
print("Harmonic Mean : ",Harmonic_mean14)
In-flight Service:
column_name15='In-flight Service'
mean15=df[column_name15].mean()

median15 =df[column_name15].median()

mode15=statistics.mode(df[column_name15])

geometric_mean15=gmean(df[column_name15])
Harmonic_mean15=harmonic_mean(df[column_name15])

print("Mean : ",mean15)
print("Median : ",median15)
print("Mode : ",mode15)
print("Geometric Mean : ",geometric_mean15)
print("Harmonic Mean : ",Harmonic_mean15)
In-flight Entertainment:
column_name16='In-flight Entertainment'
mean16=df[column_name16].mean()

median16 =df[column_name16].median()

mode16=statistics.mode(df[column_name16])

geometric_mean16=gmean(df[column_name16])

Harmonic_mean16=harmonic_mean(df[column_name16])

print("Mean : ",mean16)
print("Median : ",median16)
print("Mode : ",mode16)
print("Geometric Mean : ",geometric_mean16)
print("Harmonic Mean : ",Harmonic_mean16)
In-flight Wifi Service:
column_name17='In-flight Wifi Service'
mean17=df[column_name17].mean()

median17 =df[column_name17].median()

mode17=statistics.mode(df[column_name17])

geometric_mean17=gmean(df[column_name17])

Harmonic_mean17=harmonic_mean(df[column_name17])

print("Mean : ",mean17)
print("Median : ",median17)
print("Mode : ",mode17)
print("Geometric Mean : ",geometric_mean17)
print("Harmonic Mean : ",Harmonic_mean17)
Baggage Handling:
column_name18='Baggage Handling'
mean18=df[column_name18].mean()

median18 =df[column_name18].median()

mode18=statistics.mode(df[column_name18])

geometric_mean18=gmean(df[column_name18])

Harmonic_mean18=harmonic_mean(df[column_name18])
print("Mean : ",mean18)
print("Median : ",median18)
print("Mode : ",mode18)
print("Geometric Mean : ",geometric_mean18)
print("Harmonic Mean : ",Harmonic_mean18)

Result In Table Form:


from tabulate import tabulate

headers = ["Column_Name", "Median", "Mode","Mean","Harmonic.Mean","Geometric.Mean"]

data = [
[column_name1, median1, mode1,mean1,Harmonic_mean1,geometric_mean1],
[column_name2, median2, mode2,mean2,Harmonic_mean2,geometric_mean2]
, [column_name3, median3, mode3,mean3,Harmonic_mean3,geometric_mean3]
, [column_name4, median4, mode4,mean4,Harmonic_mean4,geometric_mean4]
, [column_name5, median5, mode5,mean5,Harmonic_mean5,geometric_mean5]
, [column_name6, median6, mode6,mean6,Harmonic_mean6,geometric_mean6]
, [column_name7, median7, mode7,mean7,Harmonic_mean7,geometric_mean7]
, [column_name8, median8, mode8,mean8,Harmonic_mean8,geometric_mean8]
, [column_name9, median9, mode9,mean9,Harmonic_mean9,geometric_mean9]
, [column_name10, median10, mode10,mean10,Harmonic_mean10,geometric_mean10]
, [column_name11, median11, mode11,mean11,Harmonic_mean11,geometric_mean11]
, [column_name12, median12, mode12,mean12,Harmonic_mean12,geometric_mean12]
, [column_name13, median13, mode13,mean13,Harmonic_mean13,geometric_mean13]
, [column_name14, median14, mode14,mean14,Harmonic_mean14,geometric_mean14]
, [column_name15, median15, mode15,mean15,Harmonic_mean15,geometric_mean15]
, [column_name16, median16, mode16,mean16,Harmonic_mean16,geometric_mean16]
, [column_name17, median17, mode17,mean17,Harmonic_mean17,geometric_mean17]
, [column_name18, median18, mode18,mean18,Harmonic_mean18,geometric_mean18]

table = tabulate(data, headers, tablefmt="grid")


print(table)
Measure of Dispersion
Age
import pandas as pd

path="/content/drive/MyDrive/airline_passenger_satisfaction.csv"
df=pd.read_csv(path)
df

column_name1='Age'
column_data1=df[column_name1]
mean1=df[column_name1].mean()

coefficient_of_range1=(column_data1.max()-column_data1.min())/(column_data1.max()
+column_data1.min())
mean_Absolute_deviation1=(column_data1-mean1).abs().mean()
coefficient_of_mean_deviation1=mean_Absolute_deviation1/mean1
std_deviation1=column_data1.std()
coefficient_of_variation1=(std_deviation1/mean1)*100
variance1=df[column_name1].var()
std_deviation1=df[column_name1].std()
range_value1=df[column_name1].max()-df[column_name1].min()
mean_deviation1=(df[column_name1]-mean1).abs().mean()
Flight Distance
column_name2='Flight Distance'
column_data2=df[column_name2]
mean2=df[column_name2].mean()

coefficient_of_range2=(column_data2.max()-column_data2.min())/(column_data2.max()
+column_data2.min())
mean_Absolute_deviation2=(column_data2-mean2).abs().mean()
coefficient_of_mean_deviation2=mean_Absolute_deviation2/mean2
std_deviation2=column_data2.std()
coefficient_of_variation2=(std_deviation2/mean2)*100
variance2=df[column_name2].var()
std_deviation2=df[column_name2].std()
range_value2=df[column_name2].max()-df[column_name2].min()
mean_deviation2=(df[column_name2]-mean2).abs().mean()
Departure Delay
column_name3='Departure Delay'
column_data3=df[column_name3]
mean3=df[column_name3].mean()
coefficient_of_range3=(column_data3.max()-column_data3.min())/(column_data3.max()
+column_data3.min())
mean_Absolute_deviation3=(column_data3-mean3).abs().mean()
coefficient_of_mean_deviation3=mean_Absolute_deviation3/mean3
std_deviation3=column_data3.std()
coefficient_of_variation3=(std_deviation3/mean3)*100
variance3=df[column_name3].var()
std_deviation3=df[column_name3].std()
range_value3=df[column_name3].max()-df[column_name3].min()
mean_deviation3=(df[column_name3]-mean3).abs().mean()
Arrival Delay
column_name4='Arrival Delay'
column_data4=df[column_name4]
mean4=df[column_name4].mean()

coefficient_of_range4=(column_data4.max()-column_data4.min())/(column_data4.max()
+column_data4.min())
mean_Absolute_deviation4=(column_data4-mean4).abs().mean()
coefficient_of_mean_deviation4=mean_Absolute_deviation4/mean4
std_deviation4=column_data4.std()
coefficient_of_variation4=(std_deviation4/mean4)*100
variance4=df[column_name4].var()
std_deviation4=df[column_name4].std()
range_value4=df[column_name4].max()-df[column_name4].min()
mean_deviation4=(df[column_name4]-mean4).abs().mean()
Departure and Arrival Time Convenience
column_name5='Departure and Arrival Time Convenience'
column_data5=df[column_name5]
mean5=df[column_name5].mean()

coefficient_of_range5=(column_data5.max()-column_data5.min())/(column_data5.max()
+column_data5.min())
mean_Absolute_deviation5=(column_data5-mean5).abs().mean()
coefficient_of_mean_deviation5=mean_Absolute_deviation5/mean5
std_deviation5=column_data5.std()
coefficient_of_variation5=(std_deviation5/mean5)*100
variance5=df[column_name5].var()
std_deviation5=df[column_name5].std()
range_value5=df[column_name5].max()-df[column_name5].min()
mean_deviation5=(df[column_name5]-mean5).abs().mean()
Ease of Online Booking
column_name6='Ease of Online Booking'
column_data6=df[column_name6]
mean6=df[column_name6].mean()

coefficient_of_range6=(column_data6.max()-column_data6.min())/(column_data6.max()
+column_data6.min())
mean_Absolute_deviation6=(column_data6-mean6).abs().mean()
coefficient_of_mean_deviation6=mean_Absolute_deviation6/mean6
std_deviation6=column_data6.std()
coefficient_of_variation6=(std_deviation6/mean6)*100
variance6=df[column_name6].var()
std_deviation6=df[column_name6].std()
range_value6=df[column_name6].max()-df[column_name6].min()
mean_deviation6=(df[column_name6]-mean6).abs().mean()
Check-in Service
column_name7='Check-in Service'
column_data7=df[column_name7]
mean7=df[column_name7].mean()

coefficient_of_range7=(column_data7.max()-column_data7.min())/(column_data7.max()
+column_data7.min())
mean_Absolute_deviation7=(column_data7-mean7).abs().mean()
coefficient_of_mean_deviation7=mean_Absolute_deviation7/mean7
std_deviation7=column_data7.std()
coefficient_of_variation7=(std_deviation7/mean7)*100
variance7=df[column_name7].var()
std_deviation7=df[column_name7].std()
range_value7=df[column_name7].max()-df[column_name7].min()
mean_deviation7=(df[column_name7]-mean7).abs().mean()
Online Boarding
column_name8='Online Boarding'
column_data8=df[column_name8]
mean8=df[column_name8].mean()

coefficient_of_range8=(column_data8.max()-column_data8.min())/(column_data8.max()
+column_data8.min())
mean_Absolute_deviation8=(column_data8-mean8).abs().mean()
coefficient_of_mean_deviation8=mean_Absolute_deviation8/mean8
std_deviation8=column_data8.std()
coefficient_of_variation8=(std_deviation8/mean8)*100
variance8=df[column_name8].var()
std_deviation8=df[column_name8].std()
range_value8=df[column_name8].max()-df[column_name8].min()
mean_deviation8=(df[column_name8]-mean8).abs().mean()
Gate Location
column_name9='Gate Location'
column_data9=df[column_name9]
mean9=df[column_name9].mean()

coefficient_of_range9=(column_data9.max()-column_data9.min())/(column_data9.max()
+column_data9.min())
mean_Absolute_deviation9=(column_data9-mean9).abs().mean()
coefficient_of_mean_deviation9=mean_Absolute_deviation9/mean9
std_deviation9=column_data9.std()
coefficient_of_variation9=(std_deviation9/mean9)*100
variance9=df[column_name9].var()
std_deviation9=df[column_name9].std()
range_value9=df[column_name9].max()-df[column_name9].min()
mean_deviation9=(df[column_name9]-mean9).abs().mean()
On-board Service
column_name10='On-board Service'
column_data10=df[column_name10]
mean10=df[column_name10].mean()

coefficient_of_range10=(column_data10.max()-column_data10.min())/
(column_data10.max()+column_data10.min())
mean_Absolute_deviation10=(column_data10-mean10).abs().mean()
coefficient_of_mean_deviation10=mean_Absolute_deviation10/mean10
std_deviation10=column_data10.std()
coefficient_of_variation10=(std_deviation10/mean10)*100
variance10=df[column_name10].var()
std_deviation10=df[column_name10].std()
range_value10=df[column_name10].max()-df[column_name10].min()
mean_deviation10=(df[column_name10]-mean10).abs().mean()

Seat Comfort
column_name11='Seat Comfort'
column_data11=df[column_name11]
mean11=df[column_name11].mean()

coefficient_of_range11=(column_data11.max()-column_data11.min())/
(column_data11.max()+column_data11.min())
mean_Absolute_deviation11=(column_data11-mean11).abs().mean()
coefficient_of_mean_deviation11=mean_Absolute_deviation11/mean11
std_deviation11=column_data11.std()
coefficient_of_variation11=(std_deviation11/mean11)*100
variance11=df[column_name11].var()
std_deviation11=df[column_name11].std()
range_value11=df[column_name11].max()-df[column_name11].min()
mean_deviation11=(df[column_name11]-mean11).abs().mean()
Leg Room Service
column_name12='Leg Room Service'
column_data12=df[column_name12]
mean12=df[column_name12].mean()

coefficient_of_range12=(column_data12.max()-column_data12.min())/
(column_data12.max()+column_data12.min())
mean_Absolute_deviation12=(column_data12-mean12).abs().mean()
coefficient_of_mean_deviation12=mean_Absolute_deviation12/mean12
std_deviation12=column_data12.std()
coefficient_of_variation12=(std_deviation12/mean12)*100
variance12=df[column_name12].var()
std_deviation12=df[column_name12].std()
range_value12=df[column_name12].max()-df[column_name12].min()
mean_deviation12=(df[column_name12]-mean12).abs().mean()
Cleanliness
column_name13='Cleanliness'
column_data13=df[column_name13]
mean13=df[column_name13].mean()

coefficient_of_range13=(column_data13.max()-column_data13.min())/
(column_data13.max()+column_data13.min())
mean_Absolute_deviation13=(column_data13-mean13).abs().mean()
coefficient_of_mean_deviation13=mean_Absolute_deviation13/mean13
std_deviation13=column_data13.std()
coefficient_of_variation13=(std_deviation13/mean13)*100
variance13=df[column_name13].var()
std_deviation13=df[column_name13].std()
range_value13=df[column_name13].max()-df[column_name13].min()
mean_deviation13=(df[column_name13]-mean13).abs().mean()
Food and Drink
column_name14='Food and Drink'
column_data14=df[column_name14]
mean14=df[column_name14].mean()

coefficient_of_range14=(column_data14.max()-column_data14.min())/
(column_data14.max()+column_data14.min())
mean_Absolute_deviation14=(column_data14-mean14).abs().mean()
coefficient_of_mean_deviation14=mean_Absolute_deviation14/mean14
std_deviation14=column_data14.std()
coefficient_of_variation14=(std_deviation14/mean14)*100
variance14=df[column_name14].var()
std_deviation14=df[column_name14].std()
range_value14=df[column_name14].max()-df[column_name14].min()
mean_deviation14=(df[column_name14]-mean14).abs().mean()
In-flight Service
column_name15='In-flight Service'
column_data15=df[column_name15]
mean15=df[column_name15].mean()

coefficient_of_range15=(column_data15.max()-column_data15.min())/
(column_data15.max()+column_data15.min())
mean_Absolute_deviation15=(column_data15-mean15).abs().mean()
coefficient_of_mean_deviation15=mean_Absolute_deviation15/mean15
std_deviation15=column_data15.std()
coefficient_of_variation15=(std_deviation15/mean15)*100
variance15=df[column_name15].var()
std_deviation15=df[column_name15].std()
range_value15=df[column_name15].max()-df[column_name15].min()
mean_deviation15=(df[column_name15]-mean15).abs().mean()
In-flight Wifi Service
column_name16='In-flight Wifi Service'
column_data16=df[column_name16]
mean16=df[column_name16].mean()

coefficient_of_range16=(column_data16.max()-column_data16.min())/
(column_data16.max()+column_data16.min())
mean_Absolute_deviation16=(column_data16-mean16).abs().mean()
coefficient_of_mean_deviation16=mean_Absolute_deviation16/mean16
std_deviation16=column_data16.std()
coefficient_of_variation16=(std_deviation16/mean16)*100
variance16=df[column_name16].var()
std_deviation16=df[column_name16].std()
range_value16=df[column_name16].max()-df[column_name16].min()
mean_deviation16=(df[column_name16]-mean16).abs().mean()
In-flight Entertainment
column_name17='In-flight Entertainment'
column_data17=df[column_name17]
mean17=df[column_name17].mean()
coefficient_of_range17=(column_data17.max()-column_data17.min())/
(column_data17.max()+column_data17.min())
mean_Absolute_deviation17=(column_data17-mean17).abs().mean()
coefficient_of_mean_deviation17=mean_Absolute_deviation17/mean17
std_deviation17=column_data17.std()
coefficient_of_variation17=(std_deviation17/mean17)*100
variance17=df[column_name17].var()
std_deviation17=df[column_name17].std()
range_value17=df[column_name17].max()-df[column_name17].min()
mean_deviation17=(df[column_name17]-mean17).abs().mean()
Baggage Handling
column_name18='Baggage Handling'
column_data18=df[column_name18]
mean18=df[column_name18].mean()

coefficient_of_range18=(column_data18.max()-column_data18.min())/
(column_data18.max()+column_data18.min())
mean_Absolute_deviation18=(column_data18-mean18).abs().mean()
coefficient_of_mean_deviation18=mean_Absolute_deviation18/mean18
std_deviation18=column_data18.std()
coefficient_of_variation18=(std_deviation18/mean18)*100
variance18=df[column_name18].var()
std_deviation18=df[column_name18].std()
range_value18=df[column_name18].max()-df[column_name18].min()
mean_deviation18=(df[column_name18]-mean18).abs().mean()

Result In Tabular Form:


from tabulate import tabulate

headers = ["Column_Name", "Range","Mean deviation","Variance","Standard


Deviation","coefficient of Range","Coefficient of M.D","Coefficient of Variance"]

data = [

[column_name1,range_value1,mean_Absolute_deviation1,variance1,std_deviation1,coeff
icient_of_range1,coefficient_of_mean_deviation1,coefficient_of_variation1],

[column_name2,range_value2,mean_Absolute_deviation2,variance2,std_deviation2,coeff
icient_of_range2,coefficient_of_mean_deviation2,coefficient_of_variation2],

[column_name3,range_value3,mean_Absolute_deviation3,variance3,std_deviation3,coeff
icient_of_range3,coefficient_of_mean_deviation3,coefficient_of_variation3]
,
[column_name4,range_value4,mean_Absolute_deviation5,variance4,std_deviation4,coeff
icient_of_range4,coefficient_of_mean_deviation4,coefficient_of_variation4]
,
[column_name5,range_value5,mean_Absolute_deviation6,variance5,std_deviation5,coeff
icient_of_range5,coefficient_of_mean_deviation5,coefficient_of_variation5]
,
[column_name6,range_value6,mean_Absolute_deviation7,variance6,std_deviation6,coeff
icient_of_range6,coefficient_of_mean_deviation6,coefficient_of_variation6]
,
[column_name7,range_value7,mean_Absolute_deviation8,variance7,std_deviation7,coeff
icient_of_range7,coefficient_of_mean_deviation7,coefficient_of_variation7],

[column_name8,range_value8,mean_Absolute_deviation8,variance8,std_deviation8,coeff
icient_of_range8,coefficient_of_mean_deviation8,coefficient_of_variation8],

[column_name9,range_value9,mean_Absolute_deviation9,variance9,std_deviation9,coeff
icient_of_range9,coefficient_of_mean_deviation9,coefficient_of_variation9]
,
[column_name10,range_value10,mean_Absolute_deviation10,variance10,std_deviation1
0,coefficient_of_range10,coefficient_of_mean_deviation10,coefficient_of_variation10]
,
[column_name11,range_value11,mean_Absolute_deviation11,variance11,std_deviation11
,coefficient_of_range11,coefficient_of_mean_deviation11,coefficient_of_variation11]
,
[column_name12,range_value12,mean_Absolute_deviation12,variance12,std_deviation1
2,coefficient_of_range12,coefficient_of_mean_deviation12,coefficient_of_variation12]
,
[column_name13,range_value13,mean_Absolute_deviation13,variance13,std_deviation1
3,coefficient_of_range13,coefficient_of_mean_deviation13,coefficient_of_variation13]
,
[column_name14,range_value14,mean_Absolute_deviation14,variance14,std_deviation1
4,coefficient_of_range14,coefficient_of_mean_deviation14,coefficient_of_variation14]
,
[column_name15,range_value15,mean_Absolute_deviation15,variance15,std_deviation1
5,coefficient_of_range15,coefficient_of_mean_deviation15,coefficient_of_variation15]
,
[column_name16,range_value16,mean_Absolute_deviation16,variance16,std_deviation1
6,coefficient_of_range16,coefficient_of_mean_deviation16,coefficient_of_variation16]
,
[column_name17,range_value17,mean_Absolute_deviation17,variance17,std_deviation1
7,coefficient_of_range17,coefficient_of_mean_deviation17,coefficient_of_variation17]
,
[column_name18,range_value18,mean_Absolute_deviation18,variance18,std_deviation1
8,coefficient_of_range18,coefficient_of_mean_deviation18,coefficient_of_variation18]

table = tabulate(data, headers, tablefmt="grid")


print(table)
Simple Bar Chart

Age:
import pandas as pd
import matplotlib.pyplot as plt
path="/content/drive/MyDrive/airline_passenger_satisfaction.csv"
df=pd.read_csv(path)
df
label=df['Age']
values=df['ID']
plt.bar(values,label, color='Red')
plt.xlabel("ID")
plt.ylabel("Age")
plt.title("Simple Bar Chart")
plt.xticks(rotation=45)
plt.show()
Flight Distance:
label=df['Flight Distance']
values=df['ID']
plt.bar(values,label, color='Red')
plt.xlabel("ID")
plt.ylabel("Flight Distance")
plt.title("Simple Bar Chart")
plt.xticks(rotation=45)
plt.show()

Departure Delay:
label=df['Departure Delay']
values=df['ID']
plt.bar(values,label, color='Red')
plt.xlabel("ID")
plt.ylabel("Departure Delay")
plt.title("Simple Bar Chart")
plt.xticks(rotation=45)
plt.show()
Arrival Delay:
label=df['Arrival Delay']
values=df['ID']
plt.bar(values,label, color='Red')
plt.xlabel("ID")
plt.ylabel("Arrival Delay")
plt.title("Simple Bar Chart")
plt.xticks(rotation=45)
plt.show()
Departure and Arrival Time Convenience:
label=df['Departure and Arrival Time Convenience']
values=df['ID']
plt.bar(values,label, color='Red')
plt.xlabel("ID")
plt.ylabel("Departure and Arrival Time Convenience")
plt.title("Simple Bar Chart")
plt.xticks(rotation=45)
plt.show()

Ease of Online Booking:


label=df['Ease of Online Booking']
values=df['ID']
plt.bar(values,label, color='Red')
plt.xlabel("ID")
plt.ylabel("Ease of Online Booking")
plt.title("Simple Bar Chart")
plt.xticks(rotation=45)
plt.show()
Check-in Service:
label=df['Check-in Service']
values=df['ID']
plt.bar(values,label, color='Red')
plt.xlabel("ID")
plt.ylabel("Check-in Service")
plt.title("Simple Bar Chart")
plt.xticks(rotation=45)
plt.show()
Online Boarding:
label=df['Online Boarding']
values=df['ID']
plt.bar(values,label, color='Red')
plt.xlabel("ID")
plt.ylabel("Online Boarding")
plt.title("Simple Bar Chart")
plt.xticks(rotation=45)
plt.show()

Gate Location:
label=df['Gate Location']
values=df['ID']
plt.bar(values,label, color='Red')
plt.xlabel("ID")
plt.ylabel("Gate Location")
plt.title("Simple Bar Chart")
plt.xticks(rotation=45)
plt.show()
On-board Service:
label=df['On-board Service']
values=df['ID']
plt.bar(values,label, color='Red')
plt.xlabel("ID")
plt.ylabel("On-board Service")
plt.title("Simple Bar Chart")
plt.xticks(rotation=45)
plt.show()
Seat Comfort:
label=df['Seat Comfort']
values=df['ID']
plt.bar(values,label, color='Red')
plt.xlabel("ID")
plt.ylabel("Seat Comfort")
plt.title("Simple Bar Chart")
plt.xticks(rotation=45)
plt.show()

Leg Room Service:


label=df['Leg Room Service']
values=df['ID']
plt.bar(values,label, color='Red')
plt.xlabel("ID")
plt.ylabel("Leg Room Service")
plt.title("Simple Bar Chart")
plt.xticks(rotation=45)
plt.show()
Cleanliness:
label=df['Cleanliness']
values=df['ID']
plt.bar(values,label, color='Red')
plt.xlabel("ID")
plt.ylabel("Cleanliness")
plt.title("Simple Bar Chart")
plt.xticks(rotation=45)
plt.show()
Food and Drink:
label=df['Food and Drink']
values=df['ID']
plt.bar(values,label, color='Red')
plt.xlabel("ID")
plt.ylabel("Food and Drink")
plt.title("Simple Bar Chart")
plt.xticks(rotation=45)
plt.show()

In-flight Service:
label=df['In-flight Service']
values=df['ID']
plt.bar(values,label, color='Red')
plt.xlabel("ID")
plt.ylabel('In-flight Service")
plt.title("Simple Bar Chart")
plt.xticks(rotation=45)
plt.show()
In-flight Wifi Service:
label=df['In-flight Wifi Service']
values=df['ID']
plt.bar(values,label, color='Red')
plt.xlabel("ID")
plt.ylabel(“In-flight Wifi Service")
plt.title("Simple Bar Chart")
plt.xticks(rotation=45)
plt.show()
In-flight Entertainment:
label=df['In-flight Entertainment']
values=df['ID']
plt.bar(values,label, color='Red')
plt.xlabel("ID")
plt.ylabel("In-flight Entertainment")
plt.title("Simple Bar Chart")
plt.xticks(rotation=45)
plt.show()

Baggage Handling:
label=df['Baggage Handling']
values=df['ID']
plt.bar(values,label, color='Red')
plt.xlabel("ID")
plt.ylabel("Baggage Handling")
plt.title("Simple Bar Chart")
plt.xticks(rotation=45)
Histogram
Age
import pandas as pd
import matplotlib.pyplot as plt

path="/content/drive/MyDrive/airline_passenger_satisfaction.csv"
df=pd.read_csv(path)
df
column_name ='Age'
# Check if the column exists in the DataFrame
if column_name in df.columns:
# Plot a histogram of the selected column
plt.hist(df[column_name], bins=20) # Adjust the number of bins as needed
plt.xlabel(column_name)
plt.ylabel('Frequency')
plt.title(f'Histogram of {column_name}')
plt.show()
else:
print(f'Column "{column_name}" not found in the CSV file.')
Flight Distance
column_name ='Flight Distance'
if column_name in df.columns:

plt.hist(df[column_name], bins=20) # Adjust the number of bins as needed


plt.xlabel(column_name)
plt.ylabel('Frequency')
plt.title(f'Histogram of {column_name}')
plt.show()
else:
print(f'Column "{column_name}" not found in the CSV file.')
Departure Delay
column_name ='Departure Delay'
if column_name in df.columns:
plt.hist(df[column_name], bins=20) # Adjust the number of bins as needed
plt.xlabel(column_name)
plt.ylabel('Frequency')
plt.title(f'Histogram of {column_name}')
plt.show()
else:
print(f'Column "{column_name}" not found in the CSV file.')
Arrival Delay
column_name ='Arrival Delay'
if column_name in df.columns:
plt.hist(df[column_name], bins=20) # Adjust the number of bins as needed
plt.xlabel(column_name)
plt.ylabel('Frequency')
plt.title(f'Histogram of {column_name}')
plt.show()
else:
print(f'Column "{column_name}" not found in the CSV file.')

Departure and Arrival Time Convenience


column_name ='Departure and Arrival Time Convenience'
if column_name in df.columns:
plt.hist(df[column_name], bins=20) # Adjust the number of bins as needed
plt.xlabel(column_name)
plt.ylabel('Frequency')
plt.title(f'Histogram of {column_name}')
plt.show()
else:
print(f'Column "{column_name}" not found in the CSV file.')
Ease of Online Booking
column_name ='Ease of Online Booking'
if column_name in df.columns:
plt.hist(df[column_name], bins=20) # Adjust the number of bins as needed
plt.xlabel(column_name)
plt.ylabel('Frequency')
plt.title(f'Histogram of {column_name}')
plt.show()
else:
print(f'Column "{column_name}" not found in the CSV file.')
Check-in Service
column_name ='Check-in Service'
if column_name in df.columns:
plt.hist(df[column_name], bins=20) # Adjust the number of bins as needed
plt.xlabel(column_name)
plt.ylabel('Frequency')
plt.title(f'Histogram of {column_name}')
plt.show()
else:
print(f'Column "{column_name}" not found in the CSV file.')

Online Boarding
column_name ='Online Boarding'
if column_name in df.columns:
plt.hist(df[column_name], bins=20) # Adjust the number of bins as needed
plt.xlabel(column_name)
plt.ylabel('Frequency')
plt.title(f'Histogram of {column_name}')
plt.show()
else:
print(f'Column "{column_name}" not found in the CSV file.')
Gate Location
column_name ='Gate Location'
if column_name in df.columns:
plt.hist(df[column_name], bins=20) # Adjust the number of bins as needed
plt.xlabel(column_name)
plt.ylabel('Frequency')
plt.title(f'Histogram of {column_name}')
plt.show()
else:
print(f'Column "{column_name}" not found in the CSV file.')
On-board Service
column_name ='On-board Service'
if column_name in df.columns:
plt.hist(df[column_name], bins=20) # Adjust the number of bins as needed
plt.xlabel(column_name)
plt.ylabel('Frequency')
plt.title(f'Histogram of {column_name}')
plt.show()
else:
print(f'Column "{column_name}" not found in the CSV file.')

Seat Comfort
column_name ='Seat Comfort'
if column_name in df.columns:
plt.hist(df[column_name], bins=20) # Adjust the number of bins as needed
plt.xlabel(column_name)
plt.ylabel('Frequency')
plt.title(f'Histogram of {column_name}')
plt.show()
else:
print(f'Column "{column_name}" not found in the CSV file.')
Leg Room Service
column_name ='Leg Room Service'
if column_name in df.columns:
plt.hist(df[column_name], bins=20) # Adjust the number of bins as needed
plt.xlabel(column_name)
plt.ylabel('Frequency')
plt.title(f'Histogram of {column_name}')
plt.show()
else:
print(f'Column "{column_name}" not found in the CSV file.')
Cleanliness
column_name ='Cleanliness'
if column_name in df.columns:
plt.hist(df[column_name], bins=20) # Adjust the number of bins as needed
plt.xlabel(column_name)
plt.ylabel('Frequency')
plt.title(f'Histogram of {column_name}')
plt.show()
else:
print(f'Column "{column_name}" not found in the CSV file.')

Food and Drink


column_name ='Food and Drink'
if column_name in df.columns:
plt.hist(df[column_name], bins=20) # Adjust the number of bins as needed
plt.xlabel(column_name)
plt.ylabel('Frequency')
plt.title(f'Histogram of {column_name}')
plt.show()
else:
print(f'Column "{column_name}" not found in the CSV file.')
In-flight Service
column_name ='In-flight Service'
if column_name in df.columns:
plt.hist(df[column_name], bins=20) # Adjust the number of bins as needed
plt.xlabel(column_name)
plt.ylabel('Frequency')
plt.title(f'Histogram of {column_name}')
plt.show()
else:
print(f'Column "{column_name}" not found in the CSV file.')
In-flight Wifi Service
column_name ='In-flight Wifi Service'
if column_name in df.columns:
plt.hist(df[column_name], bins=20) # Adjust the number of bins as needed
plt.xlabel(column_name)
plt.ylabel('Frequency')
plt.title(f'Histogram of {column_name}')
plt.show()
else:
print(f'Column "{column_name}" not found in the CSV file.')

In-flight Entertainment
column_name ='In-flight Entertainment'
if column_name in df.columns:
plt.hist(df[column_name], bins=20) # Adjust the number of bins as needed
plt.xlabel(column_name)
plt.ylabel('Frequency')
plt.title(f'Histogram of {column_name}')
plt.show()
else:
print(f'Column "{column_name}" not found in the CSV file.')
Baggage Handling
column_name ='Baggage Handling'
if column_name in df.columns:
plt.hist(df[column_name], bins=20) # Adjust the number of bins as needed
plt.xlabel(column_name)
plt.ylabel('Frequency')
plt.title(f'Histogram of {column_name}')
plt.show()
else:
print(f'Column "{column_name}" not found in the CSV file.')
Multiple Bar Chart
Leg Room Service and Departure And Arrival Time:
import pandas as pd
import matplotlib.pyplot as plt
path="/content/drive/MyDrive/airline_passenger_satisfaction.csv"
df=pd.read_csv(path)
df
categories = [
'Leg Room Service',
'Departure and Arrival Time Convenience'
]
for category in categories:
plt.bar(df.index, df[category], label=category)
plt.xlabel("Index")
plt.ylabel("Values")
plt.title("Multi-Bar Chart for Airline Passenger Satisfaction")
plt.legend(loc='upper left', bbox_to_anchor=(1, 1))
plt.show()
On-Board Service and Gate Location:
categories = [
'On-board Service',
'Gate Location'
]
for category in categories:
plt.bar(df.index, df[category], label=category)
plt.xlabel("Index")
plt.ylabel("Values")
plt.title("Multi-Bar Chart for Airline Passenger Satisfaction")
plt.legend(loc='upper left', bbox_to_anchor=(1, 1))
plt.show()
Skewness of Data

!pip install pandas scipy


import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
from scipy.stats import skew

path="/airline_passenger_satisfaction.csv"
df=pd.read_csv(path)
df
df = df.apply(pd.to_numeric, errors='coerce')

skewness_per_column = df.apply(lambda x: skew(x))

print("Skewness of each column:")


print(skewness_per_column)

Output:

You might also like