You are on page 1of 9
!pip install pyreadstat Collecting pyreadstat Downloading pyreadstat-1.1.2~ Requirement already satisfied: Requirement already satisfied: Requirement already satisfied: Requirement already satisfied: Requirement already satisfied: Installing collected packages: cp37-cp37m-manylinux2014_x86_64-whl (2.5 MB) | 2.5 MB 12.4 MB/s pandas>0.24.0 in /usr/local/1ib/python3.7/dist python-dateutil>=2.7.3 in /usr/local/1ib/pytho: pytz>=2017.2 in /usr/local/lib/python3.7/dist~ numpy>=1.15.4 in /usr/local/lib/python3.7/dist sixo=1.5 in /usr/local/lib/python3.7/dist-pack pyreadstat Successfully installed pyreadstat-1.1.2 import pyreadstat import pandas as pd import numpy as np df= pd.read_csv("/content/Airline satisfaction.csv") df-head() online — Seat Inflight voarding comfort entertainment on- Leg board room service service Baggage Checkin Infligh handling service servic 3 5 5 4 3 4 4 3 1 t 1 & 3 1 5 5 a 4 3 4 4 & 2 a 2 5 3 1 5 5 3 3 4 4 3 tuples gui conveys) lseORUsS_gllagQuOshFNscrlTonn SMCoWRS9S2&unigier=&piatMode=tve » Scanned with CamScanner 19/09/2021, 00:06 ASM Assignment 7ipyn - Colaboratory Shy vated! v j-nurGesy) 103904 cols= ["Gender", "Customer Type","Age","Type of Travel","Class", "Flight Distance", df [cols] =d£{cols].astype(‘category') df.dtypes Unnamed: 0 inted id inted Gender category Customer Type category age category Type of Travel category class category Flight Distance category Inflight wifi service category Departure/Arrival time convenient int6d Ease of Online booking inted Gate location inted Food and drink int6a Online boarding ints4 Seat comfort int64 Inflight entertainment inted on-board service intea Leg room service int6d Baggage handling intea checkin service inted Inflight service int64 cleanliness inted Departure Delay in Minutes int6d Arrival Delay in Minutes floaté4 satisfaction category dtype: object d£2 = df.drop({'Departure/Arrival time convenient’ ],axis=1) mode= df2{'Type of Travel’) df2[ ‘type of Travel’ ]=df2[ ‘Type of Travel’ df2['Type of Travel'] fillna(d£2['Type of Travel’ ].mode()[0) ° Personal Travel i, Business travel 2 Business travel 3 Business travel 4 Business travel 103899 Business travel 103900 Business travel 103901 Business travel 103902 Business travel 103903 Business travel Name: Type of Travel, Length: 103904, dtype: category categories (2, object): (‘Business travel', ‘Personal Travel'] hups/feolab esearch google comdrive/I VS-5yIOnoORUSS_qdIdaqQoOchFNAsrollTo=n-SMCoWRS9S28unigier=1&prntMode=tve 29 Scanned with CamScanner 190972021, 00406 ASM Assignment 7ipyn - Colaboratory mode= df2[ ‘Flight Distance’ } d£2['Flight Distance’ |=df2[ ‘Flight Distance'].fillna(df2['Flight Distance’ ].mode()| d£2['Flight Distance’ 460 235 1142 562 214 103899 192 103900 2347 103901 1995 103902 1000 1039031723 Name: Flight Distance, Length: 103904, dtyp. Categories (3802, int64): [31, 56, 67, 73, : category » 4502, 4817, 4963, 4983] mode= df2[ ‘satisfaction’ df2[ ‘satisfaction’ ]=df2{ ‘satisfaction’ ].fillna(df2[ ‘satisfaction’ ].mode()[0] df2['satisfaction'] ° neutral or dissatisfied 1 neutral or dissatisfied 2 satisfied 3 neutral or dissatisfied 4 satisfied 103899 neutral or dissatisfied 103900 satisfied 103901 neutral or dissatisfied 103902 neutral or dissatisfied 103903 neutral or dissatisfied Name: satisfaction, Length: 103904, dtype: category Categories (2, object): ['neutral or dissatisfied’, 'satisfied’] d£2.describe(include='all") hups/feolab esearch google comdrive/I VS-5yIOnoORUSS_qdIdaqQoOchFNAsrollTo=n-SMCoWRS9S28unigier=1&prntMode=tve 39) Scanned with CamScanner 190972021, 00406 ASM Assignment 7ipyn - Colaboratory customer Te Unnamed: 0 id Gender ‘Type Age of Class Travel count 103904.000000 103904.000000 103904 103904 103904.0 103904 103904 unique NaN NaN 2 2 75.0 2 3 dé.dtypes Unnamed: 0 inted id int64 Gender category Customer Type category Age category ‘Type of Travel category class category Flight Distance category Inflight wifi service category Departure/Arrival time convenient int64 Ease of Online booking int64 Gate location int6d Food and drink inte4 Online boarding int64 Seat comfort inté4 Inflight entertainment int64 on-board service int6d Leg room service intéd Baggage handling inte4 Checkin service intéd Inflight service inte4 Cleanliness inted Departure Delay in Minutes inte4 Arrival Delay in Minutes floated satisfaction category dtype: object import matplotlib.pyplot as plt import seaborn as sns ‘matplotlib inline d£2[['Type of Travel’, 'On-board-service'}].groupby(by='Type of Travel’) .mean() On-board service Type of Travel Business travel 3.431233 Personal Travel 3.273776 figl,axl = plt.subplots(figsize=(12,7)) axl.pie(d£2[['Type of Travel’, ‘On-board service’ ]].groupby(by="Type of Travel’) -mean(),labels=[ 'Business-travel', "Personal +Travel" ],autopct='%1.2£8%") plt.title("Ratio of On-board services between Business travel and Personal Trave hups/feolab esearch google comdrive/I VS-5yIOnoORUSS_qdIdaqQoOchFNAsrollTo=n-SMCoWRS9S28unigier=1&prntMode=tve 49 Scanned with CamScanner 19909/2021,00:06 ASM Assignment 7.pyab - Colaboratory /usr/local/1ib/python3.7/dist-packages/ipykernel_launcher.py:3: MatplotlibDep This is separate from the ipykernel package so we can avoid doing imports u Text(0.5, 1.0, ‘Ratio of On-board services between Business travel and Person Ratio of On-board services between Business travel and Personal Travel Business travel Personal Travel d£2.boxplot('On-board-service', by='Type of Travel’, figsize=(12, 8)) hitps:/eolab esearch google comrive/IV8yS-5y1OnoORSs_qdwidaqQuOchNAsrollTo=n-SMCoWRS9S2&unigier=dprntMode=tve 39) Scanned with CamScanner 190972021, 00406 ASM Assignment 7ipyn - Colaboratory /usr/local/1ib/python3.7/dist-packages/numpy/core/_asarray.py:83: VisibleDepr return array(a, dtype, copy=False, order=order) Boxplot grouped by Type of Travel On-board service tpip install statsmodels scipy tpip install statsmodels==0.10.0rc2 --pre statsmodels in /usr/local/1ib/python3.7/dist-p. Requirement already satisfied: scipy in /usr/local/1ib/python3.7/dist-package Requirement already satisfied: patsy>=0.4.0 in /usr/local/lib/python3.7/dist- Requirement already satisfied: pandas>=0.19 in /usr/local/1ib/python3.7/dist- Requirement already satisfied: numpy>=1.11 in /usr/local/1ib/python3.7/dist-p Requirement already satisfied: pytz>=2017.2 in /usr/local/1ib/python3.7/dist- Requirement already satisfied: python-dateutil>=2.7.3 in /usr/local/1ib/pytho Requirement already satisfied: six in /usr/local/lib/python3.7/dist-packages Collecting statsmodels==0.10.0rc2 Downloading statsmodels-0.10.0rc2-cp37-cp37m-manylinuxl_x86_64.whl (8.1 MB: | | 10.5 13/5 Requirement already satisfied: <18 in /usr/local/1ib/python3.7/dist-p Requirement already satisfied: pandas>=0.19 in /usr/local/1ib/python3.7/dist- Requirement already satisfied: -4.0 in /usr/local/1ib/python3.7/dist- Requirement already satisfied: +11 in /usr/local/1ib/python3.7/dist-p. Requirement already satisfied: python-dateutil>=2.7.3 in /usr/local/lib/pytho Requirement already satisfied: pytz>=2017.2 in /usr/local/lib/python3.7/dist- Requirement already satisfied: six in /usr/local/lib/python3.7/dist-packages Installing collected packages: statsmodels Attempting uninstall: statsmodels Found existing installation: statsmodels 0.10.2 Uninstalling statsmodels-0.10.2: Successfully uninstalled statsmodels-0.10.2 Successfully installed statsmodels-0.10.0rc2 WARNING: The following packages were previously imported in this runtime: [statsmodels] You must restart the runtime in order to use newly installed versions RESTART RUNTIME Requirement already satisfied: import scipy.stats as stats import statsmodels.api as sm from statsmodels.formula.api import ols /usr/local/1ib/python3.7/dist-packages/statsmodels/tools/_testing.py:19: Futu import pandas.util.testing as tm /usr/local/1ib/python3.7/dist-packages/statsmodels/compat /pandas.py:23: Futur data_klasses = (pandas.Series, pandas.DataFrame, pandas.Panel) aloss= df2[[ 'Type-of Travel , 'On-board: service’ }] .groupby (b figl,axl = plt.subplots(figsize=(12,7)) axl.pie(df2{{'Type of Travel’, ‘On-board service']].groupby(by="Type of Travel! ) .me plt.title("Ratio of On-board services between Business travel and Personal Travel" Type of Travel').mear hups/feolab esearch google comdrive/I VS-5yIOnoORUSS_qdIdaqQoOchFNAsrollTo=n-SMCoWRS9S28unigier=1&prntMode=tve 69 Scanned with CamScanner 190972021, 00406 ASM Assignment 7ipyn - Colaboratory /usr/local/1ib/python3.7/dist-packages/ipykernel_launcher.py:3: MatplotlibDep This is separate from the ipykernel package so we can avoid doing imports u Text(0.5, 1.0, ‘Ratio of On-board services between Business travel and Person Ratio of On-board services between Business travel and Personal Travel Business travel sns.barplot(x='Type of Travel’ ,y='On-board-service’, data-d£2) 35 30 & On-board service os 00 ‘Business travel Personal Travel ype of Travel d£2.rename(columns= {"Type of Travel": "Traveltype", "On-board service": "Onboardse: mod = ols('Onboardservice ~ Traveltype', data=df2).fit() aov_table = sm.stats.anova_lm(mod,typ=2) aov_table sum_sq at F PROF) Traveltype _551.384767 1.0 333,25036 2.4624136-74 Residual 171912.732870 103902.0 NaN NaN psf sesearch google comriveVByS-5)1n0RSs_qw}dagQoOchFN#ScrllTo=n-SMCoWRS9S2&anigier=1&printMode=tne 19 Scanned with CamScanner 19}092021,00.05 ASM Assignment ipa -Colsbortory from sklearn import preprocessing Label_Encoder= preprocessing. LabelEncoder ( ) d£2["Travel_type"] = Label_Encoder. fit_transform(df2{"Traveltype" ]) df2{"Customer_type"] = Label_Encoder. fit_transform(df2["Customer-Type" }) df_Traveltype = pd.get_dummies(d£2['Traveltype'],drop_first=True) df_Traveltype.head() Personal Travel o 1 1 o 2 0 3 0 4 0 d£_Customertype = pd.get_dummies(df2( ‘Customer Type'],drop_first-True) df_Customertype -head() disloyal Customer 0 0 1 1 2 0 3 0 4 0 at([4£2[[ ‘Customer Type’, 'Traveltype', 'Onboardservice’, ‘Gender’, ‘Age’, ‘satisfaction data_final-head() oo ve Traveltype Onboardservice Gender age satisfaction Personal ¢ De ee 4 Male 13 eatatod : Crone aed 1 Male 25 gseatenod ° Cote | elie 4 Female 26 satisfied o from statsmodels.formila.api import ols apes euch goglcom/ive/IVByS-50nORS_qdwIdagQoOchFNAsrlTonn-SMCoWRSIS2&nigier=1€printMode=te a Scanned with CamScanner 190972021, 00406 ASM Assignment 7ipyn - Colaboratory fit = ols('Onboardservice~ Gender+Traveltype’ , data=data_final).fit() fit.summary() OLS Regression Results Dep. Variable: Onboardservice R-squared: 0.003 Model: OLS Adj. Resquared: 0.003 Method: Least Squares F-stati 170.4 Date: Sat, 18 Sep 2021 Prob (F-statistic): 1.26e-74 Time: 19:05:11 Log-Likelihood: -1.7359e+05 No. Observations: 103904 Alc: 3.4720+05 Df Residuals: 103901 3.4720+05 DfModel: 2 Covariance Type: nonrobust coef stderr t — Poitl [0.025 0.975] Intercept 3.4204 0.006 551.972 0.000 3.408 3.433 Gender[T.Male] 0.0220 0.008 2.759 0.006 0.006 0.038 Traveltype[T.Personal Travel] -0.1576 0.009 -18.274 0.000 -0.175 -0.141 Omnibus: — 15870.929 Durbin-Watson: 1.988 Prob(Omnibus): 0.000 Jarque-Bera (JB): 6415.187 Skew: — -0.420 Prob(JB): 0.00 Kurtosis: 2.119 Cond.No. 2.89 Warnings: [1] Standard Errors assume that the covariance matrix of the errors is correctly specified. vy 0s completed at 12:05 AM hups/feolab esearch google comdrive/I VS-5yIOnoORUSS_qdIdaqQoOchFNAsrollTo=n-SMCoWRS9S28unigier=1&prntMode=tve Scanned with CamScanner 99

You might also like