Professional Documents
Culture Documents
Customer Analysis Project1
Customer Analysis Project1
PROJECT DESCRIPTION
IN THIS PROJECT THE GIVEN DATASET DESCRIBES ABOUT THE COMPLAINTS FILED BY THE POLICE
DEPARTMENT IN U.S.AND THE TIME TAKEN TO REPONSE TO EVERY COMPLAINTS AND CLOSE THE
VALUES ,UNUSED COLUMN ARE TO BE CLEANED,TO PERFORM STATISCAL ANALYSIS, THE CLOSED
TO BE REMOVED FROM THE DATASET.IN OREDER TO CHECK THE MEAN TIME TAKEN FOR THE
APPROACH:
cust_dat=pd.read_csv('311_Service_Requests_from_2010_to_Present.csv', header=0,
null_values=cust_dat.isna().sum()
plt.figure(figsize=(30,10))
x=cust_dat.columns
y=null_values
plt.xticks(rotation=70)
plt.bar(x,y)
plt.show
cust_dat.dropna(subset=['Closed Date'],inplace=True)
THE COMPLAINT DURATON IS DIVIDED INTO TWO PARTS REPONSE TIME AND CLOSING TIME,
THE RESPONSE TIME IS THE TIME BETWEEN CREATED TIME AND RESOLUTION ACTION
UPDATED TIME.
THE CLOSING TIME IS THE TIME BETWEEN RESOLUTION ACTION UPDATED TIME AND THE
CLOSED TIME.
LATER PART THE AVERAGE RESPONSE TIME AND AVERAGE CLOSED TIME IS CALCULATED FOR
THE STATISTICAL ANALYSIS.
cust_dat['City'].fillna('UNKNOWN CITY',inplace=True)
THE CITY COLUMN HAS NULL VALUES WHICH ARE TO BE IMPUTED WITH ‘UNKNOWN’ VALUES
WITH THE FILLNA FUNCTION.
cust_dat['Complaint Type'].value_counts().plot(kind='bar',figsize=(10,5))
new_df=pd.crosstab(index=cust_dat['Complaint Type'],columns=cust_dat['City'])
AND TO SHOW THE DIFFERENT COMPLAINTS IN DIFFERENT COLORS WE PLOT THE CROSSTAB
new_df1=pd.crosstab(index=cust_dat['City'],columns=cust_dat['Complaint Type'])
new_df1.plot(kind='bar',figsize=(30,10),stacked=True,colormap="Paired")
plt.show()
noise-stree sidewalk and blocked driveway is major complaints in most of the cities
PIE GRAPH TO SHOW THE BOROUGH:
cust_dat['Borough'].nunique()
THE ABOVE CODE DISPLAYS THE NUMBER OF UNIQUE VALUES IN BOROUGH WHICH HELPS IN
MAKING PIE GRAPH
plt.figure(figsize=(20,10))
explode=(0.15,0.05,0,0,0)
cust_dat['Borough'].value_counts().head(5).plot(kind='pie',labels=cust_dat['Borough'],explode=ex
plode,autopct='%1.1f%%',startangle=70)
plt.axis('equal')
AVERAGE CLOSING TIME IS CALCULATED FROM THE MEAN OF CLOSING TIME OR RESOLUTION
TIME
A HISTOGRAM PLOT HAS PLOTTED TO CHECK THE FREQUENCY OF THE RESPONSE TIME
plt.figure(figsize=(10,6))
sns.histplot(cust_dat['RESPONSE_TIME'],kde=False)
plt.title('RESPONSE TIME')
plt.show()
A HISTOGRAM HAS PLOTTED TO CHECK THE FREQUENCY OF THE CLOSING TIME OR THE
RESOLUTION TIME
plt.figure(figsize=(10,6))
sns.histplot(cust_dat['CLOSING_TIME'],kde=False)
plt.title("CLOSING TIME")
plt.show()
THE SIGNIFICANT VARIABLES ASSOCIATED WITH THE RESOLUTION TIME:
COLS=cust_dat.corr().nlargest(10,'Resolution_Time')["Resolution_Time"].index
WE DRAW THE HEAT MAP TO SEE THE CORRELATION BETWEEN THE VARIABLES
plt.figure(figsize=(10,6))
sns.heatmap(cust_dat[COLS].corr(),annot=True)
plt.show()
THERE ARE SEVEN SIGNIFICANT VARIABLES ASSOCIATED WITH THE RESOLUTION TIME.
IN ORDER TO CHECK THE AVERAGE RESOLUTION TIME AND COMPARE THE MEAN OF
ONE-WAY-ANOVA
ONE WAY ANALYSIS OF VARIANCE IS CONDUCTED BETWEEN THE COMPLAINTS WHICH SHOW
MAJOR VARIATIONS IN THE AVG_RESOLUTION TIME
DERELICT VEHICLE
AGENCY ISSUES
NOISY -STREET/SIDEWALK
POSTING ADVERTISEMENT
pvalue
pvalue
pvalue
4.fvalue, pvalue = stats.f_oneway(PA,NSS)
pvalue
Pvalue
ANOVA TABLE:
cust_dat['Complaint_Type']=cust_dat['Complaint Type']
anova_table
CHISQUARE TEST
dof=1166
[[4.48019550e-03 1.27298646e-01 1.43773546e-02 ... 4.92006924e-02
7.10110987e-02 2.38264942e-03]
[5.70328887e+00 1.62051176e+02 1.83023725e+01 ... 6.26324814e+01
9.03971286e+01 3.03311272e+00]
[7.46699250e-04 2.12164410e-02 2.39622577e-03 ... 8.20011540e-03
1.18351831e-02 3.97108237e-04]
...
[3.30265078e+00 9.38403184e+01 1.05985066e+01 ... 3.62691104e+01
5.23470149e+01 1.75640973e+00]
[4.38312460e-01 1.24540508e+01 1.40658453e+00 ... 4.81346774e+00
6.94725249e+00 2.33102535e-01]
[2.80534908e+00 7.97101687e+01 9.00262024e+00 ... 3.08078336e+01
4.44647829e+01 1.49193565e+00]]
probability=0.950, critical=1246.552, stat=122561.494
Dependent (reject H0)
significance=0.050, p=0.000
Dependent (reject H0)
RESULT