You are on page 1of 10

Final Elucidate AI project (Batch_ID) 11/14/21, 9:01 PM

Christina's 5 Analytical Questions

Q1: Who were our most loyal customers?

Q2: Did longer calls yield higer sales?

Q3: On average, were males more likely to call


than females? If so, how much more? Knowing
this could help with targeting the company's
marketing campaigns, and be a more effectve
way to tailor the messagea according to one's
gender. For example, one could use specific
words toward their occupations, toward
mothers, businessmen who like things to be
concise, or people who need extra time
understand the full scope of products or
services offered before making an informed
decision? This would help the telemarketer
relate better to the listener, and thus increase
more sales calls.
Q4: Did married couples close more sales due
to their combined incomes, or was there an
equal distribution between singles and married
couples in the number of calls made?
Q5: Out of the 30-40 year olds who were
mainly targeted, and the 50-60 year olds who
http://localhost:8888/nbconvert/html/Final%20Elucidate%20AI%20project%20(Batch_ID)%20.ipynb?download=false Page 1 of 10
Final Elucidate AI project (Batch_ID) 11/14/21, 9:01 PM

were the next group to be targeted, which


coverage plan gained the most popularity?
Was it selected by the price or by an added
health feature that compelled clients to
choose one over the other? #Note to self: See
my notebook on Google Drive.
Graphs to Consider:
#1. Bar graph -great way to show relative sizes: could depict most
popular vs least popular cover plan
sns.barplot

#2. Box and Whisker plot - great for depicting numerical data (such as
number of sales made) through the quartiles
sns.boxplot( x=df["Sale_Status"], y=df["Verified_Date"] )

#3. Heat map - appropriate to use for conversion rate and revenue for
Qs 2,3,4
graphical representation of data where each value of a matrix is represented as a color.
Create a dataset df = pd.DataFrame(np.random.random((5,5)), columns=["a","b","c","d","e"])
Default heatmap p1 = sns.heatmap(df)

#4.violinplot - Comparing Marital_Status vs Age. Formula:


sns.violinplot(data=df,
x="Marital_Status_x", y="Age")

#5) Plot bar graph: EX: df_calls= df.groupby(['CampaignID']).count()


df_calls = df_calls.drop(df_calls.columns.difference(['Cust_ID']), 1) df_calls =
df_calls.rename(columns={"Cust_ID": "Total # of calls"}) df_calls.sort_values("Total # of
calls", ascending=False, inplace=True) df_calls.plot.bar() df_calls.head(20).plot.bar()</font>:

#6) correlogram # library & dataset


import seaborn as sns df = sns.load_dataset('data_post2021.csv') import matplotlib.pyplot
as plt# Basic correlogram sns.pairplot(df) sns.plt.show()

#Useful column names to consider: 1) Cover_Level 2) Family_To_Cover 3)


Cust_Sex 4) Policy_Status 5)Premium 6) Product_Category (use this one)
7)Benefit_Level 8)HistoryID 9) Sale_Status (use this one) 10)
Verified_Date 12) HistoryID

http://localhost:8888/nbconvert/html/Final%20Elucidate%20AI%20project%20(Batch_ID)%20.ipynb?download=false Page 2 of 10
Final Elucidate AI project (Batch_ID) 11/14/21, 9:01 PM

Keep: 'Call_Result', 'avg_est_income',


'avg_bal_01', 'avg_bal_avail',
'Marital_Status_x', 'Postal_Code', 'Cust_Sex',
'Batch_ID' (super helpful #This column has 58
unique entries. The exact definition of this
column and each of its entries will be
beneficial. This column indicates the sequence
we dialled the leads for the campaign).'Age',
'ListSegment',
'Policy_no' (useful b/c it's part of Customer Data History dataset which represents the
information of the policy sold successfully to clients over the phone. This is the history of the
customers with policy information that has previously been sold to the customer, it will
indicate if the policy is active or not and other relevant information). Policy_Status was the
only one that needed clarification- A – Active policy based on feedback from the client these
policies are still active (premium paying) on their policy admin system. C – Cancelled policy
based on feedback from the client these policies have either lapsed or have been cancelled
on their policy admin system.

Drop: 'CampaignID', 'Cust_ID', 'Call_Start',


'Call_End', 'Connection_ID', 'Emp_ID',
'Call_Time_seconds', 'wage_earner', 'ID_No',
'Lang_x', ''InceptionDateCorrected',
'Campaign_Type', 'Team_ID',
'EmploymentDate', 'Employee_Gender', 'Race'
In [16]:
# <font color='#9531A9'> Q1) Who were our most loyal customers? </font>

In [33]:
import numpy as n
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt

http://localhost:8888/nbconvert/html/Final%20Elucidate%20AI%20project%20(Batch_ID)%20.ipynb?download=false Page 3 of 10
Final Elucidate AI project (Batch_ID) 11/14/21, 9:01 PM

In [34]:
Batch_ID = pd.read_csv('data_post2021.csv')
Batch_ID.head() #This column has 58 unique entries. The exact definition of this
#This column indicates the sequence we dialled the leads for the campaign

---------------------------------------------------------------------------
IsADirectoryError Traceback (most recent call last)
<ipython-input-34-72523357077b> in <module>
----> 1 Batch_ID = pd.read_csv('data_post2021.csv')
2 Batch_ID.head() #This column has 58 unique entries. The exact defini
tion of this column and each of its entries will be beneficial.
3 #This column indicates the sequence we dialled the leads for the campa
ign

/opt/anaconda3/lib/python3.8/site-packages/pandas/io/parsers.py in read_csv(fi
lepath_or_buffer, sep, delimiter, header, names, index_col, usecols, squeeze,
prefix, mangle_dupe_cols, dtype, engine, converters, true_values, false_values
, skipinitialspace, skiprows, skipfooter, nrows, na_values, keep_default_na, n
a_filter, verbose, skip_blank_lines, parse_dates, infer_datetime_format, keep_
date_col, date_parser, dayfirst, cache_dates, iterator, chunksize, compression
, thousands, decimal, lineterminator, quotechar, quoting, doublequote, escapec
har, comment, encoding, dialect, error_bad_lines, warn_bad_lines, delim_whites
pace, low_memory, memory_map, float_precision, storage_options)
608 kwds.update(kwds_defaults)
609
--> 610 return _read(filepath_or_buffer, kwds)
611
612

/opt/anaconda3/lib/python3.8/site-packages/pandas/io/parsers.py in _read(filep
ath_or_buffer, kwds)
460
461 # Create the parser.
--> 462 parser = TextFileReader(filepath_or_buffer, **kwds)
463
464 if chunksize or iterator:

/opt/anaconda3/lib/python3.8/site-packages/pandas/io/parsers.py in __init__(se
lf, f, engine, **kwds)
817 self.options["has_index_names"] = kwds["has_index_names"]
818
--> 819 self._engine = self._make_engine(self.engine)
820
821 def close(self):

/opt/anaconda3/lib/python3.8/site-packages/pandas/io/parsers.py in _make_engin
e(self, engine)
1048 )
1049 # error: Too many arguments for "ParserBase"
-> 1050 return mapping[engine](self.f, **self.options) # type: ignore
[call-arg]
1051
1052 def _failover_to_python(self):

/opt/anaconda3/lib/python3.8/site-packages/pandas/io/parsers.py in __init__(se
lf, src, **kwds)

http://localhost:8888/nbconvert/html/Final%20Elucidate%20AI%20project%20(Batch_ID)%20.ipynb?download=false Page 4 of 10
Final Elucidate AI project (Batch_ID) 11/14/21, 9:01 PM

1865
1866 # open handles
-> 1867 self._open_handles(src, kwds)
1868 assert self.handles is not None
1869 for key in ("storage_options", "encoding", "memory_map", "comp
ression"):

/opt/anaconda3/lib/python3.8/site-packages/pandas/io/parsers.py in _open_handl
es(self, src, kwds)
1360 Let the readers open IOHanldes after they are done with their
potential raises.
1361 """
-> 1362 self.handles = get_handle(
1363 src,
1364 "r",

/opt/anaconda3/lib/python3.8/site-packages/pandas/io/common.py in get_handle(p
ath_or_buf, mode, encoding, compression, memory_map, is_text, errors, storage_
options)
640 errors = "replace"
641 # Encoding
--> 642 handle = open(
643 handle,
644 ioargs.mode,

IsADirectoryError: [Errno 21] Is a directory: 'data_post2021.csv'

In [35]:
Batch_ID.shape #loading and inspecting data

---------------------------------------------------------------------------
NameError Traceback (most recent call last)
<ipython-input-35-373762c67413> in <module>
----> 1 Batch_ID.shape #loading and inspecting data

NameError: name 'Batch_ID' is not defined

In [20]:
Batch_ID.dtypes #loading and inspecting data

---------------------------------------------------------------------------
NameError Traceback (most recent call last)
<ipython-input-20-148267869b91> in <module>
----> 1 Batch_ID.dtypes #loading and inspecting data

NameError: name 'Batch_ID' is not defined

In [21]:
Batch_ID.columns #loading and inspecting data

---------------------------------------------------------------------------
NameError Traceback (most recent call last)
<ipython-input-21-ef58acf42f5c> in <module>
----> 1 Batch_ID.columns #loading and inspecting data

NameError: name 'Batch_ID' is not defined

http://localhost:8888/nbconvert/html/Final%20Elucidate%20AI%20project%20(Batch_ID)%20.ipynb?download=false Page 5 of 10
Final Elucidate AI project (Batch_ID) 11/14/21, 9:01 PM

In [22]:
Batch_ID.apply('nunique') #loading and inspecting data

---------------------------------------------------------------------------
NameError Traceback (most recent call last)
<ipython-input-22-8ec1553fa572> in <module>
----> 1 Batch_ID.apply('nunique') #loading and inspecting data

NameError: name 'Batch_ID' is not defined

In [23]:
Batch_ID = Batch_ID.drop([['CampaignID', 'Cust_ID', 'Effective_Date','Call_Start'

File "<ipython-input-23-a94b346ac738>", line 1


Batch_ID = Batch_ID.drop([['CampaignID', 'Cust_ID', 'Effective_Date','Call
_Start', 'Verified_Date','Call_End', 'Connection_ID', 'Emp_ID', 'Call_Time_sec
onds', 'wage_earner', 'ID_No', 'Lang_x','InceptionDateCorrected','Campaign_Typ
e', 'Team_ID', 'EmploymentDate', 'Employee_Gender', 'Race'], axis=1)

^
SyntaxError: invalid syntax

In [24]:
Batch_ID = Batch_ID.drop(['CampaignID'], axis=1)

---------------------------------------------------------------------------
NameError Traceback (most recent call last)
<ipython-input-24-99d56e58c6a8> in <module>
----> 1 Batch_ID = Batch_ID.drop(['CampaignID'], axis=1)

NameError: name 'Batch_ID' is not defined

In [25]:
Batch_ID = Batch_ID.rename(columns={"Cust_Sex": "Cust_Gender"})
Batch_ID.head()

---------------------------------------------------------------------------
NameError Traceback (most recent call last)
<ipython-input-25-2d577942fb8f> in <module>
----> 1 Batch_ID = Batch_ID.rename(columns={"Cust_Sex": "Cust_Gender"})
2 Batch_ID.head()

NameError: name 'Batch_ID' is not defined

In [26]:
Batch_ID = Batch_ID.rename(columns={"Avg_est_income": "Avg_income"})
Batch_ID.head()

---------------------------------------------------------------------------
NameError Traceback (most recent call last)
<ipython-input-26-e69b42e36748> in <module>
----> 1 Batch_ID = Batch_ID.rename(columns={"Avg_est_income": "Avg_income"})
2 Batch_ID.head()

NameError: name 'Batch_ID' is not defined

http://localhost:8888/nbconvert/html/Final%20Elucidate%20AI%20project%20(Batch_ID)%20.ipynb?download=false Page 6 of 10
Final Elucidate AI project (Batch_ID) 11/14/21, 9:01 PM

In [27]:
print(Batch_ID.shape) #removing duplicates
duplicate_rows_df = df[df.duplicated()] #rows containing duplicate data

print(duplicate_rows_df.shape)

---------------------------------------------------------------------------
NameError Traceback (most recent call last)
<ipython-input-27-c4c486b6b206> in <module>
----> 1 print(Batch_ID.shape) #removing duplicates
2 duplicate_rows_df = df[df.duplicated()] #rows containing duplicate dat
a
3
4 print(duplicate_rows_df.shape)

NameError: name 'Batch_ID' is not defined

In [28]:
Batch_ID = Batch_ID.drop_duplicates(keep='Verified_Date')
print(Batch_ID.shape)

---------------------------------------------------------------------------
NameError Traceback (most recent call last)
<ipython-input-28-db91953c01d4> in <module>
----> 1 Batch_ID = Batch_ID.drop_duplicates(keep='Verified_Date')
2 print(Batch_ID.shape)

NameError: name 'Batch_ID' is not defined

In [29]:
Batch_ID.dtypes #data types

---------------------------------------------------------------------------
NameError Traceback (most recent call last)
<ipython-input-29-1cdb81cf53b0> in <module>
----> 1 Batch_ID.dtypes #data types

NameError: name 'Batch_ID' is not defined

In [30]:
Batch_ID = Batch_ID.drop(["Verified_Date", "Postal_Cde","Effective_Date"], axis
Batch_ID.head()

---------------------------------------------------------------------------
NameError Traceback (most recent call last)
<ipython-input-30-964043ab9a00> in <module>
----> 1 Batch_ID = Batch_ID.drop(["Verified_Date", "Postal_Cde","Effective_Dat
e"], axis=1)
2 Batch_ID.head()

NameError: name 'Batch_ID' is not defined

http://localhost:8888/nbconvert/html/Final%20Elucidate%20AI%20project%20(Batch_ID)%20.ipynb?download=false Page 7 of 10
Final Elucidate AI project (Batch_ID) 11/14/21, 9:01 PM

In [31]:
Batch_ID['Verified_Date'] = pd.to_datetime(Batch_ID['Verified_Date']) #needed to be
Batch_ID.info()

---------------------------------------------------------------------------
NameError Traceback (most recent call last)
<ipython-input-31-946ff7e96323> in <module>
----> 1 Batch_ID['Verified_Date'] = pd.to_datetime(Batch_ID['Verified_Date'])
#needed to be renamed
2 Batch_ID.info()

NameError: name 'Batch_ID' is not defined

In [32]:
Batch_ID.Postal_Code = pd.to_int(Batch_ID["Postal_Code"]) #needed to be renamed
print(Batch_ID.dtypes)

---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-32-6cdeb9dea063> in <module>
----> 1 Batch_ID.Postal_Code = pd.to_int(Batch_ID["Postal_Code"]) #needed to b
e renamed
2 print(Batch_ID.dtypes)

/opt/anaconda3/lib/python3.8/site-packages/pandas/__init__.py in __getattr__(n
ame)
242 return _SparseArray
243
--> 244 raise AttributeError(f"module 'pandas' has no attribute '{name}'")
245
246

AttributeError: module 'pandas' has no attribute 'to_int'

In [40]:
Batch_ID["Postal_Code"] = Batch_ID["Postal_Code”].astype(int)

File "<ipython-input-40-1b2da4f23f34>", line 1


Batch_ID["Postal_Code"] = Batch_ID["Postal_Code”].astype(int)
^
SyntaxError: EOL while scanning string literal

In [41]:
Batch_ID["Postal_Code"] = Batch_ID["Postal_Code"].astype(int)

---------------------------------------------------------------------------
NameError Traceback (most recent call last)
<ipython-input-41-bb0351a0cb47> in <module>
----> 1 Batch_ID["Postal_Code"] = Batch_ID["Postal_Code"].astype(int)

NameError: name 'Batch_ID' is not defined

http://localhost:8888/nbconvert/html/Final%20Elucidate%20AI%20project%20(Batch_ID)%20.ipynb?download=false Page 8 of 10
Final Elucidate AI project (Batch_ID) 11/14/21, 9:01 PM

In [ ]:
Batch_ID.Cover_Amount = pd.to_int64(Batch_ID["Cover_Amount"]) #needed to be renamed
print(Batch_ID.dtypes)

In [ ]:
print(Batch_ID.isnull().sum()) #missing values

In [36]:
! pip install missingno

Requirement already satisfied: missingno in /opt/anaconda3/lib/python3.8/site-


packages (0.5.0)
Requirement already satisfied: matplotlib in /opt/anaconda3/lib/python3.8/site
-packages (from missingno) (3.3.4)
Requirement already satisfied: scipy in /opt/anaconda3/lib/python3.8/site-pack
ages (from missingno) (1.6.2)
Requirement already satisfied: seaborn in /opt/anaconda3/lib/python3.8/site-pa
ckages (from missingno) (0.11.1)
Requirement already satisfied: numpy in /opt/anaconda3/lib/python3.8/site-pack
ages (from missingno) (1.20.1)
Requirement already satisfied: pyparsing!=2.0.4,!=2.1.2,!=2.1.6,>=2.0.3 in /op
t/anaconda3/lib/python3.8/site-packages (from matplotlib->missingno) (2.4.7)
Requirement already satisfied: pillow>=6.2.0 in /opt/anaconda3/lib/python3.8/s
ite-packages (from matplotlib->missingno) (8.2.0)
Requirement already satisfied: cycler>=0.10 in /opt/anaconda3/lib/python3.8/si
te-packages (from matplotlib->missingno) (0.10.0)
Requirement already satisfied: python-dateutil>=2.1 in /opt/anaconda3/lib/pyth
on3.8/site-packages (from matplotlib->missingno) (2.8.1)
Requirement already satisfied: kiwisolver>=1.0.1 in /opt/anaconda3/lib/python3
.8/site-packages (from matplotlib->missingno) (1.3.1)
Requirement already satisfied: six in /opt/anaconda3/lib/python3.8/site-packag
es (from cycler>=0.10->matplotlib->missingno) (1.15.0)
Requirement already satisfied: pandas>=0.23 in /opt/anaconda3/lib/python3.8/si
te-packages (from seaborn->missingno) (1.2.4)
Requirement already satisfied: pytz>=2017.3 in /opt/anaconda3/lib/python3.8/si
te-packages (from pandas>=0.23->seaborn->missingno) (2021.1)

In [37]:
import missingno as msno

msno.matrix(Batch_ID);

---------------------------------------------------------------------------
NameError Traceback (most recent call last)
<ipython-input-37-9fa12fbf4e1c> in <module>
1 import missingno as msno
2
----> 3 msno.matrix(Batch_ID);

NameError: name 'Batch_ID' is not defined

In [39]:
Batch_ID = Batch_ID([])

http://localhost:8888/nbconvert/html/Final%20Elucidate%20AI%20project%20(Batch_ID)%20.ipynb?download=false Page 9 of 10
Final Elucidate AI project (Batch_ID) 11/14/21, 9:01 PM

---------------------------------------------------------------------------
NameError Traceback (most recent call last)
<ipython-input-39-3200018f45d5> in <module>
----> 1 Batch_ID = Batch_ID([])

NameError: name 'Batch_ID' is not defined

In [38]:
Batch_ID = Batch_ID.drop(["Verified_Date"], axis=1 #Verified_Date - doesnt look lik
Batch_ID = Batch_ID.drop(["Effective_Date"], axis=1 #Effective_Date -had 00:00.0 in
Batch_ID = Batch_ID.drop(["Date_of_Debit"], axis=1 #Date_of_Debit had 00:00.0 in ev

File "<ipython-input-38-816c887da580>", line 2


Batch_ID = Batch_ID.drop(["Effective_Date"], axis=1 #Effective_Date -had 0
0:00.0 in entire column
^
SyntaxError: invalid syntax

In [42]:
Batch_ID.dtypes

---------------------------------------------------------------------------
NameError Traceback (most recent call last)
<ipython-input-42-459ddd2e979d> in <module>
----> 1 Batch_ID.dtypes

NameError: name 'Batch_ID' is not defined

In [ ]:
df.

In [ ]:

http://localhost:8888/nbconvert/html/Final%20Elucidate%20AI%20project%20(Batch_ID)%20.ipynb?download=false Page 10 of 10

You might also like