Professional Documents
Culture Documents
BY
SOURAV KUMAR PANIGRAHI
Project Description
● This case study aims to give an idea on applying EDA on real life business scenario.
● Techniques of EDA will help us develop a basic understanding in risk development in banking and
financial service and minimises the risk of losing money while providing loans.
Approach
● At first, understand the datasets,so that it is easy to sort out the queries.
● Detect the missing values and clear it so that we can move forward with the analysis.
● Imported the datasets to google colab and started working with the coding.
Tech-Stack used
● Microsoft excel
● Google Colab
● Google Drive (To import the dataset)
1) Present the overall approach of the analysis. Mention the problem statement and the analysis
approach briefly
● We are plotting a box chart to identify the outliers for applicaion_data(quantitative variables)
4) Explain the results of univariate, segmented univariate, bivariate analysis, etc. in business terms.
univariate
Df1_extract = Df3[['DAYS_BIRTH']]
Df1_extract.hist()
plt.show()
Df1_extract.describe()
segmented univariate
bivariate analysis
5) Identify if there is data imbalance in the data. Find the ratio of data imbalance.
f1=Df3_Imbalance.diff(periods=1,axis=0)
difvalue=Df3_Imbalance[[list(Df3_Imbalance.columns)[-1]]].max()
difvalue
6) Find the top 10 correlation for the Client with payment difficulties and all other cases (Target
variable
Heat map for the correlation between critical quantitative values for clients having
difficulty in payment
7) include visualizations and summarize the most important results in the presentation. You are
free
to choose the graphs which explain the numerical/categorical variables. Insights should explain why
the
variable is important for differentiating the clients with payment difficulties with all other cases.