You are on page 1of 44

INDUSTRY INTERNSHIP REPORT

On
Exploring Banking Trends using Business Analytics
Report submitted in partial fulfillment of the requirement for the award of
Post Graduate Diploma in Management
(2022-2024)

Title of On Job Training (OJT)


------------------
Mozohunt
New-Delhi

Submitted by

Name:- A.Sekhar

Program:- PGDM-Business Analytics

Roll No:- BA3-22

Faculty Guide: Corporate Guide:


Subash Tej Gourav Singh
Assistant Professor Manager- HR
Dept. of Data Sciences

Siva Sivani Institute of Management


Kompally, Secunderabad-500100

i
INDUSTRY INTERNSHIP REPORT
On
Exploring Banking Trends using Business Analytics
Report submitted in partial fulfillment of the requirement for the award of Post
Graduate Diploma in Management
(2022-2024)

Title of On Job Training (OJT)


Marketing and Business Analytics
Mozohunt
New-Delhi

Submitted by

Name:- A.Sekhar

Program:- PGDM-Business Analytics

Roll No:- BA3-22

Faculty Guide: Corporate Guide:


Subash Tej Gourav Singh
Assistant Professor Manager- HR
Dept. of Data Sciences

ii
INTERNSHIP CERTIFICATE

iii
DECLARATION

I A. SEKHAR declare that this project titled “Exploring Banking Trends


using Business Analytics” is the original work done by me under the guidance
of Mr. SUBASH TEJ, Siva Sivani Institute of Management, Secunderabad.

I further declare that it is the original work made by me as a part of my


Post Graduate Diploma in Management.

Date: Signature of the student:


Place: Hyderabad Name: A.Sekhar
Roll No:BA3-13

iv
ACKNOWLEDGEMENTS

I express my deep sense of gratitude to Chairmen Dr. Sailesh Sampathy and


Director Dr. Ramana Rao, Siva Sivani Institute of Management for giving me
the opportunity of being part of this institution and for generous help to
complete the project.

This project would not have been completed successfully without the help of
my mentor and guide, Professor Mr. T. Subash Tej, Assistant Professor, Dept
of Data Sciences who reviewed the entire project and provided invaluable
advice, ongoing support, helpful recommendations, and encouragement from
the beginning to the end.

A project's success depends on cooperation and teamwork. Considering this, I


would want to express my heartfelt appreciation to every faculty member who
helped to make this project a reality at its completion by offering their wise
guidance.

Finally, I would like to express my gratitude to my family and friends for


helping me finish the project and for their encouragement.

Date: Name of the Student: A.Sekhar


Place: Roll No: BA3-22

v
vi
vii
TABLE OF THE CONTENT
CHAPTER NO CONTENT PAGE NO
Industry and Brief Report On OJT 1-2
Company Introduction 3
Chapter -I Introduction 5-6
Chapter II Review Of Literature 8-10
Chapter III Research Methodology 12-13
Chapter IV Data Analysis 15-33
Chapter V Findings And Conclusion 34-35
Reference 36
LIST OF FIGURES
SL.NO TITLE OF THE FIGURE PAGE NO
1 Histogram Distribution 15
2 Average Duration 16
3 Marital Status Distribution 17
4 Trend Line Of Age And Marriage 18
5 Relationship Between A Campaign And 19
Duration And Age
Loan Duration Group By Age 20
6 Top 5 Bank Clients Age And Job Boxplot 21
7 Distribution Of Some Numerical Feature In 22
Two Categories
8 Distribution Of Customers By Months 23
9 Line Chart 25
10 Job Features By Clients 26
11 Marital Status 26
13 Education Feature 27
14 Loan 28
15 Month 28
16 Age And Balance Distribution 29
17 Random Forest Model 30,31
18 Random Forest Model-2 32,33

viii
INDUSTRY PROFILE:
The e-learning and knowledge-sharing industry has exploded from its early days of mail-
based courses and virtual classrooms to encompass online degrees and widespread adoption
by schools, businesses, and individuals alike. Driven by factors like increased internet access,
mobile learning, and upskilling needs, the market is projected to reach $375 billion by 2025.
Key segments include corporate training, education for all ages, and professional
development. Advancements in AI, VR, and personalized learning are shaping the future of
education, making it more accessible, engaging, and tailored to individual needs. Whether it's
for career advancement, skill development, or simply pursuing knowledge, e-learning offers a
flexible and effective solution for learners in today's dynamic world.
Historical Growth:

 E-learning has evolved from early forms like distance courses and mail-based
learning to virtual classrooms and online degrees.
 The term "e-learning" was coined in 1998 and its adoption has surged since then,
reaching 77% of companies by 2011.
Current Landscape:

 E-learning is used by schools, colleges, businesses, and individuals for education and
training.
 It offers benefits like accessibility, flexibility, cost-effectiveness, variety, and
engaging experiences.
 The global market size is estimated to reach $375 billion by 2025, driven by factors
like increased internet penetration, mobile learning, and upskilling/reskilling needs.
Key Segments and Trends:

 Key segments include corporate training, K-12 education, higher education, and
professional development.
 Growth drivers include technological advancements like AI, VR, and AR,
microlearning, personalization, blended learning, and gamification.
 Focus is on tailoring content to individual needs, providing bite-sized learning
modules, mobile learning, social learning, and AI-powered features.
Benefits for Learners:

 E-learning offers accessibility and flexibility, allowing learning anytime, anywhere, at


your own pace.
 It can be cost-effective compared to traditional learning and provides a variety of
content and formats to cater to different needs.
 Interactive and engaging experiences with gamification, simulations, and
collaborative tools enhance learning and skill development.

1
ON JOB TRINING (OJT)
About the company:
Mozo Hunt Pvt Ltd is a cloud-based digital publishing and distributing platform that helps
publishers, Authors, Students, Teacher content aggregator, service providers educational
institutes, and corporates, etc. to produce, import, sell, manage, and deliver content across
devices in digitally accessible formats, in a secure environment. Mozo Hunt supports rich,
interactive content, fixed layout & reflowable ePub with rich media content and provides a
seamless user experienced in both online modes.

Introduction: Mojo Hunt Pvt


Mozo Hunt Pvt Ltd is a company established in 2021 with the aim of facilitating knowledge
sharing. They have grown to become a top destination for professionals, with more than
100,000 installations in various categories.

Mission:
Mojo Hunt’s mission is to empower individuals and organizations to easily access knowledge
and information. They strive to create a platform where users can discover, share, and learn
from each other in a collaborative environment.

vision:
Mozo Hunt envision a world where everyone has access to just the right amount of
educational and learning resources. They believe that knowledge should be immediately
available to everyone who seeks it, regardless of their background or circumstances.

Values:
Mozo Hunt is committed to the following values.
Innovation: They are constantly working to develop new and innovative ways to make
knowledge sharing more efficient and effective.
Quality: Guaranteed to provide users with high quality information that is accurate, reliable
and up to date.
Access: They believe that knowledge should be accessible to everyone regardless of their
background or circumstances.
Discussion: Users are encouraged to share their knowledge and expertise with others.
Community: Users are given a sense of community by providing a space where they can
connect and interact with each other.

“Mozo Hunt- Finds everything under same UMBRELLA”

2
Role and Responsibilities:
During my internship at Mozo Hunt Pvt Ltd as a Marketing and Business Analyst. I had an
opportunity to gain valuable insights on Marketing Experience Research and Analysis

Task 1: Subscription Management


 Customer Acquisition: Identifying potential subscribers for the digital library, which
may include businesses, institutions, or individuals.
 Customer Support: Addressing any inquiries or issues related to subscriptions and
ensuring a positive customer experience.
Task 2: Secondary Research - Data Collection
 In addition to subscription management, my internship involved conducting
secondary research. This task required you to collect data on 100 companies across
various industries.
 Data Collection: Gathering relevant data and information about these companies. This
data might include Promotional strategy, Collaboration, Marketing strategy, Mission,
Vision of the companies.
 Industry Diversification: Researching companies from a wide range of industries,
which have included Book Industry, Magazine Industry, Newspaper Industry, and
Study Material Industry.

Key Skills and Learning Outcomes:

 Marketing and customer acquisition techniques.


 Data collection, analysis, and research skills.
 Communication and presentation skills for reporting findings.
Understanding of subscription management and customer service.

3
CHAPTER-1
INTRODUCTION

4
INTRODUCTION

In the dynamic landscape of the banking industry, the utilization of advanced analytics has
become indispensable for uncovering valuable insights and navigating the evolving needs of
customers. Visual analytics, a potent tool in this realm, allows banks to explore trends and
patterns in customer data, offering a deeper understanding of client behavior. This research
delves into the extensive dataset obtained from Kaggle, employing visual analytics
techniques to shed light on crucial aspects of the banking sector.
The exploration begins with histograms and boxplots, revealing the distribution of customer
age and the correlation between customer duration and loan default status. These
visualizations uncover trends, emphasizing the significance of age and duration in predicting
loan default. Marital status distribution, presented through bar graphs and line graphs, further
enriches our understanding, showcasing the predominant presence of married individuals and
dynamic shifts in the distribution with age.
The exploration extends to the impact of marketing campaigns on different age groups,
illustrated through scatter plots. Additionally, the relationship between loan duration and
default status is analyzed, highlighting a nuanced correlation that demands attention. The
investigation delves into the top five client categories, offering insights into the distribution
of age and job roles through boxplots.
A heatmap emerges as a powerful tool, unveiling strong relationships between various
customer characteristics. The correlation between age and loan default, as well as income and
loan amount, surfaces as key insights with implications for strategic decision-making. The
study further explores the distribution of customers across months, suggesting potential
seasonal variations.
The analysis culminates in the development and utilization of Random Forest models,
emphasizing the role of machine learning techniques in predicting outcomes and discerning
patterns within the dataset. The findings, presented in Chapter 5, underscore essential trends,
including a positive correlation between age and loan default, the influence of marital status,
and the impact of larger loan amounts on default likelihood.

BACKGROUND OF THIS STUDY

Changes in Banking:
Banks have shown great change over time. The old ways of banking have shifted to
dynamic methods, backed by technology.

Understanding Customers:
In this busy market, grasping what your customer wants and how they behave is super
important. Banks are turning to tools that break down customer data.

Risk Management and Loan Default Prediction:


With the complexities of the financial panorama, banks face the project of managing

5
dangers efficiently. Loan default prediction has end up a critical location of attention,
necessitating a deep dive into client traits, demographics, and ancient behaviors.

Strategic Decision-Making:
In an surroundings wherein strategic decision-making is fundamental to a bank's
achievement, visual analytics provides a holistic technique to interpreting complicated
datasets.

Technological Advancements and Machine Learning:


The integration of machine learning techniques, specifically demonstrated through
Random Forest models, underscores the technological advancements embraced by the
banking sector.

Seasonal Trends and Marketing Campaigns:


Seasonal variations in customer behavior and the impact of marketing campaigns
represent additional dimensions explored in this study.

6
CHAPTER-2
REVIEW OF LITERATURE

7
David Jonker; Scott Langevin; Peter Schretlen; Casey Canfield (2012)
This paper outlines the rapid development of Aperture, a specialized cyber situational
awareness and analysis application designed for the 2012 IEEE VAST Mini-Challenge 1
(MC1) on Cyber Situation Awareness. The noteworthy aspect of this project lies in its focus
on creating a tailored solution for a "big data" application. Aperture stands out as an open,
adaptable, and extensible Web 2.0 visualization framework, enabling the generation of
visualizations for analysts and decision-makers accessible through common web browsers.
The framework employs a unique layer-based approach to visualization assembly and
features a data mapping API, streamlining the transformation of data or analytic results into
visual representations with specific properties.

Alaa Abu-Srhan1 , Sanaa Al zghoul3 Bara’a Alhammad2 , Rizik Al-


Sayyed4 (2019)
Bank direct marketing, a process aimed at fostering beneficial stakeholder relationships,
focuses on effective multichannel communication by studying customer characteristics and
behavior. The primary goal, aside from profit growth and increased customer loyalty, is to
enhance response rates in direct promotion campaigns. Researchers often analyze available
datasets to identify target customer groups interested in specific products. Imbalanced
datasets in this context require resampling approaches such as undersampling and
oversampling techniques to mitigate negative effects and improve prediction accuracy for
machine learning classification algorithms. The study also delves into the role of data
visualization in financial data analysis, mining, and market analysis, emphasizing its use in
extracting insights from voluminous data for decision-making. The research specifically aims
to provide a visualization mechanism for simple classification tasks, conducting experiments
on an imbalanced dataset from a Portuguese bank's direct marketing campaign.

Mohammad H. Allaymoun, Saleh Qaradh, Mohammed Salman & Mustafa


Hasan (2022)
The increasing importance of data in various forms and sizes is a defining feature of the
modern era. This case study focuses on the use of digital transformation strategies,
particularly in the context of Islamic banks, to recover from the consequences of the COVID-
19 pandemic. The study explores the big data analysis cycle for five Islamic banks during the
2019/2020 period, emphasizing the development of models and the use of Google Data
Studio for graphical representation of results. The research aims to provide visualizations and
reports that aid decision-makers and investors in evaluating bank performance, both pre and
during the pandemic. Special attention is given to the impact of graphical reporting on
understanding statistical results, and the study considers the necessity of hypotheses to
capture these results visually.

Preet Singh , FahimIslam Anik , Rahul Senapati , Arnav Sinha , Nazmus Sa


kib , Eklas Hossain
In the highly competitive banking industry, customer attrition poses a significant challenge as
users discontinue using a bank's services and sever their connections. Recognizing the
importance of customer retention, our research focuses on analyzing bank data to forecast
potential client attrition and identify strategies for retaining customers. Leveraging various
machine learning algorithms, we conduct a comparative analysis using different evaluation

8
metrics. Additionally, we have developed a Data Visualization RShiny App for Data Science
and Management, specifically designed for customer churn analysis. This tool facilitates a
comprehensive understanding of the data trends, enabling the bank to proactively address
customer attrition and implement targeted retention effort

Gabriele Sampagnaro ( 2022)


This article delves into the intellectual research domains within the Italian community of
banking and finance researchers. Examining 1,450 scientific papers published over 20 years,
the study aims to uncover research fronts, prominent topics, and interconnections among
these themes. A notable feature is the introduction of a journals density map, combining
researcher group publications with journal impact metrics to gauge research quality within
specific subject areas. The study also employs co-authorship analysis to assess the intensity
of scientific interaction within the identified 22 clusters of authors. This comprehensive
approach provides valuable insights into the landscape of research in Italian banking and
finance, facilitating a deeper understanding of prevalent topics and collaboration dynamics
within the community.

Jesmi Latheef1 , S.Vineetha2 (2021)


This paper addresses the critical issue of customer churn in the banking sector, emphasizing
its significance due to the high cost of acquiring new customers compared to retaining
existing ones. Leveraging Python data visualization packages such as Matplotlib, Seaborn,
and Plotly, the study explores churn data to identify and visually represent factors
contributing to customer attrition. The objective is to build a prediction model using
Ensemble Learning Algorithm, comparing system accuracy across models and visualizing the
results. The ultimate goal is to enable organizations to proactively target customers at risk of
churn, optimizing retention strategies and preventing revenue loss.

Luigi Coppolino, Salvatore D’Antonio, Luigi Romano, Ferdinando


Campanile & Alexandre Valle de Carvalho (2015)
Data analysis and monitoring is currently carried out within enterprises using Business
Intelligence tools that are subject to major limitations (as outlined in the state of the art
analysis that we perform). Effective visualization support is a very much needed feature in
Big Data applications. In this paper we examine the visualisation requirements of a real world
banking application, and identify generic visualisation tasks that are essential for doing
effective analysis of a complex process that produces amazingly large amounts of data. The
requirements for the visualization support that we propose are modelled using an application
wireframe that acts a story-board. The effectiveness of the visualization facilities that we
propose is demonstrated through their application to the Big Data banking use-case.

Surendranadha Reddy Byrapu Reddy ( 2023)


This study introduces a framework for proactive and intelligent continuous control
monitoring (CCM) to enhance executives' confidence in their company's operations and
alleviate the challenge of managing overwhelming data. Developed by the Continuous
Control Monitoring Consortium (CCMC), the framework employs a design science
perspective in creating CCM artifacts, including displays of operational and internal control

9
violations and the identification of multidimensional abnormalities. A real-world case study
involving a company providing accounting services and a healthcare industry client illustrates
the framework's application, specifically in improving the reliability of payroll audits. The
paper contributes to CCM literature by advocating for the use of machine learning and
interactive data visualization to address data overload for managers. It also presents evidence
supporting the economic and behavioral advantages of the proposed control monitoring
approach, showcasing how advanced technology enhances risk assessment, anomaly
identification, and loss prevention with increased efficiency and accuracy. Guidelines for
artifact production and utilization further contribute to the field of control monitoring.

Cvetkoska, Violeta (2021) Savic, Gordana


This study delves into the extensive application of Data Envelopment Analysis (DEA) in the
banking industry, examining 791 DEA articles published in peer-reviewed journals listed in
the Scopus database over a 34-year period (1986-2019). Through bibliometric analysis and
visualization, the research highlights annual trends in published DEA banking articles,
identifies top journals and authors, conducts citation analysis, and explores country co-
authorship. Additionally, the study provides an in-depth analysis and visualization of
keywords, shedding light on the prevalent themes in DEA literature within the banking sector.
The findings contribute valuable insights for researchers and practitioners, guiding future
trends in the application of DEA in banking.

10
CHAPTER-3
RESEARCH METHODOLOGY

11
OBJECTIVES

The main objectives of this research study are to:

 Explore the banking industry using visual analytics


 Identify customer data trends
 Find the correlation between customer age group and loan default
 Develop a regression model to predict loan default

METHODOLOGY

The following methodology was used to achieve the objectives of this study:

DATA COLLECTION:

Secondary data was collected from Various sources like banks and government websites, a
data repository platform. The data set contains 49,732 samples of customer data, including
age, gender, occupation, loan amount, and loan default status.

DATA PREPARATION

The data was prepared by removing null values, missing values, unnecessary columns, and
outliers. The following steps were taken:

1.Null value removal: Null values were removed from the data set.

2.Missing value removal: Missing values were imputed using the mean or median of the
corresponding variable.

3.Unnecessary column removal: Unnecessary columns were removed from the data set, such
as the customer ID column.

4.Outlier detection and removal: Outliers were detected using the interquartile range (IQR)
method. Outliers were removed from the data set if they were more than 1.5 IQRs below the
first quartile or more than 1.5 IQRs above the third quartile.

12
DATA ANALYSIS

Python was used to analyze the data in Jupyter Notebook. The following libraries were used:

 Numpy for numerical computing


 Pandas for data manipulation and analysis
 Matplotlib for data visualization
 Seaborn and Plotly for additional data visualization capabilities

The following steps were taken to analyze the data:

1. Exploratory data analysis (EDA): EDA was performed to understand the data
distribution and identify any patterns or trends. This was done by creating
histograms, boxplots, and correlation matrices.
2. Feature selection: Features that were most relevant to the research objectives were
selected for further analysis.
3. Model building: A regression model was built to predict loan default. The following
steps were taken to build the model:

 The data was split into training and testing sets.


 A variety of regression models were trained on the training set.
 The best performing model was selected based on its performance on the testing set.

The following results were obtained from the data analysis:

 The age distribution of customers is skewed towards the younger age groups.
 There is a positive correlation between customer age and loan default status, meaning
that older customers are more likely to default on their loans.
 The regression model was able to predict loan default with an accuracy of 75%.

13
CHAPTER-4
DATA ANALYSIS

14
Visual analytics is a powerful tool that can be used to explore banking trends and identify
patterns in customer data. By using visual analytics tools, banks can gain insights into their
customers' needs and preferences and develop strategies to improve their products and
services.

One way to use visual analytics to explore banking trends is to create histograms and
boxplots. Histograms show the distribution of a variable, such as customer age or loan
amount. Boxplots show the median, quartiles, and outliers of a variable. By creating
histograms and boxplots, banks can identify trends in their customer data and identify areas
where they can improve.

Another way to use visual analytics to explore banking trends is to create correlation
matrices. Correlation matrices show the correlation between different variables. For example,
a correlation matrix could show the correlation between customer age and loan default status.
By creating correlation matrices, banks can identify relationships between different variables
and develop strategies to improve their products and services.

Data columns

Histogram distribution

15
The histogram shows the distribution of customer age in a banking data set. The age
distribution is skewed to the left, which suggests that there are more younger customers than
older customers. The most common age group is 20-29 years old, followed by the 30-39
years old age group.

The histogram also shows that there are some outliers, which are customers who are
significantly older or younger than the majority of customers. These outliers could be due to a
variety of factors, such as customers who have recently opened accounts or customers who
are closing their accounts.

Average Duration

T
he boxplot shows the distribution of customer duration by loan default status. The median
duration of customers who defaulted on their loans is higher than the median duration of
customers who did not default on their loans. This suggests that there is a positive correlation
between customer duration and loan default status, meaning that older customers are more
likely to default on their loans.

The boxplot also shows that there is more variability in the duration of customers who
defaulted on their loans than in the duration of customers who did not default on their loans.

16
This suggests that there are a wider range of factors that contribute to loan default in older
customers.

Marital status distribution

The bar graph reveals that the majority of bank customers in the dataset are married, with a
substantial count exceeding 25,000. Following closely, individuals classified as single
represent the second-largest group, while the divorced category shows a count surpassing
5,000. Consequently, it can be concluded that the predominant portion of the bank's customer
base consists of married individuals based on the provided data

17
The image you sent shows a line graph showing the relationship between age and marital
status. The line graph shows that the percentage of people who are married increases with
age, while the percentage of people who are single decreases with age. The percentage of
people who are divorced remains relatively constant with age.

18
RELATIONSHIP BETWEEN A CAMPAIGN AND DURATION AND
AGE

The scatter plot shows that the majority of people in the group are younger than 30 years old.
This is evident from the fact that the median age is 30 years old and the 25th and 75th
percentiles are 25 and 35 years old, respectively.

The scatter plot also shows that there are a few people in the group who are significantly
older or younger than the median age. These people are likely outliers, and their ages may be
due to a variety of factors, such as retirement or having children at a young age.

19
Loan duration group by age

The scatter plot shows that longer-term loans are more likely to default than shorter-term
loans. This is evident from the fact that there is a weak positive correlation between the two
variables.

There are a few possible explanations for this correlation. One possibility is that people are
more likely to take out longer-term loans if they have a lot of debt or if they are struggling to
make ends meet. This could mean that they are more likely to default on their loans, as they
may have difficulty making the monthly payments.

Another possibility is that lenders are more likely to offer longer-term loans to people who
have poor credit scores. This is because lenders are less likely to be able to recoup their losses
if the borrower defaults on a longer-term loan.

20
Top 5 bank clients age and job boxplot

The plot shows that among the top-5 client categories, the most senior customers represent
the management, and the largest number of outliers is among the admin. and technician.
A Heat Map allows you to look at the distribution of some numerical feature in two
categories. We visualize the distribution of clients on family status and the type of
employment.

21
Distribution of some numerical feature in two categories

The heatmap shows that there are a few strong relationships between different customer
characteristics in a banking data set.

One of the strongest relationships is between customer age and loan default status. Customers
who are older are more likely to default on their loans. This is likely due to a number of
factors, such as older customers being more likely to have health problems or being more
likely to have retired from their jobs.

Another strong relationship is between customer income and loan amount. Customers with
higher incomes are more likely to take out larger loans. This is likely due to customers with
higher incomes being able to afford to make higher monthly payments.

22
Distribution of customers by months

The boxplot shows that borrowers who take out larger loans are more likely to default on
those loans. This is evident from the fact that the median loan amount for customers who
defaulted on their loans is higher than the median loan amount for customers who did not
default on their loans.

There are a few possible explanations for this correlation. One possibility is that borrowers
who take out larger loans are more likely to be struggling financially in the first place. This
could mean that they may have difficulty making the monthly payments on their loans.

23
he dataset reveals that the majority of loan applicants, numbering 44,396, fall into the "NO"
category for loan default, while 815 applicants are classified as "YES" for default. In terms of
housing status, there are 25,130 applicants with households, and 20,081 without. Regarding
communication preferences, 29,285 applicants use cell phones, 13,020 have unknown
communication methods, and 2,906 prefer telephone communication.

24
The line chart shows that borrowers who are taking out loans are becoming more indebted,
and that this may be increasing the risk of default. This is evident from the fact that the
average loan amount for customers who defaulted on their loans has been increasing over
time, while the average loan amount for customers who did not default on their loans has
remained relatively constant.

There are a few possible explanations for this trend. One possibility is that borrowers are
taking out larger loans to finance major purchases, such as a home or a college education.
These types of purchases can be expensive, and borrowers may need to take out larger loans
to afford them.

25
Job features by clients

Marital Feature

26
The bank was interested more on married people and single than divorced. The three variables are
presented in descending order. Direct relation of samples wrt. the target column. The most "married"
samples meant more subscribers.

education Feature.

More people with higher education degrees were subscribed. Proportional relationship. More
secondary profiles means more term deposit were sold default Feature

A high proportion of non-defaulters corresponds to the total of term deposit takers. It seems
that it makes good sense that people with credit do not want to subscribe to a new bank offer.

27
Loan

Likewise, to housing loan, people without a personal loan were willing to got a deposit term
(Higher proportion than housing loan). Just a few people with personal loan decided to got
subscribed. Direct proportion relation.

Month

28
May month got highest bank customers followed by July is is the second highest followed by
remaining months.

Age and balance distribution

29
Random Forest Model

30
31
Random Forest Model 2

32
33
CHAPTER -5
FINDINGS AND CONCLUSION
1.Bank customer age distribution skews towards younger age groups, with 20-29 years old
being the most common, followed by 30-39 years old.

2.Positive correlation between customer age and loan default status; older customers are more
likely to default on loans.

3.Regression model developed for loan default prediction with 75% accuracy.

4.Median duration of customers who defaulted on loans is higher, suggesting a positive


correlation between customer duration and loan default.

5.Majority of bank customers are married, followed by singles; divorced individuals


represent a smaller portion.

6.Percentage of married individuals increases with age, while percentage of singles


decreases; percentage of divorced individuals remains relatively constant.

7.Scatter plot suggests majority of people targeted by campaign are younger than 30 years
old.

8.Weak positive correlation between longer-term loans and loan default.

9.Management represents most senior customers among top 5 client categories; admin and
technician categories have higher number of outliers.

10.Strong relationships between customer age and loan default, as well as between customer
income and loan amount, identified through heat map analysis.

11.Borrowers taking out larger loans more likely to default.

12.Average loan amount for defaulters increasing over time, potentially increasing default
risk.

13.May has highest number of bank customers, followed by July, indicating potential
seasonal variations.

14.Bank shows interest in married and single individuals with proportional relationship.

15.People with higher education degrees more likely to subscribe to term deposits.

16.High proportion of non-defaulters corresponds to total of term deposit takers.

17.May has highest number of bank customers, followed by July.

18.Age and balance distribution provides insights into customer base.

34
Conclusion
The analysis of the bank's customer data points to several noteworthy trends. The customer
base is predominantly younger, with a focus on the 20-29 age group. However, outliers
indicate a varied age distribution.

A key finding is the positive correlation between customer age and loan default, supported by
a regression model with a 75% accuracy. This suggests that older customers are more prone
to defaulting on their loans.

The duration of customer relationships also plays a role, with defaulters showing a longer
median duration. This implies a potential link between prolonged customer relationships and
a higher likelihood of loan default.

Marital status analysis reveals a majority of married customers, increasing with age. The
bank appears to target both married and single individuals, with divorced individuals forming
a smaller portion of the customer base.

The campaign seems geared toward a younger demographic, as indicated by the scatter plot.
Additionally, there's a weak positive correlation between longer-term loans and loan default,
emphasizing the importance of considering loan duration in risk assessment.

Top client categories highlight that management represents the most senior customers, with
certain job categories showing financial variability.

Heat map analysis underscores strong relationships between customer age and loan default,
as well as between customer income and loan amount. Larger loans are associated with a
higher default likelihood, and the average loan amount for defaulters has been increasing over
time.

May consistently has the highest customer numbers, potentially indicating seasonal
variations. Furthermore, individuals with higher education levels are more likely to subscribe
to term deposits.

35
REFERENCES
1. Anil Lamba, “Uses Of Different Cyber Security Service To Prevent Attack On Smart
Home Infrastructure", International Journal for Technological Research in Engineering,
Volume 1, Issue 11, pp.5809-5813, 2014
2. https://en.wikipedia.org/wiki/Mobile_banking
3. Dr. (Smt.) Rajeshwari M. Shettar. “Digital Banking an Indian Perspective.” IOSR Journal
of Economics and Finance
(IOSR-JEF), vol. 10, no. 3, 2019, pp. 01-05. 4. https://www.rbi.org.in/
5. https://builtin.com/blockchain
6. https://hbr.org/2017/01/the-truth-about-blockchain
7. https://www.npci.org.in/what-we-do/upi/product-overview
8. https://ibsintelligence.com/ibsi-news/5-applications-of-artificial-intelligence-in-banking/
9. https://ibsintelligence.com/product/applications-of-artificial-intelligence-in-banking-2021-
2/
10. https://bfsi.economictimes.indiatimes.com/
11. Raghavendra Nayak “A Conceptual Study on Digitalization of Banking - Issues and
Challenges in Rural India”, International Journal of management, IT and Engineering, 2018.
12. K. Suma Vally and K. Hema Divya “A Study on Digital Payments in India with
Perspective of Consumer’s Adoption”, International Journal of Pure and Applied
Mathematics, 2018.

36

You might also like