You are on page 1of 21

Enterprise Architecture and

Information System

Tool: Tableau, Excel and SPSS


Sector: Banking

Division E
Team 8
E022 Pranav Gupta
E043 Saurabh Singh
E049 Preyal Prajapati
E059 Anmol Bhardwaj
E062 Tanika Goyal
Introduction:
Despite a significant spread in the formal banking and financial establishments, a critical extent
of the Indian populace has been avoided from services and facilities to avail financial benefits of
the formal banking frameworks. According to census 2011, a meagre percentage of 58.6% of
households had access to banking services in the country.

Financial Inclusion is the process of ensuring universal access along with sufficient and timely
availability of banking and financial services, especially to the weaker sections of the society.

In 2013, CRISIL published a framework for measuring index for financial inclusion based on 3
parameters namely :
1. Penetration based on total accounts per population
2. Availability of Banking Branches
3. Usage of Banking Facilities based on deposit and credit penetration

In our project, we consolidate various data obtained from RBI's official data postings and
Official census data to derive the results.

Dataset Description:
To mine the best insights and gain a holistic view, we have created an integrated dataset from
multiple sources including RBI’s official website, EPWRF India Time Series, and other official
government sources. The synopsis of the dataset is as follows:
1. Gross and net NPAs, both amount and as % of advances, bank group-wise viz. Scheduled
Commercial Banks, Public Sector Banks, and Foreign Banks in India.
2. Total no. of offices and sub-categories based on region population group and the type of
bank.
3. Credit Availability as Gross Bank Credit to various industries/sectors and individuals.
4. NPAs are classified under the head – Standard Assets, Sub-Standard Assets, Doubtful
Assets, and Loss Assets which are further sub-classified based on the type of banks.

For the Financial Inclusion figures, data is retrieved from Basic Statistical Returns of Scheduled
Commercial Banks in India, RBI (2018) and Census of India, 2011

Part I: FINANCIAL INCLUSION

Objective
To analyze and measure the index of financial inclusion in the Indian States and determining the
key factors contributing to the same.
Implementation
The entire project can be divided into 3 major phases that include: Data Cleaning and
Preprocessing, Data Analysis and Visualization.

Data Cleaning and Preprocessing

From the dataset, relevant data about the states and banking facilities were retrieved. It was
further ensured that all the data were based on the same time frame to maintain uniformity in the
analysis.

Data Analysis

● Dimension 1: State-wise Banking Penetration


Banking penetration measures the total number of bank accounts that are the available basis for
the population in the state.

Methodology:
1. In order to analyze, we computed the proportion of bank accounts basis the total
population in each of the states.

Furthermore, in order to draw a comparison among the states, we normalized the values
of the proportions received in order to build the bank penetration dimension.

Here, since the highest value is of Goa = 3.75878084 and the lowest value is of Nagaland
= 0.595201824, the banking penetration dimension is calculated as :

Based on the calculations received, all the states are assigned a Banking Penetration
Dimension within the range of 0-1.

Data Visualisation

This data is then fed into Tableau for further visual analysis. The states are given the geographic
role datatype and the Indian Map add-in is installed in tableau. By assigned Banking Penetration
Dimension as one of the measures, the colour scale is chosen and total Bank accounts and
Populations are added in the Details tab. Based on the colour shadings, it is very clear to identify
the states ranking low in the Banking penetration dimension.
Moving ahead with clustering them based on this index into High, Medium, Low capacities, we
move to the Analytics tab → Choose Clustering → Enter number of clusters required as 3.

Output
Interpretation of the Analysis

As evident from the table and the bar associated, Goa has the most noteworthy index in terms of
banking penetration. However, this can also be attributed to a relatively less number of
population. Despite that, comparing the same to the north-eastern states, we can see that even
though Nagaland has the lowest dimension for banking penetration, other states also fall back in
the same. Moreover, apart from Delhi and Goa, all other states lack immensely in terms of bank
accounts and this also includes states with high literacy rates like Tamil Nadu and Kerala.
_____________________________________________________________________________

● Dimension 2: Availability of Banking Services



This dimension mentions the number of offices available per percentage of the population. Many
times, it may so happen that even though people have access to a bank account, the time and
effort spent in visiting the bank located about 50 km away might defy the very purpose of
financial inclusion. With this perspective in mind, we move ahead with the dimension that
captures the context of banking branches available per population of 1000 people.
Methodology:
1. In order to analyze, we computed the total branches of the scheduled commercial banks
per 1000 population for each of the states.

2. As explained in the previous section, the values are again normalised to derive the bank
availability dimension.
Here, since the highest value is of Goa = 0.459361898 and the lowest value is of Manipur
= 0.053688351, the Banking Availability dimension is calculated as :

3. Based on the calculations received, all the states are assigned an Availability Dimension
within the range of 0-1.
4. Further steps regarding data visualization as the same as mentioned in the previous part.

Output
Interpretation of the Analysis

Thus, when we consider the dimension of Banking services available, all the states, except for
Goa substantially lack with a score below 0.5. The all-India average is also set at 0.266 and as
seen by the visual chart, even states like Maharashtra, Gujarat, Andhra Pradesh which are
developed and “well-off” lack in the availability of branches based on the population size. The
northeast region again requires improvement in this parameter along with other states such as
Bihar, Rajasthan, Andhra Pradesh, and even Uttarakhand.
_____________________________________________________________________________

● Dimension 3: Usage of Banking Services


The availability of a banking branch and a bank account is of little importance if people do not
feel secured to deposit their money or avail credit facilities. For calculating the usage dimension,
the 2 basic banking services namely deposit and credit are considered.

Methodology:
1. In order to analyze usage, we calculate the credit-deposit ratio for all of the states.

2. Repeating the normalization process and ranking the states according to their credit
deposit ratio to reach the usage dimension within the range of 0-1. Here Tamil Nadu has
the highest value= 1.1904 and Sikkim has the lowest recorded value= 0.2563895

3. Further steps about data visualisation remain the same as mentioned in the first part.
Output
Interpretation of the Analysis

When we compare the usage dimension with others, we see a very different perspective. Goa,
which was leading in the previous two aspects, here is in the lowest ones and other states such as
Rajashthan, Andhra Pradesh which were lagging in the Availability and Penetration dimension
have scored relatively higher in the usage section. This shows the stark difference in the
distribution and usage patterns as long as the banking services are concerned. We can say that a
state like Maharashtra, which might not have sufficient availability or only a considerate
penetration, ranks high on usage denoting that the people who do have access to financial
instruments avail them whereas, in Goa's case, although the facilities might be present, the usage
of the banking services are relatively lesser.
_____________________________________________________________________________

● Building The Index For Financial Inclusion


After analyzing all the three dimensions, in order to determine the weights that are to be given to
each of the dimensions, we have used Principal component analysis using SPSS. PCA is a
dimensionality-reduction technique that is frequently used to diminish the dimensionality of
huge data sets. It helps in covering most of the information using fewer variables and still
covering the majority portions of the dataset.
Here, we use the same technique to identify which vectors cover the maximum data and derive
the weighted values accordingly.
Methodology:
1. First, the dataset is loaded on SPSS→ Analyze → Dimension Reduction → Factor
2. Choose the 3 dimensions → in the Extraction Tab → Method: Principal Component →
Covariance Matrix → Based on Eigen Values greater than 1.
3. Based on the component matrix, the two components with eigenvalues greater than 1,
namely Penetration and Availability are chosen. According to the results, they
cumulatively cover 98% of the data.
The summary of the entire process is as follows:

4. Now, the extracted components are multiplied with the eigenvalues in order to derive
weights. Here, 0.987*1.933 + 0.024*1.107 = 1.934873.
5. Finally, the respective weights of all the three dimensions are multiplied by the actual
value and again divided by 5.707855 to determine the index for financial inclusion for
each and every state.
For example, for Andhra Pradesh, the calculations would be as follows.

6. Finally, the IFI obtained is normalized to achieve the overall rankings of the states.
Screenshots of the process:
Output
Below is the Tableau Dashboard covering all the three parameters:
Interpretation of the results

Finally, based on the consolidated results, we can conclude 3 major categories of states with
respect to their financial inclusion index.
1. States with low financial Inclusion: the ones having Index < 0.2
2. States with medium financial inclusion: the ones having an index between 0.2 to 0.4
3. States with high financial inclusion: the ones having an index >0.4
Basis of the study, we are further able to interpret on which exact terms is the state lagging. For
example, Rajasthan is not exactly lagging in the Usage dimension but requires more penetration
and availability of banking facilities.

Applications of this study

This study can be used for the government to look at other factors of financial inclusion apart
from just the number of bank accounts that are addressed through various government schemes.
In the banking sector, scheduled banks can use this data to identify potential segments where
they can launch their new branches and redefine schemes directed to the weaker sections of
society.
Part II: FORECASTING NPAs

Objective:

The main objective of this analysis is to come up with a methodology for forecasting
Non-Performing Assets (NPAs) which can thereby be used to forecast the NPA of an Indian bank
and gauge the crisis that the Indian banking system is facing.

Methodology:

Regression models are majorly used for forecasting purposes. Based on the trend of independent
variables, they predict the trend for the dependent variable. Since we will be predicting the NPA’s
based on external factors, regression models are the best fit in this kind of situation. Simple
Linear Regression (SLR) could not be used because it gives the value of a dependent variable (in
our case NPA) in terms of only one independent variable. NPA is mostly dependent on many
factors so this method will never give satisfactory results.

● There are various forecasting methods that can be used to predict the variable of interest.
We used Multiple Linear Regression which tells us the relationship between the
dependent and multiple independent variables.
● The independent variables we used were Repo rate, Net Loans and Advances, Consumer
price index (CPI) and Gross domestic product (GDP) to predict the dependent variable
NPA.
● We run multiple linear regression on the data for a certain number of years and then
predicted the NPA for the next 2-3years and compared those predicted values with the
actual values.
● Calculated the standard errors using both the predicted and actual values to understand
the accuracy of the model that we implemented. We also compared every possible
combination of independent variables to see which combination is giving the most
accurate result based on various standard errors like Mean Absolute Deviation (MAD),
Mean Squared Error (MSE) and Mean Absolute Percentage Error (MAPE) and
R-square value.
● Based on standard errors values we selected the best combination of variables. The one
with the minimum standard errors value will give the most accurate result. We performed
the same operations for different types of banks like a Scheduled commercial bank,
Public sector banks, New Private sector banks etc.
Variables Used in Forecasting Technique

In order to build the relationship and forecast NPAs, the factors are to be determined that affect
the NPAs in the Indian Banking System. The following four quantitative variables are used to
understand the relationship which later can be used to predict the NPA in the banking sector:

1) Repo rate

Repo rate is the rate at which the central bank of a country lends money to commercial banks in
situations of any funds shortage. This rate indicates the existing prevailing interest rate in the
economy. Repo rate, indicating interest rate, plus inflation has a combined effect on the economy
and the borrower’s ability to payback.

2) Gross domestic product

A country’s GDP is the total market value of all final goods and services produced in a country
in a given year which is equal to the total consumer, investment, and government spending. GDP
is a growth indicator of an economy. The GDP growth directly impacts loans and advances,
which further leads to NPA.

3) Loans and advances

This is one of the most crucial factors to predict and understand NPA patterns. There is a direct
relationship between the number of loans and advances and the proportion of NPA.

4) Inflation rate

The inflation rate is calculated using the consumer price index. Inflation has a direct impact on
the cost of borrowings. The rise in inflation causes the cost of borrowings to decrease, thereby
leading to higher borrowings and subsequently higher NPAs.
Output:
Interpretation of the Results

Thus, based on a linear stepwise regression model that was used to forecast NPAs for the Indian
Banking Sector, we can hereby, analyze the key indicators that have been leading to this
problem. From our analysis, we can conclude that GDP is one of the key factors responsible for
the percentage of non-performing assets. This is also intuitive as it shows us how a
macroeconomic problem can in turn lead to multiple economic problems for a business that had
borrowed cash from the banks and failed to pay back due to the faltering performance of the
overall economic setup.
Applications of this study

This study can be used for banking organizations to build a robust NPA predicting algorithm by
including even more parameters such as the company size, specific parameters of its balance
sheet such as the current ratio, inventory turnover ratio, etc. Moreover, the linear regression
technique can also be clubbed with other advanced prediction techniques to improve the
accuracy of this model.

References

● Principal components Analysis (PCA) using SPSS Statistics. (n.d.). Retrieved March 16,
2021, from
https://statistics.laerd.com/spss-tutorials/principal-components-analysis-pca-using-spss-st
atistics.php#procedure
● Vallabh, G., Singh, D., Prasoon, R. and Singh, A. (2016) Methodology to Predict NPA in
Indian Banking
● System. Theoretical Economics Letters, 6, 827-836.
http://dx.doi.org/10.4236/tel.2016.64087

You might also like