You are on page 1of 58

Data Analysis for Research

Class 7  Banking competition

Dario Maimone Ansaldo Patti

M.Sc. in Banking & Finance


King's Business School

0102 Jun 2021


Outline
Banking Competition

Methodology
Equilibrium Testing
Disequilibrium approach
Panel Data

Fixed eects estimator


Random Eects estimator
Estimation in Stata

The model
Pooled estimation
Fixed eects estimation
Random eect estimation
Test for the degree of competition
Disequilibrium approach
2 of 58
Outline
Banking Competition

Methodology
Equilibrium Testing
Disequilibrium approach
Panel Data

Fixed eects estimator


Random Eects estimator
Estimation in Stata

The model
Pooled estimation
Fixed eects estimation
Random eect estimation
Test for the degree of
competition
3 of 58
Disequilibrium approach
Banking Competition
Preliminaries

I A study by Matthews, Murinde and Zhao (2007) investigates


competitive conditions in UK banking between 1980 and 2004 using
the Panzar and Rosse (1982, 1987) approach.

I If the market is contestable, entry to and exit from will be easy


(even if the concentration of market share among rms is high), so
that prices will be set equal to marginal costs. The technique used
to examine this conjecture is to derive testable restrictions upon the
bank's reduced form revenue equation.

4 of 58
Banking Competition
Methodology

I The empirical investigation consists of deriving an index (the


Panzar-Rosse H-statistic) of the sum of the elasticities of revenues
to factor costs (input prices).

I If this lies between 0 and 1, we have monopolistic competition or a


partially contestable equilibrium, whereas H would imply a
monopoly and H < 1 would imply perfect competition or perfect
contestability.

I The key point is that if the market is characterised by perfect


competition, an increase in input prices will aect the output of
banks, while it will not under monopoly regime.

5 of 58
Banking Competition
Methodology

I The model that Matthews et al. estimate is the following:

log Revit = α0 + α1 log PLit + α2 log PKit + α3 log PFit


+ β1 RISKASSit + β2 log Assetit + β3 log BRit
+ γ1 GROWTHt + µi + vit (1)

where REVit is the ratio of bank revenues to total assets for bank i
at time t , PL is personnel expenses to employees (the unit price of
labour); PK is the ratio of capital assets to xed assets (the unit
price of capital) and PF is the ratio of annual interest expenses to
total loanable funds (the unit price of funds).

6 of 58
Banking Competition
Methodology

I The model also includes several variables that capture time-varying


bank-specic eects on revenues and costs:
1. RISKASS : the ratio of provisions to total assets;
2. ASSET : bank size, measured by total assets;
3. BR : ratio of the bank's number of branches to the total number of
branches for all banks.
4. GROWTHt : the rate of growth of GDP, which varies over time but is
constant across banks at a given point in time;
5. µi : bank-specic xed eects;
6. vit is an idiosyncratic disturbance term.

I The contestability parameter, H is given by the following:

H = α1 + α2 + α3 (2)
7 of 58
Banking Competition
Methodology

I Notice that in order to evaluate the degree of competition in the


banking sector, you test whether the contestability parameter is:

0

 =
H= ∈ (0, 1)
= 1

I This implies that you should set the following two hypotheses:

H01 : α1 + α2 + α3 = 0 (3)
H02 : α1 + α2 + α3 = 1 (4)

I You can test the above null hypotheses using a Wald test.
8 of 58
Banking Competition
Methodology

I Notice that you may obtain the following results:

1. Rejection of H01 and failure to reject H02 : the banking sector


experiences perfect competition;

2. Failure to reject H01 and rejection of H02 : the banking sectore is a


monopoly;

3. Rejection of H01 and rejection of H02 : the banking sector experiences


monopolistic competition

9 of 58
Banking Competition
Equilibrium Testing

I Unfortunately, the Panzar-Rosse approach is only valid when


applied to a banking market in long-run equilibrium.

I Hence the authors also conduct a test for this, which centres on the
following regression:

log ROAit = α00 + α10 log PLit + α20 log PKit + α30 PFit
+ β10 RISKASSit + β20 log Assetit + β30 log BRit
+ γ10 GROWTHt + µi + ωit (5)

I The explanatory variables for the equilibrium test regression are


identical to those of the contestability regression but the dependent
variable is now the log of the return on assets (log ROA).
10 of 58
Banking Competition
Equilibrium Testing

I Equilibrium is argued to exist in the market if:

E = α10 + α20 + α30 = 0 (6)

I Notice that you can easily test the above hypothesis using a Wald
test.

I Matthews et al. (2007) employ a xed eects panel data model


which allows for diering intercepts across the banks, but assumes
that these eects are xed over time.

11 of 58
Banking Competition
Equilibrium Testing

I The xed eects approach is appropriate given that the data


analysed here contain an unusually large number of years (25)
compared with the number of banks (12), resulting in a total of
219 bank-years (observations).

I The data employed in the study are obtained from banks' annual
reports and the Annual Abstract of Banking Statistics from the
British Bankers Association. The analysis is conducted for the
whole sample period, 1980-2004, and for two sub-samples,
1980-1991 and 1992-2004.

12 of 58
Banking competition
Disequilibrium approach

I Suppose that the market is not in equilibrium (E 6= 0). If this is the


case we can estimate:

log REVit = α0 + (1 − λ) log REVit−1 + α1 log PLit + α2 log PKit


+ α3 PFit + +β1 RISKASSit + β2 log Assetit
+ β3 log BRit + γ1 GROWTHt + µi + vit (7)

I where REVit is the ratio of bank revenues to total assets for bank i
at time t , PL indicates personnel expenses to employees (the unit
price of labour); PK is the ratio of capital assets to xed assets
(the unit price of capital); and PF is the ratio of annual interest
expenses to total loanable funds (the unit price of funds).
13 of 58
Empirical Analysis
Disequilibrium approach

I If (1 − λ) is statistically signicant, we have a further evidence that


revenues are not in equilibrium, as they follow a partial adjustment
model:

log REVit − log REVit−1 = λ (log REVit∗ − log REVit−1 )

I where log REV ∗ is the equilibrium level. If we rearrange the above


equation we have:

log REVit = (1 − λ) log REVit−1 + λ log REVit∗ (8)

14 of 58
Empirical Analysis
Disequilibrium approach

I Notice that log REVit∗ (revenues in equilibrium) can be


approximated by the same regressors in equation (1). Substituting
equation (1) into (8) for REVit∗ , we obtain equation (7).

I Usually, scholars estimate equation (7) and then use the estimated
parameters (α1 + α2 + α3 ) to assess the degree of competition
[Notice that with an abuse of notation, we adopt the same greek
letters to denote parameters as in Eq. 1]

15 of 58
Empirical Analysis
Disequilibrium approach

I But is this correct? Notice that there exists a bias in the above
parameters. In other words:

Ht = λH

I It follows that if we use the parameters coming from the estimation


of equation (7), we may have a distorted degree of competition.

16 of 58
Outline
Banking Competition

Methodology
Equilibrium Testing
Disequilibrium approach
Panel Data

Fixed eects estimator


Random Eects estimator
Estimation in Stata

The model
Pooled estimation
Fixed eects estimation
Random eect estimation
Test for the degree of
competition
17 of 58
Disequilibrium approach
Panel Data
Introduction
I Usually, when you look at a dataset, it contains either a
cross-section or a time series dimension.
I However, we can combine both dimensions of data in order to
increase the number of observations and gain eciency in the
estimation.
I Panel data analysis exploits both the cross section and the time
dimension of the data. Suppose to have N groups (groups could be
countries, banks, rms and so on) and T years in our dataset. This
means that for each n ∈ N , we have t ∈ T observations.
I Instead of estimating each group in isolation, we may carry out an
analysis, which takes into consideration the heterogeneity, which
comes from belonging to a specic group.
I In this case the total number of observations at our disposal is
equalto
18 of 58 N × T.
Panel Data
Introduction

I Let us consider the following equation:

yit = cons + β1 x1,it + β2 x2,it + uit ∀i = 1..N and ∀t = 1..T (9)

I The structure of equation (9) is similar to the one we already know,


but we add a dierent subscript, it , which indicates that we are
considering observations belonging to group i in the time period t .
I An easy way to deal with equation (9) is to estimate a pooled
regression, i.e. to consider all the cross-sections and time series
together and to apply the standard OLS.

19 of 58
Panel Data
Introduction

I While this is the easiest and most parsimonious way to estimate


equation (9), we are implicitely assuming that the average values of
the variables and the relationship between them are constant over
time and all the cross-sections.
I A dierent approach could be to estimate all the time series in the
data. However, this will not take into consideration the possible
existence of a common structure in the series.
I Alternatively, we may estimate separate cross-section models. Also
in this case, we will neglect the possible existence of common
variation of a series over time.

20 of 58
Panel Data
Introduction

I This is why we prefer to employ a panel data estimator, if the data


at our disposal show the structure mentioned above.
I The panel data technique brings some advantages:
1. We can address more issues with this technique, even more complex,
than if we estimate several time-series and/or cross sections, available
in the data;
2. It oers us a "dynamic" view about the relationship between variables,
rather than the standard OLS. Also combining cross-section and time
series data increases the number of degree of freedom, which
strengthens the power of the tests.
3. Finally, structuring the model in an appropriate way removes potential
problems of omitted variable bias.
21 of 58
Panel Data
Fixed eects estimator

I There exist two broad classes of panel estimators. The rst one is
the xed eects (FE ) estimator, while the second is the random
eect (RE ) panel data one.
I We start by analysing the FE . To appreciate how the xed eect
model works, let us decompose the error term in equation (9) as
follows:
uit = ηi + εit (10)

I In the above decomposition, εit is the standard disturbance term,


which varies across time and cross-sections, while ηi is a set of
group specic eects, which refer to each cross section in the
model.
22 of 58
Panel Data
Fixed eects estimator

I To keep things simple, let us suppose that the cross-sections in our


data refer to countries.
I Therefore, ηi is a set of country specic eects, which encapsulate
all of the possible variables, which aect yit cross-sectionally, but
remain constant over time.
I Operationally, ηi is a set of dummy variables, which take the value
of 1 if the observation belongs to country i and 0 otherwise.
I It follows that we can re-write equation (9) as follows:

yit = cons + β1 x1,it + β2 x2,it + ηi + εit ∀i = 1..N and ∀t = 1..T

23 of 58
Panel Data
Fixed eects estimator

I We can model our equation in slightly dierent way if we have a


time-xed eects model rather than a cross-sectional xed eects
one. In this case, a slightly dierent decomposition of the model
yields the required result:

uit = ηt + εit (11)

I In this case, ηt is a time-varying intercept which captures all the


variables, that aect yit and vary over time but remain constant
cross-sectionally.

24 of 58
Panel Data
Fixed eects estimator

I Not surprisingly, we can model the error term uit is such a way that
we can have both cross-sectional specic eects and a time-varying
intercept. In this case, equation (9) becomes:

yit = cons+β1 x1,it +β2 x2,it +ηi +ηt +εit ∀i = 1..N and ∀t = 1..T
(12)
I In equation (12) we include some rms and time specic eects in
order to capture any possible source of heterogeneity, which might
aect the relationship under investigation.

25 of 58
Panel Data
Random eects estimator

I There exists a dierent way to model the eects. Let us suppose


the following decomposition of the error term in equation (9):

uit = i + εit (13)

I In the above equation, εit is the error term as determined some


slides ago, while i is the rm-specic eect. The latter is now
modelled as a random variable, which varies cross-sectionally but
remains constant over time.

26 of 58
Panel Data
Random eects estimator

I A dierent way of modelling the error term consists of assuming


that the random componet  is constant cross-sectionally and varies
over time, i.e.:
uit = t + εit (14)

I Clearly, as before, we may combine the two dierent specic eects


to estimate the following model:

yit = cons +β1 x1,it +β2 x2,it +i +t +εit ∀i = 1..N and ∀t = 1..T
(15)

27 of 58
Outline
Banking Competition

Methodology
Equilibrium Testing
Disequilibrium approach
Panel Data

Fixed eects estimator


Random Eects estimator
Estimation in Stata

The model
Pooled estimation
Fixed eects estimation
Random eect estimation
Test for the degree of
competition
28 of 58
Disequilibrium approach
Estimation in Stata
Data Arrangement

I Before estimating equation (12), it is better to spend a word on our


dataset, since data should be prepared in a specic way.
I A part from our variables, data need to contain some additional
information for time and cross-sections. In particular, they must
contain an id identier and a year variable.
I The id identies the groups to which the observations belong, while
the year claries the year in which the variable is observed.
I If the id is not available in your dataset, you can use the command
egen id=group(name), where name is the variable containing the
name of the banks for instance.
I We use the dataset named Data_Analaysis_Class_7.dta .
29 of 58
Estimation in Stata
The model

I Using a set of French rms, we want to evaluate which is the


degree of competition in the French banking system. More
specically, we want to test the following equation:

lnirit = αi + β1 lnlabourit + β2 lncapitalit + β3 lndepositit +


+ β4 lnloansit + β5 lnequityit + β6 lntait + ηi + ηt + εit

30 of 58
Estimation in Stata
The model

I In the above equation lnir is the log of net interest revenue for
bank i at time t ; lnlabour is the log of the labour cost, lncapital is
the log of capital cost; lndeposit, is the log of deposits price;
lnloans, is a proxy for the general level of credit risk faced by

banks, lnequity, is a proxy for a general level of risk; and lntais the
log of total asset, which proxies the size of the bank. Finally, ηi and
ηt are a set of banks and time specic dummies. Data range
between 1990 and 2008.

31 of 58
Estimation in Stata
Pooled estimation

I We start our analysis by estimating a pooled model, i.e. we mix all


the observations and we apply the OLS estimator, without taking
into consideration the heterogeneity emanating from cross-section
and time series dimension.
I To estimate the model, we type (the command below shoud go in
one line only in command window):

reg nir lncapital lndeposit lnlabor lnta lnequity


lnloans, r.
I The result of our estimation is reported in the next slide.

32 of 58
Estimation in Stata
Pooled estimation

Figure 1: Pooled data estimation

33 of 58
Estimation in Stata
Pooled estimation

I In gure 1, the results can be commented in the standard way. We


can notice:
1. Red box: R2 indicates that our model can explain about 94% of the
togal variability of the dependent variable.
2. Blue box: The F -test is an overall signicance test, telling us that the
set of the regressors is jointly signicant.
3. Green box: We may interpret the meaning of the coecients and their
signicance.

34 of 58
Estimation in Stata
Pooled estimation

I We can now check the degree of banking competition. Remember


that the degree of competition is calculated as the sum of the
coecients associated to labor, deposit and capital.
I Therefore, we set up the following set of hypotheses:

H0A : β1 + β2 + β3 = 1 (16)
H0 : β1 + β2 + β3 = 0
B
(17)

I Depending on the results of our tests we can reach a conclusion


related to the degree of competition.

35 of 58
Estimation in Stata
Pooled estimation

I We can consider the following cases:

1. H0A is not rejected: The market operates under monopoly regime ;

2. If H0A is rejected, we verify the hypothesis H0B . If we cannot reject the


latter null hypothesis: The market operates under perfect
competition regime;

3. If we reject also H0B : The market operates under monopolistic


competition regime

36 of 58
Estimation in Stata
Pooled estimation

I To test the above hypotheses we write in the command window:

test lnlabor+lndeposit+lncapital=0.
I If you reject that hypothesis, we type:

test lnlabor+lndeposit+lncapital=1.
I The results of the tests indicate that the bank sector under analysis
displays a monopolistic competition, since both null hypotheses are
rejected.

I The results are reported in the next slide.


37 of 58
Estimation in Stata
Pooled estimation

Figure 2: Banking competition tests

38 of 58
Estimation in Stata
Fixed eects estimation

I However, the estimation we carried out so far does not take into
consideration the possible heterogeneity emanating from the
cross-section and the time dimension.
I Therefore, we stard with the estimation of a one-way xed eect
panel data model (i.e. we include also the set of group xed eects
ηi ).
I However, we rst we need to declare the panel structure of our
dataset. We do this by writing tsset id time.
I We are now in the position to estimate a panel data model. We
can type into the command window (in one line):
xtreg lnir lncapital lndeposit lnlabor lnta lnequity
lnloans, fe
39 of 58
Estimation in Stata
Fixed eect estimation

Figure 3: The xed eect panel data estimation

40 of 58
Estimation in Stata
Fixed eect estimation

I Few things should be noted:


1. Overall R 2 : goodness of t of our model. Same interpretation as
before (red box);
2. F -statistic: overall signicance test (blue box);
3. F -test for the signicance of the set of group dummies. Signcance
indicates that those eects are relevant and explain the variability of
the dependent variable (as further indicate by the parameter ρ.
4. Coecients are interpreted in the same way as we did previously
(yellow box).

41 of 58
Estimation in Stata
Fixed eect estimation

I As we said earlier, we could also include a set of period dummies,


ηt to capture any temporal shock.
I If we want to do this, we can simply re-estiamte our previous model
as follows:

xtreg lnir lncapital lndeposit lnlabor lnta lnequity


lnloans i.time, fe.
I The term i.time asks the software to generate the set of the
dummies for the period eects.
I The results of our estimation are reported in the next slide.

42 of 58
Estimation in Stata
Fixed eect estimation

Figure 4: The two-ways xed eect panel data estimation

43 of 58
Estimation in Stata
Fixed eect estimation

I Although we included the set of period dummies, we do not know


whether they are jointly signicant in explaining the variability of
the dependent variable.
I We can perform a specic test by typing testperm i.time. The
null hypotesis, H0 , is that the set of the year dummies are jointly
equal to 0.
I The test clearly argues against the null hypothesis, i.e. the set of
time dummies are important in capturing any potential period
shock.
I The result of the test is reported in the next slide.

44 of 58
Estimation in Stata
Fixed eect estimation

Figure 5: Joint test for the signicance of the time dummies


45 of 58
Estimation in Stata
Fixed eect estimation

I Since the estimation is likely to be aected by problems of


heteroskedasticity (we mix within the same dataset banks with
dierent characteristics), we need to re-estimate our model
applying robust standard errors.
I The introduction of robust standard errors in the estimation is
made in the standard way, that is we add r at the end of our
command line:
xtreg lnir lncapital lndeposit lnlabor lnta lnequity
lnloans i.time, fe r
est store robust
I The rst two lines refer to a single command, while the third one
allows us to save the estimation. Results are reported in the next
slide.
46 of 58
Estimation in Stata
Fixed eects estimation

Figure 6: Robust estimation

47 of 58
Estimation in Stata
Random eect estimation

I We now estimate a random eects model. In order to carry out


such estimation, we should write:

xtreg lnir lncapital lndeposit lnlabor lnta lnequity


lnloans i.time, r.

I Notice that if we want to estimate a random eects model, we


should not write anything after the comma. The default option is
the estimation of the random eects model.

I The results of the estimation are reported in the next slide


(estimation of time dummies is not reported).
48 of 58
Estimation in Stata
Random eects estimation

Figure 7: Random eects estimation

49 of 58
Estimation in Stata
Random eect estimation

I The comments on the results of the random eects estimation is


not dissimilar from the previous one.

I However the most important point is to understand whether we


should estimate our model using the xed or the random eects
estimator.

I In order to choose, we can implement an Hausman test.

50 of 58
Estimation in Stata
Random eect estimation

I Notice that in order to run the Hausman test, we should:


1. Re-estimate both the xed and the random eects models without
robust standard errors;
2. After each estimation, you should store the results of the estimation
by writing est store name of the estimation.

51 of 58
Estimation in Stata
Random eect estimation

I The complete set of routines is reported below:


1. xtreg lnir lncapital lndeposit lnlabor lnta lnequity
lnloans i.time, fe
2. est store fixed
3. xtreg lnir lncapital lndeposit lnlabor lnta lnequity
lnloans i.time
4. est store random
5. hausman fixed random
I The result of the test is reported in the next slide.
I It should be noted that the null hypothesis is that the random
eects model is preferable to the xed one.
52 of 58
Estimation in Stata
Random eects estimation

I Our conclusion is that the xed eects model is preferable to the


random eects one.

Figure 8: Hausman test

53 of 58
Estimation in Stata
Test for the degree of competition

I After having established that the twoways xed eect estimation


is preferable, we can test which is the degree of competition in the
banking sector.
I We rst check whether the sector follows monopoly by testing:

test lncapital+lndeposit+lnlabor=0

I Failing to reject the null hypothesis above implies that the sector
acts as a monopoly.
I If we reject it, we check if the sector follows perfect competition:

test lncapital+lndeposit+lnlabor=1
54 of 58
Estimation in Stata
Test for the degree of competition

I Failing to reject the second null hypothesis implies that the sector
is perfectly competitive.
I Instead, if we reject also the second null hypthesis, it means that
the sector follows monopolistic competition, since it is not a
competitive one nor a monopoly.
I To run the test we need to invoke the estimation we previously
saved:

est restore robust

I Results are reported in the next slide.

55 of 58
Estimation in Stata
Test for the degree of competition

Figure 9: Tests for the degree of competition

56 of 58
Estimation in Stata
Disequilibrium approach
I Suppose that you nd out that the market is not in equilibrium and
you want to estimate a dynamic model, i.e.:

lnirit = αi + γ1 lnirit−1 β1 lnlabourit + β2 lncapitalit + β3 lndepositit +


+ β4 lnloansit + β5 lnequityit + β6 lntait + ηi + ηt + εit
I Notice that the estimation of the model above can be carried out
using the FE, but the latter can produce biased estimates, due to
the presence of the lagged values of the dependent variable
(lnirit−1 ). Instead a better estimator is a dynamic panel data one,
specically the dierence GMM (Arellano and Bond) and/or the
system GMM(Blundell and Bond).
I A good introduction to dynamic panel data models and the
dierent ways they can be estimated in Stata can be found here
57 of 58
Estimation in Stata
Disequilibrium approach

I Dierent routines exist in Stata to estimate dynamic panel data


models.
I One of the best is xtabond2 (but you should have Stata in your
own machine to download it).
I Alternatively, you can use the builtin routine xtdpd. In the dole
uploaded with this notes, you can nd an example about the
estimation of a dynamic panel data model.

58 of 58

You might also like