You are on page 1of 39

Empirical Corporate and Structured Finance via Machine Learning Models*

Haokun Wei1
Johns Hopkins University and BAUM Tenpers Institute

June 2023

Abstract

This paper builds on relevant asset-pricing models and applies machine-learning models to address research
questions in corporate and structured finance. We focus on the US and EU markets and propose a new hybrid
model that resolves the problem of generating forward-looking insights on corporate decisions, particularly
forecasts of structured products credit spreads and the cost of equity capital. The model performs well in
analyzing the outlook for 1) the cost of equity capital and 2) the credit spreads of structured products. We
combine four deep learning models – LSTM, MLP, CNN, and HM – to develop the proposed hybrid model
that generates proactive insights which can aid capital structure decisions both in traditional corporate finance
and in structured finance. Overall, our hybrid model introduces machine learning to capital structure outlook
in corporate finance and lays a foundation for future work on outlook-driven corporate decision-making that
strengthens corporate governance and capital structure and ultimately enhances corporate performance.

Keywords: cost of equity capital, structured products, credit spreads, machine learning
JEL Classification: C51, C52, C53, C58, G1, G3, G15, G17, G32

1
Johns Hopkins University and BAUM Tenpers Institute (Finance Research Group). Email: hwei14@jhu.edu

*
I thank Daniel Ahelegbey,Yomi Anifowose, and seminar participants at the BAUM Tenpers Research Seminar for helpful
comments. I am indebted to Bouchra Errabany and Oyakhi Ibhagui for the modelling and guidance provided in this paper,
the one-on-one classes and exposition into modern machine learning techniques, and the provision of hard-to-find market
data on structured securitized financial products. I want to thank the academic management of BAUM Tenpers Institute for
giving me outstanding scholarship assistance which has made it possible for me to attend the pre-PhD training program,
take courses and conduct research in finance. The views in this paper are solely my responsibility and should not be
interpreted as reflecting the views of anyone or any institution other than myself. All remaining errors are my own.

Electronic copy available at: https://ssrn.com/abstract=4464555


1. Introduction

The outlook for the cost of capital, both traditional plain vanilla corporate finance products such as
equities and debt and exotic structured finance products such as collateralized and non-collateralized
securitized products, has dominated boardroom debates for decades. Not only because it guides current
outlook of future capital structure decisions of corporates but also because it is a topical issue that has
continued to generate different views, in the academic literature and in practice of corporate and
structured finance, over the last decades.

This paper utilizes machine learning models to address corporate and structured finance research
questions, focusing on the US and EU. We propose a novel hybrid model that effectively generates
forward-looking insights regarding corporate decisions, resolving the problem of accurately predicting
cost of equity capital and credit spreads of structured products. Our hybrid model offers proactive
insights that can aid in capital structure decisions in traditional corporate and structured finance. The
model contributes to efficient corporate decision-making, strengthening corporate governance,
improving the outlook for capital structure, and enhancing overall corporate performance.

In empirical finance literature, previous studies have often relied on standard econometric practices as
the holy grail for generating empirical insights on the problem of determining the optimal outlook of
the cost of capital. Some of these studies have several challenges that include improperly structured
questions, misspecification of the empirical strategy, and generation of results that are inconsistent
with the underlying processes of the corporate data, especially in studies that have examined corporate
decision making using time series corporate data that performs analysis in levels and ignores the
additional spuriousness that such empirical procedure introduces to the results and corporate policy
implications, decisions and implementations. For example, the risk of spurious associations and their
implications when making inferences for outlook on future corporate decisions are well documented
in Smith and Watts (1992), Gaver and Gaver (1993), Ferson et al. (2003), Ioannidis et al. (2003),
Klapper and Love (2004), Cheung and Wei (2006), Coles et al. (2012), Huang and Kisgen (2013),
Deng at al. (2014), Griffin et al. (2018), and Mitton (2022), among many others.

More recent advances made in the machine learning literature has revolutionized empirical finance
and ushered in a new way of tackling old questions with new techniques that are more insightful and
generate renewed hope of uncovering new insights and obtaining new answers to old questions. In this
light, Lahmiri and Bekiros (2019) demonstrate how machine learning approaches can help predict
corporate bankruptcy by accurately formulating the likelihood for such event in the future. Li et al.
(2021) proposes a novel semi-supervised machine learning approach for measuring corporate culture
and their attendant effect in shaping the outlook of firms. Jiang and Jones (2018) employ machine
learning to predict corporate distress for listed firms in China while Wang et al. (2022) develop a
machine learning algorithm that controls shareholder characteristics and corporate default risk. Using
supervised machine learning, Bianchini and Croce (2022) develop a replicable technique that identifies

Electronic copy available at: https://ssrn.com/abstract=4464555


cleantech firms and use this to examine the role of environmental policies in promoting venture capital
investments in cleantech companies, while Svanberg et al. (2022) develop corporate governance,
performing rating using machine learning techniques. Kim et al. (2021) use state-of-the-art machine
learning methods for forecasting corporate bond yield spreads.

For corporate risk, Avramov et al. (2021) has developed a text-based downside risk measure using
corporate annual reports and assesses its ability to forecast future corporate policies which enables
them to draw on machine learning approach to predict corporate policies using downside risk.
Mullainathan and Spiess (2017) conclude that machine learning has earned its own place in empirical
analysis and that machine learning can be useful in working through challenging datasets such as those
published by corporations and corporate financial institutions. Kim et al. (2021) employs machine
learning with geometric-lag variables to predict corporate defaults. While all these studies show the
benefit of machine learning for formulating outlook that enhances corporate decision making in
corporate finance, none of these studies have examined the use of machine-learning techniques for
forecasting the cost of equity capital for a market and the risk premium for assessing capital through
structured products, which is the structured product credit spread.

In an influential paper, Gu et al. (2020) document the detailed application of machine learning in
empirical asset pricing, especially as regards correctly determining market risk premia. As has been
motivated above, obtaining a reliable and non-spurious way of formulating, and analyzing the outlook
of this important corporate finance variables in important for corporate decision planning and can be
the added recipe that corporate managers require to optimize future decisions as they plan to source
the necessary capital that is needed to finance future expansions, mergers, or acquisitions.

Thus, in this paper, we embark on a new and original study that develops and uses new hybrid
machine learning techniques to generate robust outlook analysis that is implemented for deriving the
outlook for the cost of equity capital in making capital structure decisions in corporate finance and the
premium to raise capital via structured products in the structured finance space, as represented by the
credit spreads of the structured products. For the equity market, we use the (inverse) market P/E ratio
and the total required market return to proxy the cost of equity capital. Because the market cost of
equity is not directly observed in market data, the rationale behind these proxies is as follows.

First, we look at the entire market portfolio as a single equity whose price, under no arbitrage, can be
determined using the Gordon case dividend discount model. Based on this, we show that the observable
P/E ratio inversely proxies the market cost of equity capital, which is our variable of focus for the
corporate financing decision. Second, we assume the capital asset pricing model (CAPM) and
represent the single security in the model as the market portfolio, so that the cost of equity capital in
the model becomes the market cost of equity capital, which is our variable for the corporate finance
decision and, consequently, the market beta against itself truncates to 1, which causes the market cost
of equity capital to exactly equate the gross return on the market, which shows that the market cost of

Electronic copy available at: https://ssrn.com/abstract=4464555


equity capital can be proxied directly with the total return on the market, where the market is taken as
the indexes under consideration – S&P 500 for the US and Eurostoxx 600 for the EU. It is based on
these perspectives that the market cost of equity capital is proxied by the aforementioned variables:
P/E ratio, which is inversely related to the market cost of equity capital, and the total required return
on the market, which is directly related to the market cost of equity capital.

For the structured products market, the structured products credit spreads considered are those of the
collateralized loans obligations (CLOs), auto loans (AL), commercial mortgage-backed securities
(CMBS), residential mortgage-backed securities (RMBS), credit card loans (CCL) and student loans
(SL). Like before, our focus remains on the US and EU structured products markets.

Using data on the equity and structured products markets, we perform a comparative analysis of
machine learning methods for determining the cost of equity capital and structured product credit
spreads of publicly listed firms in the US and EU markets. Motivated by the empirical importance of
obtaining an optimal outlook and the view that the outlook for the cost of equity capital is essential for
guiding future capital structure decisions while the outlook for structured product credit spreads is
critical for firms utilizing structured financial instruments, we employ four deep (machine) learning
models - Long Short-Term Memory (LSTM), Multi-Layer Perceptron (MLP), Convolutional Neural
Networks (CNN), and Hybrid Models (HM) to forecast the cost of equity capital and the premium to
access capital through structured finance products, i.e. credit spreads. The paper introduces a novel
approach that leverages the benefits of hybrid models for proactive corporate decision-making in
capital structure decisions via traditional corporate finance and structured finance products. Our results
yield optimal approaches that improve corporate decision-making, enhance corporate performance,
and provide a more nuanced view of future decisions that rely on outlook for capital costs.

We demonstrate large corporate gains to firms using these machine learning methods to forecasts the
evolution of future capital cost and how this would shape their choice of optimal capital structure. We
find that hybrid models have several advantages, and are in fact the best performing methods and we
attribute their superior predictive gains to develop synergistic advantages that enlarge their ability to
maximizes the best advantages of their individual components and concurrently minimize their
disadvantages, thus making allowance of advantageous predictor interactions that are often missed by
other individual methods or traditional methods often employed in the literature, despite the inherent
volatility in financial data. In sum, improved forecast of capital cost via machine learning simplifies
the burdensome task of obtaining outlook of future corporate decisions and is key into the investigation
of the factors that will shape the financing decisions of firms, thus highlighting the value of machine
learning in corporate finance decision and in the structured finance markets.

To summarize, in this paper, we focus on the implementation of machine learning methods in corporate
finance and structured finance to derive the outlook for the cost of equity capital and credit spreads,
respectively. We build on previous research that has demonstrated the large economic gains of using

Electronic copy available at: https://ssrn.com/abstract=4464555


machine learning in finance, particularly in measuring equity risk premia. Our primary contribution is
to extend the application of machine learning to two important problems in finance: capital structure
decision and structured finance.

For capital structure decision, we use machine learning to predict the cost of equity capital, which is a
critical input in determining the optimal capital structure of a firm. We compare the predictive accuracy
of machine learning methods to traditional regression-based methods and demonstrate the economic
gains that can be achieved through machine learning forecasts. Our analysis identifies the best
performing methods, which include trees and neural networks, and traces their predictive gains to the
allowance of nonlinear predictor interactions that are missed by other methods. We also identify the
dominant predictive signals, which include variations in relevant variables. Our findings highlight the
value of machine learning in improving the accuracy of cost of equity capital prediction and facilitating
more reliable investigation into economic mechanisms of capital structure decision.

For structured finance, we use machine learning to predict credit spreads of structured products. Credit
spreads reflect the risk premium of structured products and are a critical input in pricing and risk
management. We compare the predictive accuracy of machine learning methods to traditional methods
and demonstrate the economic gains that can be achieved through machine learning forecasts. Our
analysis identifies the most superior methods, which include trees and neural networks, and traces their
predictive gains to the allowance of nonlinear predictor interactions often missed in other methods.
We also identify the dominant predictive signals, which include variations on default risk, loss severit.
Our findings highlight the value of machine learning in improving the accuracy of credit spread
prediction and facilitating more reliable pricing and risk management of structured products.

Overall, our paper demonstrates the potential of machine learning in improving accuracy and economic
value of financial predictions in corporate finance and structured finance. Our findings contribute to
the growing interest in machine learning for finance and highlight their potential for innovation.

2. Some Review of Machine Learning Applications in Finance

In their seminal paper, Gu, Kelly, and Xiu (2019) conducted a comparative analysis on stock pricing
using series Machine Learning methods, including linear regression, generalized linear model with
penalization, dimension reduction through principal components analysis (PCA), partial least squares,
regression tree which included boosted trees and random forests), and neural networks. They
investigated 30000 stocks covering a time span of 60 years and found strong evidence in support of
machine learning in finance. Bouchra (2022) introduced a new method, which combines 3 deep
learning (DL) models, for stock market prediction. The results show that the Convolutional Neural
Networks(CNN) performed better than the other 2 models. Furthermore, based on the performance of
three models, hybrid is better at predicting stock returns.

Electronic copy available at: https://ssrn.com/abstract=4464555


Firer (1993) analyzes the relationship between the Price to Earnings ratio (P/E ratio) and the cost of
Equity Capital. The main idea of this paper is to prove that the cost of equity is not solely determined
by the earnings yield per share. Under assumptions of going concern and no expansion in company
size, he firstly derived that P/E ratio is the reciprocal of the cost of equity using Gordon model, and
P/E is found positively related to the net present value of future growth opportunities and negatively
related to the discount rate and the firm’s variability.

Naoaj and Hosen (2023) provided empirical evidence that found significantly negative relationship
between cost of equity and banks’capital maintenance level. The study exanmines control variables.
The COE was implied from Dividend Discount Model (DDM) and Gordon Growth Model (GGM).
Financial leverage was used to avoid drawbacks of risk-weighted capital ratios, combined with other
two measurements which are Tier 1 Capital divided by risk-weighted assets (Tier 1 Capital) and the
sum of Tier 1 and 2 Capital divided by risk-weighted assets (CAR). Consequently, their results
indicated a significant negative relationship, and COE would fall 4.39% for an increase of 10%.

Majanga (2015) conducted an empirical analysis and found a positive direct relationship between a
firm’s dividend policy and its stock price, focusing on 13 firms publicly listed on Malawi Stock
Excchange. They used the Pearson-correlation analysis, which is a statistical method, according to
which the results showed that the correlation between dividends per share (DPS) and the stock price
(SP) is 0.799, which proved the positive relationship between stock prices and dividends for firms
listed on MSE. However, stock prices are determined by many factors, not only dividends. It is more
useful to use correlation analysis to test relationships between two variables only after removing the
impact of other determinants, but the other factors of stock prices cannot be ignored.

In the realm of asset pricing, some studies have utilized machine learning (ML) and deep learning (DL)
models to forecast stock returns and other financial variables. Rapach et al. (2013) employed DL with
LASSO regression to predict global stock returns, while Hutchinson et al. (1994) leveraged neural
networks for derivative price prediction. Khandani et al. (2010) and Butaru et al. (2016) used
regression tree models to forecast credit card default probabilities. Harvey et al. (2016) explored the
efficacy of multiple asset pricing factors using DL . Kelly and Pruitt (2013, 2015) achieved enhanced
prediction outcomes by using DL techniques for factor dimensionality and constructing a hybrid of
multiple prediction models for stock returns. In a separate study, Lu et al. (2022) utilized ML models
for forecasting oil futures volatility, Gu et al. (2020) and Leippold et al. (2022) examine the use of ML
in China equities.

These studies collectively underscore the distinct advantages of ML and DL models compared to
conventional linear forecasting models when it comes to prediction. Traditional time series forecasting
models often cannot accurately capture non-linear patterns present in financial data. DL models, on
the other hand, do well in this area. They excel at capturing non-linear patterns autonomously, thus
making them better suited for predictions in such scenarios without the need for human input.

Electronic copy available at: https://ssrn.com/abstract=4464555


3 Models for Cost of Equity Capital and Spreads of Structured Products

This section describes the methods we use to calculate the cost of equity capital for traditional financial
markets and review the process of how corporates use structured financial products to raise finances.
Under the assumption of no arbitrage, we find the cost of equity is the inversal of price to earnings
ratio (P/E ratio). The cost of equity can be proxied as the total return of the whole stock market, based
on the capital asset pricing model (CAPM) considering the whole market as one portfolio. Then, we
conduct an overview of the machine learning models, where we focus on three deep learning models
of artificial neural networks (ANN), including multilayer perceptron mode (MLP), long short-term
memory (LSTM), and convolutional neural network (CNN). In the last part of this section, we
introduce a hybrid nueral network architechture based on the performance of the previous 3 models,
and thus it shows a better performance than single model.

3.1 Cost of Equity Capital in Traditional Financial Market

This section describes how to proxy the cost of equity capital of the whole US market and EU market.
We introduce two methods, the first one is using the inverse of P/E ratio and the second method is to
apply the total return of the whole market portfolio.

3.1.1 Price to Earnings Ratio (P/E)


Capital raising, in order to finance essential business projects, operation expansions, and pursuing
growth opportunities, is necessary and indispensable. The prediction of the costs of financing is
therefore essential for corporations to know their potential cash outflow in future operations. To begin
with, firms raise capital through traditional financial methods, such like issuing stocks and debts.
Usually, cost of debt is normally a fixed interest rate (though few can be floating) and is comparabily
apparent and observed from the market and there is less uncertainty. Unlike the cost of debt, the cost
of equity capital is usually difficult to obtain and can not be observed directly. Researchers have
devoted themselves to exploiting appropriate methods of representing cost of equity capital. According
to C Firer (1993) , there is a negative relationship between firm’s P/E ratio and the cost of equity,
which denotes that the cost of equity can be represented by the reciprocal of relevant firm’s P/E ratio.

In general, the cost of equity capital is the required return that shareholders expect or require as an
incentive to have exposure to shares. Another way to say this is that it reflects the estimated cost of the
funds that firms would incur to raise equity capital from its investor, and in general the cost of equity
capital is typically higher than the cost of debt capital, as equity investors require a higher rate of return
in exchange for taking on the risk of exposure to equities.

One challenge often experienced in the corporate finance literature is that the cost of equity capital is
not a directly observable variable, unlike the cost of debt capital which is essentially the interest rate
for debt capital. In this paper, we use reciprocal price-to-earnings ratio (P/E ratio) as a proxy for the

Electronic copy available at: https://ssrn.com/abstract=4464555


cost of equity capital. As we will show in the next subsection, the P/E ratio is inversely related to the
cost of equity capital. Indeed, when the P/E ratio is high, this means that the market investors are
willing to pay a high price for a firm’s earnings. This likely means that they perceive that firm to be
less risky, and if a firm is perceived to be less risky, then its cost of capital – in this case equity capital
– is usually lower, which also means that it has become cheaper to raise equity capital, an outcome
that can help expand a company’s capital base and fund investment projects that would generate
subsequent future returns and free cash flows. This explains the intuition behind using P/E ratio, a
more observable data, to proxy the state of the cost of equity capital. In the propositions below, we
present this intuition more formally and describe some important results of the dynamics of equity
prices.

Proposition 1: When publicly listed, dividend paying, firms have a prolonged existence which allows
them to continue as a going concern, paying a per share dividend that is constant across time, then the
P/E ratio, cost of equity capital, and the total risk of the firms are related, assuming no arbitrage. Higher
P/E ratio goes together with lower risk and hence lower cost of equity capital, ceteris paribus.

Proof: As a baseline, we will use the standard theory of the capital asset pricing model and the Gordon
Growth Model (GGM) variant of the dividend discount model.

To begin, we will derive the pricing model for a listed stock and argue that this price must equal the
market price of the firm’s equity for no arbitrage to hold.

To do this, we consider an infinitely lived representative agent at period 𝑡 who has an existing stock
endowment of 𝑆𝑡−1 shares held from the end of the last period. The stock pays a dividend per share
of 𝐷𝑡 at the end of the current period. The agent provides some exogenously determined quantity of
labor which allows her to earn an income of 𝑌𝑡 by the end of the current period. The agent also
purchases additional shares of 𝑆𝑡 at the end of the current period from an infinite-life, publicly listed
firm at a price of 𝑃𝑡 per share. Finally, the agent derives utility from a consumption good valued 𝐶𝑡 ,
where the utility is 𝑢(𝐶𝑡 ) and agent seeks to maximize the present value of its lifetime utility.
The flow budget constraint of this agent is:
𝐶𝑡 + 𝑃𝑡 𝑆𝑡 ≤ 𝑌𝑡 + 𝐷𝑡 𝑆𝑡−1
The objective function of the agent is:
∞ 𝑡

𝔼0 ∑ (∏ 𝛽𝜏 ) 𝑢(𝐶𝑡 )
𝑡=0 𝜏=0
where 𝛽𝜏 is a time-varying discount factor and 𝛽0 = 1. Unlike the constant discount factor 𝛽, in the
case of a time-varying discount factor, the discount factor varies for each time period to reflect the idea
that the rate at which life-time utility is discounted may change over time due to variations in time
preferences that reflect factors such as changing economic conditions, idiosyncratic conditions, or
broader market realities (i.e., changes in interest rates). It captures more nuanced dynamics and
evolving preferences than the constant discount factor. This allows for a more realistic representation
of changing discount rates over time, especially in dynamic scenarios where discount rates fluctuate.
Given the time-varying discount rate 𝑟𝜏 , the time-varying discount factor 𝛽𝜏 can be written as:
1
𝛽𝜏 =
1 + 𝑟𝜏

Electronic copy available at: https://ssrn.com/abstract=4464555


and the set of all discount function for each time period 𝑡 = 0, 1, 2, … … . , ∞ can be written as:
∞ ∞
1 1 1
{1, 𝛽1 , 𝛽1 𝛽2, , … … . , ∏ 𝛽𝑡 } = {1, , ,…….,∏ }
1 + 𝑟1 (1 + 𝑟1 )(1 + 𝑟2 ) (1 + 𝑟𝑡 )
𝑡=1 𝑡=1

The agent’s problem is therefore to solve the following optimization problem:


∞ 𝑡

max 𝔼0 ∑ (∏ 𝛽𝜏 ) 𝑢(𝐶𝑡 )
𝐶𝑡 , 𝑆𝑡
𝑡=0 𝜏=0
subject to
𝐶𝑡 + 𝑃𝑡 𝑆𝑡 ≤ 𝑌𝑡 + 𝐷𝑡 𝑆𝑡−1
The first order conditions for optimality are derived as follows:

∞ 𝑡

𝐿 = 𝔼0 ∑ (∏ 𝛽𝜏 ) ( 𝑢(𝐶𝑡 ) + 𝜆𝑡 (𝑌𝑡 + 𝐷𝑡 𝑆𝑡−1 − 𝐶𝑡 − 𝑃𝑡 𝑆𝑡 ))


𝑡=0 𝜏=0
𝜕𝐿
= 𝑢′ (𝐶𝑡 ) − 𝜆𝑡 = 0
𝜕𝐶𝑡
𝑡 𝑡+1
𝜕𝐿
= − (∏ 𝛽𝜏 ) 𝜆𝑡 𝑃𝑡 + 𝔼𝑡 [(∏ 𝛽𝜏 ) 𝜆𝑡+1 𝐷𝑡+1 ] = 0
𝜕𝑆𝑡
𝜏=0 𝜏=0
which, combining, gives the current price of the stock as
𝑢′ (𝐶𝑡+1 )
𝑃𝑡 = 𝔼𝑡 [𝛽𝑡+1 ′ 𝐷 ] (1)
𝑢 (𝐶𝑡 ) 𝑡+1
Now, let 𝑃𝑡𝑜𝑏𝑠 be the observed price of the stock or index of stocks, then by the no arbitrage
assumption, it must be that 𝑃𝑡 = 𝑃𝑡𝑜𝑏𝑠 .

As it is standard in the corporate finance literature, we know that the GGM as a variant of the dividend
discount model is used to value infinite-life listed stocks which pay constant dividends. The GGM can
thus be used to approximate the stock price given in equation (1) above since the pricing formula
follows from similar assumption. Accordingly, the DDM approximates (1) as
𝐷𝑡 (1 + 𝑔)
𝑃𝑡 = , 𝐶𝑂𝐸𝑡 > 𝑔 (2)
𝐶𝑂𝐸𝑡 − 𝑔
By the no arbitrage condition, it must be true that:
𝑢′ (𝐶𝑡+1 ) 𝐷𝑡 (1 + 𝑔)
𝑃𝑡 = 𝔼𝑡 [𝛽𝑡+1 ′ 𝐷𝑡+1 ] = = 𝑃𝑡𝑜𝑏𝑠 (3)
𝑢 (𝐶𝑡 ) 𝐶𝑂𝐸𝑡 − 𝑔
where 𝑔 is the constant growth rate of the dividend and 𝐶𝑂𝐸𝑡 is the cost of equity.
We will now show that a higher cost of equity gives rise to a lower P/E ratio. Dividing both sides of
(2) by the earnings per share (𝐸𝑡 ) and solving for 𝐶𝑂𝐸𝑡 gives
𝐷𝑡 (1 + 𝑔)
𝐶𝑂𝐸𝑡 = +𝑔 (4)
𝐸𝑡 𝑃𝑡 /𝐸𝑡
Consequently,
𝜕𝐶𝑂𝐸𝑡 𝐷𝑡 (1 + 𝑔)
= −( ) <0
𝜕𝑃𝑡 /𝐸𝑡 𝐸𝑡 (𝑃𝑡 /𝐸𝑡 )2
𝜕𝑃𝑡 /𝐸𝑡
<0
𝜕𝐶𝑂𝐸𝑡

Electronic copy available at: https://ssrn.com/abstract=4464555


𝐷
provided the company generates earnings per share of 𝐸𝑡 > 0, where 𝐸𝑡 is the dividend payout ratio.
𝑡

This shows that cost of equity is negatively connected with P/E ratio, which informs our decision of
using the more observable market P/E ratio to draw insights on cost of equity capital.

Proposition II: When a company has listed equities and raises equity capital, the share price increases
when market participants exhibit higher degree of patience, have higher marginal rate of substitution,
and expect the stock to pay higher future dividends, but it decreases when the cost of equity is high.

Proof: The proof requires log-linearization of equation (3) around their steady states, for simplicity
and to express all variables in the same standing.

Taking log of both sides of (3) gives:


𝑢′ (𝐶𝑡+1 ) 𝐷𝑡 (1 + 𝑔)
ln𝑃𝑡 = ln𝔼𝑡 [𝛽𝑡+1 ′ 𝐷𝑡+1 ] = ln
𝑢 (𝐶𝑡 ) 𝐶𝑂𝐸𝑡 − 𝑔
𝑢′ (𝐶𝑡+1 ) 𝐷𝑡 (1 + 𝑔)
ln𝑃𝑡 ≈ 𝔼𝑡 [ln (𝛽𝑡+1 ′ 𝐷𝑡+1 )] ≈ ln
𝑢 (𝐶𝑡 ) 𝐶𝑂𝐸𝑡 − 𝑔
′ (𝐶 )
𝑢 𝑡+1
ln𝑃𝑡 ≈ 𝔼𝑡 [ln𝛽𝑡+1 + ln ( ′ ) + ln𝐷𝑡+1 ] ≈ ln𝐷𝑡 + ln(1 + 𝑔) − ln(𝐶𝑂𝐸𝑡 − 𝑔)
𝑢 (𝐶𝑡 )
By Taylor expansion, we have:
1
ln𝑃∗ + ∗ (𝑃𝑡 − 𝑃∗ )
𝑃

1 𝑢′ (𝐶 ∗ ) 1 𝑢′ (𝐶𝑡+1 ) 𝑢′ (𝐶 ∗ )
≈ 𝔼𝑡 [ln𝛽 ∗ + (𝛽 − 𝛽 ∗)
+ ln ( ) + ( − ′ ∗ ) + ln𝐷∗
𝛽 ∗ 𝑡+1 𝑢′ (𝐶 ∗ ) 𝑢′ (𝐶 ∗ ) 𝑢′ (𝐶𝑡 ) 𝑢 (𝐶 )
𝑢′ (𝐶 ∗ )

1
+ ∗
(𝐷𝑡+1 − 𝐷∗ )]
𝐷

1 ∗ ∗
𝐶𝑂𝐸 ∗ (𝐶𝑂𝐸𝑡 − 𝐶𝑂𝐸 ∗ )
≈ ∗ (𝐷𝑡 − 𝐷 ) + ln(1 + 𝑔) − ln(𝐶𝑂𝐸 − 𝑔) −
𝐷 𝐶𝑂𝐸 ∗ − 𝑔 𝐶𝑂𝐸 ∗

1 ∗
1 ∗
1 𝑢′ (𝐶𝑡+1 ) 𝑢′ (𝐶 ∗ ) 1
(𝑃𝑡 − 𝑃 ) ≈ 𝔼𝑡 [ ∗ (𝛽𝑡+1 − 𝛽 ) + ′ ∗ ( ′ − ′ ∗ ) + ∗ (𝐷𝑡+1 − 𝐷∗ )]
𝑃 ∗ 𝛽 𝑢 (𝐶 ) 𝑢 (𝐶𝑡 ) 𝑢 (𝐶 ) 𝐷
𝑢′ (𝐶 ∗ )
1 ∗)
𝐶𝑂𝐸 ∗ (𝐶𝑂𝐸𝑡 − 𝐶𝑂𝐸 ∗ )
≈ (𝐷 − 𝐷 −
𝐷∗ 𝑡 𝐶𝑂𝐸 ∗ − 𝑔 𝐶𝑂𝐸 ∗
̃
𝑢′ (𝐶𝑡+1 ) 𝐶𝑂𝐸 ∗
𝑃̃𝑡 ≈ 𝔼𝑡 [𝛽̃
𝑡+1 +
̃ ̃
+ 𝐷𝑡+1 ] ≈ 𝐷𝑡 − ̃𝑡
𝐶𝑂𝐸 (5)
𝑢′ (𝐶𝑡 ) 𝐶𝑂𝐸 ∗ − 𝑔

1 1 𝑢′̃
(𝐶𝑡+1 ) 𝑢′ (𝐶𝑡+1 ) ′
𝑢 (𝐶 )∗
where 𝑃̃𝑡 = 𝑃∗ (𝑃𝑡 − 𝑃∗ ), 𝛽̃ ∗
𝑡+1 = 𝛽 ∗ (𝛽𝑡+1 − 𝛽 ); = ̃𝑡 = 1∗ (𝐷𝑡+1 − 𝐷∗ ),
− 𝑢′ (𝐶 ∗); 𝐷
𝑢′ (𝐶𝑡 ) 𝑢′ (𝐶𝑡 ) 𝐷

̃𝑡 = (𝐶𝑂𝐸𝑡 −𝐶𝑂𝐸
and 𝐶𝑂𝐸
)
. Variables with ̃ are log-linearized variables and those with * denote steady
𝐶𝑂𝐸 ∗

states. Thus, 𝑃̃𝑡 is log-linearized stock price, 𝛽̃


𝑡+1 is log-linearized discount factor, which measures

Electronic copy available at: https://ssrn.com/abstract=4464555


𝑢′̃
(𝐶𝑡+1 )
the degree of patience of the household to delay consumption into the future, is the log-
𝑢′ (𝐶𝑡 )

linearized marginal rate of substitution which measures the amount of current consumption that must
̃𝑡 and 𝐷
be given up in order to enjoy future consumption, 𝐷 ̃𝑡+1 are current and future (next period)

dividend payments, and 𝐶𝑂𝐸 ̃𝑡 is the log-linearized cost of equity which is the cost of raising equity
capital for the firm (or the required return investors demand and expect to get when they position in
the company’s stock). Equation (5) represents the log-linearized version of the model in (3) from which
we can draw several conclusions to prove the above proposition.

Comparative statics show that:


̃𝑡
𝜕𝑃 ̃𝑡
𝜕𝑃 ̃𝑡
𝜕𝑃 ̃𝑡
𝜕𝑃 ̃𝑡
𝜕𝑃 𝐶𝑂𝐸∗
> 0; > 0; > 0; > 0; =− ∗ < 0; 𝐶𝑂𝐸∗ > 𝑔
𝜕𝛽̃
𝑡+1 𝑢′ (̃𝐶𝑡+1 ) ̃𝑡
𝜕𝐷 𝜕𝐷̃𝑡+1
̃
𝜕𝐶𝑂𝐸 𝑡 𝐶𝑂𝐸 − 𝑔
𝜕 ′
𝑢 (𝐶𝑡 )

From the comparative statics, we see that when current dividend payment rises, current stock price
rises too because payment of dividends is a positive signal for (income) investors. This is favorable
news for the firm. In general, favorable news often improves stock prices. When future dividend is
expected to rise, current stock prices will rise as investors scramble to position themselves in the stock
to reap higher dividend benefits in the next period. When the marginal rate of substitution rises, it
means households have become more willing to give up some current consumption in favor of future
consumption. When some current consumption is given up, money is freed to purchase stock. This
demand raises the current price of the stock. When the household discount factor goes up, this means
that the household has a higher patience to delay consumption into the future, which again means the
household can channel that money into stock purchases as a way of saving, and the current stock price
will grow. Finally, when the current cost of equity is high, it means the firm will incur a high cost if it
chooses to raise equity capital, which means investors are currently requiring a guarantee of a certain
high return as the return they will get from the stock if they purchase it now and hold into the future.
In order for such high return to be possible in the future, investors would enter the stock at a low price
now. Consequently, current stock price would drop, provided of course 𝐶𝑂𝐸 ∗ > 𝑔. The comparative
statics together with this commentary proves proposition II.

3.1.2 Total Return of US and Europe Stock Markets


This part introduces another proxy for calculation of firms’ cost of equity. By assuming the capital
assets pricing model (CAPM), we represent the single security as the market portfolio. Therefore, the
total return of market portfolio is considered as the cost of equity of the whole stock market.

CAPM estimates the expected return on a certain asset based on its dependency on the market risk, the
systematic risk, under several assumptions: 1) all investors seek for the maximization of their economic
utilities, 2) rationality and risk-aversion, 3) people use various investments to achieve fully
diversification, 4) people are price takers and unable to change asset prices, 5) lend and borrow money
at risk-free rate with no limitations, 6) there is no transaction fee or taxation cost, no transaction friction,
7) all assets can be divided infinitely into smaller parcels and are sufficiently liquid, 8) similar
expectations on market conditions, 9) no information asymmetry, all information is equally accessible.

Electronic copy available at: https://ssrn.com/abstract=4464555


The CAPM decribes a linear relationship between the required rate of return on one asset or a single
portfolio and the market risk premium, shown in the following formula.

𝐸(𝑅𝑖 ) = 𝑅𝑓 + 𝛽𝑖,𝑚 (𝐸(𝑅𝑚 ) − 𝑅𝑓 )


In this formula, 𝐸(𝑅𝑖 ) denotes the required return of a certain asset, and 𝐸(𝑅𝑚 ) denotes the
expected return on the market portfolio, 𝑅𝑓 is risk-free rate, 𝛽𝑖,𝑚 is the market-risk sensitivity factor
of this asset i. We represent the sigle asset as the market portfolio, then we get the following eqution:

𝐸(𝑅𝑖 ) = 𝑅𝑓 + 𝛽𝑚,𝑚 (𝐸(𝑅𝑚 ) − 𝑅𝑓 )

The whole stock market is considerd to be a firm, the required return on it is the cost of equity of the
hole market. It is apparently to find that 𝛽𝑚,𝑚 = 1, then we can consider the expected rate of return
on the market portfolio is the cost of whole market’s equity capital. This is the second method we use
as a proxy of cost of equity, which is the total return of the whole stock market.

Indeed, put 𝐸(𝑅𝑖 ) = 𝐶𝑂𝐸𝑡 , 𝑅𝑓 = 𝑅𝑡 , and 𝐸(𝑅𝑚 ) = 𝑅𝑡𝑚 . Then the required return on the broader
market index can be a proxy for the cost of equity capital at the broader market level. From the capital
asset pricing model, we know that the cost of equity capital 𝐶𝑂𝐸𝑡 is related to the risk-free rate 𝑅𝑡 ,
market risk sensitivity of the security 𝛽, and required return on the broader market 𝑅𝑡𝑚 as
𝐶𝑂𝐸𝑡 = 𝑅𝑡 + 𝛽(𝑅𝑡𝑚 − 𝑅𝑡 )
If the security is the market, then the market is sensitive to itself. Accordingly, 𝛽 = 1, giving:

𝐶𝑂𝐸𝑡 = 𝑅𝑡𝑚
This shows that when the security is the entire market, as we are analyzing in the paper, then the cost
of equity capital can also be measured as the required return on the market.

3.2 Structured Financial Products and their Credit Spreads

On the other hand, using structured financial products is one of financing methods of firms. The cost
of structured financial products is significant and the prediction of it is necessary for firms to generate
appropriate financing plan and business schemes. Structured products are specialized financial tools
accessible to companies that require intricate financing solutions beyond what conventional financing
can provide. Conventional lenders typically do not provide structured financing options. Structured
financial instruments, such as collateralized debt obligations, are not easily transferable. They are
employed to make it easier to raise capital, and they have been proved useful by researchers in
mitigating risk and fostering the growth of financial markets in complex emerging economies.

In addition to traditional finance, selecting structured products is a highly involved method for firms
and financial institutions to raise capital and complete their more complicated financing needs. There
are different products in structured financia markets, such as Collateralized Loan Obligations (CLOs),

Electronic copy available at: https://ssrn.com/abstract=4464555


Residential Mortgage-Backed Securities (RMBS), Commercial Mortgage-Backed Securities (CMBS),
and structured products based on auto loans (AL), credit card loans (CCL), and student loans (SL).

The structural process also is knon as securitization. Generally, the process involves pooling together
loans, and then creating a separate legal entity called a special purpose vehicle (SPV). The SPV holds
the loan portfolio and divides it into different tranches according to the level of risk and return. The
least risky tranch, or the senior tranch, often known as ‘AAA’ tranch, which has a smaller rate of return
but provides greater safety for investors’ principal. The process of division satisfies investors’ different
risk preferences. Then, the SPV issue bonds backed by the cash flows generated by the loan portfolios
and sell these bonds to investors. By creating structured products and reselling loans to various
investors, firms obtain funds at a lower cost, because of product divisions based on different risk levels.

Figure 3.2.1

AAA/Aaa

AA/Aa

Special Purpose Vehicle (SPV) A/A


Pooled Loans
( corporate loans,
leveraged loans BBB/Baa
commercial real
estate loans, etc.)
BB/Ba

B/B

Unrated

For CLOs, firms pool together various loans like corporate loans, leveraged loans, or commercial real
estate loans. The loans are then held by an SPV, which issues bonds backed by the cash flows from
these loans. The bonds are sold to investors in different tranches, appealing to different risk preferences.
The proceeds from bond sales are used to repay the pooled loans, while the loan cash flows pay interest
and principal on the bonds. A collateral manager oversees the loan portfolio within the SPV.

Similarly, RMBS and CMBS involve pooling residential or commercial mortgage loans, creating an
SPV, issuing bonds backed by the mortgage loan cash flows, and selling these bonds to investors in
multiple tranches. The proceeds from bond sales are used to repay the mortgages, and the mortgage
cash flows pay interest and principal on the bonds. A collateral manager manages the mortgage
portfolio within the SPV.

For auto loan structured products, firms pool auto loans, create an SPV, issue bonds backed by the
cash flows from the auto loans, and sell these bonds to investors. The same process applies to credit
card loan structured products, where credit card loans are pooled, and bonds are issued based on their

Electronic copy available at: https://ssrn.com/abstract=4464555


cash flows. Student loan structured products follow a similar pattern, with pooled student loans, bonds
backed by their cash flows, and sales to investors.

Overall, these structured products allow firms to access capital by securitizing loan portfolios. By
diversifying funding sources and potentially obtaining a lower cost of capital, firms can raise funds for
various purposes. However, it is essential to manage risks carefully and conduct due diligence,
considering the potential credit events and defaults associated with these structured products.
Monitoring credit spreads across these asset classes is crucial for assessing risks and opportunities in
the structured products market. Below, we present a model for credit spreads determination.

3.3 Credit Spreads for Securitized Structured Products

In general, credit spreads represent the difference in yields of a debt financial instrument and versus a
safe haven asset, usually government issued security. It is the premium or compensation above the
yield on government issued safe haven security offered to investors as compensation for taking on
credit risk associated with the financial instrument.

Because they are associated with the debt market, credit spreads are used to assess borrower or issuer
creditworthiness. When credit spread widens, this indicates market perceives higher credit risk for the
financial instrument, and therefore, demands higher premium above the safe haven bond yield as an
incentive to incentivize investors to purchase the risky security. Conversely, a narrowing credit spread
suggests improving credit quality and reduced perceived risk, resulting in a lower premium above
government bond yield. Credit spreads can be influenced by various factors, including the issuer's
financial health, changes in credit perception or ratings, broader market conditions, overall economic
outlook, and investor sentiment.

In a similar fashion, the credit spread of securitized or structured products refers to the additional yield
investors demand over a risk-free benchmark for investing in these types of securities. This is the
additional cost that firms using structured products would have to pay in order to get lenders to supply
them funds instead of purchasing safer government bonds. In fact, it measures how much more it would
cost a firm to raise funds in the markets through structured products than it would cost the government
to raise debt revenue by bond issuance. Securitized or structured products are financial instruments
created by pooling together various underlying assets, such as mortgages, auto loans, or credit card
receivables, and then issuing securities backed by these assets. The credit spread for securitized or
structured products reflects the compensation investors require for bearing the credit risk associated
with the underlying assets. Since these products are backed by a pool of assets, their creditworthiness
and risk profile depend on the quality and performance of the underlying assets.

The credit spread of securitized or structured products is typically wider than that of traditional fixed-
income securities, such as government bonds or investment-grade corporate bonds. This is because
securitized or structured products often involve a higher complexity, uncertainty, and potential credit
risk, but they can be a great source for firms to raise funds from investors who are attracted to the
greater returns they offer.

The quality of underlying assets and structure of the security are important factors influencing credit

Electronic copy available at: https://ssrn.com/abstract=4464555


spreads of securitized structured products. Higher(lower) quality assets with lower(higher) default risk
have narrow(wide) credit spreads. The structure and specific features such as different tranches within
a securitized product pose different levels of credit risk and, therefore, command different spreads.
Investors and market participants analyze the credit spread of securitized or structured products to
assess the associated credit risk and determine whether the potential return justifies the additional risk.
One of the most important pieces of information contained in the credit spreads of structured products
is the likelihood of default. By accurately forecasting what the direction and magnitude of the credit
spreads will be in the future, one is invariability determining what the chances of default would be for
the structured products. In the exposition that follows below, a model detailing the connection between
credit spreads and likelihood of default is presented to future motivate our exposition of credit spreads.

3.3.1 Credit Spreads and Default Likelihood of Structured Products

In order to show how credit spreads can reflect default probability, we follow Manning (2004) and
Dionne et al. (2010) and derive a model which links credit spreads to default probability. For simplicity,
suppose the market consists of a default-free (riskless) bond and a defaultable (risky) securitized
financial instrument, both of which pay only at maturity, i.e., a zero-coupon bond. If we assume that
returns are continuous compounded, then the return on a default-free bond can be written as
𝑓
𝑅0,𝑇 = 𝑒 𝑟0,𝑇 𝑇
where 𝑟0,𝑇 is the risk-free yield of the bond at the time 0 period of the purchase of the bond, for a
bond that will be held up until period 𝑇 , where 𝑇 is the maturity of the bond multiplied by the
𝑓
compounding period, and 𝑅0,𝑇 is the gross return, i.e., 1 plus the returns that the bond has earned

between the interval (0, 𝑇) by the end of the period 𝑇.

The structured product is a similar type of security, but because it is risky, it must pay a credit spread
or premium above and beyond the yield offered by the riskless bond. Let this credit spread be 𝑐0,𝑇 ,
where 𝑐0,𝑇 is the credit spread when the structured product was first issued and bought in the market
at time 0, then the risky yield of the structured product at the time of purchase is 𝑟0,𝑇 + 𝑐0,𝑇 and the
gross return on this structured product is:

𝑑
𝑅0,𝑇 = 𝑒 (𝑟0,𝑇 +𝑐0,𝑇 )𝑇
𝑑
where 𝑅0,𝑇 is the gross return on the structured product, i.e., 1 plus the returns earned in the interval
(0, 𝑇) by the end of the period 𝑇.

If 𝑟0,𝑇 + 𝑐0,𝑇 > 𝑟0,𝑇 as would be expected and there is arbitrage in the market, then investors are
certain to earn this higher return at no risk and receive all their cash flows at maturity with a probability
of 1, so that the likelihood or probability of default is zero. This is not reasonable because any financial
instrument that provides a return above the government bond’s risk-free rate cannot be equally or less
risky than government bond but would normally be riskier than the government bond. Hence if 𝑟0,𝑇 +
𝑐0,𝑇 > 𝑟0,𝑇 , then default probability must exist in the securitized product, and this probability is > 0.

Let 𝑝𝑇 𝑑 represent the risk neutral cumulative probability that the structured product will default
over its lifetime and so 1 − 𝑝𝑇 𝑑 is the probability that the structured product will not default. If the

Electronic copy available at: https://ssrn.com/abstract=4464555


𝑑
structured product defaults, it will recover 𝜃 percent of its gross return 𝑅0,𝑇 at the end of the T-period
𝑑
investing horizon. So, it will recover 𝜃𝑅0,𝑇 if it defaults with probability 𝑝𝑇 𝑑 . However, if it does

𝑑
not default, it will receive its full gross return 𝑅0,𝑇 with probability of 1 − 𝑝𝑇 𝑑 at the end of the T-

period investing horizon. Thus, the expected gross return from the securitized product is
𝑑 𝑑 𝑑
𝑝𝑇 𝑑 𝜃𝑅0,𝑇 + (1 − 𝑝𝑇 𝑑 )𝑅0,𝑇 = (1 − 𝑝𝑇 𝑑 + 𝑝𝑇 𝑑 𝜃)𝑅0,𝑇

This is the risk-neutral expected return on the structured product because, although the structured
product is risky, the investor is indifferent to this risk and is only interested in the expected return and
does not actively require compensation for this risk. Under the assumption of risk neutrality in
contingent claims framework, the risk-neutral expected return on the structured product must be equal
to the return on the risk-free bond.

(1 − 𝑝𝑇 𝑑 + 𝑝𝑇 𝑑 𝜃)𝑒 (𝑟0,𝑇 +𝑐0,𝑇 )𝑇 = 𝑒 𝑟0,𝑇 𝑇 (7)

solving for the credit spread gives

ln(1 − 𝑝𝑇 𝑑 + 𝑝𝑇 𝑑 𝜃)
𝑐0,𝑇 =− (8)
𝑇

which gives the relationship between credit spreads and default probability.

Proposition III: When agents are risk neutral and there is no arbitrage, then the credit spread of
structured product increases with the cumulative probability of default, and thus a high credit spread
of structured production reflects a high probability of default.

Proof: Let us first log-linearize the expression in (7) as it is easier to work with this. We have

ln(1 − 𝑝𝑇 𝑑 + 𝑝𝑇 𝑑 𝜃) + ln𝑒 𝑐0,𝑇 𝑇 = 0

By Taylor expansion, we have


1
ln(1 − 𝑝∗ 𝑑 + 𝑝∗ 𝑑 𝜃) + (1 − 𝑝𝑇 𝑑 + 𝑝𝑇 𝑑 𝜃 − 1 + 𝑝∗ 𝑑 − 𝑝∗ 𝑑 𝜃)
1− 𝑝∗ 𝑑 + 𝑝∗ 𝑑 𝜃

≈ −ln𝑒 𝑇𝑐 − 𝑇(𝑐0,𝑇 − 𝑐 ∗ )

(𝜃 − 1)𝑝∗ 𝑑 (𝑝𝑇 𝑑 − 𝑝∗ 𝑑 ) (𝑐0,𝑇 − 𝑐 ∗ )


≈ −𝑇𝑐 ∗
1 − 𝑝∗ 𝑑 + 𝑝∗ 𝑑 𝜃 𝑝∗ 𝑑 𝑐∗

(1 − 𝜃 )𝑝∗ 𝑑
𝑝̃𝑇𝑑 ≈ 𝑐̃
0,𝑇 (9)
𝑇(1 + (1 − 𝜃)𝑝∗ 𝑑 )𝑐 ∗
and comparative statics yield

𝜕𝑐̃
0,𝑇 (1 − 𝜃 )𝑝∗ 𝑑
= >0
𝜕𝑝̃𝑇𝑑 𝑇(1 + (1 − 𝜃)𝑝∗ 𝑑 )𝑐 ∗

Electronic copy available at: https://ssrn.com/abstract=4464555


Show that credit spread in the structure product market reflects the probability of default, and a high
probability of default means a higher credit spread. This completes the proof. In the next section, we
present the machine learning models, which is the main empirical methodology, and subsequently
implement the models for our data.

3.4 Overview of Deep Learning Models

3.4.1 Multilayer Perceptron Model (MLP)


The major Machine Learning prediction models we analyze are the Artificial Neural Networks (ANN).
Artificial neural network (ANN) is a computational model inspired by the structure and function of the
biological brain. It consists of interconnected nodes called artificial neurons or perceptrons. ANNs are
designed to process and learn from data, enabling them to make predictions or perform tasks such as
classification, regression, and pattern recognition. They are composed of input and output layers, as
well as one or more hidden layers that contain the artificial neurons.

Figure 3.3.1 ANN illustration

Bias unit 𝐼 𝑤0
𝑋1 𝑤1

𝑋2 𝑤2
Z a 𝑦̂
𝑋3 𝑤3
Net input Activation Output
𝑋𝑛 𝑤𝑛 function function

Input Weight
values coefficients

In the figure 3.3.1, the first layer is input data, Z represents one neuron and in each neuron all inputs
are given certain weight and then be added up together. Here this neuron has totally n inputs of X. By
adding up the products of weights and input values and the bias term, we obtain results from the
following net input function:
𝑛

𝑍 = 𝑤0 𝐼 + ∑ 𝑤𝑖 𝑋𝑖
𝑖=1

Then, the results of the above equation should be processed with one activation function. There are 2
types of activation functions that are most commonly used by resarchers, which are linear and
nonlinear activation functions. Linear activation function includes rectified linear unit (ReLu) function.
Nonlinear activation function includes sigmoid function, softmax function, and hyperbolic tangent
function (tanh) function.

Electronic copy available at: https://ssrn.com/abstract=4464555


Multilayer Perceptron Model (MLP) is a type of artificial neural network (ANN). It is composed of
multiple layers of perceptrons, including an input layer, one or more hidden layers, and an output layer.
Each perceptron in the MLP is a computational unit that applies a weighted sum of its inputs, passes
the result through an activation function, and outputs the result to the next layer. MLPs are primarily
used for supervised learning tasks such as classification and regression.

Figure 3.3.2 MLP illustration

Output
layer

Input
layer Hidden
layer

3.4.2 Long Short-Term Memory (LSTM)


Long Short-Term Memory is a type of recurrent neural network (RNN). A Recurrent Neural Network
(RNN) is a type of artificial neural network designed to process sequential data, such as time series,
text, or speech. Unlike feedforward neural networks, which process data in a strictly sequential manner
without considering previous inputs, RNNs have feedback connections that allow information to be
preserved and propagated across different time steps.

Different from the standard feedforward neural networks, LSTM networks have recurrent connections
that allow them to process sequential data and capture long-term dependencies. The main difference
between LSTM (Long Short-Term Memory) and traditional RNN (Recurrent Neural Network) lies in
their ability to capture and retain long-term dependencies in sequential data. LSTMs are specifically
designed to mitigate the vanishing gradient problem in traditional RNNs, which makes them effective
for tasks involving sequential data, such as natural language processing, speech recognition, and time
series analysis.

In an LSTM cell, there are three main components: the input gate, the forget gate, and the output gate.
These gates control the flow of information and determine what information should be stored or
discarded in the memory cell. The forget gate allows the LSTM cell to selectively forget or retain
information from the previous time step. The input gate regulates the flow of new information into the
memory cell. The output gate controls the output of the LSTM cell based on the current input and the
information stored in the memory cell. From the figure 3.3.3 below, we could see in the current cell, it
takes consideration of the current input 𝑋𝑡 as well as the historical input ℎ𝑡−1.

Electronic copy available at: https://ssrn.com/abstract=4464555


Figure 3.3.3 MLP illustration

3.4.3 Convolutional Neural Network (CNN)


Convolutional Neural Network, which is a specialized type of neural network commonly used for
analyzing visual data such as images. CNNs are particularly effective at automatically learning and
extracting features from images by utilizing convolutional layers, pooling layers, and fully connected
layers. The convolutional layers apply filters to input images to extract local features, and the pooling
layers downsample the feature maps to reduce dimensionality. CNNs have achieved significant success
in computer vision tasks such as image classification, object detection, and image segmentation.

3.4.4 Hybrid Model (HM)


This paper focuses on examine whether machine learning methods perform better in predicting
financing cost of firms and financial institutes. One of the main tasks is to fit the the hybrid model
introduced by Bouchra (2022), which has been proved to have a much better performance in stock
price forecasting. To construct the hybrid deep learning model, we should first estimate 3 neural
networks, which are MLP, LSTM and CNN separately. Then, based on their performance, evaluated
by mean squared errors (MSE), a set of weights vectors are obtained:
𝑠𝑢𝑚 = 𝑀𝑆𝐸𝑀𝐿𝑃 + 𝑀𝑆𝐸𝐿𝑆𝑇𝑀 + 𝑀𝑆𝐸𝐶𝑁𝑁
𝑠𝑢𝑚 − 𝑀𝑆𝐸𝑀𝐿𝑃
𝑤𝑀𝐿𝑃 =
𝑠𝑢𝑚
𝑠𝑢𝑚 − 𝑀𝑆𝐸𝐿𝑆𝑇𝑀
𝑤𝐿𝑆𝑇𝑀 =
𝑠𝑢𝑚
𝑠𝑢𝑚 − 𝑀𝑆𝐸𝐶𝑁𝑁
{ 𝑤𝐶𝑁𝑁 =
𝑠𝑢𝑚
where sum is the total of 3 models’ MSE, 𝑀𝑆𝐸𝑀𝐿𝑃 is the mean squared error obtained after fitting
MLP model, 𝑀𝑆𝐸𝑙𝑠𝑡𝑚 is the mean squared error of LSTM model, and 𝑀𝑆𝐸𝐶𝑁𝑁 is the mean squared
error of CNN model. The weight will be lower if the MSE is larger. Therefore, higher weight will be
given to better-performed model and lower weight will be given to poorer-performed model.

Then, the estimated results of 3 models are used as new inputs of the hybrid model. Put the different
results generated from 3 separate models into the hybrid model, then new model is:
𝑍 = 𝑤𝑀𝐿𝑃 𝑋𝑀𝐿𝑃 + 𝑤𝐿𝑆𝑇𝑀 𝑋𝐿𝑆𝑇𝑀 + 𝑤𝐶𝑁𝑁 𝑋𝐶𝑁𝑁 + 𝑏

Electronic copy available at: https://ssrn.com/abstract=4464555


Where 𝑋𝑀𝐿𝑃 is the predicted results estimated from MLP model, 𝑋𝐿𝑆𝑇𝑀 is the predicted results
estimated from LSTM model , 𝑋𝐶𝑁𝑁 is the predictd results estimated from CNN model. The major
target of building this new model is to obtain a better approximation of the vector Y, cost of financing
target variables. Following Bouchra et al. (2022), we use Gradient Descent algorithm to achieve the
optimization model:
𝑛
1
min 𝑀𝑆𝐸 = ∑(𝑍𝑖 − 𝑌𝑖 )2
𝑍 𝑛
𝑖=1

4 Implementation and Empirical Analysis

4.1 Data set description

The financial markets in the United States and Europe have a long history of development, and to this
day, they have become very mature and have a significant impact on global economic and financial
activities. Our goal is to examine the machine learning models’ performance on prediction of corporate
financing costs in these two mature financial markets. Firstly, for the traditional financial market, we
obtained the observable weekly P/E ratios of US stock markets from Bloomberg and the weekly P/E
ratios of EU stock markets from Bloomberg. We obtained US S&P 500 data, in representing for total
return of US market portfolio, and obtained Eurostoxx 600 data for total return of EU market portfolio.
Secondly, for the part of structural financial market, we obtained weekly data for both US and EU
structural financial products from JP Morgan Asset Backed Securities Data Bank.

Our data contains 4 structured financial products for EU market, in which there consists of European
collateralized loan obligations (EU CLO), European autovehicle-loan-backed security (EU Auto),
European residential mortgage-backed security (EU RMBS), European commercial mortgage-backed
security (EU CMBS). Our data also contains 4 structured financial products for US market, in which
there consists of US collateralized loan obligations (US CLO), US autovehicle-loan-backed security
(US Auto), US commercial mortgage-backed security (US CMBS), Federal Family Education Loan
Program (FFELP). We described the specific details of our dataset in Appendix A. Our dataset covers
a time span from 2006 to 2023, totally 17 years incuding around 900 business weeks.

Table 4.1.1 : Statistics summary of data set


Dataset1: Input Variables
Variables USIPI US10YY US2YY US3MY EUIPI EU10YY EU3MY DERI VIX OIL
count 279 4755 4957 5807 190 638 350 4335 5250 6001
mean 96.85 2.85 1.64 1.59 86.20 6.64 2.17 103.46 18.69 64.16
std 4.94 1.12 1.52 1.80 14.91 3.85 2.22 11.27 9.26 30.84
min 84.60 0.52 0.00 0.00 57.19 -0.09 -0.58 85.47 0.00 0.00
25% 92.67 1.98 0.38 0.09 71.18 3.83 0.03 93.14 13.22 41.00

Electronic copy available at: https://ssrn.com/abstract=4464555


50% 98.20 2.69 0.96 0.96 88.94 7.01 2.12 100.84 16.70 62.11
75% 101.06 3.75 2.59 2.40 98.82 9.70 4.05 113.99 22.10 84.73
max 104.12 5.26 5.29 6.42 109.19 15.44 7.58 128.32 82.69 143.95
Correlation Matrix
USIPI 1.00 -0.34 0.37 -0.02 0.45 0.14 -0.45 -0.54 -0.41 -0.13
US10YY -0.34 1.00 0.81 0.47 -0.83 -0.70 0.51 -0.61 -0.07 -0.49
US2YY 0.37 0.81 1.00 0.15 0.19 -0.66 -0.56 -0.29 -0.27 -0.42
US3MY -0.02 0.47 0.15 1.00 0.77 0.92 0.54 -0.67 0.34 -0.29
EUIPI 0.45 -0.83 0.19 0.77 1.00 0.79 -0.70 -0.76 -0.58 0.22
EU10YY 0.14 -0.70 -0.66 0.92 0.79 1.00 0.20 0.87 0.48 0.38
EU3MY -0.45 0.51 -0.56 0.54 -0.70 0.20 1.00 0.65 0.57 0.09
DERI -0.54 -0.61 -0.29 -0.67 -0.76 0.87 0.65 1.00 -0.27 0.41
VIX -0.41 -0.07 -0.27 0.34 -0.58 0.48 0.57 -0.27 1.00 -0.05
OIL -0.13 -0.49 -0.42 -0.29 0.22 0.38 0.09 0.41 -0.05 1.00
Dataset2: Target Variables
Cost of equity Total Return Stuctured Financial Products
EU EU EU US US US FFELP
Variables USPE EUPE USTR EUTR EU CLO
RMBS CMBS Auto CLO Auto CMBS AAA
count 1683 1098 1840 1162 904 902 831 902 904 904 600 904
mean 19.99 19.38 0.00 0.00 245.12 256.81 189.84 168.12 146.86 43.98 101.58 57.61
std 4.54 12.92 0.02 0.03 193.25 238.51 206.66 127.24 119.82 76.38 28.21 49.88
min 10.02 7.10 -0.18 -0.23 10.50 16.00 24.00 23.00 23.00 0.00 59.31 -1.00
25% 16.74 13.76 -0.01 -0.01 126.75 106.00 85.00 97.50 80.00 17.00 82.85 33.00
50% 19.02 16.42 0.00 0.00 178.20 175.00 112.00 130.00 121.50 25.00 95.08 44.00
75% 22.03 18.82 0.01 0.02 335.40 293.75 200.00 210.00 175.00 35.25 115.19 70.00
max 34.51 130.57 0.12 0.15 882.60 1000.00 1150.00 800.00 800.00 550.00 260.34 350.00
Correlation Matrix
USPE 1.00 0.11 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
EUPE 0.11 1.00 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
USTR NaN NaN 1.00 0.00 NaN NaN NaN NaN NaN NaN NaN NaN
EUTR NaN NaN 0.00 1.00 NaN NaN NaN NaN NaN NaN NaN NaN
EU
NaN NaN NaN NaN 1.00 0.63 0.30 0.56 0.46 0.20 0.42 0.16
RMBS
EU
NaN NaN NaN NaN 0.63 1.00 0.78 0.92 0.84 0.42 0.63 0.32
CMBS
EU Auto NaN NaN NaN NaN 0.30 0.78 1.00 0.90 0.93 0.70 0.66 0.59
EU CLO NaN NaN NaN NaN 0.56 0.92 0.90 1.00 0.95 0.60 0.71 0.55
US CLO NaN NaN NaN NaN 0.46 0.84 0.93 0.95 1.00 0.69 0.72 0.64
US Auto NaN NaN NaN NaN 0.20 0.42 0.70 0.60 0.69 1.00 0.63 0.87
US
NaN NaN NaN NaN 0.42 0.63 0.66 0.71 0.72 0.63 1.00 0.54
CMBS
FFELP
NaN NaN NaN NaN 0.16 0.32 0.59 0.55 0.64 0.87 0.54 1.00
AAA

Electronic copy available at: https://ssrn.com/abstract=4464555


4.1.1 Stationarity
Stationarity refers to the property of a time series where the statistical properties, such as the mean and
variance, remain constant over time. In a stationary series, the patterns in the data do not change
systematically over time. The Augmented Dickey-Fuller (ADF) test and the Kwiatkowski-Phillips-
Schmidt-Shin (KPSS) test will be used as statistical tests to assess the stationarity of a time series.

Augmented Dickey-Fuller (ADF) Test: The ADF test estimates an autoregressive model and checks
whether the coefficient of the lagged dependent variable is significantly different from 1. If the
coefficient is significantly less than 1, it provides evidence in favor of stationarity. The test also
considers higher-order lags to capture the serial correlation in the data. The ADF test is used to
determine whether a time series has a unit root, which indicates the presence of a stochastic trend. The
null hypothesis of the ADF test is that the time series is non-stationary (i.e., it has a unit root). If the
test rejects the null hypothesis, it suggests that the time series is stationary.

Kwiatkowski-Phillips-Schmidt-Shin (KPSS) Test: The KPSS test estimates an autoregressive model


under the assumption of stationarity and checks whether the estimated variance of the residuals is
significantly different from zero. If the variance is significantly greater than zero, it indicates the
presence of non-stationarity. The KPSS test is used to assess the null hypothesis of trend stationarity
in a time series. Unlike the ADF test, the KPSS test assumes the null hypothesis of stationarity and
checks whether the series exhibits trends or unit roots. If the test rejects the null hypothesis, it suggests
the presence of a unit root or non-stationarity.

Table 4.1.2: ADF and KPSS Test Results


Stationary Test Results
ADF Test KPSS Test
ADF 5% Critical KPSS 5% Critical
P-value P-value
Statistics Value Statistics Value
Cost of USPE -2.27 0.18 -2.86 2.71 0.01 0.46
Equity EUPE -3.56 0.01 -2.86 1.05 0.01 0.46
Total USTR -18.26 0.00 -2.86 0.12 0.10 0.46
Return EUTR -6.94 0.00 -2.86 0.03 0.10 0.46
EU RMBS -1.84 0.36 -2.86 0.95 0.01 0.46
EU CMBS -2.36 0.15 -2.86 1.17 0.01 0.46
EU Auto -3.66 0.00 -2.86 1.17 0.01 0.46
Stuctured
EU CLO -1.98 0.30 -2.86 0.60 0.02 0.46
Financial
US CLO -2.74 0.07 -2.86 0.51 0.04 0.46
Products
US Auto -3.50 0.01 -2.86 0.37 0.09 0.46
US CMBS -4.24 0.00 -2.86 0.47 0.05 0.46
FFELP AAA -3.05 0.03 -2.86 0.25 0.10 0.46

From the results of ADF tests, it can be concluded that some our target variables are stationary, some
are non-sationary. Firstly, let’s see the cost data of traditional financing, the United State P/E ratio has

Electronic copy available at: https://ssrn.com/abstract=4464555


a larger p-value of 0.18, which suggests that failing to reject the null hypothesis, it is non-stationary.
The European P/E ratio, United State total return rate and European total return all have a smaller p-
value, from which it indicates that the null should be rejected, and the time series are stationary.
Secondly, as for the structured finance parts, the ADF test results show that 4 variables, which are
European Auto, United State Auto, United State Commercial MBS, Federal family education loan
project, have a very small p-value and the null hypothesis is rejected, and they show stationarity. The
other variables are non-stationary, with a higher p-vlue suggesting inability of rejecting null hypothesis.
Same results can be concluded from the KPSS test. Additional confirmation of stationarity can be
derived from the visualization results of our target varibles in Appendix B.

Figure 4.1.1 US P/E Ratio Visualization Figure 4.1.2 EU P/E Ratio Visualization

4.1.2 Serial correlation

Serial correlation describes a relationship between the historical data and future data. If there exists
autocorrelation, it is possible to use historical data to predict future data. The Autocorrelation Function
(ACF) measures the correlation between a given time series and its lagged values, from which we
identify the relationship between the current observation and previous observations at different time
lags. The Partial Autocorrelation Function (PACF) measures the correlation between a time series and
its lagged values after removing the correlation contribution of the intermediate lags. It helps to identify
the direct relationship between current observation and previous observations at specific time lags.

Through conducting Autocorrelation Function (ACF) and Partial Autocorrelation Function (PACF)
calculations, shown as graphs in the Appendix, we can know whether there is autocorrelation in our
time series data. The ACF plot shows the correlation coefficients at each lagged week, with the lag on
the x-axis and their correlation coefficient on the y-axis. A significant autocorrelation at a particular
lag suggests that the previous observations at that lag have an impact on the current observation. The
PACF plot shows the partial correlation coefficients at each lag, with the lag on the x-axis and the
partial correlation coefficient on the y-axis. A significant partial autocorrelation at a particular lag
suggests a direct relationship between the current observation and the observation at that lagged week.
Notice in our graphs, the light blue region is the area of 5% siginificance level, the focus is on those

Electronic copy available at: https://ssrn.com/abstract=4464555


points lie beyond this region, which represents that they are significant. On the whole, the first 50 to
100 lags of all datas show that the coefficients are positive and significantly high, which indicates the
same direction of current value movements and the past value movements.

Figure 4.1.3 EU ACF Figure 4.1.4 EU PACF

4.1.3 Heteroskedasticity
Heteroskedasticity refers to a situation where the variance of a variable of interest is not constant across
the range of values of another variable. It describes a condition where the spread or dispersion of the
data points is not consistent. Heteroskedasticity violates one of the assumptions of for proper inference,
which assumes the variance of the residuals is constant across all levels of the independent variables.
When heteroskedasticity is present, it can affect the statistical inference and interpretation of models.

The presence of heteroskedasticity can lead to several issues. First, the standard errors of the regression
coefficients may be biased, which affects the accuracy of hypothesis tests and confidence intervals.
Second, it can impact the efficiency of the estimators, making them less precise. Finally, it leads to
incorrect inferences about the significance of independent variables and affects overall fit, predictive
power of the model. Thus, it is important to check for heteroskedasticity. There are several diagnostic
tests to detect heteroskedasticity. In our analysis, we employ the Breusch-Pagan test. We first conduct
a regression model on target variables, then apply the regression residuals to the Breusch-Pagan Test.

Table 4.1.3 : BP-Test Results


Breusch-Pagan test results
Degrees of
Test statistic p-value F-value
freedom
USPE 118.76 5.75E-24 27.17 1.20E-25
Cost of Equity
EUPE 62.98 6.86E-13 16.83 2.58E-13
USTR 227.17 4.31E-47 60.33 3.21E-54
Total Return
EUTR 244.16 1.18E-51 83.23 4.02E-60
Stuctured EU RMBS 126.97 1.73E-26 36.73 1.82E-28
Financial EU CMBS 327.19 1.47E-69 127.49 3.17E-86
Products EU Auto 315.79 4.25E-67 126.39 3.23E-84

Electronic copy available at: https://ssrn.com/abstract=4464555


EU CLO 231.56 6.09E-49 77.45 1.89E-56
US CLO 291.82 5.74E-61 85.61 1.29E-73
US Auto 92.86 1.69E-18 20.56 1.79E-19
US CMBS 20.85 8.66E-04 4.28 7.90E-04
FFELP AAA 66.36 5.86E-13 14.23 2.03E-13

In this case, the extremely small p-value suggests strong evidence against the null hypothesis of no
heteroscedasticity. Heteroskedasticity exists in our datasets. If heteroskedasticity is detected, there are
various approaches to address it, including transforming the variables, using weighted least squares
regression, employing heteroskedasticity-consistent standard errors, or utilizing other prediction
models as supplements to statistical methods. Because heteroskedasticity is detected in our datasets, it
is not sppropriate to use autoregresssion models to predict the future path.

4.2 Model Building and Estimation

In building the various machine learning models, we provide features selection that is made according
to the above correlation matrix. Firstly, for EU target variables, we drop the following features:
US10YY, US2YY, US3MY, USIPI, and keep: OIL, VIX, DERI, EUIPI, EU3MY, EU10YY. Secondly,
according to the correlation matrix, we drop features that are highly correlated with other features in
order to avoid problems such multicollinearity and overfitting. Thus, in the end, we use EU10YY,
EUIPI, VIX, OIL, as the targets for Europe features. The same methods are used to select features for
US target variables: Firstly, for US target variables, we drop features like: EU10YY, EU3MY, EUIPI,
and keep OIL, VIX, DERI, USIPI, US3MY, US2YY, US10YY. Secondly, according to the correlation
matrix, we drop features that are highly correlated with others to, once again, avoid problems such as
model overfitting and multicollinearity. Thus, US10YY, USIPI, DERI, VIX, OIL, are used as features
for US target variables. With the selection of the target variables for the US and EU well decided, we
proceed to the analysis, evaluation, and comparison of the models.

Figure 4.1.5 Features Selection for Target Variables:US Figure 4.1.6 Features Selection for Target Variables: EU

Electronic copy available at: https://ssrn.com/abstract=4464555


4.3 Model Evaluation and Comparison

For specificity, the model evaluation we have employed for the comparative analysis in this paper is
based on the mean squared error MSE. The MSE has several evaluation advantages, but the two biggest
advantages are that it provides a quadratic loss function and amply measures uncertainty in forecasting.

According to the mean squared error obtained after estimating each of the three machine learning
models, it can be found that the MSE of CNN model is the least one. Then, concerning the other two
models, MLP model has lower MSE than LSTM. This can be seen in all of the target variables.
Convolutional Neural Networks (CNN) model performs better than the other two deep learning models
in predicting financial indicators in the US and Eurozone.

Table 4.3.1 : Performance evaluation


Performance evaluation of different ML models:
MSE EU EU EU EU US US US FFELP
USPE EUPE USTR EUTR
Matrix RMBS CMBS Auto CLO CLO Auto CMBS AAA
MLP 0.0101 0.0119 0.0063 0.0043 0.0141 0.0057 0.0056 0.0051 0.0055 0.0057 0.0107 0.0099
LSTM 0.0274 0.021 0.0066 0.0043 0.0404 0.0624 0.0159 0.0303 0.0186 0.0297 0.0176 0.0231
CNN 0.0061 0.009 0.0057 0.0037 0.0143 0.0071 0.0043 0.0049 0.0016 0.0031 0.0065 0.0046
HM 0.0010 0.000 0.0009 0.0001 0.0048 0.0000 0.0012 0.0005 0.0000 0.0002 0.0037 0.0003

Finally, taking the hybrid model into consideration, we find that the hybrid model(HM) has the least
mean squared errors (MSE), which are quite small and also smaller than those of the other 3 models.
Thus, it can be concluded that the hybrid model has the best performance among all four deep models.
In addition, the graphs below reveal some interesting insights that, although there is heterogeneity in
the strength of performance of the models, one homogeneity is the fact that all the models perform
better when used to analyze the cost of equity capital than the spreads of structured products.

Figure 4.1.6 MSE for Cost of Equity Capital Figure 4.1.7 MSE for Spreads of Structure Products

MSE of 4 Machine-Learning Models Applied MSE of 4 Machine-Learning Models Applied


to Cost of Equity Capital to Spreads of Structured Products
0.07
0.03
0.06
0.025
0.05
0.02
0.04
0.015
0.03
0.01
0.02
0.005 0.01
0 0
USPE EUPE USTR EUTR EU EU EU EU US US US FFELP
RMBS CMBS Auto CLO CLO Auto CMBS AAA
MLP LSTM CNN HM MLP LSTM CNN HM

Electronic copy available at: https://ssrn.com/abstract=4464555


4.4 Comparison of the HM Output Versus Originial Data

The ouput from the HM is compared to original data for the predicted variables. Our prediction is
based on the features of the inputs to the model. The inputs are the industrial production index, brent
oil price, dollar exchange rate index, treasury yield with different maturities (3 months, 2 years, and
10 years). In the graphs below, the predictions are presented and compared with original real data. We
make the comparison between the results generated by models and the real original values for the test
sets selected from the dataset covering the past 17 years.

Taken together, these predictions combine to provide the predicted evolution of the cost of equity
capital and credit spreads of structured finance products as shown below in the graphs. The predicted
evolution of the variables is then compared to the actual observed values of the variables. From the
graphs plotted using the predicted results by the Hybrid Model (HM) and compared to the original real
value of the variables, we can conclude that the HM predicts the original observed values quite well
and performs very well for prediction of the cost of equity capital, and credit spreads of the structured
financial products. The future cost of equity capital is predicted with high accuracy with HM. Therefore,
once the values of the input features are known, we can put them into the HM to predict cost of equity
capital several periods ahead.

Figure 4.1.8: Comparison of HM Predictions and Real Data

Electronic copy available at: https://ssrn.com/abstract=4464555


The main goal of this paper is to examine the performance of deep learning models in the prediction
of cost of capital and spreads in corporate finance. We aim to provide a better new method for
forecasting financing costs. Indeed, these costs will be influenced by the economic conditions, which
can be represented by our selected features using in this paper(EUIPI, USIPI, Treasury yield, oil
prices). That is also the reason we chose these features as input for our model training. Since optimal
capital structure involves the ideal mix of financing sources to maximize firm value and minimize
capital costs, the major implication is that, whenever the cost of equity capital is predicted to be
higher while the cost/credit spreads of structured finance product is predicted to be lower, then
capital structure decisions would be planned in advance to have lower equity component due to their
higher financing cost. Achieving an optimal capital structure involves finding the right balance that
lowers costs and suits the specific characteristics of a firm, its industry, and financial goals. Our
models have proven helpful to predict the cost of equity capital, which is one of the necessary steps
in analyzing capital structure to conduct analysis and make forward-looking deductions on optimal
capital structure.

5. Conclusion

In this paper, we conduct a comparative analysis of 3 deep learning models and a new hybrid model.
This paper focuses on the US and Europe markets. We have performed a comparison of machine
learning methods based solely on their ability to resolve long-standing issues in empirical corporate
and structured finance: measuring, well in advance,the future cost of equity capital and credit spreads
of structured products. We show that significant gains do emerge from using machine learning models
to design forecasts, especially those forecasts that are based on hybrid machine learning models. In

Electronic copy available at: https://ssrn.com/abstract=4464555


some cases, these hybrid models more than double the performance of any one leading machine-
learning model in the literature.After identifying the best-performing methods (hybrid models), we
trace their superiority to maximizing the best characteristics of each machine learning models while
shrinking their disadvantages to nilpotence, allowing for gains that are often missed by any one other
methods.Among the single machine-learning models, we find that CNN model outperforms MLP and
LSTM models in forecasting the indicators for the cost of equity capital and credit spreads of
securitized products, both in the US and in the EU. Furthermore,the three models, together with the
hybrid model, all point to a common and dominant set of predictive signals which include i)
expectation of future economic performance, as contained in the long-term interest rate: 2) prevailing
economic performance, as measured by the industrial production index; 3) degree of global risk
sentiment, measured by VIX volatility index, 4) oil prices, and 5) the broad dollar exchange rate.

Our main contributions to the corporate finance literature are not only empirical but also theoretical.
First, we derive and document well known asset pricing models and provide arguments that enable us
to use these models in the context of developing proxies for the cost of equity capital and credit spreads
of structured products. Second, we derive new versions of these models by log-linearization of model
variables around their non-zero steady state values. This enables us to perform comparative analysis
more easily and study the dynamics of the models in ways alein to studies in the literature, resulting in
a list of fact-based propositions which we proved, using the linearized representation of the models.

Third, and more importantly, for US and EU markets, the two most liquid markets in the world, we
demonstrate significant gains from using hybrid machine learning models to forecast the cost of equity
capital and the credit spreads of structured products. Generating insights on future cost of equity capital
and credit spreads of structured products by using a hybrid machine-learning strategy gives rise to a
superior performance that almost always the performance of CNN, often documented in the literature
as the best standalone machine learning model, both in the US and especially the EU and across all
proxies for the cost of equity capital and credit spreads of structured products.

In summary, based on the mean squared error calculated from the estimation of three machine learning
models, it is evident that the CNN model has the lowest error. In comparison to the other two models,
the MLP model exhibits lower mean squared errors than the LSTM model across all target variables.
When predicting financial indicators from the US and Eurozone, the Convolutional Neural Networks
model outperforms the other two deep learning models. With the hybrid model, it is noteworthy that it
achieves the lowest mean squared errors, which are considerably small, smaller than those of the other
three models. The proven success of the machine learning models provides firms and investors choice
with higher accuracy and convenience. It is a supplement to original regression models when using
machine learning methods in future forecasts. In conclusion, our findings help confirm the growing
significance of machine learning in the prediction of US and EU corporate and structured finance.

Electronic copy available at: https://ssrn.com/abstract=4464555


References

Adrian Cheung, W.K., Wei, K.C.J. (2006): “Insider ownership and corporate performance: Evidence
from the adjustment cost approach”, Journal of Corporate Finance, 12, 906-925.

Avramov, D., Li, M., Wang, H. (2021): “Predicting corporate policies using downside risk: A
machine learning approach”, Journal of Empirical Finance, 63, 1-26.

Bianchini, R., Croce, A. (2022): “The Role of Environmental Policies in Promoting Venture Capital
Investments in Cleantech Companies”, Review of Corporate Finance, 2, 587-616.

Bouchra, E R, Stock Market Prediction Using Deep Learning Model (March 10, 2023). Available at
SSRN: https://ssrn.com/abstract=

Clifford W. Smith, Ross L. Watts (1992): “The investment opportunity set and corporate financing,
dividend, and compensation policies”, Journal of Financial Economics, 32, 263-292.

Deng, A. (2013): “Understanding Spurious Regression in Financial Economics”, Journal of Financial


Econometrics, 12, 122-150.

Dionne, G., Gauthier, G., Hammami, K., Maurice, M. (2010): “Default risk in corporate yield
spreads.”, Financial Management, 39, 707–731.

Di Wang Icon, Zhanchi Wu, Bangzhu Zhu (2022): “Controlling Shareholder Characteristics and
Corporate Debt Default Risk: Evidence Based on Machine Learning”, Emerging Markets Finance
and Trade, 58, 3324-3339.

Ferson, W.E., Sarkissian, S., Simin, T.T. (2003): “Spurious Regressions in Financial Economics?”,
The Journal of Finance, 58, 1393-1413.

Gaver, J.J. and Gaver, K.M. (1993) Additional Evidence on the Association between the Investment
Opportunity Set and Corporate Financing, Dividend, and Compensation Policies. Journal of
Accounting and Economics, 16, 125-160.

Griffin, P.A., Hong, H.A., Ryou, J.W. (2018): “Corporate innovative efficiency: Evidence of effects on
credit ratings”, Journal of Corporate Finance, 51, 352-373.

Gu, S., Kelly, B., Xiu, D. (2020): “Empirical Asset Pricing via Machine Learning”, The Review of
Financial Studies, 33, 2223-2273.

Huang, J., Kisgen, D.J. (2013): “Gender and corporate finance: Are male executives overconfident
relative to female executives?”, Journal of Financial Economics, 108, 822-839.

Hyeong jun Kima, Hoon Chob and Doojin Ryuc (2022): “Predicting corporate defaults using
machine learning with geometric-lag variables”, Investment Analysts Journal, 50, 3.

Electronic copy available at: https://ssrn.com/abstract=4464555


Ioannidis, C., Peel, D.A., Peel, M.J. (2003): “The Time Series Properties of Financial Ratios: Lev
Revisited”, Journal of Business Finance and Accounting, 30, 699-714.

Jan Svanberg, Tohid Ardeshiri, Isak Samsten, Peter Öhman, Presha E. Neidermeyer, Tarek Rana,
Natalia Semenova, Mats Danielson (2022): “Corporate governance performance ratings with
machine learning”, Intelligent Systems in Accounting, Finance and Management, 29, 50-68.

Jeffrey L. Coles, Michael L. Lemmon, J. Felix Meschke (2012): “Structural models and endogeneity
in corporate finance: The link between managerial ownership and corporate performance”, Journal of
Financial Economics, 103, 149-168.

Jong-Min Kim, Dong H. Kim, Hojin Jung (2021): “Applications of machine learning for corporate
bond yield spread forecasting”, The North American Journal of Economics and Finance, 58, 101540.

Klapper, L.F., Love, I. (2004): “Corporate governance, investor protection, and performance in
emerging markets”, Journal of Corporate Finance, 10, 703-728.

Salim Lahmiri & Stelios Bekiros (2019) Can machine learning approaches predict corporate
bankruptcy? Evidence from a qualitative experimental design, Quantitative Finance, 19:9

Li, Kai, Mai, Feng, Shen, Rui, Yan, Xinyan (2021): “Measuring Corporate Culture Using Machine
Learning”, The Review of Financial Studies, 34, 3265–3315.

Manning, M. J. (2004): “Exploring the relationship between credit spreads and default probabilities”,
Working Paper No. 225, Bank of England.

Mitton, Todd (2021): “Methodological Variation in Empirical Corporate Finance”, The Review of
Financial Studies, 35, 527-575.

Sendhil Mullainathan and Jann Spiess (2017): “Machine Learning: An Applied Econometric
Approach”, Journal of Economic Perspectives, 31, 87-106.

Yi Jiang, Stewart Jones (2018): “Corporate distress prediction in China: a machine learning
approach”, Accounting & Finance, 58, 1063-1109.

Electronic copy available at: https://ssrn.com/abstract=4464555


Appendix

A. Input Features and Target Variables Information:


Table 1: Input Information
Input name Short for Information
Brent Oil Price OIL Crude Oil Prices: Brent - Europe, Dollars per Barrel, Daily, Not Seasonally
Adjusted
VIX Index VIX CBOE Volatility Index: VIX, Index, Daily, Not Seasonally Adjusted
Dollar Exchange Rate Index DERI Nominal Broad U.S. Dollar Index, Index Jan 2006=100, Daily, Not Seasonally
Adjusted
US Industrial Production Index USIPI Industrial Production: Total Index, Index 2017=100, Monthly, Seasonally
Adjusted
US 10year Treasury Yield US10YY Market Yield on U.S. Treasury Securities at 10-Year Constant Maturity, Quoted
on an Investment Basis, Percent, Daily, Not Seasonally Adjusted
US 2year Treasury Yield US2YY Market Yield on U.S. Treasury Securities at 2-Year Constant Maturity, Quoted
on an Investment Basis, Percent, Daily, Not Seasonally Adjusted
US 3-month Treasury Yield US3MY Market Yield on U.S. Treasury Securities at 3-Month Constant Maturity,
Quoted on an Investment Basis, Percent, Daily, Not Seasonally Adjusted
EU Industrial Production Index EUIPI Production: Industry: Total Industry: Total Industry Excluding Construction for
the Euro Area (19 Countries), Index 2015=100, Quarterly, Seasonally Adjusted
EU 3-month Treasury Yield EU3MY Interest Rates: 3-Month or 90-Day Rates and Yields: Interbank Rates: Total for
the Euro Area (19 Countries), Percent, Monthly, Not Seasonally Adjusted
EU 10year Treasury Yield EU10YY Interest Rates: Long-Term Government Bond Yields: 10-Year: Main (Including
Benchmark) for the Euro Area (19 Countries), Percent, Monthly, Not
Seasonally Adjusted

Information of Inputs (Features): There are totally 10 features in our datasets, shown in the following
table.

Table 2: Target Information


Targets name Short for Information
The P/E ratio of the whole US market, used
as a proxy of US Cost of Equity Capital.
US PE Ratio USPE
There is a negative relationship between
Cost of them. Weekly.
Equity The P/E ratio of the whole European market,
used as a proxy of EU Cost of Equity
EU PE Ratio EUPE
Capital. There is a negative relationship
between them. Weekly.
US Total Return USTR US Total Return. Weekly.

Electronic copy available at: https://ssrn.com/abstract=4464555


Total
EU Total Return EUTR EU Total Return. Weekly.
Returns
European residential mortgage-backed security. EU RMBS Weekly.
European commercial mortgage-backed security. EU CMBS Weekly.
European Autovehicle-Loan-backed security. EU Auto Weekly.
Structured European collateralized loan obligations. EU CLO Weekly.
Financial US collateralized loan obligations. US CLO Weekly.
Products US Autovehicle-Loan-backed security. US Auto Weekly.
US commercial mortgage-backed security. US CMBS Weekly.
FFELP Credit grading : tribble As, best credit grades.
Federal Family Education Loan Program.
AAA Weekly.

Information of Target variables: We have 3 parts of research questions, there are totally 12 target
variables used as target variable for investigating machine learning models’ performance in prediction
in US and Europe financial market.

B. Data Visualizations

Electronic copy available at: https://ssrn.com/abstract=4464555


C. Autocorrelation Function (ACF) and Partial Autocorrelation Function (PACF):

Electronic copy available at: https://ssrn.com/abstract=4464555


Electronic copy available at: https://ssrn.com/abstract=4464555
Electronic copy available at: https://ssrn.com/abstract=4464555
Electronic copy available at: https://ssrn.com/abstract=4464555
Appendix B
Appendix – Details of Proofs
Recursive proof of equation (1).
∞ 𝑡

max 𝔼0 ∑ (∏ 𝛽𝜏 ) 𝑢(𝐶𝑡 )
𝐶𝑡 , 𝑆𝑡
𝑡=0 𝜏=0

subject to
𝐶𝑡 + 𝑃𝑡 𝑆𝑡 ≤ 𝑌𝑡 + 𝐷𝑡 𝑆𝑡−1
The value function of the Bellman equation can be written as
𝑉(𝑆𝑡−1 ) = max 𝑢(𝐶𝑡 ) + 𝛽𝑡+1 𝔼𝑡 𝑉(𝑆𝑡 )
𝑐𝑡 , 𝑆𝑡

where 𝛽𝑡+1 is the discount factor between period 𝑡 and 𝑡 + 1 𝐸𝑡 𝑉(𝑆𝑡 ) is the value
expected to be derived at period 𝑡 + 1 ending for quantity of stocks 𝑆𝑡 purchased at the very end
of period 𝑡. Also

𝔼𝑡 𝑉(𝑆𝑡 ) = ∫ 𝑉(𝑆𝑡 )ℎ(𝑆𝑡 )𝑑𝑆𝑡


𝑆

where ℎ(𝑆𝑡 ) is the distribution of 𝑆𝑡 . The expectation operator can be thought of as integrating
over the distribution of 𝑆𝑡 , taken as a continuous random variable.
Derivative of the value function with respect to 𝑆𝑡 yields:
𝑃𝑡 𝑢′ (𝐶𝑡 ) = 𝛽𝑡+1 𝔼𝑡 𝑉′(𝑆𝑡 )
Derivative of the value function with respect to 𝑆𝑡−1 yields:
𝔼𝑡 𝑉′(𝑆𝑡 ) = 𝔼𝑡 𝑢′ (𝐶𝑡+1 )𝐷𝑡+1
Combining both derivatives, we get:
𝑢′ (𝐶𝑡+1 )
𝑃𝑡 = 𝔼𝑡 [𝛽𝑡+1 𝐷 ]
𝑢′ (𝐶𝑡 ) 𝑡+1
Log-linearization procedure employed.
In expressing all terms in the same unit of percentage terms, log-linearization has been employed.
Suppose we have a function that has many arguments, i.e., 𝑔(𝜎1 , 𝜎2 , 𝜎3 , … . , 𝜎𝑛 ). The procedure
used to log-linearize are as follows.
Let 𝑔: ℝ𝑛 → ℝ be defined as 𝑌 = 𝑔(𝜎1 , 𝜎2 , 𝜎3 , … . , 𝜎𝑛 ), where (𝜎1 , 𝜎2 , 𝜎3 , … . , 𝜎𝑛 )𝜖ℝ𝑛 and 𝑌 =
𝑔(𝜎1 , 𝜎2 , 𝜎3 , … . , 𝜎𝑛 )𝜖ℝ .
Then ln 𝑌 = ln 𝑔(𝜎1 , 𝜎2 , 𝜎3 , … . , 𝜎𝑛 ). Let 𝑌 ∗ be the steady state of 𝑌 and (𝜎1∗ , 𝜎2∗ , 𝜎3∗ , … . , 𝜎𝑛∗ ) be
the steady state of (𝜎1 , 𝜎2 , 𝜎3 , … . , 𝜎𝑛 ).
Taylor expansion of both sides of ln 𝑌 = ln 𝑔(𝜎1 , 𝜎2 , 𝜎3 , … . , 𝜎𝑛 ) around their steady state gives:
1
ln𝑌 ∗ + ∗ (𝑌 − 𝑌 ∗ )
𝑌
𝑛
1 ∗
1
= ln𝑔(𝜎1∗ , 𝜎2∗ , 𝜎3∗ , … . , 𝜎𝑛∗ ) + ∑ 𝜎 𝑔𝜎 (𝜎 ∗ ∗ ∗
, 𝜎 , 𝜎 , … . , 𝜎𝑛
∗) (𝜎 − 𝜎𝑖∗ )
∗ ∗ ∗
𝑔(𝜎1 , 𝜎2 , 𝜎3 , … . , 𝜎𝑛∗ ) 𝑖 𝑖 1 2 3
𝜎𝑖∗ 𝑖
𝑖=1

𝜕𝑔(𝜎1 ,𝜎2 ,𝜎3 ,….,𝜎𝑛 )


where 𝑔𝜎𝑖 (𝜎1∗ , 𝜎2∗ , 𝜎3∗ , … . , 𝜎𝑛∗ ) = 𝜕𝜎1
evaluated at the steady state (𝜎1∗ , 𝜎2∗ , 𝜎3∗ , … . , 𝜎𝑛∗ ).

Recalling that ln 𝑌 = ln 𝑔(𝜎1 , 𝜎2 , 𝜎3 , … . , 𝜎𝑛 ), so that ln𝑌 ∗ = ln𝑔(𝜎1∗ , 𝜎2∗ , 𝜎3∗ , … . , 𝜎𝑛∗ ), we have

Electronic copy available at: https://ssrn.com/abstract=4464555


𝑛
1 ∗)
1 ∗ ∗ ∗ ∗ ∗)
1 ∗

(𝑌 − 𝑌 = ∗ ∗ ∗ ∗
∑ 𝜎𝑖 𝑔𝜎 (𝜎1 , 𝜎2 , 𝜎3 , … . , 𝜎𝑛 ∗ (𝜎𝑖 − 𝜎𝑖 )
𝑌 𝑔(𝜎1 , 𝜎2 , 𝜎3 , … . , 𝜎𝑛 ) 𝑖
𝜎𝑖
𝑖=1

If we denote deviations (log or percentage deviations) from steady state values by ̃ , then:
𝑛
̃=
1
𝑌 ∗ ∗ ∗ ∗
∑ 𝑔𝜎 (𝜎∗1 , 𝜎∗2 , 𝜎3∗ , … . , 𝜎∗𝑛 )𝜎∗𝑖 𝜎̃𝑖
𝑔(𝜎1 , 𝜎2 , 𝜎3 , … . , 𝜎𝑛 ) 𝑖=1 𝑖

where
1 1

(𝑌 − 𝑌 ∗ ) = 𝑌̃ and ∗ (𝜎𝑖 − 𝜎𝑖∗ ) = 𝜎̃𝑖
𝑌 𝜎𝑖
This is the procedure used to compute the log deviations from steady state presented in this paper.

Electronic copy available at: https://ssrn.com/abstract=4464555

You might also like