Professional Documents
Culture Documents
The scientific method refers to a set of generally accepted principles that should guide
any scientific project. Wikipedia defines the scientific method as follows:
The scientific method is an empirical method of acquiring knowledge that has
charac‐ terized the development of science since at least the 17th century. It involves
careful observation, applying rigorous skepticism about what is observed, given that
cognitive assumptions can distort how one interprets the observation. It involves
formulating hypotheses, via induction, based on such observations; experimental and
measurement-based testing of deductions drawn from the hypotheses; and refinement
(or elimination) of the hypotheses based on the experimental findings. These are
principles of the scientific method, as distinguished from a definitive series of steps
applicable to all scientific enterprises.
Given this definition, normative finance, as discussed in Chapter 3, is in stark
contrast to the scientific method. Normative financial theories mostly rely on
assumptions and axioms in combination with deduction as the major analytical
method to arrive at their central results.
• Expected utility theory (EUT) assumes that agents have the same utility function
no matter what state of the world unfolds and that they maximize expected utility
under conditions of uncertainty.
• Mean-variance portfolio (MVP) theory describes how investors should invest
under conditions of uncertainty assuming that only the expected return and the
expected volatility of a portfolio over one period count.
• The capital asset pricing model (CAPM) assumes that only the nondiversifiable
market risk explains the expected return and the expected volatility of a stock
over one period.
• Arbitrage pricing theory (APT) assumes that a number of identifiable risk factors
explains the expected return and the expected volatility of a stock over time;
admittedly, compared to the other theories, the formulation of APT is rather
broad and allows for wide-ranging interpretations.
What characterizes the aforementioned normative financial theories is that they were
originally derived under certain assumptions and axioms using “pen and paper” only,
without any recourse to real-world data or observations. From a historical point of
view, many of these theories were rigorously tested against real-world data only long
after their publication dates. This can be explained primarily with better data availa‐
bility and increased computational capabilities over time. After all, data and compu‐
tation are the main ingredients for the application of statistical methods in practice.
The discipline at the intersection of mathematics, statistics, and finance that applies
such methods to financial market data is typically called financial econometrics, the
topic of the next section.
1
f:ℝ ℝ+, x 2+ x
2
Given multiple values of xi, i = 1, 2, ..., n, one can derive function values for f by
applying the above definition:
yi = f xi , i = 1, 2, ..., n
The following Python code illustrates this based on a simple numerical example:
In [1]: import numpy as np
In [3]: x = np.arange(-4, 5)
x
In [4]: y = f(x)
y
Out[4]: array([0. , 0.5, 1. , 1.5, 2. , 2.5, 3. , 3.5, 4. ])
Second is the approach taken in statistical learning. Whereas in the preceding exam‐
ple, the function comes first and then the data is derived, this sequence is reversed in
statistical learning. Here, the data is generally given and a functional relationship is
to be found. In this context, x is often called the independent variable and y the
depen‐ dent variable. Consequently, consider the following data:
xi, yi , i = 1, 2, ..., n
In the case of simple OLS regression, as described previously, the optimal solutions
are known in closed form and are as follows:
Cov x, y
βVar(x)
α = y − βx
Here, Cov stands for the covariance, Var for the variance, and x, y for the mean
values of x, y.
Returning to the preceding numerical example, these insights can be used to derive
optimal parameters α, β and, in this particular case, to recover the original definition
of f x :
In [6]: y
Out[6]: array([0. , 0.5, 1. , 1.5, 2. , 2.5, 3. , 3.5, 4. ])
In [10]: np.allclose(y_, y)
Out[10]: True
Data Availability
Financial econometrics is driven by statistical methods, such as regression, and the
availability of financial data. From the 1950s to the 1990s, and even into the early
2000s, theoretical and empirical financial research was mainly driven by relatively
small data sets compared to today’s standards, and was mostly comprised of end-of-
day (EOD) data. Data availability is something that has changed dramatically over
the last decade or so, with more and more types of financial and other data avail‐ able
in ever increasing granularity, quantity, and velocity.
In [12]: c = configparser.ConfigParser()
c.read('../aiif.cfg')
ek.set_app_key(c['eikon']['app_id'])
In [16]: data.info()
<class 'pandas.core.frame.DataFrame'>
DatetimeIndex: 254 entries, 2019-07-01 to 2020-07-01
Data columns (total 4 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 AAPL.O 254 non-null float64
1 MSFT.O 254 non-null float64
2 NFLX.O 254 non-null float64
3 AMZN.O 254 non-null float64
dtypes: float64(4)
memory usage: 9.9 KB
In [17]: data.tail()
Out[17]: CLOSE AAPL.O MSFT.O NFLX.O AMZN.O
Date
2020-06-25 364.84 200.34 465.91 2754.58
2020-06-26 353.63 196.33 443.40 2692.87
2020-06-29 361.78 198.44 447.24 2680.38
2020-06-30 364.80 203.51 455.04 2758.82
2020-07-01 364.11 204.70 485.64 2878.70