Professional Documents
Culture Documents
Statistics For Quants 1696336178
Statistics For Quants 1696336178
Contents
1 Introduction to Statistics 2
2 Descriptive Statistics 3
2.1 Mean . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2.2 Median . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2.3 Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2.4 Variance and Standard Deviation . . . . . . . . . . . . . . . . . . . . . . 3
3 Probability Distributions 4
3.1 Normal Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
3.2 Binomial Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
4 Statistical Inference 4
4.1 Hypothesis Testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
4.1.1 Type I Error (False Positive) . . . . . . . . . . . . . . . . . . . . . 5
4.1.2 Type II Error (False Negative) . . . . . . . . . . . . . . . . . . . . 5
4.2 Confidence Intervals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
4.3 Sample Size and Power . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
5 Regression Analysis 6
5.1 Simple Linear Regression . . . . . . . . . . . . . . . . . . . . . . . . . . 6
5.2 Multiple Regression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
5.3 Coefficient of Determination (R2 ) . . . . . . . . . . . . . . . . . . . . . . 6
5.4 Applications in Finance . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1
8 Time Series Models 8
8.1 Autoregressive Model (AR) . . . . . . . . . . . . . . . . . . . . . . . . . 8
8.2 Moving Average Model (MA) . . . . . . . . . . . . . . . . . . . . . . . . 8
8.3 Autoregressive Moving Average Model (ARMA) . . . . . . . . . . . . . . 8
8.4 Autoregressive Integrated Moving Average (ARIMA) . . . . . . . . . . . 9
8.5 Seasonal ARIMA (SARIMA) . . . . . . . . . . . . . . . . . . . . . . . . . 9
8.6 Seasonal ARIMA with Exogenous Variables (SARIMAX) . . . . . . . . . . 9
8.7 Generalized Autoregressive Conditional Heteroskedasticity (GARCH) . . 9
8.8 Variants of GARCH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
8.8.1 EGARCH (Exponential GARCH) . . . . . . . . . . . . . . . . . . . 9
8.8.2 IGARCH (Integrated GARCH) . . . . . . . . . . . . . . . . . . . . 9
8.8.3 TGARCH (Threshold GARCH) . . . . . . . . . . . . . . . . . . . . 10
8.8.4 PGARCH (Power GARCH) . . . . . . . . . . . . . . . . . . . . . . 10
8.8.5 VGARCH (Variance GARCH) . . . . . . . . . . . . . . . . . . . . . 10
8.8.6 DCC GARCH (Dynamic Conditional Correlation GARCH) . . . . . 10
9 Tests in Statistics 10
9.1 t-test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
9.2 ANOVA (Analysis of Variance) . . . . . . . . . . . . . . . . . . . . . . . . 11
9.3 Chi-Squared Test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
9.4 Shapiro-Wilk Test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
9.5 Durbin-Watson Test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
9.6 Mann-Whitney U Test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
9.7 Variance Inflation Factor (VIF) . . . . . . . . . . . . . . . . . . . . . . . . 12
9.8 Breusch-Pagan Test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
9.9 Ljung-Box Test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
9.10 Augmented Dickey-Fuller (ADF) Test . . . . . . . . . . . . . . . . . . . . 12
9.11 Engle’s ARCH Test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
9.12 Bollerslev’s Multivariate GARCH . . . . . . . . . . . . . . . . . . . . . . . 12
1 Introduction to Statistics
Statistics is like a magnifying glass for data. It lets us zoom in on important details, spot
patterns, and make educated guesses about future events. In the financial world, this
is akin to analyzing market trends, evaluating risks, and predicting stock performance.
Every day, countless financial transactions take place around the world, from buy-
ing a cup of coffee to purchasing billion-dollar companies. Each of these transactions
generates data. Over time, this massive collection of data can tell us stories about eco-
nomic health, consumer behavior, and market dynamics. But to unlock these stories,
we need the tools of statistics.
Example: Consider the housing market in a bustling city. Over a year, the prices of
homes might fluctuate due to various factors like interest rates, employment levels, or
even seasonal changes. If a real estate investor merely glances at this data, they might
feel lost amidst the monthly ups and downs. But with statistics, they can calculate the
average house price for the year, understand the degree of price variation, and even
2
predict future price trends based on historical data. By doing this, our investor can
make informed decisions about when to buy or sell properties.
Imagine a bustling stock market. Without statistics, an individual trader might feel
overwhelmed by the sheer volume of numbers flashing on the screen. How do you
differentiate between a momentary price dip and a significant downward trend? How
do you gauge the general sentiment of the market? This is where statistics come in. By
computing averages, measuring variability, and testing hypotheses, traders can discern
patterns, evaluate risks, and predict potential market movements.
Furthermore, statistics isn’t just about crunching numbers. It’s about understanding
uncertainty and making the best possible decisions with the information at hand. In the
world of finance, where the stakes are high and uncertainty is a given, statistical analy-
sis becomes a crucial tool. Whether it’s an investment bank assessing the risk of a new
financial product, an insurance company setting premiums, or a retail investor deciding
which stocks to buy, statistics provides the framework for making these decisions.
In essence, while financial markets might seem chaotic and unpredictable, beneath
the surface, there are patterns, trends, and relationships waiting to be discovered. And
the key to unveiling them? Statistics.
2 Descriptive Statistics
Descriptive statistics is like giving a quick snapshot of data. Instead of sifting through
pages of numbers or trying to interpret rows of data, you receive a distilled overview
in the form of a few key measures that summarize everything. These measures help in
understanding the main characteristics of the data at a glance, providing insights into
its general behavior and key trends.
Imagine you’re trying to understand the performance of a stock over a year. Looking
at the daily closing prices might be overwhelming; that’s where descriptive statistics
come in. They offer an organized summary, making it easier to see the overall trend,
the typical price, and the variability in price.
2.1 Mean
The mean, often termed the average, tells us the central value around which all other
data points gravitate.
Example: Suppose you have the monthly returns of a stock over a year. The mean
would give you an idea of the stock’s typical monthly return. If the mean return is
positive, it indicates that, on average, the stock had a gain every month.
2.2 Median
The median is the middle value when all data points are arranged in ascending or
descending order. If there are an odd number of data points, the median is the one
right in the center. For an even number, it’s the average of the two central numbers.
Example: If you’re evaluating the annual salaries of employees in a firm, and the
salaries vary widely due to a few top executives with hefty compensation, the median
salary would provide a more representative figure of what the typical employee earns,
unaffected by the extremely high values.
3
2.3 Mode
The mode identifies the most frequently occurring value in a dataset. If a particular
number appears more often than any other, that number is the mode.
Example: In analyzing the sizes of shoes sold in a store, if size 8 is the most fre-
quently sold, then that’s the mode. This information could be useful for inventory
management.
3 Probability Distributions
A probability distribution is like a playbook for randomness. Instead of leaving us in
the dark about what might happen next, it illuminates the various possible outcomes
and tells us the odds of each. In the realm of finance, these outcomes could be anything
from stock prices and interest rates to market returns and even the success of a new
product launch.
Imagine throwing a dice. The outcome could be any number between 1 and 6. Each
of these outcomes has a known probability (in this case, 16 for each number). But what
if we’re dealing with something more complex, like the future price of a stock? There
are countless factors at play, and the possible outcomes might range in the hundreds,
thousands, or even more. A probability distribution helps map out these possibilities,
giving a clearer picture of potential futures.
4
3.2 Binomial Distribution
The binomial distribution deals with binary outcomes – situations where there are only
two possible results, like a coin toss yielding heads or tails. In finance, this might be
used to predict the likelihood of success or failure of a particular event based on a
known probability.
Example: Consider a trader who, based on their analysis, believes that there’s a
70% chance a particular stock will rise tomorrow. If they were to trade based on this
belief over 10 days, the binomial distribution could be used to calculate scenarios like
the probability that the stock will rise on exactly 7 out of those 10 days.
Ultimately, probability distributions offer a structured approach to uncertainty. They
don’t predict the future but rather provide a framework to understand the various pos-
sibilities and their likelihoods. For financial professionals, this knowledge is invaluable.
It helps them gauge risks, set expectations, and devise strategies that align with their
goals and risk tolerance.
4 Statistical Inference
Statistical inference is like detective work for data. Instead of investigating crime
scenes, we’re probing datasets. We gather clues from a sample, a smaller group se-
lected from a larger population, and make educated guesses or inferences about the
characteristics of the entire population.
Imagine you’re a financial analyst trying to understand the spending habits of con-
sumers in a large city. It’s impractical to ask every single person about their spending.
Instead, you might survey a smaller group and use their responses to make inferences
about the entire city’s habits.
5
the null) when it’s actually not, they’ve made a Type I error. This could lead to the
adoption of an ineffective strategy.
5 Regression Analysis
Regression analysis is akin to finding the best path through a forest of data points.
Imagine you’re looking at a scatter plot of points, and you want to draw a single line
that best captures the overall trend of those points. That’s what regression analysis
does. It’s a statistical tool that lets you predict a dependent variable based on one
or more independent variables. In the realm of finance, this can be incredibly useful,
as it allows professionals to make educated predictions about future events based on
historical data.
6
5.1 Simple Linear Regression
Simple linear regression is the most basic form of regression. It involves one indepen-
dent variable and one dependent variable. The goal is to find a straight line that best
fits the data.
Example: Suppose an analyst wants to understand the relationship between a com-
pany’s advertising spend and its sales revenue. By plotting past data and fitting a line,
the analyst can predict how much revenue increase can be expected for a given increase
in advertising spend.
7
6 Time Series Analysis
Time series analysis is like watching a movie of data points. Instead of seeing a snapshot
or a single frame, you’re observing how data changes over time. In finance, time series
analysis helps in tracking and forecasting the movement of stocks, commodities, and
other financial metrics over periods ranging from days to years.
1. Trend: The overall direction in which the data is moving over the long term.
4. Irregular or Noise: The random fluctuations that can’t be attributed to the above
components.
6.2 Forecasting
A major application of time series analysis in finance is forecasting future values based
on past and present data. Predicting stock prices, interest rates, and economic indica-
tors are all areas where time series forecasting plays a pivotal role.
Example: An investment firm might use time series models to predict the future
performance of a stock based on its historical prices. This prediction can guide decisions
about buying, selling, or holding the stock.
8
7.1 Machine Learning in Finance
Machine learning, a subset of artificial intelligence, involves training algorithms on vast
amounts of data to make predictions or decisions without being explicitly programmed.
Example: Hedge funds might employ machine learning models to predict stock
movements based on a vast array of variables, from company fundamentals to global
economic indicators.
9
8.3 Autoregressive Moving Average Model (ARMA)
ARMA combines both AR and MA components. It’s suitable for univariate time series
data that exhibits both trend and seasonality.
Example: The daily trading volume of a stock might be influenced by its past vol-
umes (AR part) and also by recent random events like news or macroeconomic factors
(MA part).
10
8.8.2 IGARCH (Integrated GARCH)
The ”I” in IGARCH stands for ”integrated”. In this model, the sum of the ARCH and
GARCH coefficients is constrained to be exactly one. This implies that shocks to the
volatility persist indefinitely into the future. In simpler terms, once there’s a shock, its
effects never completely fade away.
Example: Consider a financial market that has experienced a major crash. Even
after many years, the memories of that crash might still affect the market’s volatility.
IGARCH captures this never-ending impact of past shocks.
11
In simpler terms, while basic GARCH models tell us about the volatility of a single
asset, DCC GARCH reveals how different assets move together over time. Do they tend
to rise and fall in tandem? Or does one go up when the other goes down?
Example: Consider an investment portfolio containing stocks, bonds, and com-
modities. The DCC GARCH model can help in understanding how these assets interact
with each other. For instance, when stock prices drop, do bond prices usually rise (indi-
cating a negative correlation), or do they also fall (positive correlation)? Understand-
ing these dynamic correlations can guide investment strategies, especially in terms of
diversification and hedging.
In financial markets, where large changes (jumps or crashes) can occur, models like
EGARCH, which can capture the asymmetric response of volatility to shocks, might be
preferred.
9 Tests in Statistics
Statistical tests provide a structured way to make decisions based on data. Whether
you’re trying to determine if a new investment strategy outperforms an old one, or if
a certain stock’s returns are normally distributed, statistical tests offer a way to draw
conclusions with a known level of confidence.
9.1 t-test
The t-test is used to determine if there’s a significant difference between the means of
two groups. It can be a two-sample t-test (for comparing two groups) or a paired t-test
(for comparing two measurements taken on the same group).
Example: Imagine two investment strategies, A and B. To determine which strategy
yields higher returns, a financial analyst could use the two-sample t-test. If Strategy A
is applied to a portfolio in one year and Strategy B in the next, a paired t-test might be
more appropriate to account for market variations.
12
9.4 Shapiro-Wilk Test
This test checks whether a variable follows a normal distribution, which is a common
assumption in many financial models.
Example: Before applying certain stock pricing models, an analyst might use the
Shapiro-Wilk test to check if the stock returns are normally distributed.
13
Example: After modeling a stock’s daily returns, an analyst might use the Ljung-
Box test to check that the model captures all significant temporal patterns, ensuring no
autocorrelation remains in the residuals.
Thank You
14