You are on page 1of 15

FINANCIAL DATA ANALYSIS

CHAPTER IV
SIMPLE REGRESSION
Lesson Learning Outcomes

•Understand the goals of regression technique in financial data analysis

•Be able to produce simple linear regression analysis with financial data

• Be able to interpret the result of simple regression analysis


SIMPLE REGRESSION
Definition: Simple Regression – Hồi quy tuyến tính: is a statistical method used in finance, investing,
and other disciplines that attempts to determine the strength and character of the relationship
between one dependent variable (usually denoted by Y) and a series of other variables (known as
independent variables).
Typical Simple Regression model is:

Note: the linear regression model will always be only an approximation of the true relationship. And
the Y value, which is calculated using the regression model is only the expected value of Y and this
value may be defferent with the true value.
SIMPLE REGRESSION
Regression Line or linear regression: establishes the linear relationship between two variables
based on a line of the best fit. Linear regression is thus graphically depicted using a straight line
with the slope defining how the change in one variable impacts a change in the other.
SIMPLE REGRESSION: Sum of square residual
Sum of Square Residual (SSR):  is a statistical technique used to measure the amount of variance in
a data set that is not explained by a regression model itself. Instead, it estimates the variance in the
residuals, or error term.

The “best fit line” or the optimal regression line/model of 2 variables is the model that has smallest
value of SSR, usually referred as least square residuals.
The SSR is used by financial analysts in order to estimate the validity of their econometric models.
SIMPLE REGRESSION: Sum of square residual
Sum of Square Residual (SSR):  is a statistical technique used to measure the amount of variance in
a data set that is not explained by a regression model itself. Instead, it estimates the variance in the
residuals, or error term.

The “best fit line” or the optimal regression line/model of 2 variables is the model that has smallest
value of SSR, usually referred as least square residuals.
The SSR is used by financial analysts in order to estimate the validity of their econometric models.
SIMPLE REGRESSION: Produce Simple Regression
Step 1: Data -> Data Analysis -> Regression
Step 2: Input Y range -> choose data range of the dependent variable
Step 3: Input X range -> choose data range of the independent variable
Step 4: Output range -> convenient space -> Enter
SIMPLE REGRESSION: Explanation and unexplainatio
parts
SIMPLE REGRESSION: Significant F and P value: the
Validity

Significant F and P value reveal the validity of the regression model


If Significant F and P value < 0.05 -> h0 hypothesis is rejected -> the regression model is valid
• H0: β1 = 0 -> dependent variable (Y) and independent variable (X) have no relationship
-> reject Ho means the is certain relationship between X and Y and that relationship could be modelled
by a regresion
SIMPLE REGRESSION: R2: measuring the fit of a regressio
model

R-squared (R2) is a statistical measure that represents the proportion of the variance for a
dependent variable that's explained by an independent variable or variables in a regression model.
 Ex: R^2=0.28, means 28% variation of X could be explained by Y
R-Squared only works as intended in a simple linear regression model with one explanatory variable.
SIMPLE REGRESSION: Residual examination and outliers f

The different between Yi and the expected value of Yi, , calculated using regression model is ui.
The variance with unusual big u is outliers.
The existence of outliers may impact the validity of the regression model, thus, they are filtered at
some case.
To examine the residual value and find out outliers, we use the “residual plot” in data analysis, in
Excel
X Variable 1 Line Fit Plot X Variable 1 Residual Plot
3000.00
2500
2500.00 2000
2000.00 1500
Y 1000

Residuals
1500.00
Predicted Y
Y

500
1000.00
0
500.00 0.00 1.00 2.00 3.00 4.00 5.00 6.00
-500
0.00 -1000
0.00 1.00 2.00 3.00 4.00 5.00 6.00
-1500
X Variable 1
X Variable 1
SIMPLE REGRESSION: Exercise 4.1

1. Produce the regression model of X (number of death by Covid) and Y (stock price of vaccine
manufacturer).
2. Draw the Scatter graph with regression line.
3. Produce regression analysis
4. Interpret the results.
5. Do the same with the other 2 companies.
6. Define the company that has stock price, which is impacted by X the most.
The capital asset pricing model (CAPM)

The Capital Asset Pricing Model (CAPM) describes the


relationship between systematic risk, or the general investing
portfolio, and expected return for financial assets, particularly
stocks.
The risk-free rate in the CAPM formula accounts for the time
value of money.
The beta of a potential investment is a measure of how much
risk the investment will add to a portfolio that looks like the
market.
The goal of the CAPM formula is to evaluate whether a stock is fairly valued when its risk and
the time value of money are compared with its expected return.
In other words, by knowing the individual parts of the CAPM, it is possible to gauge whether the
current price of a stock is consistent with its likely return.
Non-Linear regression

In many cases, based on the scatter graph, we can see that the relationship between 2 variables
is not appear to be a linear.
Two most common case is the above cases, when the scatter graphs tell us that these two
variables have quadratic relationship or logarith relationship.
Non-Linear regression

Exercise 4.2: Using the data in EXECUTIVE.XLS examine different XY-plots involving
the variables X, Y, W and Z (Y = executive compensation, X = profits, W = percentage
change in sales and Z = percentage change in debt). Does there seem to be a nonlinear
relationship between any pair of variables?
Exercise 4.6
Data set EX46.XLS contains two variables, labeled Y and X.
(a) Make an XY-plot of these two variables. Does the relationship between Y and X
appear to be linear?
(b) Calculate the square root of variable X. Note the Excel function for square root is
SQRT.
(c) Make an XY-plot of the square root of X against Y. Does this relationship appear to
be linear?

You might also like