Professional Documents
Culture Documents
*probability distribution:
-lists of possible values of variable and probability that each will occur, which should sum to 1
or
*Variance
*Expected Value/Mean
*Variance
*Standard Deviation
*cumulative probability distribution / cumulative distribution function / CDF
-probability that a random variable is less than or equal to a particular value.
Probability Density Functions (for continuous random variable) (PDF, density function)
-probability that a continuous random variable be a certain number is zero (0)
-area of the under the pdf curve is the probability, and to get so, get integral between two numbers
Linear Function
Covariance
-extent to which two random variables move together
-zero when random variables are independent
Correlation
-alternative measure of dependence bet X and Y
-between -1 and 1
𝜎𝑋𝑌=𝐸(𝑋𝑌)−𝐸(𝑋)𝐸(𝑌)
e.g.
n two numbers
Uniform Distributions
f(x) = 1/(b-a) -has a constant f(x)
-continuous
distribution: x
-to calculate the probability that sample is bet c and d, calculate the shaded area.
Area if a rectangle is = (L x W) OR (B x H)
Base is the diff between d and c, while height is equal to f(x)
Exponential Distribution
-distribution starts at y intercept (or the maximum value, or lambda)
-continuous
- to calculate probability that sample is less than a (in graph, shaded blue), this is the area to the left of a
Area to the left = cumulative distribution function
= P (X<x) or P(X<a)
> when asked what is the distribution for the mean sample:
> when asked to find the XXth percentile for the mean sample:
e.g. 80th
to get the answer, that is the area to the left Xbar, which
Bearnoulli Distribution
- for discrete random variables
-outcomes are binary (0 or 1)
*Expected Value/Mean
E (G) = 1 * p + 0 * (1-p)
=p
*Variance
*kurtosis
-heaviness / thickness of the tails
-measures how much variance of X arises from extreme values/outliers
-the greater the kurtosis, the more likely are outliers
Conditional Distribution
-distribution of random variable Y conditional on random variable X taking a specific value.
function: Pr(Y=y|X=x)
*Conditional Expectation/Mean
-weighted prob
*Conditional Variance
Normal Distribution
-continuous random variable
X~ N (u,o^2)
-to look up prob of normally distributed variables, we standardize (z) the variable:
68
z = (x-u)/o -standardized version of X z -1
0.5 =norm.dist 0.158655
after getting value of z, plot thru the z table (which is value to the left of z value =normsdist 0.158655
X~ N (0,1)
Chi-Squared Distribution
-for testing hypothesis
- sum of m squared independependt standard normal random variables.
m = degrees of freedom of chi-squared distributions
F Distribution
Poisson Distribution
-disrete occurrences among a continuous domain
-mean = variance
-mean and variance = λ
Probability:
or
Binomial Distribution
-discrete occurrences among discrete trials
for (nx) =>
where:
P(x) probability of x
p probability of successful event per trial
q probability that event fails per trial
n no of trials
x succesful events
*to get probability that x is < or > some value, get prob of each and sum
e.g. P(X<5)= P(1) + P(2) + P(3) + P(4)
mean = np
SD = sqrt(npq)
t X<a in a normal distribution
o 23 16 14 19 28
e 20 20 20 20 20
0.45 0.8 1.8 0.05 3.2 6.3
quare critical value
T distrbution / value
EBM:
to get population mean given sample mean and sample standard deviation T value excel formula:
2.093024 -1.833113
-2.093024 1.833113
T distrbution / value
Confidence Interval
cel formula:
Student t Distribution
=CHISQ.INV.RT(prob,DF)
=CHISQ.INV(prob,DF)
F-Distribution =F.DIST(x,df1,df2,true)
-use alpha 2.063899
-use alpha/2
n-1
* Test of Significance
Measures of Variation
TSS = SSR+ESS
Once you get r^2, this explains the percentage of variation in Y that is explained by X.
But there is still need for test of significance.
get the F-ratio
2.866081
compare f value with f critical value (from table) if F>Fcritical, reject null hypothesis
or
Compare F value and p-value (significance F in anova table)
since p value is less than alpha 0.05, reject 0.05
Multiple Regression:
`
4.56
20.7936
* High SSE, low R^2
* Low SSE, high R^2
*r^2 provides fit of the model
*r^2 is the coefficient of determination
*One tailed vs. Two tailed Test
> In a two tailed test there are rejection region and acceptance region.
The shaded areas/rejected region are borderered by z, which are also called critical values.
> confidence level and significance level (alpha) are complementary (sum equals to 1)
In two tailed tests, alpha is divided to 2 for each rejection region
- the area of the shaded region is computed and compared to the alpha
-use z value to determine area
-area that you will get from the z table will be the area from left to postive z value' ->
-calculate area on the right (1-z value)
-p value is total of two shaded sides
-compare p value to .05 (alpha)
-if its less than, reject Ho
The t - Statistic
where:
*for large n (under central limit theorem), distribution of t approaches N(0,1)
T Test or Z Test?
T Test
-if we know standard deviation and sample size is less than 30.
Z Test
-if more than 30
Hypothesis Test for the Population Mean (u)
z test
z statistic
-if null hypothesis is true, then random variable has standard normal variable
*P-value Approach
-measures strength of the evidence against null hypothesis
-standard normal distribution
-measures area
-smaller p value = the greater evidence against null hypothesis
-reject Ho id p-value < or = alpha
0.10565 0.2113
confidence interval
two sided confidence interval
-there is a direct relationship between confidence interval and hypothesis testing, also true if confidence
-do not rejedct Ho
-compare hypothesized value vs the interval
-should be same with p value
critical value of z:
-1.959964
s pop deviation
known, we can use sample deviation/sqrt of n, only if n>30
testing, also true if confidence interbal rate and alpha are complementary
Central Limit Theorem
- if n is large enough( n>=30)
-if we collect samples of size n from a population, calculate mean of each of those samples and plot those means, graph will ap
-if sample size is large enough, the sample distribution will approximate a normal distribution.
-graph will have a shape of normal distribution, even if we don't know the distribution of population
-use z table for probability calculations
-to get the probability that a sample is somewhere betweeen a and b, use z table to get the area under the curve.
for population:
where: x random variable
u population mean
o population standard deviation
>when n is large, we can use the population mean as the sample mean (u sub x bar)
plot those means, graph will approximate a normal distribution.
1. To compute covariance
Period X Y X-Xbar
1 1.96 1.46 -0.308
2 1.59 1.68 -0.678
3 0.36 1.07 -1.908
4 2.63 3.99 0.362
5 2.97 2.35 0.702
6 3.25 2.27 0.982
7 3.98 2.51 1.712
8 1.24 1.47 -1.028
9 1.24 1.68 -1.028
10 3.46 2.17 1.192
Mean 2.268 2.065
Covariance 0.59316666667 0.620437923086795
*correlation = cov/standard dev
*positive correlation - values move together
Value
115
Value
115
110
105
100
95
90
0 3 6 9 121518212427303336394245485154576063666972757881848790939699
Value
10
0
0 3 6 9 12 15 18 21 24 27 30 33 3639 42 45 48 51 54 57 60 6366 69 72 7578 81 84 87 90 93 96 99
-2
-4
-6
Prob
0.15 0 0.15
0.2 0.15 0.35
0.3 0.35 0.65
0.35 0.65 1
Day
1 0.83 0.02
2 0.11 -0.01
Daily Rate
-1%
0%
1%
2%
51.40
52.428
51.90372
4.Pricing of Options
ck is less than 50, you will not buy the stock/not exercise option
ck is = 50, you will buy the stock/exercise option
Solver
Variables X1 X2 X3 X4 X5
Values 0 0 7.2 0 4.8 Return
Net return 0.026 0.0509 0.0864 0.07 0.078 0.99648
Constraints LHS
1 1 1 1 1 1 12.00
2 0.4 0.4 0.4 -0.6 -0.6 (0.00)
3 0.5 0.5 -0.5 (3.60)
4 0.06 0.03 -0.01 0.01 -0.02 (0.17)
2+X3+X4+X5)
Sign RHS
12
0
0
0