You are on page 1of 2

LEC 2-4 BASICS, DATA TYPES Array with scalar element-wise arithmetic operation LEC 7 PANDAS LEC 9 CONFIDENCE

LEC 7 PANDAS LEC 9 CONFIDENCE INTERVALS


Assignment Operators + 3 gives: Or * 2 FOR POPULATION MEANS
x += y  x = x + y (similar for -=, *=, /=, **=) gives:
Floor Division x //= y  x = x // y e.g. 5 // 2 = 2 When σ
Modulus x %= y  x = x % y e.g. 5 % 2 = 1 is known:

String Methods: Case Conversion


string.upper() , .lower()  convert all letters When unknown:
.capitalize  convert 1st letter to upper Array with
Label based includes stop index item
.swapcase  swap upper with lower cases array:
.title  capitalize 1st letter of each word
.find(__)  returns index of input’s first char if
input found, else -1 Find E(X) & Var/Std from arrays of P(X) & X values .loc[‘Ann’] for label-based
.iloc[1] for integer position POPULATION PROPORTIONS
(1) String, (2) List, (3) Tuple, (4) Dictionary
All are iterable using in, (1) & (3) are not Find worst-case MOE using p̂ = 0.5
mutable, all except (4) are indexed & sliced Construct dataframes with pd.Dataframe(dict) We have (1-α) confidence that the true
using integers and have operators + and *, all pop. proportion is within a ±MOE%
except (3) have methods .index for row index interval around estimate
.columns for column index
String – Assignment implies Aliasing NULL H0 & ALTERNATIVE Ha Reject H0 if P-val <
When list_2 = list_1, list_2 is list_1, hence HYPOTHESIS TESTING significance level
modifying one list will change the other too Output due to Broadcasting: Assuming H0 is true, sample mean is standardised as:
** Create new object using .copy() or [0:] MUST put [1:2], and NOT [1]
Rows are accessed via integer pos or
List methods
RETURNS
NONE: .append(_), .extend(accepts only Right-tail test e.g.
iterable input), .insert(index of item, Numpy Math Functions: log, exp, square, power
item), .remove(item to remove)
RETURNS ITEM REMOVED: .pop(index of
item removed)  remove last item if no input
Assuming H0, sample
proportion standardised as:

LEC 5 FUNCTIONS LEC 10 SIMPLE LINEAR REGRESSION


Namespaces: Local, then global, then built-in Measures: .mean(), median, std, var, min, max
All positional arguments must be specified .value_counts() gives actual count of each category
before keyword arguments, otherwise error  Input normalize=True to get proportion of each
Module Importing Plotting Double Bar Chart
.describe() gives rows of count, mean, std, min, u due to disturbances
import statistics  use statistics.mean(list) or
Plotting Scatter / Histogram

25%, 50%, 75% and max Residual


statistics.stdev(list)
from statistics import *  use mean (avoid LEC 8 Vectorised string operations Ordinary Least Squares Method - Minimise SSR
using * for better namespace management)

LEC 6 NUMPY ARRAYS


Change datatype:
shape = (axis0, axis1, …)
index = [index0, index1, …] BINOMIAL DISTR NORMAL
from scipy.stats import binom

Plot vertical lines


Sample to Population Assumption 2: sample data is randomly generated
3: sample outcomes on x are different values
Residual Plot (SLR) 4: Conditional zero mean: E(u | x) = 0
σ is pop. standard dev - If A1-4 are violated, sample estimators are biased
5: Homoscedasticity: Var(u | x) same for all x  
LEC 11 MULTIPLE LINEAR REGRESSION

Monte Carlo Simulation (with


demand poisson distributed)
A1:
2: Sample randomly generated from model in A1

to find optimal profit


3: NO perfect collinearity
(Indep variables cannot be strongly correlated)
Including irrelevant x variable  no effect on unbiasedness but increases
4:
estimators’ variances due to collinearity
If A1-4 are satisfied, sampling distributions of Omitting a relevant x variable  biased estimators
are centred about the true pop. parameter vals (unbiased)
5 & 6: ,

Zip tuple
If A1-6 satisfied, can obtain standardised term

Dataframe rp for daily return rate


df.std() and .mean() are default by
axis=0, giving a series of stds &
Confidence Interval: means for each stock

LEC 12 NON-LINEARITY

DEFO NONLINEAR:

can use y_hat = results.predict() against x values to plot fitted line


But if import np.random
as rd, rd.randint(x, y)
EXCLUDES y

Interaction terms enable different slope parameters Daily Return Rate & Var of stocks

Portfolio diversification

np.argmax(numpy array)
gives index of the max
value in numpy array

Tut 8: Example on Right-Tailed Test for Sample Proportion


Ex 8: Example on Two-Tailed Test for Sample Mean

You might also like