0 views

Uploaded by mesopotamicrelations

These are some statistics notes I took in class

These are some statistics notes I took in class

© All Rights Reserved

- hypetesting2.ppt
- Data Analysis With SAS
- chap 9
- Statistical Analysis With Cryptocurrencies
- S-lab5-2507-W10
- Hypothesis Testing
- BBS11_ISM_Ch09 (1)
- NSUCh10Fins
- ore reserved
- Frick e Big Data Paper Shorter Format
- 4482.1-2005
- My book
- Chap 09 Mod
- AIPM-6-21
- P Value
- Chap 08 Student
- chap07
- Use of statistical programs for nonparametric tests of small samples often leads to incorrect P values: examples from Animal Behaviour
- BA 2606 Summer 2014 Chapter 11 Introduction to Hypothesis Testing
- Hypotheses Presentation

You are on page 1of 18

“statistics”

You can have one, two, or multiple samples (all drawn from a population

In each of these, you can have continuous random variables (what we will work with) or discrete

random variables

You can have large or small samples (each of which contains n persons/measurements)

o Standard error is essentially the standard deviation of an estimator (read up, but

basically, it’s sigma divided by root n (i.e. the more measurements you have, the smaller

your standard error gets))

o Point estimates are completely useless that collection of measurements might be off

from the “real” value, therefore you need a confidence interval

o In a normal standard deviation, the area between -2.58-2.58 has 99% of central area,

(negative) 1.98- 1.98 covers 95%, -1.645-1.645 covers 68%. Useful numbers to know.

CH 7:

H0 is the null hypothesis (there is no statistically significant relationship between two variables);

H1 or HA is the alternative or research hypothesis. Remember, just because you cannot reject the

null hypothesis DOES NOT MEAN THERE IS NO CORRELATION

o Absence of evidence is NOT evidence of absence!

Formatting:

In one sample problems, we can reject either hypothesis (if one is true, you must reject the

other)

o Accept correct H0 hypothesis (accept H0 when H0 is true), accepting correct H1

hypothesis, accepting incorrect H0 hypothesis, accepting incorrect H1 hypothesis

True negative, True positive, False negative, False positive

True results are great, but others lead to Type I and Type II errors

o Type I: Pr {Rejecting H0\H0 is true} lower left box in 2x2 table (alpha [α])

o Type II: Pr {failing to reject H0\H1 is true} upper right box in 2x2 table (beta [β])

In a courtroom example, a guilty verdict when the person is innocent is type I,

whereas an innocent verdict when the person is guilty is type II (given the null

hypothesis is innocence and the alternative hypothesis is guiltiness)

POWER!

o The power of a test is defined as 1 – β = 1 – Pr(type II error) = Pr(reject H0\H0 is true)

o Aim of hypothesis testing is to use statistical tests that make α and β as small as

possible.

One-sample Hypothesis Testing for Mean of Normal Distribution

o X bar is often used because tests based on sample means have the highest power

among all tests with a given type I error α.

Recall E(X bar) = µ

It is an unbiased estimator, and has immensely low variance (so great)

o One sided test – values of parameter being studied can be either greater than or less

than values under H0, but cannot be both

o V IMPORTANT SLIDE!!

o

Tips and tricks – write out every value you are given and the corresponding

variable, then plug it in as necessary. Find your t value (value of test statistic),

the corresponding critical value for your sample and test (tn-1, α), and if your

value is less than the critical value, you reject the null hypothesis.

P value method for hypothesis testing

o P value is the α level at which we would be indifferent between accepting or rejecting

the null hypothesis; borderline between acceptance and rejection regions.

o Defined as: p= Pr(tn-1≤ t)

o Area under the t distribution that is less than your critical value

o If null hypothesis is true, what is the likelihood we will see some extreme value?

Small usually indicates you should reject the null hypothesis, large indicates you

should accept null (ish)

Recall:

o Inference is made up of estimation and Hyopthesis testing

Estimation (e.g. µ):

Point estimate (x bar)

Confidence interval

Confidence level (alpha [α])

Hypothesis Testing (e.g. µ):

Test statistic

Critical value for p value

Significance level (alpha [α])

P value can be thought of as the probability of observing such an extreme value as that

observed under the condition H0 is true.

o You need to remember context when applying p values (e.g. you could have highly

significant results, but if the results do not suggest a necessary change or problem, they

are not of scientific importance)

Two-Sided Alternatives (H0 µ ≠ µ0)

o Tests for alternatives on either side of the null mean; rejects H0 if the parameter is

greater than or less than the values under the null hypothesis

o Two-sided may be more conservative (did not need to know which way it may differ

from null hypothesis); one sided may have higher power.

Decision to do one or two-sided analyses must be made BEFORE data collection.

o In some applications, if variance is known and sample size is large (n > 200), you can

approximate a normal distribution and translate to z values advantageous as no

degrees of freedom are needed.

If the expected value does not fall within the confidence interval, REJECT THE NULL. By

definition, the confidence interval does not contain any value of µ for which H0 can be rejected.

Power of a test: Power= 1 – β = Pr{rejecting alt hypothesis\null hypothesis is false}

o Usually power is calculated before a study is started (80% is good), sometimes there is a

pilot study.

o We assume standard deviation is known (can be projected w/o any data) and base

power calculations on one-sample z test.

Effect Measure (e.g. RR, OR)

o Null Hypothesis is always RR or OR = 1 (Pr(event/A) = Pr(event/B))

o Further from null, easier to choose as correct hypothesis

At a fixed set of parameters, two-sided will always require a larger sample size than one-sided

Sample size also has a bearing on CI width, denoted as L

N = (z1-a/22 x s2)/d2; L = 2d

Margin of error: (d)

o The more precision you want, the more people you need and the smaller your d will be

Inferences about σ2 can be important

o Because it is skewed, you need to use a skewed distribution (chi square)

I like this example

Look at the summary; make sure you know things

There are two true means for each of the populations, which are usually estimated as x bar 1

and x bar 2

Depending on the study design, different tests are available (e.g. you can have a longitudinal, or

follow-up study (uses paired sample design), or a cross-sectional study (uses independent

sample design))

o Paired samples – each point in the first sample is matched to a singular point in the

second

o Independent – data points in one sample are completely unrelated to those in the

second

Paired uses paired t-test:

o

o So essentially you combine the two into one, you really only care about the differences

MISSING 2 LECTURES. FILL IN YOUR NOTES AFTER MIDTERMS AND COMPS ISIS LUNSKY

The denominator is the standard error of… x1 – x2

You need to know what estimator you’re using (if you know that with the utmost certainty, hard

to get confused. Otherwise, easy)

Your parameter is your unknown (numerator); the standard error of the estimator is the

denominator

Make a chart of parameters, corresponding estimators, and how to calculate standard error of

these estimators

s2 is sample variance, s is sample standard deviation (these are σ in the population)

So now we have 2 variances in 2 different populations – how do we take a weighted average?

o S2pooled = (n1 – 1)*s12 + (n2 – 1)*s22

n1+n2 – 2

Note above – method of calculating p value changes depending on where your t value lies.

Variance of phat = pq/n; SE(phat) = sqrt(pq/n)

Degrees of freedom = n1 + n2 – 2

If 0, which Δµ should equal, is within your confidence interval, no reason to dispute your

hypothesis

ALWAYS START WITH what is unknown? What do you need to make an inference about?

Kind of silly note: the lower your alpha (significance level in hypothesis testing, level of type I

error one is willing to make), the bigger your sample size needs to be to have a p value below

your alpha

Alpha in terms of CI’s means 95% of confidence intervals should include the true value

Looking at different distributions:

o Discrete:

B – n, p

P – λ, µ

o Continuous:

Z – (x-µ)/σ, X~N(µ, σ2)

T – df

Χ2 – df

F – 2 different df’s (one for each sample)

Looking at the back of the textbook, across the top is numerator, down the side is denominator

Think

o What is your hypothesis?

o What are you testing (i.e. what test statistic are you using?)

o Ok, calculating test statistic and/or p value (tbh, they’re basically the same, one is the

probability of the other)

Note: the null hypothesis ALWAYS is the one that says there is no difference – not the one

against your own hypothesis

Df assuming variances are the same = n1+n2-2

Now it’s that complex box thing. The double prime comes from rounding d’ down to the nearest

integer

Other side of the coin – CI’s for 2 sample w/ unequal variances

If 0 is within the confidence interval (representing no difference) you have no reason to reject

the null

There is a lecture missing – fill it in, Ice.

Zcorr is not correlation, but continuity correction, Ice (that part that looks like 1/2n) – helps to

correct for when x is really tiny.

o Also called finite population correction (fPC)

Three steps to hypothesis testing: 1) state your null and alternative hypotheses, 2) run it

through your calculation to get a z/t/f/etc value, 3) calculate your critical value or p value to

decide to reject null or not to

- hypetesting2.pptUploaded byMahmood Khan
- Data Analysis With SASUploaded byVictoria Liendo
- chap 9Uploaded byapi-3763138
- Statistical Analysis With CryptocurrenciesUploaded byJiayuan Dong
- S-lab5-2507-W10Uploaded bymikeface
- Hypothesis TestingUploaded byRAMEEZ. A
- BBS11_ISM_Ch09 (1)Uploaded bycleofecalo
- NSUCh10FinsUploaded byRezwana Newaz Surovi
- ore reservedUploaded byHuntesh Kumar
- Frick e Big Data Paper Shorter FormatUploaded byRodrigo Fernandez
- 4482.1-2005Uploaded bysuranji
- My bookUploaded byMeenaxi Soni
- Chap 09 ModUploaded byRonald Wichhart
- AIPM-6-21Uploaded bySebastian Parrales
- P ValueUploaded byvk
- Chap 08 StudentUploaded by7814262
- chap07Uploaded byFadly Rasyid
- Use of statistical programs for nonparametric tests of small samples often leads to incorrect P values: examples from Animal BehaviourUploaded byAlex Amorim
- BA 2606 Summer 2014 Chapter 11 Introduction to Hypothesis TestingUploaded byDennyseOrlido
- Hypotheses PresentationUploaded bysidhartha jain
- methodology handoutUploaded byapi-18960175
- Chapter 10.pdfUploaded byIndriana Anjarsari
- Worksheet Chap 8.2Uploaded byTam
- Thesis FinalUploaded byIdris Ali
- Sample Exam intro statsUploaded byEric Metzger
- Hypothesis TestingUploaded byMohd Ellif Sarian
- Enme392 1301 Lecture14 HypothesisUploaded byZain Baqar
- p2.pdfUploaded byGuru Prasad
- CorrelationUploaded byDevon Denom
- 81958150 Fed Money PolicyUploaded byhexmap

- cadangan aktiviti4Uploaded bySahida Shafie
- Understanding Repeated-Measures ANOVA.pdfUploaded bywahida_halim
- SamplingUploaded byNaj Retxed Dargup
- Importance of getting Statistics Help using SPSSUploaded byElk Journals- a class apart
- Is Intelligent Design ScienceUploaded byapi-3845254
- ma629 2007Uploaded byemilypage
- methodology of science &Physics (1).docUploaded bykcameppadi123
- Nature & Scope of Business ResearchUploaded bysujeetleopard
- Surname Proposal Date FormUploaded byViyura Eng
- 64538581 Answering Techniques for SPM Biology Paper 3Uploaded byCk Yong
- 598-2442-1-PBUploaded bySean Grant
- A Critical Literature Review on Criteria for Effective Selection of Equipment and Its Management in Construction Industry-IJAERDV04I0316418Uploaded byEditor IJAERD
- Important Reminders for the Presentation of Research ProposalsUploaded byPeter Andrew G. Regencia
- Business Stats ReviewUploaded bybasil9
- Chapter 7 TestUploaded byaka
- Las 432 Week 5 Midterm Exam Devry – 4 VersionsUploaded bycoursehomework
- TQM Empirical StudyUploaded bytristanmunar
- 01696621 xUploaded byBigcaviglia Rec
- Guide on Measurement Uncertainty (en) (G 01)Uploaded byNusky Syaukani
- dataware q&a bankUploaded bySruthy Rajendhren
- Bayesian TutorialUploaded byMárcio Pavan
- Confidence Intervals for the Odds Ratio in Logistic Regression with One Binary X.pdfUploaded byscjofyWFawlroa2r06YFVabfbaj
- Simple Regression NLSUploaded byMichael Thung
- Biopharm AssignmentUploaded byVikas Jhawat
- Eee v Management and Entrepreneurship [10al51] NotesUploaded byVijayaraghavan Aravamuthan
- Report on Socio Economic Condition of Tea Stall OwnersUploaded byRocky Mahmud
- Advanced Regression.pdfUploaded byfenriz666
- ERIC Selection PolicyUploaded bysalwa
- MSN theories.pptxUploaded byKatelyn Cosico
- Test of hypothesis in R languageUploaded byKamakshaiah Musunuru