You are on page 1of 3

3.

1 Paired Samples (Dependent, Matched Pairs) t-Test


Used when we are comparing two dependent groups. Same group or person is measured before and after being treated with X-variable. Like an independent two sample t-Test, the two
populations represent two values of the X (exp) variable. But these two values come from a single set of subjects.In order to compare two paired variables, we need to reduce the “two
sample” situation to a single sample situation. This is the Difference Scores(Before treatment data-After treatment data) and it is calculated for each group/person. This will give us a
single mean to use to make inferences.
Steps for Paired T-test: 1) State null and alternative hypothesis, using difference mean denoted by “d”. H(0)µ(d)=0, H(A)µ(d)<0,>0,does not equal 0. 2)Check conditions and calculate
T-test. 3)Find P-value. 4)Conclusion.
Conditions for Paired T-test: 1) Sample of differences is random. 2) Falls under 1/3 situations. (Diff varies normally w/ small sample size, Diff varies normally w/ large sample size, Diff
does not vary normally w/ large sample).
Why is independent samples t-Test not appropriate? Independent T-test compares the means for two populations. This cannot be used for a paired sample, since both populations come
from the same set of subjects and the difference scores creates a single mean by which the T-test is calculated and inferences are made from.

Model Comparison Framework (Paired Samples)


Model Statement: within groups
design can reduce error variation.
Btw Groups design
Score=mean+effect of group+error

Within Groups design


Score=mean+condition+person+error
There are two approaches for Simulating a paired samples comparison. Approach 1: Assume restricted (null)
model, If µ(diff)=0 what is the likelihood of getting a sample mean as extreme as yours. Approach 2: Assume complete model, If µ(diff)=my observed (𝑥̅) what is the likelihood
of getting a sample mean as extreme as the null?

Statistical/Practical Significance & Effect Size: Statistical significance is concerned with whether a research result is due to chance or sampling variability (shows the
mathematical probability that a relationship between two or more variables exists); practical significance is concerned with whether the result is useful/applicable in the real
world.
The effect size measures the difference between the changes in the dependent variable due to the independent variable. It emphasizes the size of the difference rather than
confounding this with sample size. It measures the magnitude of the difference between what you observed and the null (i.e. Cohen’s d), or how well your model explain
variation (i.e. ETA-squared, R-squared). Note: The effect size is just the standardized mean difference between the two groups. Think of it just like how statistical significance
tells you if something is different than the status quo, an effect size measures the practical importance of that difference.
Cohen’s D (Standardized Mean Difference) (Xbar1-Xbar2/SD). Effect size guidelineless than .20 (negligible), .20 to .50 (small), .50 to .80 (moderate), greater than .80
(Large). ETA^2=SS Model/SS Total (proportion of variance explained in model, when CQ.) R^Squared=(percent of variability explained in model when using QQ)
^^^Both “live in” sample distribution. They are calculations that give us a statistic (sample statistic from sample distribution)
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
3.1 Comparing Groups
Four approaches for comparing two independent groups: In all approaches ask Are the differences between groups due to natural sampling variation? If no, reject null.
1) Independent T-Test: Use observed group means and standard deviations to calculate standard error of the
sampling distribution of differences of means centered on null hypothesis. Standard Error: is calculated by recording
the mean difference between two independent sample means many times and creating a sampling distribution from
those mean differences.
Steps for Independent T-Test: 1) State null [H(0)= µ1- µ2=0] and alt [H(A)= µ1- µ2<0, >0, does not equal 0.] 2)Check
T-test conditions/summarize data. 3)Find P-value. 4)Conclusion using Test Statistic=measure of evidence stored in
data against null. It measures in SE’s how diff our data is.
Conditions for Independent T-Test: 1) Samples are independent. 2) Both pop. are normal and both samples are
random. Or pop. Is known (not normal), but sample size of random sample is large (>30).
T-test= (SS Model)- (SS Null Model)/Standard Error of Mean Diff.
2) Simulation: way to hypothesis test, generate sampling distribution to show how much
sampling error is due to variation in our parameter estimates (each sample). Use
“pooled” standard deviation to create a single population model. Then simulate many
pairs of samples to create a null sampling distribution of differences.
Steps for testing two groups using simulation (hypothesis testing):
Simulation using restricted model: 1) Model the sampling distribution of the restricted
model (the null hypothesis) 2) Get a test statistic from your data that shows how different
the groups are in terms of standard error units 3) Look at where that test statistic falls on
that restricted model sampling distribution 4) Determine if it falls within a critical region
Assuming complete model. As shown above, make a sampling
and make your decision
distribution of the difference of two means. Find t(ind). Examine
Simulation using complete model: 1) Model the sampling distribution of the complete
where it falls on the model. Determine significance based on
model (the alternative hypothesis that uses your sample statistics). 2) Get a test statistic
location to Tcrit.
from your data that shows how different the groups are in terms of standard error units.
Assuming restricted model. As shown above, make a sampling
3) Look at where that test statistic falls on that complete model sampling distribution.
distribution of the difference of two means (centered at null). Find
4)Determine if it falls within a critical region and make your decision .
t(ind) and do the same as before^.
Bias vs Error: Error can be defined as any difference between the average values that were obtained through a study and the true average values of the population being targeted. Whereas error makes up all flaws in a study's results, bias refers only to error
that is systematic in nature. Statistical Models: Models attempt to explain total variation (explained, unexplained, random variation), can be applied to DGP.
Model Comparison Framework (Independent Samples) To compare models we estimate parameters from data, then construct a sampling distribution and use it to see if we
can reject the restricted model (using confidence interval or null hypothesis test).
Crum & Langer/Simulation Models [Note: µ (population parameter)=estimated by µ(j)( the mean difference)] [Note: Y(i) is the variance of dependent variable]
1) Complete model has higher ETA^2
2)Complete model always accounts for more variation.
3)Complete model is not as simple/easier to work with as Restricted model.
4)Any variation from restricted model is due to sampling variation (random
chance).
NOTE: Restricted model in Simulation: Yi=Grand Mean + Error (no effect of group)
Complete Model in Simulation: Yi=Grand Mean +Effect of Group + Error
3) Randomization: Tests how much of the differences between groups is due to natural sampling variation. This approach may produce a mean difference between two
independent groups achieved just from random sampling variation or “noise” in data. Thus, we don’t use an assumed population, we use our data and collect many samples that
will give us a sampling distribution of the mean difference (just from sampling alone)this distribution will serve as our null sampling distribution.
NOTE: THE LAST THREE APROACHES create sampling distributions for putting the MEAN DIFF between groups in context. You can also an calculate the confidence
interval for each approach. Identify T-crit for each, determine if null is outside boundaries of t-crit.
4) ANOVA: One-Way Analysis of Variance: Uses RATIO OF VARIANCES to calculate F-Statistic/Ratio, which is used compare more than two groups. The F-value
measures to what extent the difference among sampled groups mean dominates over the usual “null” variation within samples group. To calculate F-value, variance must be
partitioned (does not use means of hypothesis test).
ANOVA Partitions Variance into: Three groups. This is used to construct an ANOVA table.
1) Between groups SS (SS Model[Sum of each groups mean – the grand mean] ^2). High value preferred.
2)Within Groups SS (SS Error[Sum of everyone’s individual score – group mean] ^2). Low value preferred.
3) Total Variance SS (SS total[Sum of everyone’s individual score – grand mean] ^2). Total Variability of dependent variable.

F-Statistic Formula: Recall: For ANOVA, we estimate two components of variance, corresponding to the two parts of each score’s deviation (from group mean and then from
group to grand mean)These two components are represented by Mean Square Error (MS). Thus, unlike other approaches, it has TWO degrees of freedom (# of groups -1)
and (sample size - # of groups). To calculate variance (s^2) divide the Sum of Squares (SS) by Degrees of Freedom. Note: n=total number of score. j=number of groups.
Variance among Sample Means SS Between/(n-1) MS Btw (treatment) F-Ratio/Statistic Interpretation: If MS Between is high, F will be high
Variance w/in Groups SS Error/(j-1) MS error If MS error is high, F will be lower. In terms of P value: the higher the F, the lower P value.
What values does F ratio take when null is true/not true? F-statistic is a ratio of two quantities that are expected to be roughly equal under the null hypothesis, which
produces an F-statistic of approximately 1. If the null is not true your F-statistic will be higher indicating a group effect, but to determine if the effect is significant and not just
sampling variation we need to conduct an F-Test to determine the P-value!
F-test/F-Distribution: Construct a sampling distribution of the F statistic under the restricted model. This can be done with simulation, randomization, and probability
modelwhich will end up creating F-Distribution. The F-distribution depends on how many groups and how large each sample is. Every F-value is different for every individual
sample data collected (this allows us to create the sampling distribution). We then determine how extreme the observed sample F-ratio/statistic is by calculating the
probability (p-value) of getting an F ratio as extreme or more extreme as ours IF THE RESTRICTED MODEL IS TRUE.
Shape of F distribution in comparison to Z/T distribution: All three are probability density functions; t-Distribution uses degrees of freedom (use in general). Z-distribution
doesn’t work well with small sample sizes (only used n>30) and we never really know the SD of the population (only used when pop. SD is known). The t-distribution is based
on using the sample standard deviation as an estimate of the population standard deviation. Approximating the population standard deviation with a sample standard deviation
means the sampling distribution is going to have more spread and is going to be affected by the sample size. Hence for any test statistic value (t=+/-2 vs z=+/-2) there will be a
higher proportion outside those endpoints for the t-distribution. However, as the sample size gets larger the t-distribution approximates the z-distribution. The F-distribution
unlike z and t distribution, will always be positive and take a value between 0 and anything greater than 0, as shown above. The shape of the F distribution changes as a function
of the degrees of freedom in the numerator and denominator of the F ratio.
How is the F statistic related to the T statistic? F is the T-stat squared. Also, they are both standardized values that are a measure of how far the sample mean is from the null.
How to use ANOVA (F-Test) for multiple groups (Steps): 1) State Null Hypothesis [H (0) µ1= µ2 = µ3 = µ4] and Alt [H(A) (µ1 ≠µ2 ≠µ 3 ≠µ4), (µ1 =µ2 =µ3 ≠µ4),
(µ1= µ2 ≠ µ3 = µ4), (µ1= µ2 ≠ µ3 ≠ µ4). Rather than using “µ” for the sample mean, each sample mean is denoted as “Y(bar)1, Y(bar)2, Y(bar)3, Y(bar)4. 2) Check F-test
conditions, find F-value. 3)Find P-value 4) Make conclusions.
F-Test Conditions: 1) Sample drawn from each pop are independent. 2)Response variable varies normally within each population being compared (large sample size not
needed). 3) Populations all have same standard deviation. Check by using “sample” Standard Devs and check ratio of [largest SD/Smallest SD<2].
ANOVA Table: Displays how variance is partitioned. It tells you everything you need to know to calculate F-stat and if it is statistically significant using the P-value.
Measures of
Spread:
Rangedistance
between min and
max (Max-Min).
IQRcontains
middle 50% of data.

________________________________________________________________________________________________________________________________________
2.1 Z-score (standard normal distribution): number of SD away a data point is from the mean. It is a probability model that serves as a measure of distance, based on “any value minus the mean”. It is used to
calculate the probability of a score occurring within our normal distribution and enables us to compare two scores that are from different normal distributions. You divide deviation by SD to quantify how many SD’s
above/below the mean the value falls. The result unit is: SD’s from the mean. When you are precise for a specific value, probability of getting that exact value is zero! The probability of a score falling within a certain
interval is not zero. And, the area under the normal curve represents the probability of a given region. STANDARD DEV: Distance observation from mean. Use as center measure if distrib. is symmetric w/o outliers.
Approximate percentage of scores falling within 1, 2, and 3 standard deviations from the mean: 68%, 95%, and 99%. Sum of Squaresnot good estimation model, varies with sample size!
When a distribution can/should be modeled normalN is large enough, randomly sampled, no measurement error, no skewness.
2.2 Sampling Distributions: Distribution of all the values of a certain statistic taken for each sample taken. It serves a model of sampling variations. It is used for statistical inference, mainly because it allows analytical
considerations to be based on the sampling distribution of a statistic, rather than on the joint probability distribution of all the individual sample values. Three methods use to model a sampling distribution
areSimulation, Mathematical Model, and Bootstrapping. Note: any sample statistic will have a sampling distribution, which often can be modeled by a specific mathematical distribution
Diff of SD of Sampling Distribution vs Sampling Distribution of a SDThe SD of a sampling distrib. is the standard dev of the means of all the samples taken. The latter is the distribution of the mean SD of each SD
Central Limit Theorem: 1. As sample size increases, shape of sampling distribution of the means approaches normal. 2. Mean of the sample of means in a theoretical sampling distribution is equal to the population
mean. 3. As sample size increases the standard deviation of the sampling distribution of means decreases. (Standard deviation of the sampling distribution of means, also called the Standard Error, is equal to the
population standard deviation divided by the square-root of the sample size.)
Which model to use in a given situation Sample Distribution: sample taken, random sample. Gives parameter estimates. Sampling Distribution: many samples taken, probability of getting a certain sample stat. Can
create confidence interval & hypothesis testing. Population Distribution: No mention of samples, All men, All women, etc. Modeling Variation: In Populationx=mean(mu)+error(sigma). In Parameter Estimate(note
that you can switch the parameter to proportion so everything becomes P^)Sample Mean(XBar)=Mean of SDoM(MuXBar)+Error(sigmaXBar). In Parameter Estimate, swap SD with SE!
Standard Error Calculation: Error around mean of the population is modeled by SD. Error around estimate of parameter is modeled by the Standard Error: Standard deviation of the sampling distribution of the
means/proportions. Left is Sampling Distribution of Mean, Right is Sampling Distribution of Proportion.

2.3 Estimation/Confidence Intervals: Can be estimated for the difference between two independent groups using simulation, randomization, and t-statistic.
Point Estimate: Single value calculated from sample data given as an estimate of a parameter of a population. For the population mean, the sample mean is the best point estimator. For categorical variables, the sample
proportion (p HAT) is the best point estimator for the population proportion.
Confidence Interval: provides a range of values which is likely to contain the population parameter of interest and allows us to attach confidence. Confidence intervals are constructed at a confidence level, such as
95%. This means that if the same population is sampled on numerous occasions and interval estimates are made on each occasion, the resulting intervals would bracket the true population parameter (mean or
proportion) in approximately 95 % of the cases. Ex95% confident that the true pop. proportion is within the interval. 95% confidence interval is +/- 2 SD from sample mean because the value of 1.96 is based on
the fact that 95% of the area of a normal distribution is within 1.96 standard deviations of the mean; 12 is the standard error of the mean and due to the CLT, this number is used for the construction of approx. 95%
confidence intervals. Note: In order to estimate the population proportion, p, with a 95% confidence interval with a margin of error of m, we need a sample size of (at least) 1/m^2.
Talking about Confidence Intervals95% confident that the true population proportion is within the interval. There is a 5% chance that it is not in the interval, 2.5% higher and lower.
Confidence Interval Changes Level of percent confidence decreases with increase in precision (smaller interval) with which the parameter is estimated.

for mean. for proportions . .How can we mitigate that tradeoff between level of confidence and the precision of our interval? By increasing the sample size which results in smaller
standard errors (decreased margin of errorMargin of error is half the length of a given interval. In the above equations, the second part after +/- is the margin of error), and therefore, in sampling distributions that are more clustered around the population
mean. A more closely clustered sampling distribution indicates that our confidence intervals will be narrower and more precise. Note: to find n that maintains certain margin of error/confidence leveln=((2*SD)/margin of error)^2.
____________________________________________________________________________________________________________________
Random sample vs Random Assignment: Random Sample refers to how sample members (study participants) are selected from the population for inclusion in the study. Random assignment is an aspect of experimental design in which study participants are
assigned to the treatment or control group using a random procedure. Best way to claim link correlation to causation: Randomized controlled double blind experiment. Experiments with +1 variables must have one treatment group for every combo of
categories of the two exp variables. Example: 3 software*2 classes=6 groups. FINAL NOTE: ETA^2 is used for CQ. R^2 is used for QQ
Standard Deviation vs Z-scoreThe standard deviation is simply the square root of the variance, which brings it back to the original unit of measure. The Z-score, by contrast, is the number of standard deviations a given data point lies from the mean.

You might also like