You are on page 1of 25

Psych 110 – Chapter 1: Introduction to Statistics

Statistics
- Makes sense of variability
- Takes into account individual differences
- Not only about math, also about decision-making into making inferences on solving
certain problems
- Values the scientific method

Karl Pearson
- “Statistics is the grammar of science”
- Biographer of Francis Galton (the one who introduced correlation)
- Used the letter “r” (for regression) as a tribute to Galton

Francis Galton
- Eugenics
o Philosophy in which people are selectively mated together to produce talented,
“fit”, offspring
- Created his own intelligence tests

Population
- Entire group of individuals of interest in research
- N

Sample
- Selected to represent the population in a research study
- Results are generalized to population
- n

CAUTION!
Based on the APA Publication Manual (7th edition)
- use N to designate the number of members in the total sample
- use n to designate the number of members in a limited portion or subsample of the total
sample

Types of Statistical Methods


1. Descriptive
- Organize and summarize data
- What are the characteristics of the set of participants?
- Ex. tables of graphs; descriptive values (mean and SD) used to summarize data
2. Inferential
- Use sample data to make general conclusions (inferences) about populations
- What are the characteristics of the population? / What can I conclude about the
population?
- Hypothesis testing; statistical tests

Describing Data
1. Parameter
2. Statistic

Variables
- Characteristic or condition that can change or take on different values
- Research objective/question
o Relationship between variables for a specific group of individuals

Types of Variables
1. Discrete
- Separate, indivisible categories (class size, gender)
2. Continuous
- Infinitely divisible units (such as time)
o Real Limits
 Boundaries set for each measurement category or interval for continuous
variables
 Located exactly half-way (or half-unit) between adjacent categories
 Upper real limit
 Category + ½ unit
 Lower real limit
 Category – ½ unit
Person A (75.5) Person B (76.5)
Real Limit = 76
Measuring Variables
1. Nominal Scale
a. Unordered set of categories
b. Identified only by name
c. Determine similarity or differences (or sort) individuals
d. Categorical
e. No direction and magnitude
2. Ordinal Scale
a. Ordered set of categories (ranked)
b. Direction of difference between individuals (greater than or less than)
c. Only has direction
3. Interval Scale
a. Ordered series of equal-sized categories
b. Identify the direction and magnitude of a difference
c. Arbitrary location of the zero point
4. Ratio Scale
a. Is similar to an interval scale
b. Value of zero indicates the absence of the variable (absolute zero)
c. Identify the direction and magnitude of differences
d. Allow ratio comparisons of measurements

Research Methods
1. Experimental
2. Non-experimental
a. Qualitative research designs (ex. phenomenology, case studies)
b. Surveys and interviews (including FDGs)
c. Correlational designs
d. Quasi-experimental designs

Correlational Studies
- To determine presence of relationship between two variables
- To describe the relationship
- Simply observes the two variables as they exist naturally
- Cannot define causal relationship
Experiments
- Demonstrate a cause-and-effect relationship between two variables/allow to draw causal
inferences about behaviour
- Show that changing the value of one variable (independent variable) causes changes to
occur in an observed variable (dependent variable)
Four Basic Elements in an Experiment

Manipulation – researcher manipulates one variable (IV) by changing its value to create a
set of two or more treatment conditions (levels of IV)
Measurement – a second variable (DV) is measured to obtain a set of scores in each
treatment condition
Compare – scores in one treatment condition are compared with scores in another
treatment condition
Control – all other variables are controlled to ensure that they do not influence the two
variables being examined (EVs)

Independent Variable (IV)


- Variable that the experimenter intentionally manipulates/selects/to be included in the
study
- Created/selected by the experimenter and are not affected by anything else that happens
in the experiment
- An IV must have at least 2 levels in order to create at least 2 treatment conditions
- Subject variable (characteristics of the participants, naturally occurring such as
personality traits, not manipulated nor created by the researcher), considered
independent)

Dependent Variable (DV)


- Outcome the experimenter tries to explain
- Particular behaviour we expect to change because of the experimental treatments
- The values/scores of the DV depend on the values/levels of the IV

**be specific when describing the IVs and DVs (say which phobia and its intensity if applicable)

Hypothetical Constructs/Concepts
- Unseen processes postulate to explain behaviour
- Can’t be observed directly (as such the need to operationally define)
- Ex. stress, honesty, memory

Conceptual Definition
- “dictionary definition”

Operational Definitions
- Specifies the precise meaning of a variable within an experiment
- Defines a variable in terms of observable operation, procedures and measurements
- Describes operations involved in manipulating or measuring the variables in an
experiment (procedures/instructions on how to carry the experiment)
- Varies from one experiment to another
- Experimental OD (for an IV) or Measured OD (for a DV)

Extraneous Variable
- Any variable in a research study other than the specific variables being studied
- Factors that are not the focus of the experiment but can influence the findings
- Increases variability in the scores, making it more difficult to detect a significant
difference/treatment effect/interaction
- Ex. equipment failures (noises), inconsistent, instructions

Confounding Variables
- When an EV changes systematically across different treatment conditions of an
experiment
- When an EV in a way that is similar to the variable intentionally being studied
- Sabotages the experiment (not internally valid; doubts with interpretation)

The EV threatens the experiment’s internal validity as it changes systematically along with the
IV. The CV makes the experiment not internally valid.

Experiments
General variables to control
- Participant variable (ex. age, sex, IQ)
- Environmental variable (ex. noises, time of day, weather)
Control techniques
- Random assignments of participants
- Matching participants or environment through assignment
- Holding variables constant

Quasi-Experimental Designs
- Similar to experiments but lacks one or more of its essential elements
o No manipulation of variable to differentiate the groups
o No random assignment to treatment conditions
- Cannot demonstrate cause-and-effect relationships; simply demonstrate and describe
relationships, similar to correlational research
- Use quasi-independent variables (considered independent variables) to differentiate the
groups
o Pre-existing participant or environmental variables
o Time lapse (pre-post)
- Ex post facto studies and Non-equivalent groups
o Pre-existing participant or environmental variables differentiates the groups
o Cannot control assignment of participants to groups and cannot assure group
equivalence
- Pre-post Study
o Time passage used to differentiate groups
o Cannot control variables related to time
- Developmental research designs
o Examine changes in behaviour related to age
o cross-sectional studies and longitudinal studies
 Research on cross-sectional vs. longitudinal studies

Quasi-Experimental Designs (Myers & Hansen, 2012)


Ex Post Facto Studies
- Researcher systematically examines the effects of pre-existing participant or
environmental variables
- Ex. comparing extroverts and introverts on intelligence

Nonequivalent Groups Designs


- Researcher compares the effects of different treatment conditions on pre-existing groups
of participants

Sampling Error
- The discrepancy between a sample statistic and its population parameter
o Though samples are generally considered to be representatives of the entire
population, a sample is not expected to give a perfect accurately picture

Chapter 2: Frequency Distributions


Frequency Distributions
- Descriptive statistical technique
- Organize and simplify data
- Presents a general picture of the results/entire set of scores
- Shows where each individual is located relative to others
Frequency Distribution Tables
- Consists of at least two columns
- X column
o Values listed from highest to lowest
o Do not skip any value for ordinal, interval or ratio scales
- Frequency column
Regular Frequency Distribution
- A frequency distribution table that lists all of the individual categories (X values)
Grouped Frequency Distribution
- Too many X values = difficulty presenting simple, organized data
- Use a grouped frequency distribution table
o Guidelines
 Table should have around 10 class intervals
 Too many = cumbersome
 Too few = little info
 Intervals have the same width and cover range of scores, no gaps, no
overlaps
 Usually a simple number such as 2, 5, 10, or 20
 Easier to understand how range of scores was divided
 Bottom score in each interval should be a multiple of the interval width
Constructing
- Determine the width/range of scores
o Range = highest – lowest +1

Cumulative Frequency and Cumulative Percentage


- Cumulative Frequency (cf)
- Cumulative Percentage (c%)
Interpolation
- A mathematical process based on the assumption that the scores and the percentages
change in a regular, linear fashion as you move through

Frequency Distribution Graphs


- X-axis = score categories ( x values)
- Y-axis = frequency
- Useful since they show the entire set of scores, can determine the highest/lowest score
and where the score are centered at a glance, and show whether the scores are clustered
together or scattered over a wide range
- Interval or ratio scale score categories use histogram or
Histogram
- A bar is centered above each score or class interval
- The height of the bar corresponds to the frequency
- Bar width extends to the real limits
Polygons
- A dot is centered above each score
Bar Graphs
- Nominal or ordinal scale
- Unlike a histogram, it has gaps or spaces are left between adjacent bars
Relative Frequency
- Used for large populations where it is impossible to know the exact number of
individuals (frequency) for any specific category
- Exact number of frequency not shown in graph
Smooth Curve
- For interval or ratio
- If population scores are measured on an interval or ratio scale, present the distribution as
a smooth curve
- Emphasizes the fact that the distribution is not showing the exact frequency
Shapes of Frequency Distribution
- A graph shows the shape of the distribution
o Symmetrical distribution
 Look similar on both sides
o Skewed distribution
Stem-and-Leaf Displays
- Gives an organized picture of the entire distribution
- Individual leafs identify the individual scores
- List down stems in a column
- Go through the list of scores
Central Tendency
- A statistical measure that determines a single value that
o Accurately describes the center of the distribution
o Represents the entire distribution of scores
- Mean
o Most commonly used measure of central tendency
o Represented by mu for population
o For samples, M or x-bar
o Requires scores that are numerical values measured on an interval or ratio scale
o Conceptually, can also be defined as
 The amount that each individual receives when the total is divided equally
among all N individuals
 The balance point of the distribution because the sum of the distances
below the mean is exactly equal to the sum of the distances above the
mean
o Weighted (Overall) Mean
 Refer to HS notes
 Sample problem: You take three 100-point exams in your statistics class
and score 80, 80 and 95. The last exam is much easier than the first two,
so your professor has given it less weight. The weights for the three exams
are:
 Exam 1: 40 % of your grade. (Note: 40% as a decimal is .4.)
 Exam 2: 40 % of your grade.
 Exam 3: 20 % of your grade.
 What is your final weighted average for the class?
 Multiply the numbers in your data set by the weights:
 .4(80) = 32
 .4(80) = 32
 .2(95) = 19
 Add the numbers up. 32 + 32 + 19 = 83.
 The percent weight given to each exam is called a weighting factor.

o Changing the mean


 Changing the value of any score will immediately change the value of the
mean
o When the mean won’t work
 The mean won’t work when the distribution contains a few extreme scores
or is very skewed
 The mean will be pulled towards the extremes
 Mean will not provide a “central” value
 Impossible to compute a mean on nominal scale data
 Inappropriate to compute mean on data are measured on an ordinal
scale (ranks)
- Median
o The midpoint of list of scores in a distribution which is listed in order from
smallest to largest
o Point on the measurement scale below which is 50% of the scores in the
distribution are located
o Relatively less sensitive to extreme scores
 Tends to stay in the center of the distribution even when there are a few
extreme scores or when the distribution is very skewed
 Serves as a good alternative to the mean in these solutions
o Can be used for open-ended distributions where upper or lower limits are not
specified (ex. 5+)
o Can be used for undetermined/unknown scores (missing data)
- Mode
o Most frequently occurring category or score
o Only measure of central tendency that can be used for data measured on a
nominal scale
o Often is used as a supplemental measure of central tendency that is reported along
with the mean of the
o Bimodal Distribution
 Possible to have two modes (note that a distribution can have only one
mean and only one median)
 Often used to describe a peak in a distribution that is not really the highest
point
 Major mode at the highest peak
 Minor mode at a second peak in a different location
Central Tendency and Normal Distribution
Skewed distribution
- The mode will be located at the peak on one side
- The mean usually will be displaced toward the tail on the other side
- The median is usually located between the mean and the mode

Chapter 4: Variability

Variability
- A quantitative measure
- To describe the distribution
o How spread out or clustered the scores are in a distribution
o Distance of score or group of scores from each other
- Know how representative a score or group of scores is/are of the entire population
o Smaller distances (clustered) – the score is a better representation of the
population
o Bigger distance (spread out) – more difficult to fin score that is a good
representation of the population
o Variability seen as an error – sampling error

Central Tendency and Variability


- A measure of variability usually accompanies a measure of central tendency as basic
descriptive statistics for a set of scores
- Central tendency – similarities
- Variability – differences
- Central tendency and variability are the two primary values that are used to describe a
distribution of scores
Measuring Variability
- Variability can be measured with
o Range
 Highest – lowest +1 (Xmax – Xmin + 1)
 Easily distorted by extremely large or small scores in the distribution even
if other scores are clustered
o Standard deviation/variance
 Measures the standard distance between a score and the mean
 Mean is the reference point/standard
 How near or far a score is from the mean
 Provides a measure of the average distance from the mean
 Describes whether the scores are clustered closely around the mean or are
widely scattered
 Compute the deviation (distance from the mean) for each score
 Deviation = x – mu
 Square all the deviations (without squaring, the answer would
always be zero)
 Compute the mean of the squared deviations
 For a population
o Summing the squared deviations (sum of squares, SS)
o Dividing by N
 For sample’s variance (s^2) is computed by
o SS/ n – 1
o n – 1 is the degrees of freedom (df)

- In each case, variability is determined by measuring distance


- Interval or ratio scale
Properties of Standard Deviation
- If each score is multiplied by a constant, the standard deviation will be multiplied by the
same constant
The Mean and Standard Deviation as Descriptive Statistics
- If you are given numerical values for the mean and the standard deviation, you should be
able to construct a visual image or a sketch of the distribution of scores
- As a general rule about
o Around 70% of scores will be within one standard deviation of the mean
o 95% of the scores

Chapter 5: z-scores
Z-score
- Value of the z-score tells the exact location of a score relative to all the other scores in the
distribution
- Standardizing the entire distribution allows us to compare scores even if they are from
different tests – two (or more) different distributions can be made
- Domain of all things is zero
- Standard deviation is 1
- If the original distribution is changed into a z-distribution, the shape won’t change
- Changing an x-value into a z-score involves creating a signed number that:
o Specifies the precise location of each x-value in a distribution
o The (+ or -) sign identifies the location, either above (+) or below (-) the mean
o The numerical value of the z-score corresponds to the number of standard
deviations between x and the mean
o (for population) z= X-mu/standard deviation x = mu +z(standard dev)
o (for sample) z= X-M/s
o As descriptive statistics, z-scores describe exactly where each individual is
located
o As inferential statistics, z-scores determine whether a specific sample is
representative of its population, or its extreme and unrepresentative
 If z-score near 0 -> fairly typical/representative individual
 If z-score at the extreme tails -> “noticeably different” from the others
Properties of z-score
1. The mean of a distribution of z-scores is always 0
2. The standard deviation of a distribution of z-scores is always 1
3. The sum of the squared z-scores is always N
4. Transforming the original distribution to a distribution of z-scores does not change the
shape of the original distribution and does not change the location of any individual score
relative to other s in a distribution
Other standardized distributions based on z-scores
- Although transforming x values into z-scores creates a standardized distribution, many
people find z-scores burdensome because they consist of many decimal values and
negative numbers.
- More convenient to standardize a distribution into numerical values that are simpler than
z-scores
What are the applications of z-scores?
- Helps identify the location of a score relative to other scores
Probability
- Method for measuring and quantifying the likelihood of obtaining a specific sample from
a specific population
- A fraction or a proportion
- Determined by a ratio comparing the frequency of occurrence for a specific outcome
relative to the total number of possible outcomes
- When a population of scores is represented by a frequency distribution, probabilities can
be defined by proportions of the distribution
- In graphs, probability can be defined as a proportion of area under the curve
- Whenever the scores in a population are variable, it is impossible to predict with perfect
accuracy exactly which score or scores will be obtained when you take a sample from the
population
Probability and Inferential Statistics
- If the sample has a high probability of being obtained from a specific population, then the
researcher can conclude that the sample is likely to have come from that population
- If the sample has a low probability of being obtained from a specific population, specific
population is probably not the source of the sample
- Those in the extreme tails of the distribution are probably not from the population
Random Sampling
- Ensures that all members of the population have an equal chance of being selected
- Process of sampling with replacement is utilized
- Why use a random sample?
o In order to apply the laws of probability to the sample
o Results in a sample that should be representative of the population
- Sampling with replacement
o Each member of the population selected for the sample is returned to the
population before the next member is selected
o Thus, the probability for one individual to be selected must stay constant from one
selection to the next (if more than one individual is selected)
- Sampling without replacement
o Members of the sample are not returned to the population before subsequent
members are selected
Probability and the Normal Distribution
- The unit normal able lists several different proportions corresponding to each z-score
location
Probability and the Binomial Distribution
- Binomial distributions
o Are formed by a series of observations (for example, 100 coin tosses) for which
there are exactly two possible outcomes
o The two outcomes are identified as A and B
o Probabilities of p(A)
- When pn and qn are both at least 10
o The binomial distribution is closely approximated by a normal distribution
 With a mean of mu = pn
 Standard deviation of sigma = sqrt(npq)
o A z-score can be computed for each value of X and the unit normal table can be
used to determine probabilities for specific outcomes
- Binomial distributions are actually discrete numbers – use appropriate upper and lower
limits in the computation of z-scores
- How to:
o Graph the distribution and identify area of interest
o Find out the limit equivalents of the area of interest
 Ex. 15 or more -> use lower limit of 14.5
 Ex. more than 15 -> use upper limit of 15.5
 Ex. 15 or less -> use upper limit of 15,5
 Ex. less than 15 -> use upper limit of 14.5
For First Long Exam (said to be the easiest)
- Bring own calculator, 2 pencils, eraser, black/blue ballpen, and correction fluid/tape
- No need to bring bluebook and unit normal table
- Chapters 1-6; SPSS basics
- Interpolation
- Central Tendency and Variability
- Formulas to memorize
o SS (definition and/or computational)
o Variance and SD (Population and Sample)
o Z-score (how to get z-score and value of X)

Population distribution
- A collection of all population scores
Sampling distribution
- A collection of statistics draw from all possible samples of a specific size from a
population
o An example of a sampling distribution is the distribution of sample means, which
is a collection of sample means of all possible random samples of a particular
sample size that can be obtained from the population
Sample distribution
- A collection of all sample scores

Characteristics of the Distribution of Sample Means


1. Sample means should pile up around the population mean.
2. The pile of sample means tend to form a normal-shaped distribution.
3. Generally, the larger the sample size, the closer the sample means should be to the
population mean.
The Central Limit Theorem
1. The mean of the distribution of sample means is called the expected Value of M and is
always equal to the population mean mu.
2. The standard deviation of the distribution of sample means is called the Standard Error of
M.
3. If the distribution of the population from which the samples are obtained is normal. The
distribution of sample means will also be normally distributed, regardless of sample size
(n).
4. As the sample size increases, the distribution of sample means approaches a normal
distribution. When the sample size is sufficiently large (i.e., n>=30), the distribution of
sample means will be approximately normal, regardless of the shape of the population
distribution.
According to accounts, the first person who postulated the CLT was Abraham de
Moivre. Galton illustrated the CMT using a Quincunx/Galton board (it illustrates a
binomial distribution)

Central Limit Theorem: Shape of the DSM


- The DSM approaches a normal distribution (will look more like a normal distribution) as
the sample size gets larger – no matter what the shape of the population distribution
Central Limit Theorem: Mean of the DSM
- Average of all sample means = expected value of M
- A particular sample mean is expected to be near its population mean
- Sigma m = SD of sampling distribution = Standard error of M
Central Limit Theorem: SD of the DSM
- Its magnitude is determined by two factors
1. Law of large numbers: as the sample size increases, the standard error decreases
2. Population SD
a. When a sample consists of n = 1 score, then its DSM is identical to the
population distribution.

Hypothesis
- Thesis/main idea of an experiment/study consisting of a statement that predicts the
relationship between at least two variables
Hypothesis Testing
- The general goal is to rule out chance (sampling error)
- as a plausible explanation for the results from a research study
- Technique to help determine whether a specific treatment has a significant effect on the
individuals in a population
- “significant” – result is very unlikely to occur by chance alone
- Purpose is to decide between two possible scenarios
1. Sampling error
2. Too large to be explained by sampling error
Four Steps
1. State the hypothesis
a. Null hypothesis (Ho)
i. Always states that the treatment has no effect
ii. Population means before and after are the same
b. Alternative hypothesis (H1/Ha)
i. States that the treatment has an effect (there is a change, a difference or a
relationship for the general population)
ii. The population means before and after treatment are different, such that
their difference is too large to be explained by chance/sampling error
2. Set the criteria for a decision/locate the critical region
3. Collect data and compute sample statistic
4. Make a decision about the Ho and state the conclusion
Example 1
Step 1:
Ho: mu = 22.70
- Jokoy’s performance has no significant effect on the joviality of the audience.
Ha: mu /= 22.70
- Jokoy’s performance has a significant effect on the joviality of the audience

Step 2:
The distribution of sample means can be divided into two sections
Alpha Level / Level of Significance (α)
- To define the boundaries that separate the high-probability samples from the low-
probability samples
- Common alpha level values : α = 0.5, α = 0.01, α = 0.001
- If α = 0.05, then we separate the most unlikely 5% of the sample means (i.e., sample
means located in the extreme tails) from the most likely 95% of the sample means (i.e.,
sample means located in the center)
Critical Region
- Consists of outcomes that are very unlikely to occur if the null hypothesis is true (aka low
probability samples)
- Is defined by sample means that are almost impossible to obtain if the treatment has no
effect
o The phrase “almost impossible” means that these samples have a probability (that
is less than the alpha level
- Different from critical region boundaries
- Treatment has no effect if value of sample mean is located at the middle 95% of the
distribution (assuming α is 0.05)
- Treatment has an effect if value of sample mean is located at the critical region (i.e., less
than z= -1.96 and more than z= 1.96)
Step 3:
Test statistic (in this case, its z-score) forms a ratio comparing
Step 4: Make a decision
A large value for the test statistic shows that
- The obtained mean difference > expected if there is no treatment effect
- Is it large enough to be in the critical region?
o Conclude that the difference is significant or that the treatment has no significant
effect
o Reject the null hypothesis
Directional Tests
- When a research study predicts a specific direction for the treatment effect
- Possible to incorporate the directional prediction into the hypothesis test
Factors Affecting a Hypothesis Test
- Difference between sample mean and hypothesized population mean
- Standard error
o Variability of scores (high variability decreases the chance of getting a treatment
effect)
o Sample size (larger sample size increases the chance of getting a treatment effect)

Assumptions for a Hypothesis Test (using z-scores)


- Random sampling
- Independent observations (the occurrence of first event has no effect on the probability of
the second event)
- Treatment effect is equal to adding/subtracting a constant to every score in the population
(and thus sigma is still the same after giving treatment to all members of the population)
- DSM is normal
Error in Hypothesis Testing
- Sampling mean (following treatment) different from original population mean
- Hypothesis test relies on sample data which are not completely reliable
- Type I error
o Occurs when the sample data appear to show a treatment effect when, in fact,
there is none
o Researcher will reject the null hypothesis and falsely conclude that the treatment
has an effect
o Caused by extreme (unusual), unrepresentative samples (researcher selects an
extreme sample by chance, the result that the sample falls in the critical region
even though the treatment has no effect)
o Type I error is equal to the alpha level – the probability of obtaining a sample
mean in the critical region even when the null hypothesis is true
- Type II error
o Occurs when the sample does not appear to have been affected by the treatment
when the treatment does have an effect
o Researcher will fail to reject the null hypothesis and falsely conclude that the
treatment has no effect
o Type II errors are commonly the result of a very small treatment effect. Although
the treatment does have an effect. It is not large enough to move the sample mean
Power of a Hypothesis Test
- The probability that the test will correctly reject a false null hypothesis
- The probability that the test will identify a treatment effect if one really exists
Beta (fail to reject Ho when there is a real effect) + (1-beta) [reject Ho
when there is a real effect] = 1.00
Measuring Effect Size
- Hypothesis testing evaluates the statistical significance of the results from a research
study
o Test determines whether or not it is likely that the obtained sample mean occurred
without any contribution from a treatment effect
- A significant treatment effect doesn’t necessarily mean a substantial/large treatment
effect
- Even a very small effect can be significant if it is observed in a very large sample
Reporting Results Using APA Format
- When presenting p values, use p>.05, p<.05, and others if the exact probability is not
available
- Use a zero before the decimal point in numbers that are less than 1 when the statistic can
exceed 1 (d=0.50, z=0.50)
- Do not use a zero before a decimal fraction when the statistic cannot be greater than 1
(e.g., correlations, proportions, level of statistical significance)
T statistic
- Allows researchers to use sample data to test hypotheses about an unknown population
mean
- Can be used to test hypotheses about a completely unknown population that is, both mu
and sigma are unknown, and the only available information about the population comes
from the sample
- Can be used when the population standard deviation is not available or is not needed
Two general scenarios:
1. To determine whether or not a treatment causes a change in a population mean
- Must know/have the value of mu for the original, untreated population
2. A hypothesized value for an unknown population mean is derived from a theory or other
prediction
- Compare actual sample mean with the hypothesized population mean
- Significant difference indicates that the hypothesized value for mu should be rejected
Two alternatives
1. Discrepancy due to chance
2. Highly unlikely to be due to chance
Assumptions of the t-test
- Values in the sample consist of independent observations
- Population sampled must be normal
Hypothesis Tests with the t- statistic
- A critical step for the t statistic hypothesis test is to calculate exactly how much
difference between M and mu is reasonable to expect
o Problem: the population standard deviation is unknown. It is impossible to
compute the standard error of M.
o But use the sample variance, s^2, in the place of the unknown population
variance, or use sample standard deviation, s, in place of the unknown population
standard deviation, sigma.
- The t statistic (like a z-score) forms a ratio
o Top of the ratio: obtained difference between the sample mean and the
hypothesized population mean
o Bottom of the ratio: standard error which measures how much difference is
expected by chance
The t distributions
- Comparing t statistic to z-score
o We use the sample variance to estimate the unknown population variance
o With a large sample, the t statistic will be very similar to a z-score (very good
estimation)
o With small samples, the t statistic will provide a relatively poor estimate of z
- The value of degrees of freedom, df = n-1
o Is used to describe how well the t statistic represents a z-score
o Determine how well the distribution of t approximates a normal distribution
o For large value of df, the t distribution will be nearly normal

Measuring Effect Size with the t statistic


- Measuring effect size
o Estimated Cohen’s d
 Measures the size of the treatment effect in terms of the standard deviation
 Use the sample standard deviation instead of the population value (which
is unknown)
- Percentage of variance account for by the treatment
o Based on the idea that the treatment causes the scores to change, which
contributes to the observed variability in the data
o By measuring the amount of variability that can be attributed to the treatment, we
obtain a measure of the size of the treatment effect
- Confidence interval
o Range of values that estimates the unknown population mean
o Based on the observation that a sample mean tends to provide a reasonably
accurate estimate of the unknown population mean
Things to consider in your study
- Online or onsite data collection
- Characteristics of participants
o Psych 101 students (need to give them credit stubs)
o Convenience and snowball sampling
o Psych 110 students (not from your class)
- Sample size (based on power analysis)
- Procedures; operational definitions of variables of interest
- Ethics (consent forms, debriefing, positive mood induction?, need to contact PsychServ?)
- When reporting, refer to how to refer using APA results
o Mention a statement about your descriptive stats (of the sample)

How Psychiatric Labels Affect How We See People


- Based on the study of Rosenhan (1974) where experimenters became pseudo patients and
feigned hallucinations to enter psychiatric hospitals
- It is not difficult to be misdiagnosed as being mentally ill, but it is very difficult to get rid
of that diagnosis, and the meanings that has about you, once it’s been made
- Labels of psychological disorder does not mean that they cannot make significant enough
progress to be considered free of the problem
Statistical Test
- Choosing a statistical test is dependent on the following factors
1. Same or different set of participants in each treatment conditions
2. Level of measurement of the dependent variable
3. Number of treatment conditions
4. Number of IVs
- Parametrics tests
o Makes assumptions about the parameters (defining properties) of the population
distribution(s) from which one’s data are drawn
 Normally
 Homogeneity of variance
 Independence/independent observations
 Interval or ratio data
- Nonparametrics tests
o Do not rely on the restrictive assumptions of parametric tests

Independent-Measures Designs
- Allows researchers to evaluate the mean difference between two populations using the
data from two separate samples
- The identifying characteristics of the design is the existence of two separate or
independent samples
- Can be used to test for mean differences between
o Two distinct populations (such as men versus women)
o Two different treatment conditions (such as drug versus no drug)
- Used where a researcher has no prior knowledge about either of the two populations (or
treatments) being compared
o The population means and variances/SDs are all not known and values must be
- General purpose of the independent-measures t test is
o Determine if the sample mean difference obtain in a research study is
 Real mean difference between the two populations
 Chance/sampling error
o Determine if the mean difference obtained in a research study is
Steps in Hypothesis Testing with the Independent-Measures t statistic
1. State the hypothesis
a. For the independent-measures t test, Ho states that there is no difference between
the two populations means
2. Locate the critical region
The Homogeneity of Variance Assumption
- If the assumption is violated, then the t statistic contains two questionable values:
1. The value for the population mean difference which comes from the null hypothesis,
and
2. The value for the pooled variance
- Cannot determine which if these two values is responsible for a t statistic that falls in the
critical region

Hypothesis Tests with the Related-Samples t


- In a repeated measures design, a single group of individuals is obtained and each
individual is measured in both of the treatment conditions being compared
o Data consist of two scores for each individual
- In a matched-subjects design, there will be different participants in each group, ut the
participants have been closely matched before assignment to treatment conditions
o Form pairs of participants based on similarity of matching variable (that is, a
characteristic that is likely to affect the DV)
o Assign one member of each pair to a treatment condition and the other one in the
other condition
- Data consist of pairs of scores with each pair corresponding to a matched set of two
“identical” participants
- A difference score is computed for each matched pair of individuals

The null hypothesis says that


- There is no consistent or systematic difference between the two treatment conditions
- Chance/sampling error is expected
o Small differences between the mean of the sample of difference scores and the
mean of the population
The alternative hypothesis states that

Assumptions in Related-Samples t-test


1. Observations within each treatment condition are independent
2. The population distribution of difference scores is normal
3. Interval/ratio data
Advantages of Between-Subjects Design
1. Each individual score is independent from other scores
a. Not affected by progressive errors – order effects (ex. practice, fatigue) and
carryover effects
2. Used for a wide variety of research questions
Disadvantages of Between-Subject Design
1. Need relatively large number of participants (concern if research involves special
populations)
2. Each score is obtained from a unique individual from all of the other participants
(individual differences)
a. ID can become confounding variables
b. ID can produce high variability in the scores (making it difficult to detect a
treatment effect)
Advantages of Within-Subject Design
1. Need relatively few participants
2. Useful in situations where it is difficult to locate participants
3. Gen on-going record of participant responses/behaviors over time / well suited to
examine changes that occur over time such as learning or development
4. Increase power
a. Eliminate concerns with individual differences
i. No individual differences between treatment
ii. Measure and remove variance caused by ID within treatments
5. Possible to measure differences between treatments without involving any ID
6. When Id are consistent across treatments, they can be measured and removed from the
rest of the variance in the data
Disadvantages of Within-Subjects Design
1. Practical limitations
a. Participants may need to spend more time in experiment
b. Tedious/inconveniences for participants
c. Participant attrition/subject mortality
2. Progressive errors
Carryover Effects
- Changes in behaviour/performance that are caused by the lingering aftereffects of an
earlier treatment condition
o Order effects: changes in behaviour/performance that are related to general
experience in a research study, but not related to a specific treatment/s
- Effect of some treatment will persist after the treatments are removed
- Ex: smell and taste perception, emotion
-

You might also like