You are on page 1of 75

Appendix

Going by the Numbers:


Statistics in Psychology
Experimental vs. Non-Experimental

• Non-Experimental:
• Correlation- allows us to determine association
• Cannot determine causality!

• Experimental:
• Variables are manipulated so that we can determine
the effect of one variable on another (e.g. drug trials)
Experimental Terminology

• Experimental Group:
• The group of people who are given a treatment (a
variable is manipulated for this sample)
• Control Group:
• The group in an experiment who are not given a
treatment
Experimental Terminology

• Independent Variable:
• The variable that is manipulated for the experimental
group
• Dependent Variable:
• The variable that is being measured (the variable that
is affected by changes in the other)
More Basic Concepts

 Variable: A condition or characteristic that can


have different values

 Value: A possible number or category that a


variable can have

 Score: A particular person’s value on a variable


Some Basic Terminology

Term Definition Examples


Variable Condition or characteristic that Stress level,
can have different values age, gender,
religion
Value Number or category 0, 1, 2, 3, 4,
25, 85, female,
Catholic
Score A particular person’s value 0, 1, 2, 3, 4,
on a variable 25, 85, female,
Catholic
Module 56:
Descriptive Statistics
Statistics

Statistics: branch of mathematics that focuses


on organization, analysis, and interpretation of
a group of numbers (i.e., data)
Method of distinguishing patterns from
randomness/chance
E.g., lottery, coin flip, etc.
i.e., Is the difference between groups an effect
or just random chance
What does Statistics help us to
do?

• Allows us to determine the relationship between


variables with some degree of certainty
• Variable: a condition or characteristic that can have
different values; something of interest

• Research begins as a question about the relationship


between two variables
Alcohol consumption  car crashes
The Two Branches of
Statistical Methods

 Descriptive statistics
• Summarize/organize a group of numbers from a
research study

 Inferential statistics
• Draw conclusions/make inferences that go
beyond the numbers from a research study
• Use sample to make general conclusions
More on Inferential Statistics

• Goal is to draw conclusions about a population of


interest, by collecting data on a sample
• Population: Entire group of individuals of interest
• University Students
• Turkish people
• Men who brush their teeth with their non-dominant
hand
• Sample: the particular participants selected to be
studied from the population
Validity

External Validity: how well the sample represents


the population
Generalizability
Ideal Research Design

1. Participants in the experimental and control groups


are identical
2. Both groups are exposed to identical situations
(except for manipulation of the independent variable)
• If not, there may be a confounding variable
• Coke vs Pepsi
3. Sample studied represents the intended population
4. Measurement of dependent variable is accurate and
appropriate for what it’s supposed to be measuring
Types of Measurement

 Numeric (quantitative) variables:


1. Equal-interval variables
• e.g., GPA, age, class size
• Discrete
• Continuous
2. Ratio Scale  absolute zero
3. Ordinal (rank order) variables
• e.g., position finished in a race
Types of Measurement

 Nominal (qualitative) variables


4. Categorical
• e.g., gender, religion
Types of Measurement
Type Definition Example
Equal-interval Numeric variable in which differences Stress level, age
between values correspond to differences
in the underlying thing being measured
Ordinal Numeric variable in which values Class standing,
correspond to the relative position position finished in
of things measured a race
Nominal Variable in which the values are Gender, religion
categories
Frequency Distributions

• After collecting data, first task:


• Organize and simplify the data so that it is possible
to get a general overview of the results

• This is the goal of descriptive “statistics”

• One method for simplifying and organizing data:


• Frequency distribution
Organizing Data: Frequency
Tables

 Provide a listing of individuals having each of the


different values for a particular variable
 e.g., stress ratings of 151 students:
4,7,7,7,8,8,7,8,9,4,7,3,6,9,10,5,7,10,6,8,7,8,7,8,7,4,5,10,10
,0,9,8,3,7,9,7,9,5,8,5,0,4,6,6,7,5,3,2,8,5,10,9,10,6,4,8,8,8,4
,8,7,3,8,8,8,8,7,9,7,5,6,3,4,8,7,5,7,3,3,6,5,7,5,7,8,8,7,10,5,
4,3,7,6,3,9,7,8,5,7,9,9,3,1,8,6,6,4,8,5,10,4,8,10,5,5,4,9,4,7,
7,7,6,6,4,4,4,9,7,10,4,7,5,10,7,9,2,7,5,9,10,3,7,2,5,9,8,10,1
0,6,8,3
Organizing Data: Frequency Tables

• Frequency: Number of scores with a particular


value

• Frequency Distribution: The pattern of frequencies


over different values
Organizing Data:
Steps for Making a Frequency
Table
1. Make a list down the page of each possible
value, from highest to lowest
2. Go one by one through the scores, making a
mark for each next to its value on the list
3. Make a table showing how many times each
value on your list is used
4. Figure the percentage of scores for each value
Organizing Data: A Frequency Table

Stress
Rating Frequency Percent
10 14 9.3
9 15 9.9
8 26 17.2
7 31 20.5
6 13 8.6
5 18 11.9
4 16 10.6
3 12 7.9
2 3 2.0
1 1 0.7
0 2 1.3
Steps for Making a
Histogram

1. Make frequency table


2. Put the values along the bottom of the page, from
left to right, from lowest to highest
3. Make a scale of frequencies along the left edge of
the page that goes from 0 at the bottom to the
highest frequency for any value
4. Make a bar above each value with a height for the
frequency of that value
Organizing Data: Frequency Graphs

Histogram
Shapes of Frequency
Distributions
• Unimodal, Bimodal, and
Rectangular
Shapes of Frequency
Distributions

• Symmetrical and Skewed Distributions


Skewness

• Skewed to the left= Negatively skewed;


tail (side with fewer scores) to the left
• Ceiling effect
• Skewed to the right= Positively skewed;
tail (side with fewer scores) to the right
• Floor effect
Normal Distributions

Normal Curve: normal distributions often approximate a


bell-shaped curve that is unimodal and symmetrical
Summary

• We talked about how to begin to make sense of


a group of scores
• Frequency Tables and Graphs

• What are the main statistical techniques for


describing a group of scores with numbers?
How can we do this?

1. Describe group of scores in terms of a


typical/average/most representative/etc. value

2. Describe how spread out the numbers are in


a group of scores
Central Tendency

• Measures of Central Tendency are those which


tell us the typical value in a group of scores
• e.g. What is the typical height of a Bilkent
undergraduate
• Mean
• Median
• Mode
Central Tendency: Benefits
Capture a great deal of
information in a single score

Example 2:

45F = Average high temperature in


December (Ankara, Turkey)

83F = Average high temperature in


December (Kona, HI)
Central Tendency: Mean
• Sum of all the scores divided by the number of scores

M
 X
N
• S = the sum of
• X = each individual score

• S x = the sum of all scores
• N = # of scores
Central Tendency: Mean - Example

• Mean # of dreams per week: 7,8,8,7,3,1,6,9,3,8


• ΣX = 7+8+8+7+3+1+6+9+3+8 = 60
• N = 10
• Mean = 60/10 = 6
Central Tendency: Mode

• Most common single number in a distribution


• Bimodal distribution
• Mode of 7,8,8,7,3,1,6,9,3,8 =
•8
• What type of variable makes the mode a good
choice?
• Nominal variables
• Why?
Mode: Common Error
X f

• The most frequent score 4 4

• Single score that is most 3 10


common 2 14
Mode
1 6
0 6
ERROR:
Mistakenly report frequency of
the most common score, rather
than the score itself
Central Tendency: Mode

The mode as the high point in a distribution’s histogram,


using the # of dreams/week example
The Mode

n = 50
Cat = 17
Dog = 28
Mode
Fish = 3
Other = 2

What is the mode for this distribution?


The answer isn’t a number
Central Tendency: Median

The middle score when all scores are arranged from


lowest to highest
Median of 1, 8, 3, 4, 7
 1, 3, 4, 7, 8
 Median = 4
Median of 7,8,8,7,3,1,6,9,3,8
1 3 3 6 7 7 8 8 8 9

median
Median is the average (mean) of the 5th and 6th
scores, so the median is 7
The Median

Midpoint of the distribution not the midpoint of the


scale (e.g. 0-100)
Divides the set of scores
into two equal-sized not halfway between the
groups lowest and highest
scores
When To Use Each Measure

• Mean is most commonly used in research


• Vulnerable to distortion when sample size is small or
has extreme values
• Mode is used for nominal variables
• Not that vulnerable to extreme scores
• Median is preferable to mean when extreme scores
are in data set
• All three converge for large samples
• Normal distribution
Consider the Following…

1st group of #s
1, 2, 3
2nd group of #s
-3, 0, 9
3rd group of #s
-3, -2, 0, 1, 4, 4, 9

Rank order the 3 measure of CT for a


distribution of scores involving a floor effect
Summary
• Mode: Most frequent score
• Always corresponds to an actual score
• Can have multiple modes
• Nominal data
• Median: Middle score
• Not influenced by extreme scores
• Ordinal data
• Mean: Average score
• Represents every score in the distribution
• Distorted by extreme scores
• CT + variability = foundation for inferential statistics
Variability or Dispersion

• In addition to being able to describe the typical


score, we also want to be able to describe how
much the scores differ or vary from each other

• How much do the scores in our sample vary from


the measure of central tendency on average?

 What’s our next step?


Range

• Range is the simplest measure of variability?


• Difference between highest and lowest score
• Age at Bilkent
• Based entirely on extreme scores
Standard Deviation
• SD: average deviation of a set of scores from
the center of the distribution
Sum of Deviation Scores
• Deviation Score is the extent to which each individual’s score deviates
from the mean
X-M

We want to know the average or the standard deviation score


S (X – M) ?
N
Average Deviation Scores?

S (X – M) = 0/N
N
• The sum of all deviation scores will ALWAYS be 0!!!
• We need to square each deviation score and obtain
Sum of Squares (SS)
Sum of Squared Deviation Scores

• Sum of all squared deviations from the mean:

S (X- M)2 = SS
Variance: Average Sum of Squared
Deviation Scores

• Formula for the Variance

SD
2

 (X  M)
2


SS
N N
Variance: Measure of Spread

• The average of each score’s squared


difference from the mean
• Steps for computing the variance:
1. Subtract the mean from each score
2. Square each of these deviation scores
3. Add up the squared deviation scores
4. Divide the sum of squared deviation scores by
the number of scores
Standard Deviation: Measure of Spread

• Most common way of describing the


spread of a group of scores
• Approximately the average amount that
scores deviate from the mean
• Steps for computing the standard
deviation:
1. Figure the variance
2. Take the square root
Measures of Spread
The Standard Deviation
• Formula for the standard deviation:

SD  SD 2

 (X  M)
2


SS
N N
Correlations

• The association (or co-relation) between scores on two


or more variables

• For example:
• Optimistic people are healthier
• Babies look at more beautiful women longer
• More attractive people earn more money
• Students with higher attendance get higher grades
Scatter Diagrams (Scatterplots)

• Draw the axes and decide which variable goes on which axis
• The predictor or “causal” variable goes on the x-axis

• Mark the values on each axis


• Choose the range of values that covers all of the possible scores

• Mark a dot for each person’s pair of scores


Conservative/ Government
Liberal Services
2 5
5 6
4 7
6 4
5 4
3 2
6 7
4 5
5 4
6 5
4 5
5 6
4 3
5 2
3 2
6 4
7 6
5 6
6 6
6 6
4 6
4 4
4 5
Conservative/ Government
Liberal Services
2 5
5 6
4 7
6 4 8
5 4

Government Services
3 2
7
6 7 6
4 5
5 4
5
6 5 4
4 5
3
5 6
4 3 2
5 2
1
3 2
6 4
0
7 6
0 1 2 3 4 5 6 7 8
5 6
6 6 Conservative/Liberal Scale
6 6
4 6
4 4
4 5
Conservative/ Government
Liberal Services
2 5
5 6
4 7
6 4 8
5 4

Government Services
3 2
7
6 7 6
4 5
5
5 4
6 5 4
4 5
3
5 6
4 3 2
5 2 1
3 2
6 4
0
7 6 0 1 2 3 4 5 6 7 8
5 6
6 6 Conservative/Liberal Scale
6 6
4 6
4 4
4 5
Conservative/ Social
Liberal Responsibility
4 5
5 6
4 7
6 4 8
5 4

Government Services
3 2
7
6 7 6
4 5
5 4
5
6 5 4
2 5
3
5 6
4 3 2
5 2
1
3 2
6 4
0
7 6
0 1 2 3 4 5 6 7 8
5 6
6 6 Conservative/Liberal Scale
6 6
4 6
4 4
4 5
Conservative Standard of
Liberal Living
4 5
5 3
4 3
6 4
5 4
3 3
6 2
4 5
5 4
6 3
2 4
5 5
4 3
5 4
3 4
6 3
7 2
5 4
6 1
6 1
4 3
4 5
4 5
Conservative/ Standard of
Liberal Living
4 5
5 3
4 3
6 4 8
5 4
3 3
7

Standard of Living
6 2 6
4 5
5 4
5
6 3 4
2 4
5 5
3
4 3 2
5 4
3 4
1
6 3 0
7 2
0 1 2 3 4 5 6 7 8
5 4
6 1 Conservative/Liberal Scale
6 1
4 3
4 5
4 5
Conservative/ Standard of
Liberal Living
4 5
5 3
4 3
6 4 8
5 4
3 3
7

Standard of Living
6 2 6
4 5
5 4
5
6 3 4
2 4
5 5
3
4 3 2
5 4
3 4
1
6 3 0
7 2
0 1 2 3 4 5 6 7 8
5 4
6 1 Conservative/Liberal Scale
6 1
4 3
4 5
4 5
Conservative/
Liberal Gun Control
4 7
5 3
4 6
6 6 8
5 4
7
3 6
6 7 6

Gun Control
4 7
5
5 5
6 5 4
2 4
3
5 6
4 4 2
5 4
1
3 5
6 6 0
7 4
0 1 2 3 4 5 6 7 8
5 5
6 6 Conservative/Liberal Scale
6 6
4 4
4 6
4 6
Bivariate Correlation

• Correlation Coefficient: a statistic that indicates the


degree to which two variables are related to one
another

67
How is bivariate correlation
measured?
• Pearson Correlation Coefficient (r): - 1.00 < r < +1.00
1. Valance (sign, direction)
Tells us the direction of the relationship.
(+): When V1 increases, V2 increases OR When V1 decreases, V2 decreases
(-): When V1 increases, V2 decreases OR When V1 decreases, V2 increases

2. Magnitude (size, strength): The numerical value ignoring the sign,


strength of the relationship
Cohen (1988, 1992):
Large if r > .50, Moderate if .50 > r >.30, Small if r <.30

68
Strong Positive Association
r = .53

• There is a positive
association between
reading and writing
scores of students.
• As students writing
scores increase
their reading scores
also increase.

69
Strong Negative Association

• There is a negative
association between
age and weekly internet
usage in hours
• Younger people use
more internet weekly
where as older people
use less internet

70
Zero Association or Zero
Correlation
• There is zero
association between
Grades on Psyc/ 103 assignment

grades students get


on their psyc 103
assignments and the
number of friends
they have

Number of close friends

71
Correlation Coefficients
• r Pearson’s correlation coefficient
• Direction of the correlation
-1 < r < 0 negative linear correlation
0 < r < +1 positive linear correlation
• Degree of the correlation
The further the r value is from 0 the stronger the correlation
• Which correlation is stronger?
r = 0.32 or r = -0.46
Interpreting Correlations

• Changes in one variable relate to changes in the other


• Correlation does NOT mean causation!

• Three possible directions of causality:


• X causes Y
• Y causes X
• A third factor causes X and Y
• Example: Prejudice and Between Group Contact
Prejudice and Contact

• Measures for prejudice and between group contact


• Calculate the correlation (e.g. r = -0.27).

• Explanations?
• prejudice causes people to limit their between group contact
• people who have low between group contact become prejudice
• fear of unfamiliar causes both prejudice and contact level
Correlation does not mean
Causation

You might also like