Psyc 103 (Stats)

Appendix
Going by the Numbers:

Statistics in Psychology
Experimental vs. Non-Experimental
• Non-Experimental:
• Correlation- allows us to determine association
• Cannot determine causality!
• Experimental:
• Variables are manipulated so that we can determine
the effect of one variable on another (e.g. drug trials)
Experimental Terminology
• Experimental Group:
• The group of people who are given a treatment (a
variable is manipulated for this sample)
• Control Group:
• The group in an experiment who are not given a
treatment
Experimental Terminology
• Independent Variable:
• The variable that is manipulated for the experimental
group
• Dependent Variable:
• The variable that is being measured (the variable that
is affected by changes in the other)
More Basic Concepts
 Variable: A condition or characteristic that can

have different values
 Value: A possible number or category that a

variable can have
 Score: A particular person’s value on a variable

Some Basic Terminology
Term Definition Examples

Variable Condition or characteristic that Stress level,
can have different values age, gender,
religion
Value Number or category 0, 1, 2, 3, 4,
25, 85, female,
Catholic
Score A particular person’s value 0, 1, 2, 3, 4,
on a variable 25, 85, female,
Catholic
Module 56:
Descriptive Statistics
Statistics
Statistics: branch of mathematics that focuses

on organization, analysis, and interpretation of
a group of numbers (i.e., data)
Method of distinguishing patterns from
randomness/chance
E.g., lottery, coin flip, etc.
i.e., Is the difference between groups an effect
or just random chance
What does Statistics help us to
do?
• Allows us to determine the relationship between

variables with some degree of certainty
• Variable: a condition or characteristic that can have
different values; something of interest
• Research begins as a question about the relationship

between two variables
Alcohol consumption  car crashes
The Two Branches of
Statistical Methods
 Descriptive statistics
• Summarize/organize a group of numbers from a
research study
 Inferential statistics
• Draw conclusions/make inferences that go
beyond the numbers from a research study
• Use sample to make general conclusions
More on Inferential Statistics
• Goal is to draw conclusions about a population of

interest, by collecting data on a sample
• Population: Entire group of individuals of interest
• University Students
• Turkish people
• Men who brush their teeth with their non-dominant
hand
• Sample: the particular participants selected to be
studied from the population
Validity
External Validity: how well the sample represents

the population
Generalizability
Ideal Research Design
1. Participants in the experimental and control groups

are identical
2. Both groups are exposed to identical situations
(except for manipulation of the independent variable)
• If not, there may be a confounding variable
• Coke vs Pepsi
3. Sample studied represents the intended population
4. Measurement of dependent variable is accurate and
appropriate for what it’s supposed to be measuring
Types of Measurement
 Numeric (quantitative) variables:

1. Equal-interval variables
• e.g., GPA, age, class size
• Discrete
• Continuous
2. Ratio Scale  absolute zero
3. Ordinal (rank order) variables
• e.g., position finished in a race
 Nominal (qualitative) variables

4. Categorical
• e.g., gender, religion
Type Definition Example
Equal-interval Numeric variable in which differences Stress level, age
between values correspond to differences
in the underlying thing being measured
Ordinal Numeric variable in which values Class standing,
correspond to the relative position position finished in
of things measured a race
Nominal Variable in which the values are Gender, religion
categories
Frequency Distributions
• After collecting data, first task:

• Organize and simplify the data so that it is possible
to get a general overview of the results
• This is the goal of descriptive “statistics”
• One method for simplifying and organizing data:

• Frequency distribution
Organizing Data: Frequency
Tables
 Provide a listing of individuals having each of the

different values for a particular variable
 e.g., stress ratings of 151 students:
4,7,7,7,8,8,7,8,9,4,7,3,6,9,10,5,7,10,6,8,7,8,7,8,7,4,5,10,10
,0,9,8,3,7,9,7,9,5,8,5,0,4,6,6,7,5,3,2,8,5,10,9,10,6,4,8,8,8,4
,8,7,3,8,8,8,8,7,9,7,5,6,3,4,8,7,5,7,3,3,6,5,7,5,7,8,8,7,10,5,
4,3,7,6,3,9,7,8,5,7,9,9,3,1,8,6,6,4,8,5,10,4,8,10,5,5,4,9,4,7,
7,7,6,6,4,4,4,9,7,10,4,7,5,10,7,9,2,7,5,9,10,3,7,2,5,9,8,10,1
0,6,8,3
Organizing Data: Frequency Tables
• Frequency: Number of scores with a particular

value
• Frequency Distribution: The pattern of frequencies

over different values
Organizing Data:
Steps for Making a Frequency
Table
1. Make a list down the page of each possible
value, from highest to lowest
2. Go one by one through the scores, making a
mark for each next to its value on the list
3. Make a table showing how many times each
value on your list is used
4. Figure the percentage of scores for each value
Organizing Data: A Frequency Table
Stress
Rating Frequency Percent
10 14 9.3
9 15 9.9
8 26 17.2
7 31 20.5
6 13 8.6
5 18 11.9
4 16 10.6
3 12 7.9
2 3 2.0
1 1 0.7
0 2 1.3
Steps for Making a
Histogram
1. Make frequency table

2. Put the values along the bottom of the page, from
left to right, from lowest to highest
3. Make a scale of frequencies along the left edge of
the page that goes from 0 at the bottom to the
highest frequency for any value
4. Make a bar above each value with a height for the
frequency of that value
Organizing Data: Frequency Graphs
Histogram
Shapes of Frequency
Distributions
• Unimodal, Bimodal, and
Rectangular
Shapes of Frequency
Distributions
• Symmetrical and Skewed Distributions

Skewness
• Skewed to the left= Negatively skewed;

tail (side with fewer scores) to the left
• Ceiling effect
• Skewed to the right= Positively skewed;
tail (side with fewer scores) to the right
• Floor effect
Normal Distributions
Normal Curve: normal distributions often approximate a

bell-shaped curve that is unimodal and symmetrical
Summary
• We talked about how to begin to make sense of

a group of scores
• Frequency Tables and Graphs
• What are the main statistical techniques for

describing a group of scores with numbers?
How can we do this?
1. Describe group of scores in terms of a

typical/average/most representative/etc. value
2. Describe how spread out the numbers are in

a group of scores
Central Tendency
• Measures of Central Tendency are those which

tell us the typical value in a group of scores
• e.g. What is the typical height of a Bilkent
undergraduate
• Mean
• Median
• Mode
Central Tendency: Benefits
Capture a great deal of
information in a single score
Example 2:
45F = Average high temperature in

December (Ankara, Turkey)
83F = Average high temperature in

December (Kona, HI)
Central Tendency: Mean
• Sum of all the scores divided by the number of scores
M
 X
N
• S = the sum of
• X = each individual score

• S x = the sum of all scores
• N = # of scores
Central Tendency: Mean - Example
• Mean # of dreams per week: 7,8,8,7,3,1,6,9,3,8

• ΣX = 7+8+8+7+3+1+6+9+3+8 = 60
• N = 10
• Mean = 60/10 = 6
Central Tendency: Mode
• Most common single number in a distribution

• Bimodal distribution
• Mode of 7,8,8,7,3,1,6,9,3,8 =
•8
• What type of variable makes the mode a good
choice?
• Nominal variables
• Why?
Mode: Common Error
X f
• The most frequent score 4 4
• Single score that is most 3 10

common 2 14
Mode
1 6
0 6
ERROR:
Mistakenly report frequency of
the most common score, rather
than the score itself
Central Tendency: Mode
The mode as the high point in a distribution’s histogram,

using the # of dreams/week example
The Mode
n = 50
Cat = 17
Dog = 28
Mode
Fish = 3
Other = 2
What is the mode for this distribution?

The answer isn’t a number
Central Tendency: Median
The middle score when all scores are arranged from

lowest to highest
Median of 1, 8, 3, 4, 7
 1, 3, 4, 7, 8
 Median = 4
Median of 7,8,8,7,3,1,6,9,3,8
1 3 3 6 7 7 8 8 8 9
↑
median
Median is the average (mean) of the 5th and 6th
scores, so the median is 7
The Median
Midpoint of the distribution not the midpoint of the

scale (e.g. 0-100)
Divides the set of scores
into two equal-sized not halfway between the
groups lowest and highest
scores
When To Use Each Measure
• Mean is most commonly used in research

• Vulnerable to distortion when sample size is small or
has extreme values
• Mode is used for nominal variables
• Not that vulnerable to extreme scores
• Median is preferable to mean when extreme scores
are in data set
• All three converge for large samples
• Normal distribution
Consider the Following…
1st group of #s
1, 2, 3
2nd group of #s
-3, 0, 9
3rd group of #s
-3, -2, 0, 1, 4, 4, 9
Rank order the 3 measure of CT for a

distribution of scores involving a floor effect
Summary
• Mode: Most frequent score
• Always corresponds to an actual score
• Can have multiple modes
• Nominal data
• Median: Middle score
• Not influenced by extreme scores
• Ordinal data
• Mean: Average score
• Represents every score in the distribution
• Distorted by extreme scores
• CT + variability = foundation for inferential statistics
Variability or Dispersion
• In addition to being able to describe the typical

score, we also want to be able to describe how
much the scores differ or vary from each other
• How much do the scores in our sample vary from

the measure of central tendency on average?
 What’s our next step?

Range
• Range is the simplest measure of variability?

• Difference between highest and lowest score
• Age at Bilkent
• Based entirely on extreme scores
Standard Deviation
• SD: average deviation of a set of scores from
the center of the distribution
Sum of Deviation Scores
• Deviation Score is the extent to which each individual’s score deviates
from the mean
X-M
We want to know the average or the standard deviation score

S (X – M) ?
N
Average Deviation Scores?
S (X – M) = 0/N
N
• The sum of all deviation scores will ALWAYS be 0!!!
• We need to square each deviation score and obtain
Sum of Squares (SS)
Sum of Squared Deviation Scores
• Sum of all squared deviations from the mean:
S (X- M)2 = SS
Variance: Average Sum of Squared
Deviation Scores
• Formula for the Variance
SD
2

 (X  M)
2

SS
N N
Variance: Measure of Spread
• The average of each score’s squared

difference from the mean
• Steps for computing the variance:
1. Subtract the mean from each score
2. Square each of these deviation scores
3. Add up the squared deviation scores
4. Divide the sum of squared deviation scores by
the number of scores
Standard Deviation: Measure of Spread
• Most common way of describing the

spread of a group of scores
• Approximately the average amount that
scores deviate from the mean
• Steps for computing the standard
deviation:
1. Figure the variance
2. Take the square root
Measures of Spread
The Standard Deviation
• Formula for the standard deviation:
SD  SD 2

 (X  M)
2

SS
N N
Correlations
• The association (or co-relation) between scores on two

or more variables
• For example:
• Optimistic people are healthier
• Babies look at more beautiful women longer
• More attractive people earn more money
• Students with higher attendance get higher grades
Scatter Diagrams (Scatterplots)
• Draw the axes and decide which variable goes on which axis
• The predictor or “causal” variable goes on the x-axis
• Mark the values on each axis

• Choose the range of values that covers all of the possible scores
• Mark a dot for each person’s pair of scores

Conservative/ Government
Liberal Services
2 5
5 6
4 7
6 4
5 4
3 2
6 7
4 5
5 4
6 5
4 5
5 6
4 3
5 2
3 2
6 4
7 6
5 6
6 6
6 6
4 6
4 4
4 5
Liberal Services
2 5
5 6
4 7
6 4 8
5 4
Government Services
3 2
7
6 7 6
4 5
5 4
5
6 5 4
4 5
3
5 6
4 3 2
5 2
1
3 2
6 4
0
7 6
0 1 2 3 4 5 6 7 8
5 6
6 6 Conservative/Liberal Scale
6 6
4 6
4 4
4 5
Liberal Services
2 5
5 6
4 7
6 4 8
5 4
Government Services
3 2
7
6 7 6
4 5
5
5 4
6 5 4
4 5
3
5 6
4 3 2
5 2 1
3 2
6 4
0
7 6 0 1 2 3 4 5 6 7 8
5 6
6 6
4 6
4 4
4 5
Conservative/ Social
Liberal Responsibility
4 5
5 6
4 7
6 4 8
5 4
Government Services
3 2
7
6 7 6
4 5
5 4
5
6 5 4
2 5
3
5 6
4 3 2
5 2
1
3 2
6 4
0
7 6
0 1 2 3 4 5 6 7 8
5 6
6 6
4 6
4 4
4 5
Conservative Standard of
Liberal Living
4 5
5 3
4 3
6 4
5 4
3 3
6 2
4 5
5 4
6 3
2 4
5 5
4 3
5 4
3 4
6 3
7 2
5 4
6 1
6 1
4 3
4 5
4 5
Conservative/ Standard of
Liberal Living
4 5
5 3
4 3
6 4 8
5 4
3 3
7
Standard of Living
6 2 6
4 5
5 4
5
6 3 4
2 4
5 5
3
4 3 2
5 4
3 4
1
6 3 0
7 2
0 1 2 3 4 5 6 7 8
5 4
6 1
4 3
4 5
4 5
Conservative/ Standard of
Liberal Living
4 5
5 3
4 3
6 4 8
5 4
3 3
7
Standard of Living
6 2 6
4 5
5 4
5
6 3 4
2 4
5 5
3
4 3 2
5 4
3 4
1
6 3 0
7 2
0 1 2 3 4 5 6 7 8
5 4
6 1
4 3
4 5
4 5
Conservative/
Liberal Gun Control
4 7
5 3
4 6
6 6 8
5 4
7
3 6
6 7 6
Gun Control
4 7
5
5 5
6 5 4
2 4
3
5 6
4 4 2
5 4
1
3 5
6 6 0
7 4
0 1 2 3 4 5 6 7 8
5 5
6 6
4 4
4 6
4 6
Bivariate Correlation
• Correlation Coefficient: a statistic that indicates the

degree to which two variables are related to one
another
67
How is bivariate correlation
measured?
• Pearson Correlation Coefficient (r): - 1.00 < r < +1.00
1. Valance (sign, direction)
Tells us the direction of the relationship.
(+): When V1 increases, V2 increases OR When V1 decreases, V2 decreases
(-): When V1 increases, V2 decreases OR When V1 decreases, V2 increases
2. Magnitude (size, strength): The numerical value ignoring the sign,

strength of the relationship
Cohen (1988, 1992):
Large if r > .50, Moderate if .50 > r >.30, Small if r <.30
68
Strong Positive Association
r = .53
• There is a positive
association between
reading and writing
scores of students.
• As students writing
scores increase
their reading scores
also increase.
69
Strong Negative Association
• There is a negative
association between
age and weekly internet
usage in hours
• Younger people use
more internet weekly
where as older people
use less internet
70
Zero Association or Zero
Correlation
• There is zero
association between
Grades on Psyc/ 103 assignment
grades students get

on their psyc 103
assignments and the
number of friends
they have
Number of close friends
71
Correlation Coefficients
• r Pearson’s correlation coefficient
• Direction of the correlation
-1 < r < 0 negative linear correlation
0 < r < +1 positive linear correlation
• Degree of the correlation
The further the r value is from 0 the stronger the correlation
• Which correlation is stronger?
r = 0.32 or r = -0.46
Interpreting Correlations
• Changes in one variable relate to changes in the other

• Correlation does NOT mean causation!
• Three possible directions of causality:

• X causes Y
• Y causes X
• A third factor causes X and Y
• Example: Prejudice and Between Group Contact
Prejudice and Contact
• Measures for prejudice and between group contact

• Calculate the correlation (e.g. r = -0.27).
• Explanations?
• prejudice causes people to limit their between group contact
• people who have low between group contact become prejudice
• fear of unfamiliar causes both prejudice and contact level
Correlation does not mean
Causation

Psyc 103 (Stats)

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Psyc 103 (Stats)

Uploaded by

Copyright:

Available Formats

Appendix

Going by the Numbers:

 Variable: A condition or characteristic that can

 Value: A possible number or category that a

 Score: A particular person’s value on a variable

Term Definition Examples

Statistics: branch of mathematics that focuses

• Allows us to determine the relationship between

• Research begins as a question about the relationship

• Goal is to draw conclusions about a population of

External Validity: how well the sample represents

1. Participants in the experimental and control groups

 Numeric (quantitative) variables:

 Nominal (qualitative) variables

• After collecting data, first task:

• This is the goal of descriptive “statistics”

• One method for simplifying and organizing data:

 Provide a listing of individuals having each of the

• Frequency: Number of scores with a particular

• Frequency Distribution: The pattern of frequencies

1. Make frequency table

• Symmetrical and Skewed Distributions

• Skewed to the left= Negatively skewed;

Normal Curve: normal distributions often approximate a

• We talked about how to begin to make sense of

• What are the main statistical techniques for

1. Describe group of scores in terms of a

2. Describe how spread out the numbers are in

• Measures of Central Tendency are those which

45F = Average high temperature in

83F = Average high temperature in

• Mean # of dreams per week: 7,8,8,7,3,1,6,9,3,8

• Most common single number in a distribution

• The most frequent score 4 4

• Single score that is most 3 10

The mode as the high point in a distribution’s histogram,

What is the mode for this distribution?

The middle score when all scores are arranged from

Midpoint of the distribution not the midpoint of the

• Mean is most commonly used in research

Rank order the 3 measure of CT for a

• In addition to being able to describe the typical

• How much do the scores in our sample vary from

 What’s our next step?

• Range is the simplest measure of variability?

We want to know the average or the standard deviation score

• Sum of all squared deviations from the mean:

• Formula for the Variance

• The average of each score’s squared

• Most common way of describing the

• The association (or co-relation) between scores on two

• Mark the values on each axis

• Mark a dot for each person’s pair of scores

• Correlation Coefficient: a statistic that indicates the

2. Magnitude (size, strength): The numerical value ignoring the sign,

grades students get

Number of close friends

• Changes in one variable relate to changes in the other

• Three possible directions of causality:

• Measures for prejudice and between group contact

You might also like