You are on page 1of 6

Elementary Statistics for AP Psychology

by Laura Lincoln Maitland
A large amount of data can be collected in research studies. Psychologists
need to make sense of the data. Qualitative data are frequently changed to
numerical data for ease of handling. Quantitative data already is numerical.
Numbers that are used simply to name something are said to be on a nominal
scale and can be used to count the number of cases. For eample! for a survey!
girls can be designated as "#!" $hereas boys can be designated as "%." &hese
numbers have no intrinsic meaning. Numbers that can be ranked are said to be on
an ordinal scale! and can be put in order. For eample! the highest scorer can be
designated as "#!" the second highest as "%!" the third highest as "'!" etc. &hese
numbers cannot be averaged. Number # could have scored () points higher than %.
Number % may have scored * points higher than '. +f there is a meaningful
difference bet$een each of the numbers! the numbers are said to be on an interval
scale. For eample! the difference bet$een '%, Fahrenheit -F. and *%,F is #),F.
&he difference bet$een /*,F and 0*,F is also #),F. 1o$ever! /*,F is not t$ice as
hot as '%,F. 2hen a meaningful ratio can be made $ith t$o numbers! the numbers
are said to be on a ratio scale. &he key difference bet$een an interval scale and a
ratio scale is that the ratio scale has a real or absolute 3ero point. For quantities of
$eight! volume! and distance! 3ero is a meaningful concept! $hereas the meaning
of ),F is arbitrary.
Statistics is a field that involves the analysis of numerical data about
representative samples of populations.
Descriptive Statistics
Numbers that summari3e a set of research data obtained from a sample are
called descriptive statistics. +n general! descriptive statistics describe sets of
interval or ratio data. After collecting data! psychologists organi3e the data to
create a frequency distribution! an orderly arrangement of scores indicating the
frequency of each score or group of scores. &he data can be pictured as a
histogram4a bar graph from the frequency distribution4or as a frequency
polygon4a line graph that replaces the bars $ith single points and connects the
points $ith a line. 2ith a very large number of data points! the frequency polygon
approaches a smooth curve. Frequency polygraphs are sho$n in Figure /.#.
Measures of Central Tendency
Measures of central tendency describe the average or most typical scores
for a set of research data or distribution. Measures of central tendency include the
mode! median! and mean. &he mode is the most frequently occurring score in a set
of research data. +f t$o scores appear most frequently! the distribution is bimodal5
if three or more scores appear most frequently! the distribution is multimodal. &he
median is the middle score $hen the set of data is ordered by si3e. For an odd
number of scores! the median is the middle one. For an even number of scores! the
median lies half$ay bet$een the t$o middle scores. &he mean is the arithmetic
average of the set of scores. &he mean is determined by adding up all of the
scores! then dividing by the number of scores. For the set of qui3 scores (! /! 0! 0!
0! 6! 6! 7! 7! #)5 the mode is 05 the median is 0.(5 the mean is 0./. &he mode is the
least used measure of central tendency! but can be useful to provide a "quick and
dirty" measure of central tendency especially $hen the set of data has not been
ordered. &he mean is generally the preferred measure of central tendency because
it takes into account the information in all of the data points5 ho$ever! it is very
sensitive to etremes. &he mean is pulled in the direction of etreme data points.
&he advantage of the median is that it is less sensitive to etremes! but it doesn8t
take into account all of the information in the data points. &he mean! mode! and
median turn out to be the same score in symmetrical distributions. &he t$o sides
of the frequency polygon are mirror images as sho$n in Figure /.#a. &he normal
distribution or normal curve is a symmetric! bell9shaped curve that represents
data about ho$ many human characteristics are dispersed in the population.
:istributions $here most of the scores are squee3ed into one end are skeed. A
fe$ of the scores stretch out a$ay from the group like a tail. &he ske$ is named
for the direction of the tail. Figure /.#b pictures a negatively ske$ed distribution!
and Figure /.#c sho$s a positively ske$ed distribution. &he mean is pulled in the
direction of the tails! so the mean is lo$er than the median in a negatively ske$ed
distribution! and higher than the median in a positively ske$ed distribution. +n
very ske$ed distributions! the median is a better measure of central tendency than
the mean.
Measures of !ariability
!ariability describes the spread or dispersion of scores for a set of research
data or distribution. Measures of variability include the range! variance! and
standard deviation. &he range is the largest score minus the smallest score. +t is a
rough measure of dispersion. For the same set of qui3 scores -(! /! 0! 0! 0! 6! 6! 7!
7! #).! the range is (. !ariance and standard deviation "SD# indicate the degree
to $hich scores differ from each other and vary around the mean value for the set.
;ariance and standard deviation indicate both ho$ much scores group together
and ho$ dispersed they are. ;ariance is determined by computing the difference
bet$een each value and the mean! squaring the difference bet$een each value and
the mean -to eliminate negative signs.! summing the squared differences! then
taking the average of the sum of squared differences. &he standard deviation of the
distribution is the square root of the variance. For a different set of qui3 scores -/!
0! 6! 6! 6! 6! 6! 6! 7! #).! the variance is # and the <: is #. <tandard deviation must
fall bet$een ) and half the value of the range. +f the standard deviation approaches
)! scores are very similar to each other and very close to the mean. +f the standard
deviation approaches half the value of the range! scores vary greatly from the
mean. Frequency polygons $ith the same mean and the same range! but a different
standard deviation! that are plotted on the same aes sho$ a difference in
variability by their shapes. &he taller and narro$er frequency polygon sho$s less
variability and has a lo$er standard deviation than the short and $ider one.
<ince you don8t bring a calculator to the eam! you $on8t be required to figure out
variance or standard deviation.
Correlation
<cores can be reported in different $ays. =ne eample is the
standard score or 3 score. <tandard scores enable psychologists to compare
scores that are initially on different scales. For eample! a 3 score of # for
an +Q test might equal ##(! $hile a 3 score of # for the <A& + might equal
/)).&he mean score of a distribution has a standard score of 3ero. A score
that is one standard deviation above the mean has a 3 score of #. A standard
score is computed by subtracting the mean ra$ score of the distribution
from the ra$ score of interest! then dividing the difference by the standard
deviation of the distribution of ra$ scores. Another type of score! the
percentile score! indicates the percentage of scores at or belo$ a particular
score. &hus! if you score at the 7)th percentile! 7)> of the scores are the
same or belo$ yours. Percentile scores vary from # to 77.
A statistical measure of the degree of relatedness or association bet$een
t$o sets of data! X and Y! is called the correlation coefficient. &he correlation
coefficient -r. varies from ?# to @#. =ne indicates a perfect relationship bet$een
the t$o sets of data. +f the correlation coefficient is ?#! that perfect relationship is
inverse5 as one variable increases! the other variable decreases. +f the correlation
coefficient -r. is @#! that perfect relationship is direct5 as one variable increases the
other variable increases! and as one variable decreases! the other variable
decreases. A correlation coefficient -r. of ) indicates no relationship at all bet$een
the t$o variables. As the correlation coefficient approaches ?# or @#! the
relationship bet$een variables gets stronger. Aorrelation coefficients are useful
because they enable psychologists to make predictions about Y $hen they kno$
the value of X and the correlation coefficient. For eample! if r B .7 for scores of
students in an AP Ciology class and for the same students in AP Psychology class!
a student $ho earns an A in biology probably earns an A in psychology! $hereas a
student $ho earns a : in biology probably earns a : in psychology. +f r B .# for
scores of students in an Dnglish class and scores of the same students in AP
Aalculus class! kno$ing the Dnglish grade doesn8t help predict the AP Aalculus
grade.
Correlation does not imply causation. Aorrelation indicates only that there is a
relationship between variables, not how the relationship came about.
&he strength and direction of correlations can be illustrated graphically in
scattergrams or scatterplots in $hich paired E and F scores for each subGect are
plotted as single points on a graph. &he slope of a line that best fits the pattern of
points suggests the degree and direction of the relationship bet$een the t$o
variables. &he slope of the line for a perfect positive correlation is r B @#! as in
Figure /.%a. &he slope of the line for a perfect negative correlation is r B ?#! as in
Figure /.%b. 2here dots are scattered all over the plot and no appropriate line can
be dra$n! r B ) as in Figure /.%c! $hich indicates no relationship bet$een the t$o
sets of data.
$nferential Statistics
+nferential statistics are used to interpret data and dra$ conclusions. &hey
tell psychologists $hether or not they can generali3e from the chosen sample to
the $hole population! if the sample actually represents the population. +nferential
statistics use rules to evaluate the probability that a correlation or a difference
bet$een groups reflects a real relationship and not Gust the operation of chance
factors on the particular sample that $as chosen for study. Statistical significance
"p# is a measure of the likelihood that the difference bet$een groups results from a
real difference bet$een the t$o groups rather than from chance alone. Hesults are
likely to be statistically significant $hen there is a large difference bet$een the
means of the t$o frequency distributions! $hen their standard deviations -<:. are
small! and $hen the samples are large. <ome psychologists consider that results
are significantly different only if the results have less than a # in %) probability of
being caused by chance -p B .)(.. =thers consider that results are significantly
different only if the results have less than a # in #)) probability of being caused by
chance -p I .)#.. &he lo$er the p value! the less likely the results $ere due to
chance. Hesults of research that are statistically significant may be practically
important or trivial. <tatistical significance does not imply that findings are really
important. Meta%analysis provides a $ay of statistically combining the results of
individual research studies to reach an overall conclusion. <cientific conclusions
are al$ays tentative and open to change should better data come along. Jood
psychological research gives us an opportunity to learn the truth.
K Aopyright %))/9%)#% Dducation.com All Hights Heserved.
httpLMM$$$.education.comMstudy9helpMarticleMelementary9statisticsM