Professional Documents
Culture Documents
Objectives
1. To introduce and use a scientific method.
2. To introduce and practice using simple statistics.
3. To learn how to write scientific reports.
03_pfl61305_Lab A_001-014.indd 1
10/29/14 9:53 AM
Start
Observations
Patterns in space or time
Models
Explanations or theories
Hypothesis H 1
Prediction based on model
H0 Null Hypothesis
Retain H 0
Logical opposite to H1
Refute hypothesis
and model
Experiment
Reject H 0
Support hypothesis
and model
Critical test of H0
Interpretation
Figure 1
03_pfl61305_Lab A_001-014.indd 2
A scientific method that incorporates the hypothetico-deductive approach and falsificationist procedure.
10/29/14 9:53 AM
Start
Observatio ns
Excessive exposure to
UV light causes skin cancer.
Model
Five mice for control;
Five mice for experimental group.
Hypothesis (H 1 )
Retain H 0
There is no difference in
incidence of skin cancer
between mice exposed
to UV light and mice
that are not.
Null Hypothesis (H 0 )
Reject H 0
Excessive exposure
to UV light may cause
skin cancer.
Experiment
Experimental mice exposed to UV
light 20 minutes each day for one month.
Control mice received no exposure.
Interpretation
After one month, biopsy of the
skin of both groups of mice.
Figure 2
03_pfl61305_Lab A_001-014.indd 3
10/29/14 9:53 AM
Statistics
As stated previously, it is almost never feasible to make
all of the possible measurements that might prove a
hypothesis. In addition, in natural populations, there
often is considerable variation (consider the human
species). So, it often is possible for a hypothesis to be
true for most, say, >95% of the population, although
it is not true for a few individuals. Consequently,
we can rarely say categorically that a hypothesis is
true, although sufficient supporting evidence can be
amassed that a hypothesis very likely is true. But how
likely is very likely?
Quantitatively, the likelihood that a hypothesis is
true is calculated as the probability that the hypothesis
represents accurately all the possible data to which it is
applicable. The probability is calculated using statistics.
Statistics, as we commonly know, are divided into two
types: descriptive and inferential statistics. Descriptive
statistics (e.g., mean and standard deviation) describe
the pattern (i.e., distribution) of measurements and
might be used to see whether observed groups of
measurements (i.e., samples) are the same as expected.
Inferential statistics, in contrast (e.g., t-test), are used
to assess whether two samples are coming from the
same population. Brief descriptions are provided below
to help you to understand these statistics. However,
for LS23L, you are not required to remember the
equations.
03_pfl61305_Lab A_001-014.indd 4
Definitions
Several definitions will help you to understand how
statistics are calculated, how they relate to your
measurements, and what they really mean.
Population: the entire collection of measurements on
which the researcher intends to draw conclusions, e.g.,
adult weight of human population in South America, or
height of eucalyptus trees in Los Angeles County.
Sample: the set of measurements (X1, X2, X3, Xi)
actually made (e.g., sampling daily dietary calories of
one thousand individuals from each capital of a South
American country; or sampling height of fifty eucalyptus
trees in each LA neighborhood).
Descriptive Statistics
There are a few terms in statistics commonly used to
describe the set of measurements in order to show
their characteristics. These terms, called parameters,
can show the central tendency or can be described
as a measure of dispersion. However, due to the fact
that it is impossible to obtain all the measurements
of one particular variable, the parameter is usually
not available. As a result, an estimate of a parameter
is produced to serve as a description of these
measurements. An estimate of a parameter is called
a statistic. The following explains three statistics
that measure the central tendency and one statistic
that describes the level of dispersion of a set of
measurements. We are going to incorporate these
statistics into the lab report.
1. Mean
One of the statistics that measures the central tendency
of a variable is mean. Mean is more commonly known
as the arithmetic average. The mean of a sample
is calculated as the sum of all measurements in the
sample divided by the sample size (n).
Mean = X = (X1 + X2 + X3 + ... Xi)/n = a Xi/n
2. Median
The second parameter to measure the central tendency
is the median. Median is the measurement located at
the middle of the ordered set of data. In other words,
there are just as many observations larger than the
median as there are smaller. If the sample size is odd,
the median is the middle measurement of the ordered
series. If the sample size is even, the median is the
average between the two middle measurements. For
example,
10/29/14 9:53 AM
3. Mode
4. Standard Deviation
The standard deviation is a measure of variation
around the mean. Any measurement that is not
equal to the mean is deviated from the mean. The
size of the deviation is calculated as (Xi X). The
standard deviation (s) is calculated using the sum of all
deviations measured (see the equation that follows).
If all measurements are the same as the mean, the
standard deviation of the sample is zero. However,
measurements usually are variable and therefore the
standard deviation is greater than zero.
s5
2
a (Xi 2 X )
n21
03_pfl61305_Lab A_001-014.indd 5
Inferential Statistics
So far we have only discussed a few statistics to
describe a group of data. However, the essence of a
statistical analysis is to answer a question objectively
by conducting a statistical test. A statistical test is
made between two or more sets of samples in order
to compare, for example, if they are from the same
population. In this lab we are only going to explore
one of the commonly used statistical tests. You are
not expected to become an expert on statistics, since
it takes much more than one course to master this
discipline. The purpose of this lab is to introduce you
to these objective methods modern scientists use to
answer their questions.
t-Test
Quite often a scientific study relies on a comparison
between two or more sample groups. In order to talk
about differences (or lack of differences) between
these groups in a meaningful way, it is necessary to
have a measurement that all scientists recognize and
understandthis is where statistical tests come in
handy. Many statistical tests have been developed to
allow scientists to calculate the significance of the
differences they see in their data. In this experiment,
we will be using the t-test, a very useful tool that
determines the difference between two samples by
comparing their means while taking into account
their variances. (You may also see it referred to as
the Students t-test. This does not signify a t-test
with training wheels. It is the real t-test, published
originally by an author who used the pseudonym
Student.) In order to determine the t-value, it is
necessary first to calculate several of the descriptive
10/29/14 9:53 AM
t5
0 X12X2 0
X1 2 X2
03_pfl61305_Lab A_001-014.indd 6
less than 5%, you will conclude that the two samples
are significantly different.
Laboratory Exercise
In your laboratory exercise today, you will have a
chance to apply some of the scientific and statistical
concepts you have just read about while participating
in an actual ongoing research project. You will
be expected to follow the hypothetico-deductive
approach by formulating your own hypothesis and
null hypothesis. Using the t-test, you will then be able
to determine whether or not your sample groups are
significantly different from each other. Today you are
participating in real research and contributing actual
data for possible future publication.
The current project proposes to assess cognitive
functioning of undergraduate students through
sophisticated computerized measures developed
by a neuropsychologist. The Memory Interference
Test (MIT) is a computer program that uses either
visual or auditory cues to test the subjects memory.
In addition, a demographic survey asks questions
about the subjects mental and physical states at the
time of the test, along with information about his or
her age, education level, and background. Subjects
can choose not to answer any questions that make
them uncomfortable, and all data remain completely
anonymous. Responses will be sent automatically and
electronically to an aggregated databasespecific
scores and background data will not be available
to anyone. For research purposes, demographic
information about a subgroup will be accessible only
if that group is larger than 50. This restriction protects
students anonymity, while ensuring good research
design with an adequate group size.
The MIT has several cognitive measures. The
picture memory tests (pictures, faces, designs and
Kanji) flash images onto the screen, while the word
memorytest flashes written words. In the auditory
test, the subject wears headphones and listens to lists
of words with no visual cues. Each version of the MIT
consists of four memory tests and a reaction time
test: Tests 1,2, and3are identical. Each presents a
target list of twenty items and then a recognition
list of fifty items. The recognition list consists of the
twenty target items randomly interspersed among
thirty additional items (referred to as distracters). The
subject identifies which items he recognizes from the
previously presented target list. Test 4 presents an
additional recognition list of sixty items, consisting of
ten items from each of the target lists of Tests 1, 2, and
3, together with thirty distracters. The subject is asked
to identify which of the items in the recognition list
appeared in the three previously presented target lists.
Test 5 is a test of reaction times only, independent of
any memory effects. It presents a group of fifty items,
10/29/14 9:53 AM
MIT Manual
This manual will guide you through the Web interface
on how to perform a t-test on the aggregated
database. First, we want to define the data one can
retrieve from the aggregated database. In terms of
test performance, one can compare two different
parameters: 1. Number of correct responses, which is
a measure of how accurately a subject remembered
the items, and 2. Average response time, a measure of
how fast on average a subject responded to the correct
items. You can choose two different parameters on the
Web site and the Web site will give you the statistics
and calculate the t-test. Once you have the t-test value,
you will obtain the probability value (p-value) from the
table provided at the end of this section.
Figure 4
03_pfl61305_Lab A_001-014.indd 7
10/29/14 9:53 AM
Figure 5
The results of comparing all data from the
picture test (PMIT) versus all data from the word test
(WMIT) are shown in Figure 5 (note that the database
is constantly growing as data are collected, and the
numbers shown here are not current). The Web
interface will give you two graphs: On the left are the
statistics for the # of Correct Responses and on the
right you see the statistics for the Average Response
Time. The respective t-test values are displayed in the
graph on top of the individual statistics. Remember,
these are NOT your p-values. You will need to follow
the instructions given later in order to determine your
p-values. However, the chart below the graphs does
give an indication of whether the results are significant.
The degrees of freedom (abbreviated here as dof)
have been calculated (the calculation will be explained
later on), and a range of t-values are shown. As you
can see, in order for the results to be significant at the
5% level (highlighted pink), the t-value must be at least
1.962. Unlike p-values, t-values are more significant as
they get larger. Keeping that in mind, do you think that
the results shown in Figure5 are significant? Would you
accept or reject your null hypothesis?
03_pfl61305_Lab A_001-014.indd 8
10/29/14 9:53 AM
Figure 6
Figure 7
03_pfl61305_Lab A_001-014.indd 9
10/29/14 9:53 AM
10
Figure 8
03_pfl61305_Lab A_001-014.indd 10
10/29/14 9:53 AM
Table 1
Table 2
Short-Key Legend
Demographic List #
Short Label
Question
date
durtn
handset
start
wait
continued
03_pfl61305_Lab A_001-014.indd 11
10/29/14 9:53 AM
12
Table 2
Demographic List #
Short Label
Question
sex
Gender
age
race
Race
ethn
Ethnic group
ed
Education COMPLETED
country
Country of birth
lang_1
First language
lang_2
lang_n
lang_m
lang_f
local
localy
area
uspart
firstMIT
Is this your first time performing the URI-UCLA Memory Interference Test?
trial
lastMIT
ed_sped
loc
loc_dur
loc_inc
handness
Dominant hand
hand_hx
handuse
cafe_frq
cafe_vol
tea_frq
tea_vol
soda_frq
soda_vol
tobc_frq
tobc_vol
etoh_frq
etoh_vol
continued
03_pfl61305_Lab A_001-014.indd 12
10/29/14 9:53 AM
Table 2
Demographic List #
Short Label
Question
cafe_hrs
tea_hrs
soda_hrs
tobc_hrs
etoh_hrs
eat_hrs
wake_hrs
sleep_hrs
state
pain
mental
physcl
emotnl
love
spirit
Table 3
p-value in %
20%
10%
8%
6%
5%
4%
2%
1%
0%
0.4
0.2
0.1
0.08
0.06
0.05
0.04
0.02
0.01
0.001
90
0.846
1.291
1.662
1.771
1.905
1.987
2.084
2.368
2.632
3.402
100
0.845
1.290
1.660
1.769
1.902
1.984
2.081
2.364
2.626
3.390
110
0,845
1.290
1.659
1.767
1.900
1.982
2.078
2.361
2.621
3.380
120
0.845
1.289
1.658
1.766
1.899
1.980
2.076
2.358
2.617
3.373
130
0.845
1.289
1.657
1.764
1.897
1.978
2.074
2.355
2.614
3.367
140
0.844
1.288
1.656
1.763
1.896
1.977
2.073
2.353
2.611
3.362
150
0.844
1.288
1.655
1.763
1.895
1.976
2.072
2.351
2.608
3.357
160
0.844
1.287
1.654
1.762
1.894
1.975
2.071
2.350
2.606
3.353
170
0.844
1.287
1.653
1.762
1.893
1.974
2.070
2.348
2.604
3.349
180
0.844
1.286
1.653
1.761
1.893
1.973
2.069
2.347
2.603
3.345
190
0.843
1.286
1.653
1.761
1.892
1.973
2.068
2.346
2.602
3.342
200
0.843
1.286
1.653
1.760
1.892
1.972
2.067
2.345
2.601
3.340
300
0.842
1.285
1.650
1.757
1.889
1.969
2.064
2.339
2.592
3.325
400
0.842
1.284
1.649
1.755
1.887
1.967
2.061
2.336
2.588
3.315
500
0.842
1.283
1.648
1.754
1.885
1.965
2.059
2.334
2.586
3.310
1000
0.842
1.282
1.646
1.752
1.883
1.962
2.056
2.330
2.581
3.300
0.842
1.282
1.645
1.751
1.881
1.960
2.054
2.326
2.576
3.291
p-value
degrees
of freedom
03_pfl61305_Lab A_001-014.indd 13
10/29/14 9:53 AM
14
Title
Short, concise, and relevant.
Introduction
First, briefly explain the rationale of the study. As
part of your background information, please find an
interesting original research paper about memory or
the variable you are looking at and properly cite it. As
UCLA students, you have campus wide access to the
PubMed database (http://www.pubmedcentral.nih
.gov/) which makes this type of search fairly simple. For
more information on how to incorporate citations in
your report, see the scientific writing lecture posted on
the LS23L CCLE site. You should also attach a copy of
the abstract of the cited article to the hard copy of your
report only. Then state what your null hypothesis is for
03_pfl61305_Lab A_001-014.indd 14
Results
Describe the mean, median, and standard deviation
of each group. Attach the screen capture of your data
from the database and refer to the figure in your text.
Make sure to give your calculated p-values. Do not
include any discussion of the results in this section; just
report the data in paragraph form.
Discussion
Was the null hypothesis supported or refuted?
Therefore, was the alternative (experimental)
hypothesis refuted or supported? This conclusion
must be related to the statistical test. What realworld conclusions can you draw from your results? If
applicable, discuss possible sources of error and what
you could do to strengthen your experiments. What
type of further research would be useful/interesting?
10/29/14 9:53 AM