You are on page 1of 14

CLRES 2020, Lab 3

Tuesday 2pm-5pm July 26 2004


GSCC 126
Instructors:
Joyce Chang, PhD
Maria Mor, PhD
Doris Rubio, PhD
Teaching Assistants:
Fiona Callaghan MS
Bill Clark
David Corcoran
Vinay Mehta

Goals for Lab 3


1. More One-sample t-tests

We are using the dataset brain.dta

Whenever you see a check-mark that means that you are required to perform
some action. Whenever some words are in this font it means that these
are commands that you should type in the command window of STATA. And
whenever you see an > it refers to going to a series of drop-down windows, as in
“All Programs>Mathematics>STATA”. There are generally two ways to do most
things in STATA: using commands that you type in the command window, or
using drop-down menus, as in SPSS. Whenever possible, we will give you both
ways of doing things in STATA, but you are only required to do it the way you
feel most comfortable. On the back of this handout is some space for you to
answer questions about the lab material.

The questions that you have to answer to get credit for this lab are enclosed in a
box like this.

You will answer these questions as you go through the lab and hand them in at
the end for credit, so remember to write your name on them! If you experience
trouble at any time, just raise your hand to let a TA or an instructor know that
your need help. Let’s get started!

Getting Started
First we will log on to the computer. To do this you will need your University of
Pittsburgh user id and your password.
 You should see a space on the screen to enter your user id. Type it in
and press return.
 Now enter your password and press return. You should now be logged on
to the computer.

1
We will open a folder in which to save our work, and then we will open STATA
and enter our data sets into STATA.
 Right-click somewhere on the desktop and select “New Directory”. Name
your folder “Lab3”. We will save all our work in this folder.
 Go to the web page: http://www.pitt.edu/~changj/CLRES2020/main.html
 Scroll down to find the data sets and right-click on “brain.dta” and select
“Save Link As…”.
 We want to save the file in “/scratch/username/Desktop/Lab3”. The
“username” is your University of Pittsburgh email id (the part of your
University of Pittsburgh email address that comes before the “@” e.g.
“fmc2” is the id from the email address fmc2@pitt.edu), so on my
computer I would save it in “/scratch/fmc2/Desktop/Lab3”. To do this,
double click on “Desktop” and then “Lab3” in the main window (you should
only have to do this once; the computer will remember where you are
saving your files later on). Click “Save”.
 Your data sets should now be in your “Lab3” folder on the Desktop. Open
up your “Lab3” folder to check that it is there, by double clicking on the
“Lab3” icon on your desktop. If things do not look right, contact a TA.
Now we will open STATA.
 To open STATA, click on the icon in the bottom left of your screen (this is
the “Start Applications” menu) and go up to “Mathematics” and then move
the mouse right onto “STATA” to highlight it. Click on STATA and it
should open.
 We wish to tell the STATA to save anything we do from now on in our
“Lab3” file. To do this, in the command window type:
cd “/scratch/username/Desktop/Lab3”
 Now open the log file. Type log using log3.log or you could go to
File>Log>Begin… . You will have to give the log file a name, so type in
“log3”. Next we have to make sure that STATA saves it as a “.log” file and
not a “.smcl” file; go to the drop down menu next to “Save as type: Stata
SMCL Document (*.smcl)” and select “Stata log (*.log)”. Then save in
your Lab3 folder (you may have to double click on Desktop to find the
Lab3 folder).
 Type use brain in the command window of STATA, and press return.
You can also enter your data using a drop down window. Go to
“File>Open…” and select the brain.dta data set and click “Open”. Your
data set should now be in STATA.
 You should see some words in the “Variables” window -- fsiq viq piq
weight height mri_count gender -- Click on the Data Editor button
(or type edit in the command window). You should see 7 columns of
numbers and some labels at the top of those columns. Click on the red
button with the white cross at the top right of the screen to get rid of the
Data Editor window. If your data does not look right, ask a TA for help.

About the Data


Datafile Name: Brain size

2
Datafile Subjects: Medical
Story Names: Brain Size and Intelligence
Reference: Willerman, L., Schultz, R., Rutledge, J. N., and Bigler, E. (1991), "In
Vivo Brain Size and Intelligence," Intelligence, 15, 223-228.
Authorization: Contact authors
Description: Willerman et al. (1991) collected a sample of 40 right-handed Anglo
introductory psychology students at a large southwestern university. Subjects
took four subtests (Vocabulary, Similarities, Block Design, and Picture
Completion) of the Wechsler (1981) Adult Intelligence Scale-Revised. The
researchers used Magnetic Resonance Imaging (MRI) to determine the brain
size of the subjects. Information about gender and body size (height and weight)
are also included. The researchers withheld the weights of two subjects and the
height of one subject for reasons of confidentiality.
Number of cases: 40
Variable Names:
1. Gender: Male (=1) or Female (=2)
2. FSIQ: Full Scale IQ scores based on the four Wechsler (1981) subtests
3. VIQ: Verbal IQ scores based on the four Wechsler (1981) subtests
4. PIQ: Performance IQ scores based on the four Wechsler (1981) subtests
5. Weight: body weight in pounds
6. Height: height in inches
7. MRI_Count: total pixel Count from the 18 MRI scans

For this lab we will only use the variables “fsiq”, “gender” and “mri_count”. Our
basic research questions for this lab are:

a) Are the brain sizes (mri_count) of these students significantly greater than
the rest of the population? Are the brain sizes (mri_count) of these
students significantly less than the rest of the population? Are the brain
sizes (mri_count) of these students significantly different to the rest of the
population?
b) Are the IQ’s (fsiq) of these students significantly greater than the rest of
the population? Are the IQ’s (fsiq) of these students significantly less than
the rest of the population? Are the IQ’s (fsiq) of these students significantly
different to the rest of the population?

We will use a one-sample t-test to answer these questions. I will outline how to
do these tests for the brain size (mri_count) and the questions for this lab will ask
you to test the IQ.

Note that in “real life” you would only choose ONE of “greater”, “less” or “different
to” depending on your research question, but here you will calculate all three
tests as an exercise.

Summary Statistics

3
Before we get into analyzing the data, we should always do some summary
statistics and find out some basic facts about the data. The following commands
will help you answer the questions below.

 Type summarize and press enter

Question 1: What is the mean and standard deviation for “fsiq” and “mri_count”?
Question 2: Do you have any missing values? If so, which observations and
which variables?

 Type graph box fsiq


 Type graph box fsiq, by(gender)
 Type graph box mri_count
 Type graph box mri_count, by(gender)

Question 3: Do you have any outliers, or other strange values for fsiq or
mri_count?

 Type histogram fsiq, normal


 Type histogram fsiq, normal by(gender)
 Type histogram mri_count, normal
 Type histogram mri_count, normal by(gender)

Question 4: Which variables are continuous and which are discrete (out of fsiq,
mri_count and gender)?
Question 5: For the continuous variables, are the data normally distributed? If
not, are they skewed left or right, or are they non-normal for some other reason?

Whatever your answer for the above questions, we will proceed to do t-tests as if
the data is normally distributed and we found no problems. This is because we
are in a class and we need to practice t-tests as an exercise. In “real life” we
would not proceed (or consult a statistician for what to do!) if, for example, the
data was not normally distributed or we had too many missing values or extreme
values.

More One-sample t-tests

This is the decision rule that you must remember:

If our p-value is less than α, then we say that we “reject” the null hypothesis.
If our p-value is larger (or equal to) the α level, then we “fail to reject” the
null hypothesis.

In the following examples we will do one sample t-tests comparing the brain size
to some benchmark figure. Suppose someone tells us that 890,000 pixels is

4
considered a “normal sized” brain. We wish to test if our sample of students
have unusually large or small brains compared to the general population. There
are 3 different ways of formulating this question, which give the 3 different kinds
of Ha.

Example 1
Is the average brain size of the students greater than 890,000 pixels? (Use an α
= 0.10).

 From the question we know Ha: μ > 890,000. The null hypothesis could
be Ho: μ = 890,000 or Ho: μ ≤ 890,000, depending on what we, as
clinicians, think is more reasonable. The form of Ho does not alter our
calculations, just the interpretation. In general, choose the Ho that
seems the most conservative. Let’s say I don’t think that it is possible that
the average could be less than 890,000, so I choose Ho: μ = 890,000.
 We do not know the population standard deviation σ, so if we were
calculating the test by hand we would use the formula:

t = (Sample mean – μo )/(s/√n)

 We know: Sample mean = 908755, μo = 890000, s = 72282.05, n = 40.


We “convert” our sample mean into a “t-score” (kind of like we did with the
normal distribution): t = (908755-890000)/(72282.05/√40) = 18755 /
11428.8 = 1.641
 We need to find a p-value. Because of the way our Ha is worded, we
need to find P(t > 1.641) where the t distribution has n-1 = 40-1 = 39
degrees of freedom. We can do this by looking it up in a table or typing
 display ttail(39, 1.641)
 You should find that p = 0.0544.
 Or we could do the whole thing in STATA:
 ttest mri_count == 890000
You should get the following output:

. ttest mri_count ==890000

One-sample t test

------------------------------------------------------------------------------
Variable | Obs Mean Std. Err. Std. Dev. [95% Conf. Interval]
---------+--------------------------------------------------------------------
mri_co~t | 40 908755 11428.8 72282.05 885638.1 931871.9
------------------------------------------------------------------------------
Degrees of freedom: 39

Ho: mean(mri_count) = 890000

Ha: mean < 890000 Ha: mean != 890000 Ha: mean > 890000
t = 1.6410 t = 1.6410 t = 1.6410
P < t = 0.9456 P > |t| = 0.1088 P > t = 0.0544

To summarize:

5
One-tailed test
Ho: μ = 890,000 , Ha: μ > 890,000
t = 1.641, α = 0.10
P(t > 1.641) = 0.0544 = p
p < 0.10 so we reject Ho.
Conclusion: There is evidence that the mean brain size is larger than the normal
population

Example 2
Is the average brain size of the students less than 890,000 pixels? (Use an α =
0.10).

 From the question we know Ha: μ < 890,000. The null hypothesis could
be Ho: μ = 890,000 or Ho: μ ≥ 890,000, depending on what we, as
clinicians, think is more reasonable. The form of Ho does not alter our
calculations, just the interpretation. In general, choose the Ho that
seems the most conservative. Let’s say I don’t think that it is possible that
the average could be more than 890,000, so I choose Ho: μ = 890,000.
 We do not know the population standard deviation σ, so if we were
calculating the test by hand we would use the formula:

t = (Sample mean – μo )/(s/√n)

 We know: Sample mean = 908755, μo = 890000, s = 72282.05, n = 40.


We “convert” our sample mean into a “t-score” (kind of like we did with the
normal distribution): t = (908755-890000)/(72282.05/√40) = 18755 /
11428.8 = 1.641
 We need to find a p-value. Because of the way our Ha is worded, we
need to find P(t < 1.641) where the t distribution has n-1 = 40-1 = 39
degrees of freedom. We can do this by looking it up in a table or typing
 display 1-ttail(39, 1.641)
 You should find that p = 1-0.0544 = 0.9456.

6
 Or we could do the whole thing in STATA:
 ttest mri_count == 890000
You should get the following output:

. ttest mri_count ==890000

One-sample t test

------------------------------------------------------------------------------
Variable | Obs Mean Std. Err. Std. Dev. [95% Conf. Interval]
---------+--------------------------------------------------------------------
mri_co~t | 40 908755 11428.8 72282.05 885638.1 931871.9
------------------------------------------------------------------------------
Degrees of freedom: 39

Ho: mean(mri_count) = 890000

Ha: mean < 890000 Ha: mean != 890000 Ha: mean > 890000
t = 1.6410 t = 1.6410 t = 1.6410
P < t = 0.9456 P > |t| = 0.1088 P > t = 0.0544

To summarize:

One-tailed test
Ho: μ = 890,000 , Ha: μ < 890,000
t = 1.641, α = 0.10
P(t < 1.641) = 0.9456 = p
p > 0.10 so we fail to reject Ho.
Conclusion: We keep our assumption that the mean brain size is same as the
general population.

Example 3
Is the average brain size of the students different to 890,000 pixels? (Use an α
= 0.10).

 From the question we know Ha: μ ≠ 890,000 (another way of saying this
is Ha: μ < 890,000 or μ > 890,000) . The null hypothesis is Ho: μ =
890,000 (there is no choice for this one).

7
 We do not know the population standard deviation σ, so if we were
calculating the test by hand we would use the formula:

t = (Sample mean – μo )/(s/√n)

 We know: Sample mean = 908755, μo = 890000, s = 72282.05, n = 40.


We “convert” our sample mean into a “t-score” (kind of like we did with the
normal distribution): t = (908755-890000)/(72282.05/√40) = 18755 /
11428.8 = 1.641
 We need to find a p-value. Because of the way our Ha is worded, we
need to find P(t < -1.641 or t > 1.641) where the t distribution has n-1 =
40-1 = 39 degrees of freedom. We can do this by looking it up in a table
or typing
 display 2*ttail(39, 1.641)
 You should find that p = 2×0.0544 = 0.1088.
 Or we could do the whole thing in STATA:
 ttest mri_count == 890000
You should get the following output:

. ttest mri_count ==890000

One-sample t test

------------------------------------------------------------------------------
Variable | Obs Mean Std. Err. Std. Dev. [95% Conf. Interval]
---------+--------------------------------------------------------------------
mri_co~t | 40 908755 11428.8 72282.05 885638.1 931871.9
------------------------------------------------------------------------------
Degrees of freedom: 39

Ho: mean(mri_count) = 890000

Ha: mean < 890000 Ha: mean != 890000 Ha: mean > 890000
t = 1.6410 t = 1.6410 t = 1.6410
P < t = 0.9456 P > |t| = 0.1088 P > t = 0.0544

8
To summarize:
One-tailed test
Ho: μ = 890,000 , Ha: μ < 890,000
t = 1.641, α = 0.10
P(t < -1.641 or t > 1.641) = 0.1088 = p
p > 0.10 so we fail to reject Ho.
Conclusion: We keep our assumption that the mean brain size is the same as for
the general population.

Now you will do a similar set of 3 tests for IQ (fsiq). Using one-sample t-tests
with an α = 0.05, answer the following research questions with STATA:

Question 6: Is the mean IQ (using “fsiq”) significantly greater than 100? State
your Ho, Ha, μo, t-value, p-value, α, whether you reject or fail to reject Ho, and
your research conclusion. Choose an Ho that makes sense to you.
Question 7: Is the mean IQ significantly less than 100? State your Ho, Ha, μo, t-
value, p-value, α, whether you reject or fail to reject Ho, and your research
conclusion. Choose an Ho that makes sense to you.
Question 8: Is the mean IQ significantly different to 100? State your Ho, Ha, μo,
t-value, p-value, α, whether you reject or fail to reject Ho, and your research
conclusion.

The End.

9
Saving the Lab
At the end of the session, follow the following procedure so that you can save
any files you may want to review later on (e.g. your log file). These are the
instructions if you are saving your files onto a floppy disk. If you have a zip disk,
just do the same steps but with the "Zip" folder on the Desktop rather than the
"Floppy" folder.
 Type log close and your log file is automatically saved and closed. You
can also go to File>Log>Close.
 Insert floppy disk (or zip disk).
 Right click on the "Floppy" icon on the Desktop and select "Mount". We
can now save files onto this disk. If you do not “Mount” the disk, then your
files may not save properly.
 Close your "Lab3" folder if it is open. Click on the "Lab3" icon on the
Desktop and drag the whole folder to the floppy disk icon on your Desktop.
You should get a small menu giving you a choice to "Move" or "Copy" the
documents. Click on "Copy". Your files should now be on your floppy
disk.
 Double click on the floppy disk icon to check that there is now a "Lab3"
folder on your floppy disk.
 Now close the floppy disk window, and right click on the floppy disk icon
and select "Unmount". You must do this in order to take your disk out of
these machines and still have your files saved.
 Now press the button on your computer to eject the floppy disk.
It is very important to save a backup on the university computer in case
something happens to the disk.
 Click on the “Lab3” folder icon and drag the whole folder to the “AFS”
folder on your desktop. You should get a small menu giving you a choice
to "Move" or "Copy" the documents. Click on "Copy". Your files are now
stored on the University of Pittsburgh computer system and can be
accessed from any computer with an internet connection. See the
instructions below on how to access these documents from your home
computer.
 You have finished -- see you for the next lab!

10
Accessing the files from home from the University of Pittsburgh
computer system
Here are some instructions FYI to help you access your backup copy in case
there is some problem with your floppy disk or zip, when you get out of here. To
access your backup copies from your home or office computer do the following
steps:
 Open Netscape Navigator or Internet Explorer. Type
ftp://username@unixs.cis.pitt.edu and go to this destination.
(eg. Using my username, I would type
ftp://fmc2@unixs.cis.pitt.edu ).
 After a few seconds, Internet Explorer will ask you for your username and
password. Enter these and press return.
 After the screen has loaded, you should see a list of files and one of them
should be your “Lab3”. Just drag and click that file to wherever you want
to put it on your home computer. Close Internet Explorer.

11
Answer Sheet – Lab3 CLRES 2020 Summer 04.

NAME and DATE:

Question 1:
fsiq mri_count
mean

sd

Question 2:

Question 3:

Question 4:
Variable Name Discrete or Continuous?

12
Question 5:
Variable Normal? (Yes/No) Skewed Left/Right/Other Reason?

Question 6:
Ho

Ha

μo =

t=

p=

α=

Reject Ho?

Conclusion

13
Question 7:
Ho

Ha

μo =

t=

p=

α=

Reject Ho?

Conclusion

Question 8:
Ho

Ha

μo =

t=

p=

α=
Reject Ho?

Conclusion

14

You might also like