You are on page 1of 5

1, Survival analysis, Kaplan-Meier survival curves

Database: SPSS_Rat_Tumor(King, 1979)_survival.sav

The database contains the results of a real experiment. Rats treated with a chemical carcinogen
received different diets, and their tumour free survival (TumorFreeTime) was registered. The
following three groups were studied: Group 1. low fat diet, Group 2. high fat diet (containing
saturated fats), Group 3. high fat diet (containing unsaturated fats). The group codes are in the
FatLevel variable. The „Censored” variable contains 0 if the animal developed tumour or 1 in all other
cases.

Data of Group 3 is unnecessary for further comparison. Recode FatLevel variable into a new variable.
The new variable should contain Group 1 and Group 2, but Group 3 have to be recoded to „system
missing”.

Compare the tumour free survivals of Groups 1 and 2 with an appropriate statistical test and
illustrate the survivals on Kaplan-Meier survival curves!

Group 1. median survival:………………………………… 191,1 average survival: ……………………………153,3

Group 2. median survival: ………………………………107 average survival: ………………………………121,06

The result of the applied statistical test: p: …………………0,029

Is the difference statistically significant? .............................. yes


2. Dichotomization, 2x2 table, Odds ratio

Database: Merged_cut.sav

How did political preference of the participants influence their final vote in the presidential election?

Output variable: „VOTE OBAMA OR MCCAIN”. Codes: 1 – Obama, 2 – McCain.

We are interested in the Obama-McCain correlation, so recode the output variable to a new variable:
1  1 (Obama), 2  2 (McCain), all other values: „System missing value”.

Independent variable (exposure): „THINK OF SELF AS LIBERAL OR CONSERVATIVE”.

Dichotomize the output variable into a new variable: 1-2-3  0 (liberal), 5-6-7  1 (conservative), all
other values: „System missing value”.

(You can find both the original and the dichotomised variables in the database, so dichotomisation
can be skipped.)

Create a 2x2 table and answer the following questions:

How many liberals voted for Obama? ……………………………897

How many conservative voters were questioned in total? ……………………………1172

Odds ratio: …………………………… 28,132

95% confidence interval: …………………………… - ……………………………21,777 – 36,341

The proportion of Obama voters was highest among: ……………………………liberals

Is the correlation statistically significant? ……………………………yes


3. Logistic regression

Database: Merged_cut.sav

This database contains data from a survey performed in the USA. Following questions are based on
this database.

How do the age of the participants and the amount of time they spend watching TV a day influence
their health status?

Dependent variable: „health” (CONDITION OF HEALTH)

Recode the variable into a new variable: 1-2  0 (Good), 3-4  1 (Bad), all other values: „System
missing value”.

(You can find both the original and the dichotomised variables in the database, so dichotomisation
can be skipped.)

Independent variables: „age” (AGE OF RESPONDENT), “tvhours” (HOURS PER DAY WATCHING TV).

Use multivariate logistic regression to determine whether the studied variables affect the health
status of the participants!

„AGE OF RESPONDENT”:

Odds ratio: …………………………… 1,104

95% confidence interval: ……………………. - …………………….1,007 – 1,021

„HOURS PER DAY WATCHING TV”:

Odds ratio: ……………………………1,154

95% confidence interval: ……………………. - …………………….1,105 – 1,205

Which correlation is statistically significant? ………………………………………………. Both are significant.


4. 2x2 table, Odds ratio

Database: Merged_cut.sav

This database contains data from a survey performed in the USA. Following questions are based on
this database.

How does the marital status influence the happiness of participants?

Independent variable (exposure): “marital_status” (MARITAL STATUS)

Recode the “marital” variable into a new variable: 1  0 (married), 2,3,4,5  1: not married, all other
values: “System missing values”.

Output variable: “general_happyness” (GENERAL HAPPINESS)

Recode the output variable into a new variable: 1,2  0 (Happy), 3  1 (Unhappy), all other values 
“System missing value”

(You can find both the original and the dichotomised variables in the database, so dichotomisation
can be skipped.)

Create a 2x2 table using the recoded variables and answer the following questions:

How many happy married people participated in this study? 2094

How many people considered themselves unhappy? 746

Odds ratio: 2,742

95% confidence interval: 2,308-3,257

Is it true that married people are happier than unmarried people? Yes

Is this correlation statistically significant? Yes


5. Stratified analysis

Database: Merged_cut.sav

This database contains data from a survey performed in the USA. Following questions are based on
this database.

How can the marital status influence the happiness of participants (stratified analysis)?

Independent variable (exposure): “marital_status” (MARITAL STATUS)

Recode the “marital_status” variable into a new variable: 1  0 (married), 2,3,4,5  1: not married,
all other values: “System missing values”.

Output variable: “general_happiness” (GENERAL HAPPINESS)

Recode the output variable into a new variable: 1,2  0 (Happy), 3  1 (Unhappy), all other values 
“System missing value”

Analyse the effect of a 3rd factor on the correlation described above. This new variable shows
whether the participants think the reduction of income differences “income_differences” (SHOULD
GOVT REDUCE INCOME DIFFERENCES) important. Answers are coded as ordinal scale (1-7 based on
the importance of the reduction of the income differences).

Dichotomize “income_differences” into a new variable: 1-3 0 (important), 4-7  1 (not important).

(You can find both the original and the dichotomised variables in the database, so dichotomisation
can be skipped.)

Perform the stratified analysis of the marriage-happiness correlation where the layers are defined by
the dichotomised “income_differences” variable!

Layer specific results:

Layer 1 (the reduction of income differences is important):


Odds ratio: 3.575
95% confidence interval: 2.562 – 4.988
How many married people were in this layer? 527

Layer 2 (the reduction of income differences is not important):


Odds ratio: 2,892
95% confidence interval: 2,176-3,843
How many happy non-married people were in this layer? 751

Crude odds ratio: 3,279


95% confidence interval: 2,646-4,064

Mantel-Haenszel OR: 3,178


95% confidence interval: 2,56-3,95

The effect of the 3rd factor on the marriage-happiness correlation is: confounder / effect modifier /
both / none: effect modifier

You might also like