You are on page 1of 29

Lecture 2

Research methods Part 2

Acknowledgement: lecture based on the lecture developed by Molin (Delft


University of Technology)
Phases in research process
This lecture

1. problem formulation
2. conceptual model
3. research design survey
experiment
4. measurement instrument
5. data collection
6. data analysis
Causality

Many of the pitfalls in data analysis have to do with


ability to interpret relationships in terms of causality
What is causality?
• ‘X causes Y’
• if we change X, Y will change too
• X = cause, independent variable, predictor
• Y = effect, dependent, explained or predicted variable

• why important?
• understand  scientific
• predict consequences of policy measures
What is cause, what effect?
• education - income

• illness – work pressure

• gender – means of transport

• aggressive people – watching violent movies

condition for causality: X proceeds Y in time


Spurious relationships
• a correlation exists between X and Y X Y

• this disappears after controlling for Z (3rd variable)

Z X

Y
• X causes Y?
• Ice cream sales  drownings in city swimming pools?
• coffee consumption  risk of heart disease?

Condition of causality: no 3-rd variable exists


that can explain away the correlation
Conditions for causality

1. statistical correlation
• can be measured

2. cause before effect


• time dependence can be problem

3. no spurious relationships
• measure all potential alternative causes and test if correlation
still exists

4. theory: causal mechanism


• how does the cause generate the effect?
3. Research design

Apply the best strategy to answer the research questions,


while
minimizing the probability of alternative explanations
Two important ‘ideal’ types
• experiment
• typically: one group gets ‘treatment’, another group does not
• much control by researcher: best design for ruling out
alternative explanations – statistical analysis is often simple
• used for measuring effects of treatments and evaluative
research (testing a hypothesis)

• survey
• typically: large group of respondents completes a questionnaire
• less control by researcher: ruling out of alternative explanations
needs to be done by using advanced statistics
• often used for explorative and descriptive research
Examples of an experiment
• new medicine
• experimental group: new medicine
• control group: placebo

• living-lab experiment – indoor climate


• measuring effect of sound mask on degree of comfort of
employees
Experimental design

Randomization

The design allows finding the effect of X, taking into


account autonomous change
effect due to cause (experimental group) = Y2 – Y1
autonomous change (control group) = Y4 – Y3

net effect = (Y2 – Y1) – (Y4 – Y3)


Experiment: main characteristics
• control on independent variable
• researcher decides which case gets treatment
• time order control

• randomization
• cases are randomly assigned to control group and
experimental group

• before and after measurement


 how much change?

• control group
• group that does not get treatment
Characteristics of survey
• systematic interviewing or observation
• all cases get same questions and same response
possibilities

• one moment measurement


• no before – after measurement
• problems with time control

• large numbers
• many cases  reliable measurements
• many variables  testing for possible alternative causes
(spurious relationships)
When to apply the survey?

• predictors that cannot be controlled


• age, gender, etc.

• non-observable variables
• motives, attitudes, opinions, preferences, perceptions,
wishes, needs, plans
• reasons for behavior
• behavior in past
But, problems with identification of
causality
• no time-control
• due to one measurement moment & no control on predictor
• does cause come before effect?
 theory necessary

• spurious correlations
• in an experiment potential alternative explanations are ruled
out
• this does not apply for a survey
 explicitly test for spurious correlations
Examples – correlations we often find in studies

distance to train + market value of


station the dwelling

Do people want to live far away from a train


station?

green - use of active


neighborhood transport mode

Do people dislike to walk or bycicle in an


environment with much green?
Validity
Do we measure what we intend to measure?

• internal validity
• are causal interpretations in research valid?
• are all alternative explanations ruled out?

• external validity
• are results generalisable to different places, times,
groups and circumstances?
Comparison validity
• experiment • survey

low / high? Internal validity low / high?

low / high? External validity low / high?


Comparison validity
• experiment • survey

– high internal validity – low internal validity


• causality no problem • causality often problem
• due to time order & predictor • no control on time order,
control predictor variable &
spurious effects

– high external validity


– low external validity • measurement in many
• often artificial environment places & different groups
• often few cases & specific • often many cases
groups
Levels of measurement
Level of measurement
• measuring
• assigning numbers to empirical phenomena
• numbers are easier to deal with in analysis than text

• level of measurement
• how to interpret the numbers
• determines which analysis techniques are allowed

• very, very important!


Nominal level just distinction

• 1  2  3:
• numbers just indicate the different categories
• so, there is no ordering
• e.g.: color, gender, means of transport

• numbering is fully arbitrary


• means of transport 1 = car, 2 = bike, 3 = train
is equivalent to 2 = car, 3 = bike, 1 = train

• dichotomous
• variable of nominal level with only 2 categories
• gender: 1=female, 2=male
Ordinal level rank order

• 1 < 2 < 3 or 1 > 2 > 3


• there is an order between categories
• no equal differences between consecutive categories
• example: level of education

• numbering is arbitrary as long as order between


categories stays the same
• educational level:
1 = primary school, 2 = secondary school, 3 = bachelor,
4 = master
OR
2 = primary school, 5 = secondary school, 6 = bachelor,
7 = master
Interval level equal differences

• 2 - 1 = 4 - 3; but 2  2 x 1
• order with equal differences between categories
• no absolute zero value
• e.g.: preferences on rating scale; intelligence; oF; oC

• example: temperature in Celsius or Fahrenheit


• right: difference 20o-10o = 2 x (15o-10o)
• wrong: if temperature decreases from 20o to 10oC, it
becomes twice as cold NOT TRUE
Ratio level equal proportions

• 2=2x1
• order & equal intervals & absolute zero value
• weight, distance, age, temperature in Kelvin

• example equal proportions


• 20 kilometer is twice as much as 10 kilometer
• weight 40 kilo is twice as heavy as 20 kilo
Interval versus ratio
• ratio
• absolute scale
• 2x higher score, indicates 2x more of the characteristic

• interval
• relative scale
• 2x higher score does not represent 2x more of the
characteristic
• e.g. factors are different in Celsius and Fahrenheit
Overview levels of measurement
lowest highest

non metric metric


discrete continuous
‘qualitative’ ‘quantitative’

more analysis techniques possible & more power


Which level of measurement?
– weight
– military rank
– years of education
– temperature in Kelvin
– political parties
– salary
– opinion on 0-10 point rating scale
– household type
– choice (yes, no)
– preference order of policy measures
Is this correct?
• the satisfaction with a particular service is
measured on a 10-point rating scale

• John scores 8 and Peter scores 4 on this


scale

• John is twice as much satisfied with the


service as Peter - ?

You might also like