# Empirical Research Methods Poster

Define research question
Level of control
Can be pretty sure your conclusions are valid ... But are you getting at the correct issues?

How facts become theories
Event Relationships between constructs are identified Confidence in theory is increased Laws are formed Prediction is confirmed Theory is rejected

Common research process
Purpose
Empirical research is a research approach in which empirical observations (data) are collected to answer research question. The goal of the theory has to be defined here. Based on the research question an empirical strategy has to be chosen. It is used to find out what is already known about a question before trying to answer it. A good literature review is an important part of any research. The goal of literature review is to demonstrate a familiarity with a body of knowledge and establish credibility. Additional, to show the path to prior research and how a current research is related to it.

Select research method
Experiment
Establishes causal relationships, confirm theories.

Analyze quantitative data
Statistics is a mathematical science pertaining to the collection, analysis, interpretation or explanation, and presentation of data. Statistical methods can be used to summarize or describe a collection of data; this is called descriptive statistics. In addition, patterns in the data may be modeled in a way that accounts for randomness and uncertainty in the observations, and then used to draw inferences about the process or population being studied; this is called inferential statistics. Both descriptive and inferential statistics comprise applied statistics.

Can be pretty sure you are getting at the correct issues … But can you draw valid conclusions?

Event Relationships between constructs are identified Event Laws are formed

Level of access

Prediction is NOT confirmed Confidence in theory is reduced

Theory is modified

Review Literature

Types of research questions are: What?, Where?, Who?, When?, How Much?, How many?, Why?, How?

When appropriate

Explanatory Why research How How many How much Descriptive research When Who Where Exploratory research What

Start empirical research

Research is a systematic process for answering questions to solve problems and create new knowledge. Research Question (RQ) is what you are trying to find out by undertaking the research process. A clear and precise RQ guides theory development, research design, data collection and data analysis.

Case study
Investigate a typical »case« in realistic representative conditions.

Survey
Investigate information collected from a group of people, projects, organizations or literature.

Event Theory is formed that explains laws

Predictions from theory can be drawn, which form hypotheses

Research is performed Theory is NOT modified

Define research question

Control

Requires high control Control on who is using which technology, when, where, and under which conditions is possible. To investigate self standing tasks from which results can be obtained immediately. Can establish causal relationships. Can confirm theories.

Requires medium control Change to be assessed (e.g., new technology) is wide-ranging throughout the development process. Assessment in a typical situation required.

Requires low control Technology change is implemented across a large number of projects. Description of results, influence factors, differences and commonalities is needed.

Descriptive statistics
Descriptive statistics are used to describe the basic features of the data in a study. They provide simple summaries about the sample and the measures. Together with simple graphics analysis, they form the basis of virtually every quantitative analysis of data. Measures of central tendency A measure of central tendency is a single number that is used to represent the average score in the distribution. Mode – the most common score in a frequency distribution Median – the middlemost score in a distribution Mean – the common average Measures of variability A single number which describes how much the data vary in the distribution. Range – The difference between the highest and lower score in a distribution. Variance – The average of the squared deviations from the mean. Standard deviation – the square root of the variance, a measure of variability in the same units as the scores being described. Correlation and regression Determine associations between two variables. Correlation – The strength of the relationship between two variables. Regression – Predicting the value of one variable from another based on the correlation.

Research question examples:
What are the key success factors of object-oriented frameworks? Does the proposed software improvement increases the efficiency of its users? How does software development methodology and team size influence developers productivity?

Create theoretical model
Research question: »How does software development methodology and team size influences developers productivity?«

Create theoretical model

Consider threats to the research
Threats to the research are related to operationalization and measurement issues: Operationalization issues – The validity of the operationalization Measurement issues – Reliability, validity, sensitivity (see below)

Design research Theoretical model is based on research question and represents set of concepts and relationships between them! Select research method

Other research methods
Practitioner oriented methods Delphi method Action research Laboratory oriented methods Mathematical modeling Computer simulation Laboratory experiment Technology oriented methods Proof of technical concept Literature based methods Literature review Conceptual study

...

Levels
(observed variables)

Independent variables
(latent variables)

Dependent variables
(latent variables)

Measures
(observed variables)

Pro's

Theoretical model is used to conceptualize the problem stated in research question. It is commonly represented with causal model.

Can be incorporated in normal development activities. Already scaled up to life size if performed on real projects. Can determine whether expected effects apply in studied context. Easy to plan. Help answer why and how questions. Can provide qualitative Insights. With little or no replication they may give inaccurate results. Difficult to interpret and generalize (e.g., due to confounding factors). Statistical analysis usually not possible. Few agreed standards on procedures for undertaking case studies.

Can use existing experience. Can confirm an effect generalizes to many projects/organizations. Allow to use standard statistical techniques. Enable research in the large. Applicable to real world projects in practice. Generalization usually easier. Good for early exploratory analysis. May rely on different projects/organizations keeping comparable data. No control over variables methods. Can at most confirm association but not causality. Can be biased due to differences between respondents and nonrespondents. Questionnaire design may be tricky (validity, reliability). Questionnaires. Interviews. Project measurement. Literature survey. Comparing different populations among respondents, association and trend analysis, consistency of scores.

OSSD RUP XP Valid but not reliable Reliability threats Valid and reliable Reliable but not valid Number of developers Software development methodology
H1

Application in industrial context requires Compromises.

Inferential statistics
Inferential statistics or statistical induction comprises the use of statistics to make inferences concerning some unknown aspect of a population.

Developer productivity

Con's

Lines of code (LOC) per developer per day

Design experiment

Design case study

Design survey

– refers to the question whether the research can be repeated

Development team size

H2
Measurement relationship – associate latent variables with their measures Causal relationships (H1,H2) – define cause-effect relationship between latent variables (theoretical propositions). Can be tested only by evaluating relationships between observed variables (hypotheses)! Perform research on defined sample General sable population – population to which you want to ultimately generalize results. Accessible population – population that you can actually gain access.

with the same results. Stability reliability – Does the measurement vary over time? Representative reliability – Does the measurement give the same answer when applied to all groups? Equivalence reliability – When there are many measures of the same construct, do they all give the same answer?

Data collection

Perform research

The objective of this activity is to run the study according to the study plan.

Sampling distribution – the distribution of means of samples from a population. Sampling distribution has three important properties: It has the same mean as the population distribution. It has smaller standard deviation as the population distribution. As the sample size becomes larger, the shape of the distribution approaches a normal distribution, regardless of the shape of the population from which the samples are drawn. Hypothesis testing - is the use of statistics to determine the probability that a given hypothesis is true. The usual process of hypothesis testing consists of four steps. Formulate the null hypothesis H0 (the hypothesis that is of no scientific interest) and the alternative hypothesis Ha (statistical term for the research hypothesis). Identify a test statistic that can be used to assess the truth of the null hypothesis. Compute the P-value, which is the probability that a test statistic at least as significant as the one observed would be obtained assuming that the null hypothesis were true. The smaller the P-value, the stronger the evidence against the null hypothesis. Compare the p-value to an acceptable significance value alpha (sometimes called an alpha value). If p<=alpha, that the observed effect is statistically significant, the null hypothesis is ruled out, and the alternative hypothesis is valid. Statistical errors Accept H0 H0 is TRUE Correct decision Wrong decision – Type I error H0 is FALSE Wrong decision – Type II error Correct decision

Process and product measurement. Questionnaires.

Process and product measurement. Questionnaires. Interviews

Validity threats
Face validity – Research community »good feel«. Content validity – Are all aspect of the conceptual variable included in the measurement? Criterion validity – validity is measured against some other standard or measure for the conceptual variable. Predictive validity – The measure is known to predict future behavior that is related to the conceptual variable. Construct validity – A measure is found to give correct predictions in multiple unrelated research processes. This confirm both the theories and the construct validity of the measure. Conclusion validity – is concerned with the relationship between the treatment and the outcome of the research (choice of sample size, choice of statistical tests). Experimental validity – (see reliability)

Collect data

Variables types
Hypotheses are tested by comparing predictions with observed data Observations that confirm a prediction do not establish the truth of a hypothesis Deductive testing of hypotheses look for disconfirming evidence to falsify hypotheses Dependent

Actual sample – the sample actually used in research.

Hypothesis testing
Represent the »effect«

Developer efficiency Software reliability Requirements change Development team size Latent

LOC Mean time between failure {OSSD, RUP, XP} Number of developers Observed

Analysis types

All other variables which are not the focus of research are irrelevant variables.

Data is collected with a research instrument, for example questionnaire.

Parametric and nonparametric statistics, compare central tendencies of treatments, groups.

Compare case study results to a representative comparison baseline: sister project, company baseline, project subset with no change. Internal validity Construct validity External validity Experimental validity or reliability

Analyze data Qualitative data Chose data analysis Quantitative data

Major threats

Conclusion validity Internal validity Construct validity External validity

Represent the »cause«

Independent

Internal validity Experimental validity or reliability Construct validity External validity

The null hypothesis (H0)

Reject H0

Measurement issues
Reliability - does the measurement give the same results under the same conditions (consistency)? Validity - does the measurement method actually provide information about the conceptual variable? Sensitivity - how much does the measurement change with the changes on the conceptual variable?

Sources of invalidity
Internal – Is concerned with the validity within the given environment and the reliability of results. It relates to validity of research process design, controls and measures. External – Is the question of how general the findings are. Can you carry over the research results into actual environment?

Use qualitative data analysis

Depends on data and the goal of the study.

Use quantitative data analysis
Discriminant of Logistic Regression

Select statistical test
Nominal Level of measurement of DV?
In t

Statistical significance – the probability that an experimental result happened by chance. Repeated ANOVA LoM for DV? Int Here is the distribution of values of Z when the hypothesis tested is true. (mean Z = 0) Alpha is the probability of rejecting the hypothesis tested when that hypothesis is true. Here we have set alpha = 0.05 -1 0 1 2 3

Describe abstract theoretical concepts. They cannot be directly measured.

Sensitivity
How much does the measurement change with the change of the conceptual variable?

Define ways of measuring latent variables. Each latent variable may have multiple empirical indicators.

The objective of this activity is to analyze the collected data in order to answer the operationalized study goal (research question).

Nom Level of measurement for IV

ANOVA

rval Inte

Linear regression

Draw conclusions

Consider threats

Consider reliability, validity and sensitivity! Consider sources of invalidity (internal, external)

3+

Positivist research model
Positivism is a philosophy that states that the only authentic knowledge is scientific knowledge, and that such knowledge can only come from positive affirmation of theories through strict scientific method. World of theory World of propositions Operationalization

Qualitative vs. Quantitative analysis
Qualitative (»Judgments«) Tends to be the poor relation. Problems of opinion and perception when making the judgment. The data collected is more likely to create differences of opinion over interpretation. Not easily measurable. As the benefits are longer term, they can be outweighed by shorter term costs. Can lead to inconsistent assessments of performance between places over time and between project elements. Subjective opinions tend to be given less status than quantitative ones. Quantitative (»Hard numbers«) Easier to implement and collect data. Tick boxes. Easier to make comparisons over time and between places. Can be a quick fix when organizations need performance data to justify project investment. Easier to process through a computer. Easier for other stakeholders to examine and comprehend. Trends and patterns easier to identify. Can distort the evaluation process as we measure what is easy to measure. Can lead to simplistic judgments and the wider more complex picture is ignored. Conclusions can be drawn statistically or analytically.

# variables ?

2

Mind

Pearson correlation Linear regression

Chi-squared Goodness of Fit

Disseminate results

Spearman correlation Linear regression

Supp

In te rv al

Chi-squared cross tabulation

Abstract

or t / Fals ify

Test »Toy world« Laboratory

One sample t-test

Reality

Real world Simplify

Don't be afraid to talk over ideas with others!

The objective of this activity is to report the study and its results so that external parties are able to understand the results in their contexts as well as replicate the study in a different context.

1 Nominal Level of measurement ?

The IV is the variable that defines Correlated What conditions levels of Are Int+ measurement conditions Int Ord ? independent or Ord +O r correlated +I d nt ?
N + om her t O N + om

3+ How many levels does the IV have ? 2

Ord Friedman Paired (related) ttest Wilcoxon Matched Pairs One way ANOVA

Int LoM for DV? Ord

-3

-2

Values of Zx

START

The Z score for an item, indicates how far and in what direction, that item deviates from its distribution's mean, expressed in units of its distribution's standard deviation. Power is the probability of rejecting the hypothesis tested when the alternative hypothesis is true. Power = 0.26

Int LoM for IV?

Here is the distribution of values of Z when a particular alternative hypothesis is true (mean Z = 1) Beta is the probability of accepting the hypothesis tested when the alternative hypothesis is true. -3 -2 -1 Beta = 0.74

Indep.

N om

Kruskal Wallis Ord Independent t-test Int Ord Mann Whitney U

End empirical research

We got an answer to stated research question.

IV = Independent Variable DV = Dependent Variable

How 3+ many levels does the IV have ? LoM 2 for IV?

0

1

2

3

4

Values of Zx

The critical value of Z = 1.65

Design Experiment
Analyze <Object(s) of study> (what is studied/ observed?) for the purpose of <Purpose> (what is the intention?) with respect to their <Quality focus> (which effect is studied?) from the point of view of the <Perspective> (whose view?) in the context of <Context> (where is the study conducted?). Variables selection: Independent and dependent variables Observed variables Measurement scales (nominal, ordinal, interval, ratio) Selection of subjects: Profile description Quantity Separation criteria Context selection: Online vs. Offline Student vs. Professional Specific vs. general

Design case study
Case method facts: Does not explicitly control or manipulate variables. Studies a phenomenon in its natural context. Makes use of qualitative tools and techniques for data collection and analysis. Case study research can be used in a number of different ways. Can be used for description, discovery and theory testing. A smaller set of cases a researcher selects from a larger pool and generalizes to the population Varieties of case study research: Case studies can be carried out by taking a positivist or interpretivist approach. Can be deductive and inductive. Can use qualitative or quantitative methods. Can investigate one or multiple cases.

Design Survey
A survey is a study by asking (a group of) people from a population about their opinion on specific issue with the intention to define relationships outcomes on this issue.
Survey Process: Study definition – determining the goal of a survey. Design – operationalizing of the study goals into a set of questions (see theoretical model) Implementation – operationalisation of the design so that the survey will be executable. Execution – the actual data collection and data processing. Analysis – interpretation of the data. Packaging – reporting about the survey results. Data analysis Coding scheme (for open question) Data entry Checking Resolve incomplete data Statistical testing of results

Literature used
Bernd Freimut, Teade Punter, Stefan Biffl, & Marcus Ciolkowski 2002, State-of-the-Art in Empirical Studies, Virtuelles Software Engineering Kompetenz-zentrum. Johnston, R. & Shanks, G. Research Methods in Information Systems. 2003. Neuman, W. L. 2005, Social research methods : qualitative and quantitative approaches, 5th ed. edn. Winston Tellis 1997, "Introduction to Case Study", The Qualitative Report, vol. 3, no. 2. www.wikipedia.org

Logic of sampling Population Sample frame

Results are generalized to population Sample

Sampling process

Case research design: Single case Investigate a phenomenon in depth, get close to the phenomenon, provide a rich description and reveal its deep structure.

Multiple case

A list of cases in a population or the best approximation of it.

Random sample: a sample in which a researcher uses a random sampling process so that each sampling element in the population will have an equal probability of being selected.

Experiment design: Define the set of tests (treatments) How many tests (to make effects visible) Link the design to the hypothesis, measurement scales and statistics Randomize, block(a construct that probably has an effect on response) and balance(equal number of subjects) Experiment design Classical One shot case study One group pretest posttest Static group comparison Two group posttest only Time series design Random assignment Yes No No No Yes No Pretest Yes No Yes No No Yes Posttest Yes Yes Yes Yes Yes Yes Control group Yes No No Yes Yes No

Hypothesis formulation: Hypotheses statements H0: Null hypothesis (different treatments produce equal results) Ha: One or two tailed alternative hypotheses (different treatments produce different results) Experimental group Yes Yes Yes Yes Yes Yes R o x R Design notation o x o o o x o o x x x o o o o o o x Experiment design notation: X = Treatment (represents a value of independent variable) O = Observation (of dependent variables) o R = Random assignment

Enable the analysis of data across cases, which enable the researcher to verify that findings are not the result of idiosyncrasies of the research setting. Cross case comparison allows the researcher to use literal or theoretical replication.

In exploratory case studies, fieldwork, and data collection may be undertaken prior to definition of the research questions and hypotheses. Explanatory cases are suitable for doing causal studies. In very complex and multivariate cases, the analysis can make use of pattern-matching techniques. Descriptive cases require that the investigator begin with a descriptive theory, or face the possibility that problems will occur during the project.

Types of survey Descriptive surveys are frequently conducted to enable descriptive assertions about some population, i.e., discovering the distribution of certain features or attributes. The concern is not about why the observed distribution exists, but instead what that distribution is. Explanatory surveys aim at making explanatory claims about the population. For example, when studying how developers use a certain inspection technique, we might want to explain why some developers prefer one technique while others prefer another. By examining the relationships between different candidate techniques and several explanatory variables, we may try to explain why developers choose one of the techniques. Explorative surveys are used as a pre-study to a more thorough investigation to assure that important issues are not foreseen. This could be done by creating a loosely structured questionnaire and letting a sample from the population answer to it. The information is gathered and analyzed, and the results are used to improve the full investigation. In other words, the explorative survey does not answer the basic research question, but it may provide new possibilities that could be analyzed and should therefore be followed up in the more focused or thorough survey.