You are on page 1of 102

Frameworks for

CHAPTER
Generating and
1 Applying Evidence
Chapter Overview
 Objectives
After completing this chapter, the learner will be able to:
1. Describe the purpose of clinical research.
2. Describe how quantitative and qualitative research differ and how they complement each
other.
3. Describe the steps of the research process.
4. Discuss the role of evidence in clinical decision making.
5. Discuss the components of the International Classification of Functioning, Disability and
Health and how they influence research questions.
6. Explain the role of interprofessional research.
7. Describe the purpose of explanatory, exploratory, and descriptive research.

 Key Terms
Qualitative research Transprofessional
Quantitative research Basic research
Scientific method Applied research
Translational research Systematic review
Efficacy Meta-analysis
Effectiveness Scoping review
Translational research Explanatory research
Efficacy Experimental designs
Effectiveness Randomized controlled trial (RCT)
International Classification of Functioning, Pragmatic clinical trial (PCT)
Disability and Health (ICF) Quasi-experimental designs
Body structures Single subject designs (SSD)
Body functions N-of-1 trial
Activities Exploratory research
Participation Epidemiology
Environmental factors Cohort studies
Personal factors Case-control studies
Intraprofessional Correlational/predictive studies
Multiprofessional Methodological research
Interprofessional Descriptive research

1
Developmental research Historical research
Normative research Mixed methods research
Case report/case series

 Chapter Summary
• Clinical research is a structured process of sequential steps that guide thinking,
investigating facts and theories and planning and analysis, beginning with
exploring connections, with the purpose of identification of the research question,
improving individual and public health. moving to designing and implementing the
study, analyzing data, and disseminating
• NIH defines three clinical research findings.
purposes:
• Evidence-based practice (EBP) involves
 To conduct patient-oriented research to
critical evaluation of literature and
understand mechanisms of disease and
consideration of clinical expertise, patient
interventions.
values, and clinical circumstances to
 To conduct epidemiologic observational inform clinical decision-making.
studies to describe patterns of disease and
identify risk factors. • Translational research is the application of
basic scientific findings to clinically
 To conduct outcomes research and health relevant issues, and simultaneously, the
services research to determine the impact generation of scientific questions based on
of research on population health and clinical dilemmas.
evidence-based interventions.
• Efficacy is the benefit of an intervention as
• Qualitative research strives to capture compared to a control, placebo or standard
naturally occurring phenomena, following program, tested in a carefully controlled
a tradition of social constructivism. This environment, with the intent of
philosophy is focused on the belief that all establishing cause and effect relationships.
reality is fundamentally social, and Effectiveness refers to the benefits and use
therefore the only way to understand it is of procedures under “real world”
through an individual’s experience. conditions where circumstances cannot be
• Quantitative research is based on a controlled within an experimental setting.
philosophy of logical positivism, in which • The International Classification of
human experience is assumed to be based Functioning, Disability and Health (ICF) is a
on logical and controlled relationships model that shows the relationship among
among defined variables. It involves body structures and functions, activities,
measurement of outcomes using numerical and participation in life roles. The effect of
data under standardized conditions. a health condition on these behaviors can
• The scientific method has been defined as a be mitigated by environmental and
systematic, empirical, and controlled personal factors.
critical examination of hypothetical • Interprofessional research is an important
propositions about the associations among focus to assure that questions are answered
natural phenomena.
• The process of clinical research involves

2
using varied perspectives and expertise. designs to explore relationships and predict
risk factors. These include cohort studies,
• Basic research is directed toward the
case-control studies and predictive studies
acquisition of new knowledge. Applied
for prognosis and diagnostic accuracy.
research advances the development of new
diagnostic tests, drugs, therapies and • Methodological research is used to study
prevention strategies, answering questions reliability and validity of measurements.
with direct clinical application.
• Descriptive research is used to describe
• Explanatory research uses experimental groups or populations to document their
designs to compare two or more characteristics. Approaches include
interventions. These include randomized developmental and normative research,
controlled trials (RCTs), pragmatic clinical case reports, historical research, and
trials (PCTs), and single-subject designs qualitative research.
(SSDs).
• Exploratory research uses observational

3
CHAPTER
On the Road to

2 Translational Research
Chapter Overview

 Objectives
After completing this chapter, the learner will be able to:
1. Define translational research.
2. Discuss the difference between efficacy and effectiveness.
3. Distinguish characteristics of randomized and pragmatic trials.
4. Describe the translational continuum in research in relation to Phase I-IV trials
5. Discuss the advantages and limitations of comparative effectiveness research in providing
evidence for practice.
6. Discuss the relevance of considering patient-reported outcome measures and patient-oriented
evidence that matters.
7. Discuss the purpose of implementation studies.

 Key Terms
Transitional research Practice based evidence
Randomized controlled trial (RCT) Outcomes research
Effectiveness trials Patient-oriented evidence that matters (POEM)
Pragmatic clinical trial (PCT) Disease oriented evidence (DOE)
Basic research Patient-reported outcome measure (PROM)
Phase I trials Patient-centered outcomes research (PCOR)
Phase II trials Primary outcome
Phase III trials Secondary outcome
Phase IV trials Implementation science

 Chapter Summary
• Translational research refers to the direct clinical practice, often referred to as taking
application of scientific discoveries into knowledge from “bench to bedside.”

4
• Many scientific breakthroughs can take up • Practice-based evidence (PBE) refers to the
to 25 years to reach publication and type of evidence that is derived from real
implementation, creating a “translation patient care problems, identifying gaps
gap.” between recommended and actual
practice.
• Efficacy is the benefit of a new therapy
established using a randomized controlled • Outcomes research is an umbrella term
trial (RCT) which incorporates random that describes the impact of health care
allocation, blinding, and designs to reduce practices and interventions.
sources of bias.
• Patient-reported outcome measures
• Effectiveness trials look at the effect of (PROM) are outcomes that reflect what is
interventions under real world conditions most important to patients, such as quality
using pragmatic trials. of life and function.
• Four translation blocks include Phase I • Patient-centered outcomes research
trials to determine if treatments are safe in (PCOR) has the goal of engaging patients
humans, Phase II and III trials to determine and other stakeholders in the development
efficacy with specific controls and defined of research questions and outcome
patient samples, and Phase IV trials that measures.
look at implementation in community
• Studies can designate a primary outcome
settings.
as the one that will be used to assess
• Comparative effectiveness research (CER) therapeutic benefit, as well as secondary
is designed to study real world application outcomes that are other endpoint
of research evidence by comparing two measures.
treatments.
• Implementation science is the study of
• Pragmatic clinical trials (PCTs) study methods to promote the integration of
interventions in diverse settings to better research finding and evidence into
reflect practical circumstances of patient healthcare policy and practice.
care.

5
CHAPTER Defining the

3 Research Question
Chapter Overview

 Objectives

After completing this chapter, the learner will be able to:


1. Describe the process of developing a research question.
2. Discuss the sources of research questions.
3. Describe how a theoretical rationale forms the framework for a research question.
4. Define independent and dependent variables.
5. Describe the purpose of operational definitions.
6. Describe the characteristics of good research hypotheses.
7. Express the purpose of research in terms of the research problem, the research question,
specific aims, and research hypotheses.

 Key Terms
Systematic reviews Levels
Meta-analysis Operational definition
Explanatory research Research hypothesis
Exploratory research Statistical hypothesis
Descriptive research Nondirectional hypothesis
Methodological research Directional hypothesis
PICO Simple hypothesis
Variable Complex hypothesis
Factor Guiding questions
Independent variable Specific aims
Dependent variable Power

6
 Chapter Summary
• The research process begins by identifying a • Studies incorporate independent variables
topic of interest and clarifying the problem to that represent conditions or interventions that
generate a specific, testable question. predict outcomes, and dependent variables
that represent responses or measured
• Questions may arise from clinical
outcomes.
experience, from clinical theory, finding
gaps or conflicts in the literature, or the need • Operational definitions define variables
to better understand patterns or according to their unique meaning within a
characteristics of populations. study design, providing detail of how
treatments are applied or outcomes are
• Research questions must be built on a
measured.
rationale that justifies the need for the study
and the foundation for testing it. This • Research hypotheses are declarative
process includes a review of literature to statements that predict the relationship
understand prior research and the state of between independent and dependent
knowledge. Systematic reviews and meta- variables, and are used to guide the
analyses are useful for summarizing the interpretation of outcomes. The purpose of a
conclusions and quality of past research. study is to test the hypothesis and, ultimately,
to provide evidence so that the researcher
• Research questions can focus on four types
can accept or reject it. Hypotheses can be
of objectives: explaining cause-and-effect
expressed with direction (which treatment
through comparison of interventions, looking
will be better) or without direction
for relationships to determine how clinical
(indicating they will be different).
phenomena interact, describing existing
conditions or phenomena in a particular • Descriptive studies will not have hypotheses,
population, or studying measurement but will use guiding questions or specific
methods to investigate reliability and aims to frame the study’s purpose.
validity.
• A good research question will focus on
• Research questions should be framed around important problems that have an impact on
four essential components using the acronym patient care. It will direct a study that
PICO: the Population being studied, the adheres to ethical standards and that is
Intervention of interest (which may be a feasible in terms of time, expertise, and
diagnostic test or risk factor), a Comparison resources.
group (if relevant), and Outcomes.

7
CHAPTER The Role of Theory in

4 Research and Practice


Chapter Overview
 Objectives
After completing this chapter, the learner will be able to:
1. Discuss the role of theory in clinical practice and research.
2. Identify four purposes of theories in clinical research.
3. Illustrate deductive and inductive logic and differentiate inductive and deductive theories.
4. Define and illustrate concepts and constructs.
5. Describe propositions and models in relation to clinical theory.
6. Illustrate how hypotheses are derived from theories and are used to test theories.

 Key Terms
Theory Inductive reasoning
Concepts Inductive theories
Variables Middle-range theory
Constructs Grand theory
Propositions Meta-theory
Models Law
Deductive reasoning

 Chapter Summary
• A theory is a set of interrelated concepts that • Theories are built on concepts. Constructs
specify relationships among variables, are abstract concepts that are not observable
representing a reasonable explanation of the but are inferred by measuring relevant or
relationships. correlated behaviors, such as function or
pain.
• Theories are used to summarize knowledge
to explain observable events, to predict • A proposition is a generalized statement that
what should occur under specific asserts the theoretical linkages between
conditions, to stimulate development of concepts.
new knowledge, and to provide a basis for
asking an applied research question.

8
• A model is a simplified approximation of a • Theories are important foundations for
process or structure, including conceptual, asking good clinical questions, providing
physical, and statistical models. the rationale for interpretation of outcomes.
• Deductive reasoning is the acceptance of • Middle-range theories form a bridge
general proposition and subsequent between theory and empirical observations,
inferences that can be drawn. Deductive providing opportunities for hypothesis
theories are intuitive, providing insight that testing under clinical conditions. Grand
can then be tested. theories are more comprehensive, trying to
explain phenomena at the societal level.
• Inductive reasoning is logic that develops
Meta-theories are used to reconcile several
generalizations from specific observations.
theoretical perspectives in the explanation
Inductive theories evolve through a process
of sociological, psychological, or
that begins with empirical observations that
physiological phenomena. Laws are
are studied for patterns to form
derived when theories reach the level of
generalizations.
absolute consistency, mostly observed in
physical sciences.

9
Understanding
CHAPTER
Evidence-Based
5 Practice
Chapter Overview
 Objectives
After completing this chapter, the learner will be able to:
1. Discuss the factors that have influenced the need for EBP in your profession.
2. Describe sources of knowledge and how they relate to the use of evidence in practice.
3. Define EBP and how the model contributes to clinical decision-making.
4. Describe the five steps in the EBP process.
5. Develop clinical questions using the PICO format for studies of interventions, diagnosis and
prognosis.
6. Give examples of background and foreground questions related to a patient case.
7. Describe the general questions used to critically appraise a study.
8. Describe the levels of evidence used to distinguish the strength of studies for quantitative and
qualitative studies.
9. Define the importance of implementation studies and knowledge translation to the EBP process.

 Key Terms
Evidence-based practice (EBP) Clinical practice guidelines
Background question Scoping reviews
Foreground question Critical appraisal
PICO Levels of evidence
Systematic reviews Knowledge translation (KT)
Meta-analysis

 Chapter Summary
• Evidence-based practice (EBP) is the use of • EBP is important because of the gaps in
best research evidence in conjunction with clinical knowledge, and the overuse,
clinical expertise, patient values, and clinical underuse, or misuse of available procedures.
circumstances, to inform clinical decisions.

10
• EBP contributes to decision-making, rather the results meaningful? Are the results
than the typical sources of knowledge that relevant to my patient?
include reliance on traditional methods or
• Research studies have been classified
authorities or experience.
according to levels of evidence that represent
• EBP does not mean that everyone has to use the expected rigor of the study design,
a “cookbook” approach but is part of a larger indicating the level of confidence that can be
process that includes experience and placed in the findings. For studies of
judgment within the framework of patient diagnostic accuracy or prognosis,
needs and preferences. observational studies will provide the
strongest evidence. For studies of
• The five steps of EBP include 1) asking a interventions, screening, or assessment of
clinical question, 2) acquiring relevant
harm, RCTs are considered most effective.
literature, 3) appraising the literature, 4)
Systematic reviews should provide the
applying findings to a clinical decision, and
highest level of evidence because they
5) assessing the success of the process. These
include multiple studies and quality
steps are sometimes referred to as the five
assessment.
“A’s.”
• Qualitative research does not fit neatly into
• A background question is related to etiology
the classification system because of its intent
or general knowledge about a patient’s
to understand human experience. Evidence
condition, referring to the cause of a disease
can be judged according to generalizability
or condition, its natural history, signs and
of the study based on theoretical premise and
symptoms, or the anatomic or physiological ability to extend findings into broader
mechanisms that relate to pathophysiology. contexts.
• A foreground question focuses on specific
• Many barriers exist to the implementation of
knowledge to inform decisions about patient
EBP within clinical settings, including a lack
management. It is phrased using the PICO of skill in critical appraisal of literature, the
format: Population, Intervention,
ability to access literature, and the lack of
Comparison, and Outcomes. These
time and resources that are typical in busy
questions can relate to diagnosis,
clinical environments. Many strategies can
measurement, prognosis, intervention, or
be used to establish an evidence-based
patient experiences.
culture.
• Systematic reviews and meta-analyses are
• Knowledge translation (KT) is related to the
syntheses of literature that provide
long-standing problem of underutilization of
summaries of evidence to aid in decision-
evidence. Although research may
making. They assess the availability of
demonstrate treatment effectiveness or the
evidence and appraise the quality of the
presence of risk factors, KT involves the
research studies, providing summary
adaptation of quality research into relevant
conclusions.
priorities, including the creation and
• Critical appraisal of literature should address application of knowledge.
three main questions: Is the study valid? Are

11
CHAPTER Searching the
6 Literature
Chapter Overview
 Objectives
After completing this chapter, the learner will be able to:
1. Apply strategies for using search engines and databases for locating research
literature.
2. Develop search strategies using Boolean logic and Medical Subject Headings.
3. Describe several methods for refining or broadening a search.
4. Distinguish between primary and secondary sources of information.

 Key Terms

Peer review Boolean logic


Grey literature Nesting
Publication bias Medical subject headings (MeSH)
Primary source Exploding
Secondary source Focusing
Database Limits
MEDLINE Filters
PubMed Clinical queries
Digital object identifier (DOI) Sensitivity
Cumulative Index to Nursing and Allied Specificity
Health Literature (CINAHL) Abstracts
Cochrane Database of Systematic Reviews Interlibrary loan
Search engines Open access
Keywords My NCBI
Truncation Citation management applications
Wildcards Review of literature

12
 Chapter Summary
CINAHL and the Cochrane Database
• Scientific literature includes all
of Systematic Reviews.
scholarly products, including original
research articles, editorials, position • The search process begins by
papers, reviews and meta-analyses, identifying keywords. These terms can
books, dissertations, conference be identified within a PICO question.
proceedings and website materials. Boolean logic is used to create
combinations of keywords that can
• Many journals are peer reviewed,
help to fine tune a search using the
which means that manuscripts are
terms AND, OR, and NOT.
scrutinized by experts before they are
accepted for publication. • Medical Subject Headings (MeSH) are
subject headings developed by the
• Grey literature is material that is not
National Library of Medicine to sort
produced by commercial publishers,
through keywords.
including government documents,
reports of all types, fact sheets, • Several search strategies can be used
practice guidelines, conference to refine a search by truncating
proceedings and other non-journal keywords, filtering for dates or other
products. These can provide important characteristics, and accessing clinical
information for evidence-based queries through PubMed to target
practice (EBP). types of research. Other strategies
include finding references in other
• Primary sources are reports provided
articles and looking for related articles
directly by the investigator, such as
within a database.
journal articles. Secondary sources
include reviews of studies presented • Reviewing abstracts helps to screen
by someone other than the original for articles that are relevant to your
author. Caution must be exercised clinical question.
when using secondary sources, as they
• Full text of articles may be available
may not accurately reflect the material
for free through PubMed and other
in the original source.
databases. When free articles are not
• Systematic reviews and meta-analyses available, access can be obtained
include critical analysis of published through interlibrary loan, author
works. Although technically a requests, and public libraries. Many
secondary source, these are a form of journals are open access.
research in generating new knowledge
• PubMed offers many options that
based on rigorous and documented
allow you to save searches, to build
analysis of previous research.
collections of related citations, and to
• Many databases and search engines get e-mail alerts when relevant articles
can be used to access citations and full are published.
text of literature. MEDLINE is the • Citation management applications
most comprehensive database of allow you to save references and
health-related research and can be organize your citations when writing
freely accessed through PubMed. your papers. The most popular are
Other important databases include EndNotes and RefWorks, both of

13
which require subscription or access literature, it is helpful to start with a
through your institution. Several free systematic review that can give you an
reference tools are available. idea of the scope of prior research.
• When completing a review of

14
CHAPTER Ethical Issues in

7 Clinical Research
Chapter Overview

 Objectives
After completing this chapter, the learner will be able to:
1. Describe the major principles in the Common Rule and recent changes.
2. Explain the role of HIPAA in clinical research.
3. Define the principles of beneficence, justice, and autonomy as they relate to research ethics.
4. Discuss ethical issues related to use of control groups in human studies research.
5. Discuss historical foundations of research ethics.
6. Describe the role of the institutional review board in clinical research.
7. Describe the elements of informed consent.
8. Describe the process of developing and submitting a proposal to an IRB.
9. Define the three main types of misconduct in research.

 Key Terms
Nuremberg Code Exempt review
Declaration of Helsinki Informed consent
National Research Act Risk
Institutional Review Board (IRB) Equipoise
Belmont Report Benefits
The Common Rule Fabrication
Respect for persons Falsification
Autonomy Plagiarism
Beneficence Conflict of interest (COI)
Justice Peer review
Risk-benefit ratio Retractions
Expedited review Replication studies

15
 Chapter Summary
• Several documents have become part of • Informed consent must be obtained from all
ethical standards for human research. The participants in research. Informed consent
Belmont Report codifies three basic principles forms document the full scope of a study,
that guide clinical research. potential risks and benefits, and protections
such as anonymity and confidentiality. These
 Respect for persons involves attention to
forms must be signed to document fully
human dignity, and an individual’s right to
voluntary participation. Most institutions
autonomy or self-determination. This
have specific requirements for drafting
means that individuals have the right to
consent forms, which will be scrutinized by
make decisions about their lives, including
the IRB. Informed consent forms must
whether to participate in research studies.
include certain elements, which are codified
 Beneficence refers to the obligation to in the Common Rule (see the Chapter 7
attend to the well-being of individuals, Supplement).
including physical and psychological risks.
• Research integrity on the part of the
 Justice refers to fairness in the research researcher is an important concern in human
process, or the equitable distribution of the studies, with many unfortunate examples of
benefits and burdens. This relates to violations of ethics. The three forms of
selection of participants, with the intent that misconduct include fabrication of results,
subjects represent diverse elements of the plagiarism, and falsification of data. When
population and that the burden of research studies are found to violate these ethical
does not fall only on disadvantaged groups. standards, they may be retracted so that
It also requires attention to the risk-benefit readers will know that the findings should not
ratio, to ensure that potential benefits be considered valid.
outweigh potential risks.
• Conflicts of interest (COI) occur when a
• Federal regulations require the review of researcher’s professional judgement may be
research proposals by an Institutional Review influenced by secondary personal interests,
Board (IRB) to ensure the rights and such as financial gain. Potential conflicts
protections of all participants. Proposals may must be disclosed as part of the planning and
be reviewed fully or may receive expedited reporting of research.
reviews depending on the nature of the study
and potential risks to subjects.

16
CHAPTER Principles of
8 Measurement
Chapter Overview

 Objectives
After completing this chapter, the learner will be able to:
1. Distinguish among continuous, discrete, and dichotomous variables.
2. Discuss the challenge of measuring constructs.
3. Define and provide examples of the four scales of measurement.
4. Discuss the relevance of identifying measurement scales for statistical analysis.

 Key Terms
Dichotomous variable Nominal scale
Polytomous variable Ordinal scale
Continuous variable Interval scale
Discrete variable Ratio scale
Precision Parametric tests
Constructs Nonparametric tests
Levels of measurement

 Chapter Summary
• Measurements are taken to describe variable can be described only in whole
quantity, set criteria for performance, make integer units.
comparisons, evaluate patient conditions,
• A construct is an abstract variable that is not
discriminate between individuals with
observable and is defined by the
different characteristics, and to predict
measurement used to assess it. It is also
outcomes or relationships.
considered a latent trait because it reflects a
• Measured variables can be dichotomous, property within a person and is not externally
having only two possible values, or observable. Examples are intelligence,
polytomous, having more than two values. health, pain, mobility, and depression.
• A continuous variable can take on any value
along a continuum, whereas a discrete

17
• Four levels of measurement include: absolute zero point. Example: height and
weight.
 Nominal scale: classifies objects or people
into categories with no quantitative order. • Nominal values can only be counted.
Example: blood type. Ordinal measures can be ranked and
expressed as counts or percentages.
 Ordinal scale: a rank-ordered measure Technically these values should not be
where intervals between values are subjected to arithmetic operations, but they
unknown and likely unequal. Example: often are. Mathematical calculations can be
numerical rating scale for pain. used with interval and ratio data.
 Interval scale: possesses rank order but also • Parametric statistics, which are designed to
has known and equal intervals between estimate population values, require interval
consecutive values but no true zero. or ratio data. Nonparametric statistics are
Example: temperature. used with ordinal or nominal measures.
 Ratio scale: the highest level of There are conditions when ordinal values
measurement is an interval scale with an may also be used with parametric statistics.

18
CHAPTER Concepts of

9 Measurement Reliability
Chapter Overview

 Objectives
After completing this chapter, the learner will be able to:
1. Discuss the importance of reliability in clinical measurement.
2. Define reliability in terms of measurement error.
3. Distinguish between random and systematic error.
4. Describe typical sources of measurement error.
5. Describe the effect of regression toward the mean in repeated measurement.
6. Discuss how concepts of agreement and correlation relate to reliability.
7. Define and provide examples of test-retest, rater, and alternate forms reliability.
8. Discuss how generalizability theory influences the interpretation of reliability.
9. Discuss how reliability is related to the concept of minimal detectable change.

 Key Terms

Reliability
Classical measurement theory Absolute reliability
True score Intraclass correlation coefficient (ICC)
Systematic error Standard error of measurement (SEM)
Random error Generalizability theory
Reliability coefficient Test-retest reliability
Relative reliability Carryover
Absolute reliability Testing effects
Classical measurement theory Intra-rater reliability
True score Inter-rater reliability
Systematic error Alternate forms reliability
Random error Internal consistency
Reliability coefficient Split-half reliability
Relative reliability Change score

19
Regression toward the mean (RTM) Methodological research
Minimal detectable change (MDC)

 Chapter Summary
understand specific sources, such as rater or
• In classical measurement theory an
timing.
observed score consists of two components:
a true score which is a fixed value, and • Test-retest reliability is an assessment of
unknown error. how well an instrument will perform from
one trial to another when the actual
• Systematic error is predictable, occurring in
measurement has not really changed.
a consistent overestimate or underestimate
of a measure. Random error refers to errors • Intra-rater reliability is a measure of the
that have no systematic bias and can occur stability of data recorded by one tester
in any direction or amount. across two or more trials. Inter-rater
reliability concerns variation between two or
• Errors of measurement can occur because:
more raters who are measuring the same
1) the individual taking the measurement
property.
(the rater) does not perform the test
properly; 2) the measuring instrument itself • Alternate forms reliability is used to assess
does not perform in the same way each whether alternative versions of a test can
time; or 3) the variable being measured is provide reliable scores. This may be in the
not consistent over time. form of a test or questionnaire, or it may be
comparison of different models of an
• Reliability is an indicator of the degree to
instrument.
which a measurement contains error.
Relative reliability is expressed as a • Internal consistency reflects the extent to
coefficient, which reflects how much of the which items on a multi-item inventory are
total variance in a set of scores is true measuring the same characteristics.
variance, and not error. Reliability
• Change scores reflect the difference in
coefficients can range from 0.0 to ±1.00. performance from one session to another,
Consideration of what coefficient value is often a pretest and posttest. If measures do
considered acceptable reliability must be not have strong reliability, change scores
made within the context of the may primarily be a reflection of error.
measurement and the degree of precision
needed for making judgments about the • The phenomenon of regression toward the
measurement. mean indicates that extreme scores on an
initial test are expected to move closer to the
• Absolute reliability indicates a degree of group average on a second test, even when
error variance in the actual units of a there is no true change in performance.
measurement, reflecting the measured This phenomenon is reduced if a
amount of error that can be expected. measurement has strong reliability.
• Reliability exists to some extent in every • Minimal detectable change (MDC) is the
measuring instrument and is not an all-or- amount of change in a variable that must be
none trait. achieved beyond the minimal error in a
• Generalizability theory suggests that not all measurement. It is a threshold above which
error is the same, and can be partitioned to

20
we can be confident that a change reflects • Methods for improving reliability include
true change and not just error. standardizing measurement protocols,
training raters, calibrating instruments, and
• Methodological research involves the
taking multiple measurements.
development and testing of measuring
instruments to establish reliability.

21
CHAPTER Concepts of

10 Measurement Validity
Chapter Overview

 Objectives
After completing this chapter, the learner will be able to:
1. Discuss the importance of validity in clinical measurement.
2. Define and provide examples of face, content, criterion-related, and construct validity.
3. Discuss issues affecting validity of measuring change.
4. Define minimal clinically important change.
5. Distinguish between criterion and norm referencing.

 Key Terms

Validity Known groups method


Content validity Multitrait-multimethod matrix (MTMM)
Criterion-related validity Factor analysis
Concurrent validity Exploratory factor analysis (EFA)
Predictive validity Confirmatory factor analysis (CFA)
Construct validity Criterion-referenced test
Convergent validity Norm-reference test
Discriminant validity Responsiveness
Face validity Minimal detectable change (MDC)
Gold standard Minimal clinically important change (MCID)
Sensitivity Floor effect
Specificity Ceiling effect

22
 Chapter Summary reference standard.
• Validity concerns the meaning that we • The standard used to establish criterion
give to a measurement, reflecting if a validity is ideally a gold standard that is
measure is capable of discriminating already established as a valid measure of
among individuals with and without the variable of interest. When a gold
certain traits, if it can evaluate the standard is not available, as is true for
magnitude or degree of change in a many clinical variables, a reference
measurement, and if we can use it to standard must be chosen that is assumed
make predictions about future status to measure the variable of interest. This
based on a measurement. is the process used in the assessment of
diagnostic accuracy and screening tests.
• Validity is affected by systematic error.
• Sensitivity indicates the extent to which
• Validity of physical measures is straight- the target test accurately identifies those
forward, such as measurement of length with the condition (true positive), and
using a ruler. Measuring constructs specificity measures the ability of the
presents a challenge to assess validity. target test to identify those without the
• Validity is not an all-or-none property condition (true negatives).
and will be present to some degree in • Construct validity establishes the ability
most measures. Therefore, rather than of an instrument to measure the
asking if an instrument is valid, it is more dimensions and theoretical foundations
appropriate to ask if it is valid for a given of an abstract construct.
purpose.
• Methods of establishing construct
• Measuring validity is not as straight- validity include the known groups method
forward as reliability. It must be evalu- whereby individuals known to be
ated by constructing an “evidentiary different on a variable are tested to deter-
chain” that links the method of measure- mine that the instrument can identify their
ment to the intended application. Three group membership. Factor analysis is a
main types of evidence can support statistical procedure that examines multi-
validity. ple dimensions of a scale to determine if
• Content validity establishes that the the theoretical components of the
content of a test adequately samples the measurement are adequately represented.
universe of content that defines the con- • Convergent validity measures the extent to
struct being measured. This is typically which a test correlates with other tests of
established through consensus of experts. similar constructs, and discriminant
• Criterion-related validity establishes the validity is the extent to which a test is not
correspondence between a target test and correlated with tests of different con-
a reference standard to determine that the structs.
target test is measuring the variable of • Face validity is not considered a true
interest. Concurrent validity establishes measure of validity but does reflect the
this correspondence by taking both degree to which an instrument appears to
measurements at relatively the same time. measure what it is intended to measure.
Predictive validity reflects the extent to
which the target test can predict a future • Criterion referencing refers to comparison
of an individual’s performance on a test

23
to a fixed standard that represents an family member’s view of improvement.
acceptable level of behavior, such as a
• A ceiling effect occurs when a scale is not
passing grade on an exam or measure-
precise enough at the high end to
ment of a normal heart rate. Norm-
distinguish small changes in those with
referencing is used to compare and rank
strong performance. A floor effect occurs
individuals within a defined population,
when the scale cannot sufficiently
indicating how their performance ranks
distinguish small changes at the bottom
within the group, such as a curved grade
of the scale.
on an exam.
• Methodological research for validity
• The minimal clinically important dif-
incorporates testing both reliability and
ference (MCID) is the smallest difference
validity in several ways over time.
in a measured variable that signifies a
Strategies for improving validity include
meaningful difference in performance. It
fully understanding the construct of
is based on a criterion that established
interest and its clinical context, using
sufficient change to be considered
several different approaches for gathering
“important.” This may be a subjective
evidence of validity, and cross-validating
determination by a patient (“I feel
outcomes on different groups to show
better”), or it may reflect a clinician or
consistency in outcomes.

24
CHAPTER Designing Surveys
11 and Questionnaires
Chapter Overview
 Objectives
After completing this chapter, the learner will be able to:
1. Describe the role of surveys in clinical research.
2. Describe the basic structure of survey instruments.
3. Describe the process of designing a survey.
4. Discuss the characteristics of good survey questions.
5. Develop a cover letter that will engage potential participants in a survey.
6. Describe the processes of Q sort and Delphi surveys.

 Key Terms
Survey Visual analog scale (VAS)
Questionnaire Branching
Interviews Population
Structured interview Sample
Semi-structured interview Probability sample
Self-report Nonprobability sample
Recall bias Sampling error
Open-ended question Response rate
Closed-ended question Cross-tabulations
Likert scale

 Chapter Summary
structured, or unstructured, depending on
• Surveys can be used for descriptive
the format of questions.
purposes or to generate data to test specific
hypotheses. • Because surveys are generally based on
self-report, recall bias can be an issue if
• Surveys are generally administered as
respondents do not accurately remember
questionnaires or through interviews.
behaviors.
Interviews can be structured, semi-

25
• Stage 1 in planning a survey involves choices, and considering if respondents can
delineation of a research question, the answer easily and honestly.
target population, and hypotheses. Stage 2
• Questions should be worded clearly,
includes designing the survey instrument, avoiding bias in wording, and using
creating and testing items, and finalizing language that can be easily understood by
the format. Stage 3 includes IRB approval, all respondents.
selection of a sample, and distribution of
the survey. • Samples for surveys generally need to be
large. Probability samples are preferable
• Like any other form of data collection, but may not be feasible.
surveys are part of a larger research
question that specifies the variables of • Coverage error occurs when all members of
interest. a population are not eligible, potentially
biasing results.
• Developing content for a survey involves
reviewing existing instruments to • Response rates for surveys are the major
determine if they are applicable for your drawback of this method. Needed sample
study. If items must be developed, they size can be determined based on estimates
must go through several rounds of review, of an acceptable margin of error.
usually by experts in the field. Through
• The cover letter is an important part of the
pilot testing and revisions, the content if survey to give respondents the motivation
refined until it attains an acceptable level of to respond by letting them know the
validity. importance of the project and why they
• Survey questions can be open-ended, where were chosen. Confidentiality should be
respondents are asked to answer in their protected. Follow up is often necessary to
own words, or they may be closed-ended, encourage respondents to return the survey.
where respondents are asked to choose • Questions and responses should be coded
from a fixed set of choices for their answer. so that data can be easily recorded once
• There are many types of closed-ended surveys are returned.
questions, including dichotomous choices, • Data are often subjected to descriptive
check all that apply, multiple choice, statistics or cross-tabulations to determine
measures of intensity, checklists, or scales. if answers are related.
• Writing good questions requires using clear • Returning the survey is considered
language, giving applicable answer informed consent.

26
CHAPTER Understanding Health
12 Measurement Scales
Chapter Overview
 Objectives

After completing this chapter, the learner will be able to:


1. Describe how measurement scales are used to evaluate constructs.
2. Distinguish between summative and cumulative scales.
3. Describe the characteristics of a Likert scale.
4. Discuss the application and limitations of visual analog scales in clinical practice.
5. Describe the role of Rasch analysis for understanding cumulative scales.

 Key Terms

Scale Rasch analysis


Subscales Person-item map
Summative scale Logit
Cutoff scores Thresholds
Likert scale Item distribution
Visual analog scale (VAS) Item separation
Anchors Person separation
Numerical rating scale Infit
Faces rating scale Outfit
Cumulative scale Person fit
Classical measurement theory Differential item functioning (DIF)
Item response theory Computer adaptive testing (CAT)

 Chapter Summary

• Many measurement scales, usually designed • A scale is an ordered system based on a


as questionnaires, are composed of several series of questions or items, resulting in a
items that are used to reflect an underlying score that represents the degree to which a
latent trait and the dimensions that define it. respondent possesses a particular attitude,

27
value, or characteristic. In classical controlling health is primarily internal, a
measurement theory, the trait is assessed by matter of chance, or under the control of
getting a total numerical score. powerful others.
• For a scale to be meaningful, it should be • A visual analog scale (VAS) is composed of a
unidimensional, representing a singular line, usually fixed at 100 mm in length, with
overall construct. Subscales can be created word anchors on either end that represent
if different dimensions exist with the overall extremes of a characteristic, such as “no
trait. pain” and “worst pain imaginable” or
“fatigue” and “worst possible fatigue.” The
• Scales may be generic or they may be geared
individual is asked to place a mark along the
towards specific conditions, age ranges, or
continuum that represents the level of the
care settings.
trait they experience. This scale is
• A summative scale is one that presents a total considered a quick self-report measure. It
score by adding values across a set of items. can only measure a trait in one dimension,
The score is intended to indicate the although multiple VASs can be used to
“amount” of the latent trait that is present in assess multiple dimensions.
an individual. The summative score is based
• Numerical rating scales ask for a respondent
on the assumption that all items contribute
to indicate the extent of their pain (or other
equally to the total.
trait), usually from 0 to 10, with anchors
• The interpretation of summative scales can defined at each extreme. A faces rating scale
be problematic because the same score can has been used with faces showing
be achieved by two people, but their actual increments of distress, often used with
trait intensity may be different if they have a children or elderly individuals.
different combination of responses.
• VAS scores, although recorded in
• Many scales use items with scores on an millimeters, are ordinal, as it is not possible
ordinal scale. These can present to know if one individual’s score of 6 is
interpretation difficulties because the equal to another’s score at the same level.
intervals between graded scores are not
• A cumulative scale presents a set of
equal or known. When scales are used as
statements that reflect increasing severity of
screening tools, a cutoff score can be
the characteristic being measured. It is based
determined to demarcate a positive or
on the assumption that there is only one
negative test.
unique combination of responses that can
• A Likert scale is a summative scale in which achieve a particular score, as each item is
individuals choose among answer options required to be present for the next item to be
that range from one extreme to another to selected.
reflect attitudes, beliefs, perceptions or
• In Rasch analysis, items are examined to
values. They typically include five
indicate which ones are more or less
categories, such as: Strongly Disagree (SD),
difficult, and persons’ responses are
Disagree (D), Neutral (N), Agree (A),
examined to reflect their level of ability.
Strongly Agree (SA). Four categories can be
Based on item response theory (IRT), this
used, eliminating the neutral choice, to force
model will indicate a good fitting scale if
respondents to declare their opinion.
those individuals with greater disability can
• The Multidimensional Health Locus of “pass” only the easier items, and those with
Control (MHLC) scale is an example of a greater ability can “pass” the more difficult
Likert scale that includes three subscales items.
indicating if the individual believes

28
• A good scale will have items that reflect the subgroups within a sample respond
full range of difficulty as well as the full differently to individual items, despite their
range of ability and will have items that are having similar levels of the latent trait.
distributed across that continuum. Fit
• Computer adaptive testing (CAT) involves
statistics are used to indicate how well the adjusting levels of difficulty of items on a
scale fits this model. Because the test based on how individuals respond to
relationship between difficulty and skill questions. Using a complex algorithm
level should be constant, a given score examining correct and incorrect responses,
should represent the same degree of a trait the number and difficulty of succeeding
for all persons with that score. items can be altered so easier items can be
• Differential item functioning (DIF) refers to omitted for those with higher performance.
potential item bias in the fit of the data when

29
CHAPTER
Choosing a Sample
13 Chapter Overview

 Objectives
After completing this chapter, the learner will be able to:
1. Distinguish between populations and samples.
2. Define the concepts of sampling bias and sampling error.
3. Distinguish between target and accessible populations.
4. Describe the purpose of inclusion and exclusion criteria in sampling for research studies.
5. Contrast different types of probability and nonprobability sampling procedures.
6. Discuss issues in recruiting an adequate sample size.

 Key Terms
Population
Sample Random sample
Target population Simple random sample
Accessible population Systematic sample
Inclusion criteria Stratified random sample
Exclusion criteria Cluster sampling
Power Multi-stage sampling
Probability sample Disproportional sample
Random sampling (selection) Convenience sample
Nonprobability sample Consecutive sample
Statistic Quota sampling
Parameter Purposive sampling
Sampling error Snowball sampling
Sampling bias

30
 Chapter Summary
• A population is the aggregate of persons or • The power of a study relates to the ability to
objects that meet a specified set of criteria find statistically significant effects when
and to whom results of a study will be they exist. Factors that affect power include
generalized. sample size, variability in the data, and the
magnitude of the effect being studied.
• A sample is a subgroup of the population
Therefore, sample size should be determined
chosen for study to serve as the reference
during planning of a study to be sure that it
group for drawing conclusions about the
is sufficient to identify outcomes.
population.
• Samples can be identified through a process
• The target population is the overall group to
of probability sampling, in which
which results will be generalized. The
participants are randomly chosen from the
accessible population is the group to which
accessible population, meaning that each
the researcher has access and from which the
member has an equal chance of being
actual sample will be drawn.
selected, eliminating potential bias. In a
• When selecting a sample, the researcher nonprobability sample, participants are
must specify inclusion criteria that identify chosen by nonrandom methods. This
the primary traits that would make someone technique is used more often in clinical
from the population eligible to be a sample research because of the availability of
subject, such as demographics, education, or individuals to serve as subjects.
health conditions. Exclusion criteria are
• Sampling error refers to the difference
those factors that would preclude someone
between a sample’s averaged data and the
from being a subject, factors that could
averages that would be obtained from the
confound results such as comorbidities or
entire population. Although we can’t know
cognitive ability.
this difference, there are ways to estimate it
• A recruitment strategy has to be developed statistically.
that will indicate how individuals will be
• Sampling bias occurs when the individuals
identified from the accessible population
selected for a sample overrepresent or
and how they will be invited to participate.
underrepresent certain population attributes
• The final sample will be composed of those that are related to the phenomenon under
who agree to participate. The sample may study. Such bias should be minimized when
be further reduced by the end of the study a random sample is chosen.
because of attrition. Reasons for attrition
• Probability sampling methods include
should be documented to determine if they
several randomized approaches.
are related to the study, such as participants
who drop out because of inconvenience, or  Simple random sampling involves
if reasons are random. choosing subject at random from the
accessible population, usually based on
• A flow diagram should be included that directories or census lists.
details the number of participants who are
invited, who agree to participate, and who  Systematic sampling draws participants
are part of the final analysis, with reasons for from lists by choosing individuals based
attrition. on a sampling interval, such as every 10th
person.

31
 Stratified random sampling is used when • Nonprobability samples are created when
the population is divided into subsets samples are chosen on some basis other than
based on a characteristic that is relevant to random selection.
the outcome, such as age or gender.
 In convenience sampling, subjects are
Subjects are chosen randomly from each
chosen based on availability, including the
stratum, usually in proportion to the
use of volunteers.
number of individuals in the population
who fall into that subgroup.  Consecutive sampling involves recruiting
participants as they become available.
 Cluster sampling is used when accessing a
large or dispersed population. Individuals  Quota sampling incorporates elements of
or groups are clustered according to some stratification so that participants are
characteristic and randomly chosen within chosen nonrandomly from subgroups.
each cluster. For example, a sample may  In purposive sampling, subjects are chosen
start by randomly choosing 10 states, then based on particular criteria and asked to
10 cities within each state, and then participate, in order to create an
residents within certain areas in those informative sample for a specific purpose.
cities, a technique called multi-stage
sampling.  Snowball sampling is used when
individuals are difficult to identify, such as
 Disproportional sampling is used when in the study of sensitive topics or rare
subgroups in the population are of characteristics. A few subjects are
unequal size, creating a situation where identified who meet the criteria and they
some groups may provide insufficient are asked to identify others who they know
samples for comparison. Smaller groups who also fit the requirements. This
may be purposefully oversampled to give “snowballing” is continued until an
them adequate representation. Analysis adequate sample is obtained.
procedures are adjusted to account for
this disproportional representation.

32
CHAPTER Principles of

14
 Objectives
Clinical Trials
Chapter Overview
After completing this chapter, the learner will be able to:

 Objectives
1. Describe the role of clinical trials in healthcare research.
2. Describe the characteristics of a randomized controlled trial.
3. Define different random assignment strategies.
4. Describe different types of control groups.
5. Discuss the importance of blinding in research protocols.
6. Discuss the differences between randomized controlled trials and pragmatic clinical trials.
7. Define phases of clinical trials.
8. Discuss the difference between superiority and non-inferiority trials.

 Key Terms
Clinical trial
Therapeutic trial Allocation concealment
Diagnostic trial Control group
Preventive trial Placebo
Clinical trials registry Sham
Randomized controlled trial (RCT) Attention control group
Treatment arms Wait list control group
Parallel group Active controls
PICO Equipoise
Active variable Blinding
Attribute variable Double-blind study
Random assignment (allocation) Single-blind study
Simple random assignment Open-label trial
Block random assignment Pragmatic clinical trial (PCT)
Stratified random assignment Pragmatic-explanatory continuum
Cluster random assignment Preclinical research (basic research)
Random consent design Phase I trial
Assignment by patient preference Phase II trial
Run-in period Phase III trial

33
Phase IV trial Non-inferiority margin
Superiority trial Minimal clinically important difference (MCID)
Non-inferiority trial Equivalence trial

 Chapter Summary

subgroups (blocks) and are randomly


• Clinical trials can focus on therapeutic
assigned to treatment arms within each
effects of interventions, establishing
block.
accuracy of diagnostic tests, or prevention
strategies.  Stratified random assignment involves
subjects being divided into strata based
• The randomized clinical trial (RCT) is
on a relevant group characteristic, and
considered the gold standard for
then randomly assigned to treatment
experimental research based on the rigor of
groups within each stratum.
its structure in controlling confounding
effects.  Cluster random assignment involves
assigning participants to groups based on
• The basic RCT design incorporates two sites in which they are available,
randomly assigned groups. Each group is randomly assigning treatments to sites,
also called a treatment arm. and all individuals at that site getting the
• Experimental designs require the ability to same treatment.
define the levels of an independent  The random consent design involves
variable, the use of random assignment to assigning subjects randomly to groups
groups, and the use of a control group for before seeking consent to participate.
comparison. Only those assigned to the experimental
• An independent variable can be an active group are approached for consent, and
variable when subjects can be assigned to others receive standard care.
groups or conditions. An attribute variable  Assignment by patient preference allows
is a factor that cannot be assigned because subjects to express a preference for one
it is an inherent characteristic of treatment condition or the other. Only
participants. those who express no preference are
• Random assignment is a foundational randomly assigned to groups.
element of an RCT, meaning that each  A run-in period is a way of assuring that
subject has an equal chance of being subjects will adhere to the protocol. All
assigned to any group, creating groups that subjects are given a placebo prior to
should be balanced. assignment, and those who comply are
• There are several forms of random then randomly assigned to be part of the
assignment that can be used. study.

 In simple random assignment, every • Allocation concealment involves assign-


member of the sample has an equal ment of subjects to groups without the
chance of being assigned to either group. knowledge of those involved in the
experimental process. A common method
 In block random assignment, subjects are is use of sealed envelopes that contain a
randomly divided into even-numbered

34
participant’s assignment. External • Randomized trials are defined in sequential
agencies can generate random sequences to phases related to the types of information
assure that there is no bias. collected, usually related to drug trials. In
Phase I trials, small groups of healthy
• Using a control group is an essential design
volunteers are tested to demonstrate safety
strategy in an RCT providing a comparison
for a treatment group. The control can be a of a new therapy. In Phase II trials, larger
standard treatment, a placebo or sham, or groups of patients are evaluated to
no intervention at all. determine the efficacy of the treatment at
different dosages. In Phase III trials, very
• Blinding, also called masking, assures that large samples are tested over long periods
those involved in the study are unaware of to compare the new treatment with standard
the participant’s group assignment. In a care, monitoring effects of dosages and side
double-blind study, neither subjects nor effects. Phase IV trials are done after new
investigators are aware of treatment groups. treatments are approved. Subgroups may
In a single-blind study, either subjects or be tested to explore the effects of the
investigators may be blinded, but not both. treatment further.
• In open label trials, both researchers and • When studies are done specifically to show
participants know which treatment is being that one treatment is superior to another, the
administered, which is often necessitated trial is considered a superiority trial. When
by the nature of the treatment. a new treatment is being compared to an
• RCTs represent the gold standard for accepted treatment to show that it is equally
experimental research, providing the effective, the trial is considered a non-
strongest level of control. However, the inferiority trial, with the intent of showing
need to define samples narrowly and define that the new treatment is a reasonable
exact protocols can make generalization to alternative, which may be cheaper or have
the real world difficult. A pragmatic clinical fewer risks. The new treatment is then
trial (PCT) is designed like an RCT but considered “no worse” than the standard
incorporates a diverse patient population approach. The difference between the
recruited directly from practice settings. treatments must be within a non-inferiority
Treatment proceeds as it would in typical margin that is the biggest difference
clinical situations and data collection between the responses of the two groups
focuses on important outcomes such as that would be acceptable to consider the
patient satisfaction and quality of life. new treatment an acceptable substitute.

35
CHAPTER

15 Design Validity
Chapter Overview

 Objectives
After completing this chapter, the learner will be able to:
1. Describe threats to internal, external, construct and statistical conclusion validity related
to quantitative research.
2. Describe several strategies to control for subject variability in experimental design.
3. Describe the purpose of blinding.
4. Describe strategies for controlling intersubject differences in research design.
5. Discuss various reasons for missing data and different types of “missingness.”
6. Describe how a flow diagram can explain how a sample is finalized.
7. Explain the purpose of intention to treat analysis.
8. Describe various methods for handling missing data in data analysis including data
imputation.

 Key Terms

Statistical conclusion validity Diffusion or imitation of treatments


Internal validity Compensatory equalization
Internal threats Operational definitions
History Multiple treatment interaction
Maturation Experimental bias
Attrition Experimenter effect
Testing effect External validity
Reactive effect Adherence
Instrumentation effect Hawthorne effect
Selection Ecological validity
Quasi-experiment Homogeneous sample
Social threats Blocking variable
Matching
Propensity score

36
Independent factor Missing at random (MAR)
Repeated factor (measure) Missing not at random (MNAR)
Analysis of covariance (ANCOVA) Completer analysis
Covariate Imputation
Per-protocol analysis Last observation carried forward (LOCF)
Intention to treat (ITT) Multiple imputation
Missing completely at random (MCAR)

 Chapter Summary

are not responsible for the outcome


• Whether using explanatory or
rather than the intervention.
observational designs, four types of
validity are important to understanding • Specific threats to internal validity
the strength of evidence: statistical include:
conclusion validity, internal validity,
 History refers to the confounding
construct validity, and external validity.
effect of specific events that occur
• Statistical conclusion validity concerns after the introduction of treatment but
the appropriate use of statistical before the final test.
procedures for analyzing data, leading to  Maturation concerns the effect of the
unbiased conclusions about the passage of time on changes in
relationship between independent and outcomes.
dependent variables. Threats include a
lack of statistical power, violated  Attrition is defined as loss of subjects
assumptions of statistical tests, reliability over time, which is a concern if it is too
of the data, and failure to use intention to large or if the attrition is related to the
treat analysis. experimental situation.
• Internal validity has a focus on cause-and-  Testing effects occur when the actual
effect relationships, requiring three testing of the outcome variable is
components in a design. responsible for changes in
performance.
 Temporal precedence means that the
cause precedes the effect, that a change  Instrumentation effects occur when the
in outcome is only observed following measuring instruments are unreliable.
the application of treatment.  Regression to the mean is also
 Covariation of cause and effect is based associated with a lack of reliability,
on documentation that the outcome occurring when there are extreme
only occurs in the presence of the scores on the pretest, which often
intervention and therefore is not regress toward the mean score on the
related to other causes. posttest.
 Finally, we must demonstrate that no  Selection effects occur when subjects
plausible alternative explanations are are assigned to groups in a biased way,
possible and that confounding variables thereby influencing how groups

37
respond. subjects act as their own control,
which avoids having to match subjects
• Social threats to validity include diffusion
across groups.
of effects because of communication
among subjects, investigators giving  Using an analysis of covariance
preference to one group, or subjects (ANCOVA) as an analysis technique to
feeling more or less invested because of control for a confounding variable in
the receiving the control condition the analysis of data.
instead of the experimental treatment.
• It is also important to control for the
• Construct validity concerns how the effect of missing values in data, as they
independent and dependent variables are can influence analysis, especially if they
defined and is related to the nature of are not random events. Data may be
conclusions that can be drawn based on missing because subjects drop out or if
those operational definitions. This can be they are noncompliant. A flow diagram
affected by the time frame of the study, should be included in studies to show
interactions of multiple treatments, or how subjects proceed through the study,
experimental bias that occurs when including reasons for subjects who do not
experimenters or subjects try harder complete the study as expected.
because they are in a study.
• There are many approaches to data
• External validity concerns the extent to analysis to account for missing data.
which results can be generalized beyond These include per protocol analysis, in
the study sample and the experimental which subject data are only included for
setting. It can be influenced by how those who complete the study and
subjects are selected. complied with the protocol. This can
overestimate the effect of a treatment. A
• Strategies to control threats to validity
preferred method is intention to treat
include:
analysis (ITT) which means that data are
 Using random assignment to groups analyzed according to original random
 Using a homogeneous sample, assignment, regardless of the treatment
restricting subjects on certain subjects actually received. This approach
variables to avoid confounding. guards against bias of drop outs.

 Building in a blocking variable as • When many data points are missing, it


another independent variable to could affect data analysis procedures that
account for it in the analysis. require all variables to be available. Data
can be estimated for missing values using
 Matching subjects across groups to a process of data imputation, which
assure there are comparable subjects. statistically projects values for missing
 Using repeated measures, where data based on averages or correlations
among variables within a sample.

38
CHAPTER Experimental

16 Designs
Chapter Overview
 Objectives
After completing this chapter, the learner will be able to:
1. Describe the structure of basic experimental designs for independent groups.
2. Discuss how experimental designs control for threats to internal validity.
3. Discuss the role of main effects and interaction effects in a factorial design.
4. Describe the structure of repeated measures designs.
5. Discuss the advantages and disadvantages of using repeated measures designs.
6. Describe the process of sequential clinical trials.

 Key Terms
Completely randomized designs Interaction effect
Between-subjects designs Randomized block design
Parallel groups designs Blocking variable
Within-subjects design Practice effects
Repeated measures design Carryover effects
Single factor design Order effects
One-way design Latin square
Multi-factor design Crossover design
Pretest-posttest control group design Washout period
Placebo Two-way design
Posttest-only control group design Mixed design
Factorial designs Sequential clinical trials
Main effect

 Chapter Summary
• Experimental designs can take on a • A true experiment requires that
variety of configurations, all designed participants can be randomly assigned to
to offer control of internal validity. at least two comparison groups. In a one-
way design, a study includes one

39
independent variable. A two-way design  This design allows the researcher to
includes two independent variables. study the effect of each independent
variable separately, called main
• Designs can be chosen based on the
effects, and the interaction between
number of independent variables, the
the two variables.
number of levels within each independent
variable, the number of groups being  In a randomized block design, an
tested, how subjects are assigned to attribute variable is built into the design
groups, how many observations are as a second independent variable,
made, and the temporal sequence of allowing its effect to be studied at each
interventions and measurements. level of the treatment.
• The pretest-posttest control group design • In a repeated measures design,
is the basic structure of a randomized subjects are tested at each level of the
controlled trial (RCT) comparing two independent variable, serving as their
groups, also called a parallel group study. own controls. This is also called a
Because groups are assigned at random, within-subjects design.
theoretically they differ solely on the  A repeated measures study may use
basis of the provided treatment, allowing a one-way design with only one
for conclusions regarding the effect of the independent variable or a two-way
treatment. design when two independent
 Control groups may receive a placebo, variables are included.
a sham treatment, or a different  Repeated measures designs must be
treatment, typically based on standard able to account for the potential
care. carryover or practice effects that
 This design can be extended to include can occur when subjects repeat a
more than two groups to compare measurement over several trials,
several treatment conditions. which can be controlled by
randomizing the order of con-
• The posttest- only control group design is ditions.
identical to the pretest-posttest design,
except that there is no pretest  In a cross-over design, half the
administered. This approach can be sued subjects are assigned treatment one
when a pretest is either not relevant, first, the other half getting treatment
impractical, contraindicated, or two. After a washout period, the
potentially reactive. It is still considered subjects receive the alternative
to have strong control if the groups are treatment.
randomly allocated.  A two-way design that includes a
• When more than one independent repeated measure and an
variable is incorporated, a factorial design independent measure is called a
is used. The design is described by the mixed design.
number of independent variables and • In a sequential clinical trial, two
their levels. Therefore, a 3 × 2 design has treatment conditions A and B are
two independent variables, one with 3 compared by analyzing the pref-
levels, the other with 2 levels. erences of pairs of subjects, who are
recruited sequentially. The results of

40
the comparison within each pair is treatment B or if there is no significant
charted on specially constructed charts difference between them. This design
that provide stopping rules for often results in the need for fewer
deciding if treatment A is better than subjects than in a fixed sample design.

41
CHAPTER
Quasi-Experimental

17 Designs
Chapter Overview

 Objectives
After completing this chapter, the learner will be able to:
1. Discuss the reasons for implementing quasi-experimental designs.
2. Describe the basic elements of time-series designs.
3. Describe design variations of time-series designs, including the role of a withdrawal period.
4. Describe nonequivalent group designs
5. Discuss the threats to validity of quasi-experimental designs.

 Key Terms
Quasi-experimental designs
One-group pretest-posttest design
Interrupted time series design (ITS)
Nonequivalent group designs
Nonequivalent pretest-posttest control group design
Intact groups
Subject preferences
Historical controls
Nonequivalent posttest-only control group design

 Chapter Summary
• Quasi-experimental designs are similar to • Time series designs focus on assessment of
experimental designs but lack either random responses over time, where time serves as the
assignment, comparison groups, or both. independent variable that is measured across
several time intervals.
• Quasi-experimental designs present threats to
internal validity because of the lack of these • In the one-group pretest-posttest design, one
controls. group of subjects is tested before and after an
intervention. The design offers little control of

42
threats to internal validity. However, the design  This design can be used with intact groups or
can be defended if previous research has shown when subjects self-select their group
the intervention is better than no treatment, and assignment.
where the time interval from pretest to posttest  Design validity is threatened by the lack of
is relatively short, limiting the potential effect of randomization. Control can be imposed by
other variables on the outcome. checking the balance in baseline scores between
• Time series designs are considered repeated groups, by stratifying subjects to control for
measures, since the same subjects are being important characteristics, or matching.
tested at time interval.  Historical controls involve the use of subjects
• The interrupted time series design is an from a prior study to serve as controls,
extension of the one-group pretest-posttest compared to subjects receiving the
design, with several measures before and after experimental intervention.
an intervention. Data patterns are observed to  The nonequivalent posttest-only design
demonstrate that there is a difference in includes a control group but is subject to
response following the intervention. confounding because there is no way to assure
 Design variations include the use of a equivalence on important characteristics at
continuous intervention that is withdrawn at baseline. This study can be used most
some point to demonstrate behavior improves effectively to look at relationships among
with treatment and reverts back to baseline variables for those who do and do not get the
when the treatment is withdrawn. intervention, but cause and effect cannot be
interpreted.
• The nonequivalent group design can be
configured like pretest-posttest experimental
designs but without a randomized control.

43
CHAPTER Single-Subject

18 Designs
Chapter Overview
 Objectives
After completing this chapter, the learner will be able to:
1. Describe the basic structure of single-subject designs.
2. Describe methods of measuring target behaviors.
3. Discuss the limitations of A-B designs.
4. Describe the structure of withdrawal designs, multiple baseline designs, alternating treatment
designs, and multiple treatment designs.
5. Describe the structure and use of N-of-1 trials.
6. Explain how baselines and withdrawal phases support the internal validity of single-subject
designs.
7. Describe the process of visual analysis in single-subject designs.
8. Apply methods of quantitative analysis for single-subject designs.
9. Describe the process of generalization in single-subject research.

 Key Terms
Single-subject designs N-of-1 trial
Target behavior Visual analysis
Baseline phase Level
Intervention phase Trend
A-B design Split-middle line
Stability Celeration line
Slope Binomial test
Trend Serial dependency
Interval recording Two standard deviation band method
Replication of effects Statistical process control (SPC)
Withdrawal designs Control charts
A-B-A design Effect size
A-B-A-B design Nonoverlapping scores
Multiple baseline design Generalization
Alternating treatment design Direct replication
Multiple treatment design Systematic replication
Changing criterion design Clinical replication

44
 Chapter Summary
• Single-subject designs (SSD) involve • A-B designs are especially susceptible to
serial observations of individual behaviors threats to internal validity because there is
before, during, and after interventions to no control condition.
document the trend in responses.  Control must be demonstrated through
• SSDs have the advantage of being able to replication of effects, which can involve
examine the responses of each individual withdrawal and reinstatement of
rather than looking only at aggregated data treatment, replication across more than
across groups, contributing to an one subject, or by comparing two or more
understanding of patient differences and interventions.
the benefits of individualized care.
• In an A-B-A withdrawal design, the
• SSDs are a form of time-series study with treatment is withdrawn at some point
a collection of repeated measurements and following the intervention. If responses
are therefore considered a within-subjects improve with intervention and then revert
design. back to baseline, it is evidence that the
treatment had an effect.
• The response to treatment is considered
the target behavior, which can be  Withdrawal designs can also include a
measured by frequency, duration, or second intervention phase, A-B-A-B, to
magnitude of the response. further demonstrate the treatment’s
effect.
• Data are collected across at least two
phases: A is the baseline during which the • In a multiple baseline design, effects can be
target behavior is monitored but no replicated across more than one subject,
treatment is provided. B is the intervention across multiple settings, or across multiple
phase during which the treatment is behaviors.
provided while measurements continue.  The multiple baseline design can use the
 A design with one baseline and one A-B configuration because effects can be
intervention phase is called an A-B design. replicated, controlling potential con-
founding.
 Results are plotted on a chart showing the
responses across all phases, with the target  Each replication incorporates a different
behavior along the Y-axis, and time across length of the baseline period. By
the X-axis usually in hours, days, or staggering baselines, consistent effects of
weeks. treatment can be documented.
• Baseline data are an essential element of • In an alternating treatment design, two or
the SSD, serving as the control condition more treatment conditions are imple-
for comparison with the intervention mented at each treatment session, usually
phase. Baselines may be level, show in random order, to determine if the
erratic values, or may be ascending or treatments engender different levels of
descending. response.
 It is important to demonstrate that the • In a multiple treatment design, more than
trend in behavior is different in the one intervention is tested, usually after
baseline and intervention phases to withdrawal of the initial intervention,
document the treatment’s effect. using an A-B-A-C configuration.

45
 Other variations of this design can also  A change from A to B is evaluated using
include combinations of interventions, the binomial test, which determined if
such as A-B-A-BC, where B and C there is a significant difference in the
interventions are provided alone and in proportion of points that fall above and
combination to see how their effects below the extended line.
differ.
• The two-standard-deviation band method
• In a changing criterion design, the involves calculating the mean and
treatment goal is incrementally increased standard deviation of points in the baseline
to demonstrate the effectiveness of phase.
intervention as the patient progresses.  Lines are extended from baseline to
• An N-of-1 trial uses the structure of an RCT intervention phase at two standard
to demonstrate the effectiveness of an deviations above and below the mean.
intervention, or to compare two  If at least two consecutive data points in
interventions in a single patient. the intervention phase fall outside the two
 It incorporates a cross-over design in standard deviation band, changes from
which the patient will experience one baseline to interventions are considered
intervention, followed by a washout significant.
period, and then experiences the second
• Statistical process control (SPC) is a
intervention. method of examining variability in data
 The patient and clinician keep records of across baseline and intervention phases.
responses under both conditions to  Upper and lower control limits are plotted
determine which is preferable. In this at 3 standard deviations above and below
way, this design becomes a decision- the baseline mean. Data that fall outside
making tool. these limits are considered too variable to
 Several N-of-1 trials can be combined in be considered similar to baseline.
a form of meta-analysis to show the
• Effect size can also be used to evaluate
effectiveness of an intervention.
changes from baseline to intervention
• SSDs can be evaluated using visual phases.
techniques, documenting a change in the  A commonly used method is evaluation
trend (slope) and level (magnitude) of the of nonoverlapping scores from baseline
response over time. to intervention. The percent of non-
• The split middle line is a line that is drawn overlap is a reflection of the treatment
through the median points in the baseline effect.
phase and extended into the intervention  When all intervention points overlap with
phase. baseline points, there is 0% nonoverlap,
 Also called a celeration line, it can be used and the distribution of scores in both
a way of showing that the trend in response phases is conceptually superimposed,
during the baseline phase does not continue indicating no treatment effect.
into the intervention phase, thereby
• Generalization of single-subject out-
documenting a meaningful change. comes is important as a measure of

46
external validity.  Clinical replication involves replication
 Generalization can be provided by direct of effects in typical clinical situations
where treatment may be used in various
replication by repeating single-subject
experiments across several subjects. combination to fit the patient’s needs.
 Social validity has been used to indicate
 Systematic replication demonstrates
the application of single-subject study
consistency of findings under conditions
findings in real world settings.
that are different from the initial study.

47
CHAPTER
Exploratory Research:
Observational
19 Designs
Chapter Overview
 Objectives
After completing this chapter, the learner will be able to:
1. Distinguish between observational and experimental research.
2. Describe criteria for evaluating causality in observational research.
3. Distinguish between longitudinal and cross-sectional study designs.
4. Distinguish between prospective and retrospective study designs.
5. Describe the structure and design elements of a cohort study.
6. Describe the structure and design elements of a case-control study.
7. Identify strategies for minimizing bias or confounding in cohort and case-control studies.

 Key Terms
Observational studies Secondary analysis
Exploratory research Misclassification
Descriptive research Case-control study
Analytic research Population-based study
Exposure Hospital-based study
Risk factor Selection bias
Dose-response relationship Observation bias
Longitudinal studies Interviewer bias
Prospective studies Recall bias
Retrospective studies Confounding
Cross-sectional studies Matching
Reverse causation Propensity score matching
Cohort studies

 Chapter Summary
differences to demonstrate how exposures can
• Observational research can be classified as
help explain observed differences.
descriptive or analytic. Descriptive research
characterizes populations by examining the • Observational research differs from
distribution of health-related factors. Analytic experimental research in that there is no
research focuses on examining group manipulation of variables and no assignment of

48
subjects to groups. Existing groups are  A disadvantage of the retrospective study is
identified by their shared history or health the lack of investigator control of how past
status. Observational designs are needed when data were collected.
it is not feasible to use experimental designs for
• A cohort study is a longitudinal investigation
demonstrating treatment effectiveness.
in which the researcher identifies a group of
• A common use of observational designs is to subjects who do not yet have the outcome.
draw causal inferences about the effects of a Exposed and unexposed cohort members are
hypothesized risk factor. Risk factors can refer monitored for a period of time to ascertain and
to factors related to the occurrence or compare the incidence of new outcomes.
prevention of a disorder, which may include  An inception cohort is a group that is
personal characteristics, behaviors, or assembled early in the development of a
environmental exposures. disorder. Exposed and unexposed persons
• To support a causal hypothesis, observational are then followed to determine the
designs must be able to demonstrate several longitudinal association of exposure with
principles: progression or remission of the disorder.
 Temporal sequence establishes that a  Cohort studies are inappropriate for
causative exposure precedes the outcome. studying rare or slowly developing
outcomes because large numbers of
 There should be a substantial association
subjects would need to be followed over
between the exposure and outcome, based on
long periods to detect a sufficient number
a statistical measure of risk.
of cases for analysis.
 The relationship should be based on a
 Subjects for cohort studies can be recruited
plausible biological mechanism.
from the general population or from special
 Findings should demonstrate consistency populations in which exposure to a risk
across many studies using different samples factor is common. Subjects who are
under different conditions. exposed should be as similar as possible to
 A dose-response relationship indicates that the unexposed group in all factors related to
the outcome is related to increasing levels of the outcome. No subjects should be
exposure. Nonlinear relationships are also immune to the outcome.
possible.  Major challenges for cohort studies include
the potential for attrition and the possibility
• Observational studies may be longitudinal,
of misclassifying a subject’s exposure
where researchers follow subjects over time,
status.
or cross-sectional, when groups of subjects are
evaluated at one point in time. Longitudinal • A case-control study includes groups of
designs may be prospective, involving direct individuals who are purposely selected on the
recording of data to define exposed and basis of whether they have the health
unexposed groups, or retrospective, utilizing condition under study. Cases are those who
exposure data that were collected in the past. have the target condition, while controls do
 A major challenge for cross-sectional not. The investigator then determines if these
studies is establishing the correct temporal two groups differ with respect to their
sequence of cause and effect. Reverse exposure histories.
causation can occur when the specified  A case definition refers to the diagnostic and
“outcome” actually causes the “exposure”. clinical criteria used to identify someone as
a case.

49
 A population-based study obtains cases and about the exposure is obtained from the
controls from the general population. In a study groups. Recall bias occurs when cases
hospital-based study, subjects are patients in who have experienced a particular disorder
a medical institution. remember their exposure history differently
than do controls.
 Cases and controls must be chosen
regardless of their exposure history,  Confounding occurs when extraneous
protecting the study against selection bias. variables that alter the risk of developing the
disorder are unequally distributed across
 Observation bias occurs when there is a
exposed and unexposed groups.
systematic difference in the way information

50
43

CHAPTER Descriptive
20 Research
Chapter Overview
 Objectives
After completing this chapter, the learner will be able to:
1. Describe the purpose of descriptive research and how descriptive data can be used to generate
hypotheses.
2. Describe the purposes of developmental and normative research.
3. Discuss the advantages and disadvantages of longitudinal and cross-sectional approaches for
collecting descriptive data.
4. Discuss the structure and purpose of case reports.
5. Describe the impact of epidemiologic care reports.
6. Describe the function of historical research and the type of data used to synthesize historical data.

 Key Terms
Period effects
Descriptive research
Normative research
Developmental research
Case reports
Longitudinal studies
Case series
Natural history
Historical research
Cross-sectional studies
External criticism
Cohort effects
Internal criticism

 Chapter Summary
• Descriptive research is an observational lead to generation of hypotheses or theoretical
approach designed to document traits, propositions that can be tested using
behaviors and conditions of individuals, exploratory or explanatory techniques.
groups, and populations.
• Developmental research involves the
• Descriptive studies can use quantitative or description of developmental change and the
qualitative methods. sequencing of behaviors over time. Important
developmental studies have provided the
• Descriptive research is generally structured foundation for understanding human
around guiding questions rather than development in children and adults.
hypotheses, but descriptive data will often

51
 Longitudinal developmental studies  Survey data may be used to establish risk
involve collecting data over an extended factors for disease or dysfunction, to
period of time to document behaviors as provide data for development of
they naturally change. hypotheses, and to establish population
characteristics that can be used for policy
 A cross-sectional study involves collection
and resource decisions.
of data at one point in time to describe
differences among members of a group, • Case reports provide a detailed account of an
differentiating those who are at different individual’s condition or response to
developmental levels. This is often called treatment but may also focus on a group,
a “snapshot” approach and is considered a institution, or other social unit, such as a
more pragmatic approach for collecting school or community.
data on large samples.
 The purpose of a case report is to reflect on
 Potential disadvantages of the cross- practice, including diagnostic methods,
sectional approach include cohort effects, patient management, ethical issues,
whereby individuals can vary in their innovative interventions, or the natural
developmental trajectory because of when history of a condition.
they were born or what specific events
 Case reports are not considered a true form
occurred at certain points in time.
of research because they are not based on
• Normative studies describe typical or standard a rigorous or systematic methodology.
values for characteristics of a given However, they can be important
population. They can be directed toward a contributions to evidence-based practice as
specific age group, gender, occupation, they present new information that can be
culture or disability. Norms are usually considered by others.
expressed as averages within a range of  Case reports can be prospective, following
acceptable values. a case as it progresses, or retrospective,
 Normal values can be generated for reporting the results of case management
physiological measures, such as laboratory after it is completed.
tests, or they can be generated as  A case series is an expansion of a case
standardized scores that allow report involving observation of several
interpretation of responses relative to a cases that have similarities.
normed value for a given group, such as IQ
scores.  Many journals publish case reports and
will require specific formats.
 Studies that seek to establish normal
values must be large to represent the full  Clinicians must take precautions to protect
range of responses and may need to be patient privacy in case reports and should
established for subgroups. obtain informed consent from patients.
 Normal values are important references for  Epidemiologic case reports describe one
determining when an individual needs or more individuals, documenting unique
intervention and when the person has or unusual health concerns. These types of
reached an acceptable performance reports are important contributions that
standard. will often lead to investigation of specific
exposures that may cause the observed
• Descriptive surveys are used to provide an conditions.
overall picture of a group’s characteristics,
attitudes or behaviors. • Historical research involves the critical review
of events, documents, literature, and other
sources of data to reconstruct the past in an

52
effort to understand how and why past events through interpretation of historic facts and
occurred and how that understanding can observed relationships.
influence current practice or outcomes.  Historical data are not just a collection of
 Historical studies are intended to past events but a synthesis of data from the
incorporate judgments, analyses, and past that can be applied to understanding
inferences on the part of the researcher current conditions.

53
CHAPTER Qualitative

21 Research Methods
Chapter Overview

 Objectives
After completing this chapter, the learner will be able to:
1. Understand key differences between qualitative and quantitative methods.
2. Differentiate among common types of qualitative research approaches.
3. Understand how and why qualitative sampling differs from quantitative sampling.
4. Differentiate among the three common types of qualitative interviews.
5. Differentiate among common modes of qualitative analysis.
6. Understand how researchers maintain rigor/trustworthiness in qualitative research.
7. Read a qualitative article and evaluate the trustworthiness of the findings.
8. Understand how mixed methods research can be used to combine both qualitative and
quantitative methods.
9. Choose a research article that uses a quantitative methodology and discuss how qualitative
study can complement that study’s purpose.

 Key Terms
Qualitative research Participant observation
Logical positivism In-depth interview
Naturalistic inquiry Unstructured interview
Ethnography Interview guide
Informants Structured interview
Grounded theory Semi-structured interview
Constant comparison Focus group interviews
Saturation Theoretical sampling
Phenomenology Coding
Relationality Content analysis
Temporality Trustworthiness
Case studies Credibility
Field notes Triangulation

54
Negative case analysis Reflexivity
Member checking Mixed methods
Transferability Multimethod studies
Thick description Convergent designs
Dependability Sequential designs
Audit trail Embedded designs
Confirmability Multiphase designs

 Chapter Summary

• Qualitative research is based on the belief that • Qualitative research questions usually start
all interactions are inherently social out broad, seeking to understand the
phenomena, requiring a deep understanding dynamics of how an experience influences
of personal experience to explore behaviors subsequent behaviors or decisions, including
and attitudes. To gain these insights, why something happens. Questions can be
qualitative researchers conduct inquiry in added to the inquiry as data are collected that
natural settings. uncover new information. Qualitative studies
do not usually specify conventional
• Qualitative data involve the collection of
hypotheses.
narrative information. This research is
generally considered an inductive process, • Although the concept of causation is not
with conclusions being drawn directly from directly applicable to qualitative study,
the data. findings can offer interpretations about
possible explanatory factors based on
• In the systematic study of social phenomena,
understanding an individual’s personal
qualitative research serves three important
responses.
purposes: describing groups of people and
social-cultural contexts, generating • The most commonly applied traditions in
hypotheses that can be tested by further qualitative study include ethnography,
research, and developing or expanding theory grounded theory, and phenomenology.
to explain observed phenomena.
• Ethnography has its foundations in
• Qualitative research can be complementary anthropology and has been defined by
to quantitative study, allowing one to inform fieldwork where investigators immerse
the other regarding relevant variables and themselves in the settings and activities
questions. studied. Study participants become
informants because they inform and teach the
• Qualitative research addresses essential
researcher about their lives and communities.
components of evidence-based practice with
regard to sources of information, including • Grounded theory is a formalized process of
patient preferences and providers’ clinical simultaneous data collection and theory
judgments. These factors are influenced by development. Through a process of constant
understanding how patients and care comparison, each new piece of information is
providers view interactions with each other, compared to data already collected to
family members, and colleagues within their determine agreements or conflicts. This
environment. iterative process continues until additional
data yield no further insights, called

55
saturation. The researcher puts pieces of discussion. In a structured interview,
information in context, eventually leading to questions and response choices are
a theoretical premise. The derived theory is predetermined, similar to using a
then formulated, based on factors that interact questionnaire. A semi-structured interview
to explain behaviors. uses a combination of fixed responses and
open-ended questions, allowing some
• Phenomenology is the study of life
consistency in the interview content, but
experiences and the meanings individuals
also allowing participants to contribute
attribute to those experiences. Observation
their own insights.
and interviews are used to reveal
participants’ ideas and thoughts related to • Qualitative researchers use different
their experiences within their lived approaches to non-probability sampling,
environment (spatiality), regarding the including convenience, purposive, and
influence of interactions and emotions snowball sampling. It is important to choose
(relationality), and perception of time participants who represent a range of
(temporality). characteristics, and who can be a rich source
of information.
• Case study research is used to understand an
individual’s behaviors, an institutional • Determining an adequate sample size for
culture, or systems within natural settings. qualitative study is a matter of judgment,
Through in-depth interviews, questionnaires, often based on evaluation of the quality of
and review of documents, qualitative information collected and the purpose of the
researchers draw conclusions that help to research. Although often small, qualitative
explain interactions and choices. studies can incorporate larger samples
depending on the environment and the scope
• Qualitative data collection includes several
of the question.
strategies, including observation, interviews,
and focus groups. • Theoretical sampling is a process whereby
participants are recruited and interviewed,
 Observation provides data about how social
and data are analyzed to identify initial
processes occur in real life situations,
concepts. New participants are brought in
without external influences. Field notes are
who are likely to expand on those concepts,
used to record descriptions of what is seen
until components of theory emerge, and data
and heard. The observer steps back and
have reached the point of saturation.
monitors behaviors with no interaction.
 In participant observation, often used in • In the analysis of qualitative data, researchers
ethnographic research, the researcher will usually develop a coding scheme to
becomes part of the community being identify themes or concepts that emerge from
studied. The intent is that data will be more the data. Through a process of content
informed because the researcher analysis, inferences are drawn by interpreting
experiences are the same as those of the the textual material.
participants. • The quality of a qualitative study is described
 Interviews are usually conducted face to as its trustworthiness based on four criteria.
face. The purpose of an in-depth interview Credibility refers to the degree of confidence
is to probe ideas and obtain the most in the truth of the findings. Dependability
detailed information possible. Unstruc- refers to how stable the data are over a span
tured interviews are open ended based on a of time relevant to the study and the degree to
list of general questions to stimulate which the study could be repeated with

56
similar results. Confirmability refers to discussing elements of data that do not
ensuring that findings are due to the support explanations emerging from the data;
experiences and ideas of participants, rather and member checking, which involves
than characteristics or preferences of the bringing participants or other stakeholders
researcher. Transferability refers to the into the process to validate findings and offer
ability to apply findings to other people in feedback.
similar situations, or an assessment of
• Mixed methods research involves
generalizability.
combination of qualitative and quantitative
• Several methods are used to establish methods. Designs incorporate elements of
credibility of findings, including data collection from both traditions and
triangulation, which involves comparing integrate the findings to explain the
various sources of data to confirm findings; phenomenon under study.
negative case analysis, which involves

57
CHAPTER
Descriptive Statistics
22 Chapter Overview

 Objectives
After completing this chapter, the learner will be able to:
1. Describe methods for graphic presentation of distributions.
2. Calculate measures of central tendency and discuss their appropriate applications.
3. Define measures of variability including range, percentiles, variance, and standard
deviation.
4. Discuss the concept of sum of squares and its meaning in defining variance.
5. Explain the properties of the normal curve.
6. Explain the purpose of standardized scores in relation to the normal curve and standard
deviation.

 Key Terms
Range
Distribution
Percentiles
Frequency distribution
Quartiles
Grouped frequency distribution
Interquartile range
Histogram
Box plot
Line plot (frequency polygon)
Deviation scores
Stem-and-leaf plot
Sum of square (SS)
Exploratory data analysis (EDA)
Mean square (MS)
Normal distribution
Variance (s2)
Skewed distribution
Skewness
Central tendency
Standard deviation (s)
Mode
Coefficient of variation (CV)
Median
Standardized scores (z-scores)
Mean

58
 Chapter Summary interquartile range is the range of scores
within the middle 50% of the distribution,
• Descriptive statistics are used to between Q 1 and Q 3 . These values are
characterize the shape, central tendency, often graphed as box plots which show the
and variability within a set of data, called median, the interquartile range, and the
a distribution. minimum and maximum scores.
• Frequency distributions provide an • Variability is a measure of the spread of
ordered list of scores within a distribution scores within a distribution. As variability
and the number of times that score occurs. increases, scores are more spread out
When continuous data are used, frequency around the mean. It is assessed by taking
distributions can contain data grouped into the sum of deviation scores, or difference
meaningful intervals. of each score from the mean. These values
• Histograms, line graphs, and stem-and-leaf are squared to ignore minus signs, and
plots can be used to graphically display their sum indicates the degree of variance
distributions. within the distribution. These values are,
therefore, referred to as sum of squares
• The shape of data can demonstrate how (SS).
scores are distributed. The normal curve is
a symmetrical bell-shaped curve. A • The sum of squares represents the full
skewed distribution is asymmetrical, with variance in the distribution. However,
fewer scores at one end forming a “tail.” because values were squared, the square
A positively skewed distribution has a root of this value is used as a measure of
longer tail on the right, and a negatively variability within a distribution, called the
skewed distribution has a longer tail on the standard deviation (SD).
left. • In a normal distribution, proportions of the
• Three measures of central tendency, or curve are standardized by converting
averages, can be used to characterize data. standard deviation units to a z score.
The mode is the score that occurs most Approximately 68% of the curve
frequently in a distribution. The median is represents the distance from –1 to +1
that value above which there are as many standard deviation units, or z = ±1.0.
scores as below it, dividing a rank ordered Approximately 95% of the curve falls
distribution into two equal halves. The between ±2 SD, and approximately 99%
mean of the distribution is the sum of a set falls within ±3 SD.
of scores divided by the number of scores,
• Using z scores, we can determine the area
the value most people call the “average.”
under the normal curve that is bounded by
• Variability can be expressed in several any two values.
ways. The range is the difference between
the highest and lowest scores. Percentiles
divide data into 100 equal portions. If a
score is in the 20th percentile, that score is
higher than 20% of the scores in that
distribution. Quartiles divide a distri-
bution into four equal parts. Q 1 , Q 2 , and
Q 3 correspond to 25%, 50% (the median),
and 75% of the distribution. The

59
CHAPTER
Foundations of

23 Statistical Inference
Chapter Overview
 Objectives
After completing this chapter, the learner will be able to:
1. Discuss the concept of probability in relation to observed outcomes.
2. Describe the concept of sampling error.
3. Define and interpret a confidence interval for the mean.
4. Distinguish between research and null hypotheses.
5. Explain the difference between the null hypothesis and the alternative hypothesis.
6. Define Type I and Type II errors.
7. Discuss the purpose of a priori and post hoc statistical power analysis.
8. Discuss the determinants of statistical power.
9. Explain the difference between a one-tailed and two-tailed test.
10. Explain the conceptual difference between parametric and nonparametric statistics.

 Key Terms

Inferential statistics Type I error


Probability (p) Type II error
Sampling error of the mean Level of significance
Standard error of the mean (𝑠𝑠𝑋𝑋� ) Alpha (α)
Point estimate Beta (β)
Interval estimate Power
Confidence interval (CI) Effect size (ES)
Null hypothesis (H 0 ) Effect size index
Alternative hypothesis (H 1 ) Conventional effect sizes
Superiority trial Statistical significance
Non-inferiority trial Clinical significance
Not significantly different Publication bias

60
z-ratio Parametric statistics
Critical region Homogeneity of variance
Critical value Nonparametric tests
Two-tailed test Kolmogorov-Smirnov test of normality
One-tailed test Shapiro-Wilk test of normality

 Chapter Summary

hypothesis of no difference, which will be


• Inferential statistics involve a decision-
tested by statistical procedures. The
making process that allows us to estimate
alternative hypothesis (H 1 ) is based on the
unknown population characteristics from
research hypothesis, predicts an effect, and
sample data.
can be expressed in a directional or non-
• Assumptions for inferential statistics are directional format.
based on two concepts of statistical
 The purpose of statistical testing is to try
reasoning: probability and sampling error.
to disprove H 0 . Because we cannot prove
 Probability is the likelihood that any one a negative, we can only legitimately reject
event will occur, given all possible H 0 if an effect is demonstrated or not reject
outcomes. A lowercase p is used to it if we cannot demonstrate an effect.
represent probability as a decimal, from 0
 Because we deal with sample data, when
(no probability) to 1.00 (absolute
we make a decision to reject or not reject
probability). If p = .04, it means there is a H 0 , we do so knowing we may be making
4% probability that an event will occur.
a correct or incorrect decision. A Type I
 Sampling error refers to the difference error occurs when we conclude that there
between true population values and sample is a significant effect when there is none,
values. Because we rarely know that is, we incorrectly reject H 0 when it is
population values, we make estimates true. A Type II error occurs when we find
based on sample data using standard error. no effect when there really is one, that is,
For means, the standard error of the mean we incorrectly reject H 0 when it is false.
is used to estimate the population mean,
 When we specify a null hypothesis for a
based on the ratio of the sample standard
non-inferiority trial, we predict that the
deviation and the √𝑛𝑛. As n increases, the experimental treatment will be worse than
standard error will decrease, indicating that the standard treatment by a certain
the estimated values should be closer to the amount, called the non-inferiority margin.
population mean. The alternative hypothesis will be that the
• Any single group statistic, such as a mean, is experimental treatment will not be worse
considered a point estimate. However, to than the standard treatment.
better estimate population values, it is more • We specify a level of significance, denoted as
helpful to determine an interval that is likely alpha (α), that indicates a threshold we will
to contain the population value. This is called use for committing a Type I error. Therefore,
a confidence interval (CI). If we calculate a
if α is set at .05, it means we will reject H 0
95% CI, it means that we are 95% sure that
only if we find a significant difference with
the true population value will fall within that
≤5% chance of being incorrect. Although this
range.
is an arbitrary threshold, it has become
• The null hypothesis (H 0 ) is a statistical accepted as standard.

61
 When a statistical test is performed to consider the difference between statistical
determine if a true effect has occurred, significance based on probability and clinical
such as a difference between group means, significance based on expertise, experience,
the test will result in a p value that and an understanding of theoretical
indicates the likelihood of making a Type mechanisms that puts statistical results in
I error if we decide the groups are context. All research results should be
different. Using .05 as the standard considered in terms of whether outcomes are
threshold, for example, p = .04 would be meaningful. A statistically significant effect
considered significant. doesn’t mean it is important, and vice versa.
• The probability of making a Type II error is • In statistical testing, the size of the effect is
beta (β), which is the probability of failing to based on a standardized score, the z score.
reject a false null hypothesis. If β = .20, then Based on proportions of the normal curve, we
there is a 20% chance that we will not reject can determine if a z score falls within tails of
H 0 when it false. the curve that represent 5%. Anything that
falls within the upper or lower 5% is
• The complement of β error, 1 – β, is
considered an unlikely event and therefore
considered the power of a test. A more
would represent a significant effect.
powerful test is more likely to find a
significant difference if it exists.  A two-tailed test is used when the
alternative hypothesis is expressed
 Power involves four interdependent
without direction, allowing for the
concepts (PANE): power (1 – β), alpha
possibility that the effect could be positive
level of significance (α), sample size (n), or negative. A one-tailed test is used when
and effect size. With a larger sample, a test
the hypothesis is directional, indicating
will be more powerful.
that a difference is expected only in one
 The effect size (ES) is a measure of the direction.
degree to which H 0 is false, or the size of
• Parametric statistics are used for statistical
the effect of the independent variable. An
testing based on three assumptions.
effect size index is a unitless standardized
value that allows comparisons of effect size  Samples are randomly drawn from a
across samples and studies. There are parent population with a normal
several types of ES indices, each with distribution.
conventional effect sizes that are  Variances in the samples being compared
considered small, medium, and large. are roughly equal, or homogeneous.
 A priori power analysis can be used to  Data are measured on the interval or ratio
estimate sample size during planning of scales.
study by estimating the expected effect
size, desired power, and levels of α and β. • Nonparametric statistics are used when these
Post hoc analysis can be used to estimate assumptions cannot be made. These statis-
achieved power when a study is completed tical procedures are based on ranked data and
and a significant effect is not found. are useful with ordinal or nominal data.
• In all statistical testing, it is important to

62
CHAPTER Comparison of Two

24 Means: t-Test
Chapter Overview

 Objectives
After completing this chapter, the learner will be able to:
1. Describe the components of the statistical ratio for examining the difference between two
means.
2. Discuss the assumptions for using statistical tests.
3. Discuss the application of the t-test for paired and unpaired research designs.
4. Describe the use of confidence intervals in statistical testing.
5. Interpret computer output for the paired and unpaired t-test.
6. Determine power and sample size estimates for studies comparing two means.
7. Explain why the use of multiple t-tests is inappropriate for statistical decision making.

 Key Terms
t-test Levene’s test for equality of variances
Error variance Two-tailed test
Statistical hypothesis (H 0 ) One-tailed test
Unpaired t-test (independent) Confidence intervals
Homogeneity of variance Paired t-test
Standard error of the difference between Standard error of the difference scores
means Standardized mean difference (SMD)
Critical value Effect size
Degrees of freedom (df)

 Chapter Summary
within groups. The denominator of the t-test
• The t-test is used to compare two means,
is called the standard error of the difference
either from two independent groups or from
between means.
two repeated measures.
• An assumption for the t-test, like other
• The statistic is based on the ratio of variance
parametric statistics, is that there is equality
between groups divided by the variance
of variances among the two groups, or

63
homogeneity of variance. When variances of the independent variable represent a
are significantly different, the t-test has to be repeated measure. Because the number of
adjusted. Levene’s test is used to compare measures must be equal across the two
the variances of the two groups. treatment conditions, it is unnecessary to test
for homogeneity of variance with the paired
• Each group has n-1 degrees of freedom.
test.
• The calculated value of t is compared with a
• Power of the t-test is estimated using the
critical value that is determined by degrees of
effect size index d, which expresses the
freedom, level of significance, and one- or
difference between two sample means in
two-tailed tests.
standard deviation units. Conventional
• Confidence intervals can be determined for effect sizes are small: d = .20, medium: d =
the difference between means. These values .50, large: d = .80.
present important information to judge the
• Statistical reports for the t-test should
true nature of the difference between means.
include the means (±SD), the mean
• The independent or unpaired t-test is used difference, confidence intervals, one- or
when two independent groups are two-tailed test, the calculated value of t,
compared, usually formed by random degrees of freedom, value of p, and effect
assignment. The computer output for the size.
unpaired t-test automatically generates two
• It is inappropriate to use the t-test for
lines of data, one for equal variances
comparing more than two means, as this
assumed and another for equal variances not
inflates the Type I error associated with the
assumed. The researcher must decide which
comparisons. The analysis of variance is
data to use based on the test for homogeneity
recommended when more than two means
of variance.
are compared.
• The paired t-test is used when the two levels

64
Comparison of More
CHAPTER
than Two Means:
25 Analysis of Variance
Chapter Overview
 Objectives
After completing this chapter, the learner will be able to:
1. Discuss the application of the analysis of variance for one-way and two-way designs.
2. Discuss the difference between main effects and interaction effects in a two-way design.
3. Discuss the difference between between-subjects and within-subjects analyses.
4. Interpret computer output for analysis of variance.
5. Interpret power and effect size for analysis of variance.

 Key Terms
Analysis of variance (ANOVA) Eta squared (η2)
One-way ANOVA Cohen’s f
Partitioning the variance Omega squared (ω2)
Between-groups variance Two-way ANOVA
Within-groups (error) variance Factorial design
Homogeneity of variance Main effects
Sum of squares (SS) Marginal means
Between-groups sum of squares (SS b ) Interaction effects
Error sum of squares (within groups) (SS e ) Repeated measures ANOVA (within-
Total sum of squares (SS t ) subjects design)
Degrees of freedom Sphericity
Mean square (MS) Mauchley’s test of sphericity
Between-groups mean square (MS b ) Greenhouse-Geiser correction
Error mean square (within groups) (MS e ) Huynh-Feldt correction
F-ratio Mixed analysis of variance
Multiple comparison

65
 Chapter Summary
and total df = N – 1.
• The analysis of variance (ANOVA) is used
to compare means for more than two groups • The F ratio is calculated as the MS b /MS e.
or repeated measures. Subscripts b = between groups, e = error
term.
• The ANOVA is a parametric test based on
assumptions of homogeneity of variance, • The ANOVA can include values for effect
normal distribution, and interval/ratio data. size and power. Effect size indices for the
ANOVA are eta squared (η2) or partial eta
• The one-way ANOVA is used with one
independent variable or factor, with three or squared (𝛈𝛈𝟐𝟐𝐩𝐩 ) and Cohen’s f. These are
more levels. calculated using values for the SS b and SS e .
Conventional effect sizes have been proposed
• Total variance in a set of data reflects two for these indices.
sources of variance: a treatment effects
(between groups) and unexplained error • The two-way ANOVA is used with a factorial
(within groups). design, when two independent variables are
used.
• Variance is calculated as the sum of squares
(SS). A separate SS will be calculated for • Variance in a two-way design includes main
between groups, within groups, and total effects for each independent variable as well
variance. as the interaction of the two variables. When
interaction effects are significant, main
• Levene’s test is used to determine homo- effects are usually ignored.
geneity of variance. If the test is not
significant, variances are assumed to be • The repeated measures ANOVA is used when
equal. all subjects are tested under all levels of the
independent variable.
• The format for a table reporting the results
of an ANOVA includes sum of squares, • Homogeneity of variance for repeated
mean square (MS), and degrees of freedom measures is based on the assumption of
(df) for each source of variance, and the sphericity. This is tested using Mauchley’s
calculated F ratio and probability for each test of sphericity.
comparison. For between groups variance df • Repeated measures can be used in a two-way
= k –1, for within groups variance df = n – k, design, or in a mixed design with one
repeated measure and one independent
measure.

66
CHAPTER Multiple

26 Comparisons
Chapter Overview
 Objectives
After completing this chapter, the learner will be able to:
1. Explain the difference between per comparison and familywise error rate.
2. Define the minimal significant difference (MSD).
3. Calculate the MSD for several multiple comparison tests.
4. Discuss the purpose of trend analysis.
5. Interpret computer output for multiple comparisons tests.

 Key Terms
Tukey’s honestly significant difference (HSD)
Multiple comparisons tests
Studentized range (q)
Per comparison error rate (αPC )
Student-Newman-Keuls test (SNK)
Familywise error rate (α FW ) Scheffé comparison
Type I error rate Simple effects
Bonferroni correction Planned comparisons
Post hoc multiple comparisons Complex contrasts
Minimum significant difference (MSD) Trend analysis

 Chapter Summary
power but decrease the probability of Type
• Multiple comparisons are used to explore
I error.
differences between means following an
analysis of variance. • The per comparison error rate (α PC ) is the
• Multiple comparisons provide adjust- probability associated with a single
ments for the level of significance to comparison. The familywise error rate
account for the use of several (α FW ) represents the cumulative
comparisons. More liberal procedures probability of making at least one Type I
will have greater power but also a greater error in a set of statistical comparisons.
chance of committing a Type I error. More
• The Bonferroni correction is an adjustment
conservative approaches will have lower
determined by dividing the level of

67
significance (e.g. .05) by the number of different MSD depending on how far apart
comparisons, correcting the probability the means are. The Scheffé test is based on
that is needed to achieve significance. the F statistic to determine the MSD.
• Post hoc multiple comparisons are • Post hoc tests can be run for one-way or
performed after an ANOVA and may two-way designs and for repeated
involve comparison of all pairwise measures. In two-way designs,
differences. comparisons may include marginal means
for main effects or means for interaction.
• The most commonly used post hoc tests
include Tukey’s honestly significant diff- • A priori multiple comparisons are planned
erence (HSD), the Student-Newman-Keuls comparisons, decided before a study is
(SNK) comparison, and Scheffé’s com- run, focusing on specific comparisons
parison. The SNK procedure is the most rather than all possible pairwise
liberal of the three, Scheffé is most comparisons.
conservative, and Tukey provides a good
• Trend analyses are used to look at patterns
balance between Type I and Type II error.
of a responses over a continuous variable
• The post hoc tests are based on such as age or time. Trends are generally
comparison of a difference between means described as linear or nonlinear. The most
with a minimum significant difference frequently encountered nonlinear trend is
(MSD). The mean difference must be equal quadratic, which means scores vary in
to or greater than the MSD to be direction or rate of change. Trends can be
considered significant. The Tukey test tested using an ANOVA that can partition
uses one MSD for all pairwise variance into trend components.
comparisons. The SNK procedure uses

68
CHAPTER
Nonparametric Statistics

27 for Group Comparisons


Chapter Overview
 Objectives
After completing this chapter, the learner will be able to:
1. Describe the role of nonparametric statistics and reasons for their use.
2. Compare parametric and nonparametric tests for specific applications.
3. Explain the procedure for ranking a distribution of scores with and without ties.
4. Calculate and interpret results of the Mann-Whitney U test, the Kruskal-Wallis ANOVA,
the sign test, the Wilcoxon signed-ranks test, and the Friedman ANOVA.
5. Interpret multiple comparison tests for the Kruskal-Wallis and Friedman analyses of
variance.

 Key Terms
Nonparametric statistics Binomial test (x)
Mann-Whitney U test Wilcoxon signed-ranks test (T)
Kruskal-Wallis analysis of variance by ranks (H) Friedman two-way analysis variance by
Sign test ranks (2r )

 Chapter Summary
 Nonparametric tests are statistical procedures  The Kolmogorov-Smirnov and Shapiro-
that do not make the same assumptions of Wilk tests can be used to determine if data
data as parametric statistics. They are often follow a normal distribution.
referred to as distribution-free tests.  Nonparametric tests can be compared to
 Data used in nonparametric statistics are parametric procedures for power efficiency,
reduced to ranks, and comparisons are which is a test’s relative ability to identify
based on median distributions, not means. significant differences for a given sample
size. A rule of thumb to determine sample
 The two major criteria for choosing a
nonparametric test are that data are at the size is to add 15% to the required sample size
if the data were used with a parametric
ordinal or nominal level of measurement or
that assumptions of normality and homo- procedure.
geneity of variance cannot be satisfied.

69
 To rank scores in a distribution, data are  The test looks at each pair of scores and
listed from smallest to largest, taking into determines the direction of difference.
account negative signs. If two or more scores Under the null hypothesis, the distribution
are tied, each is given the same rank, which of + or – scores under each condition will
is the average of the ranks they occupy. not be different.
 The Mann-Whitney U test is used to test the  The test statistic for the binomial test is x.
difference between independent samples and  The Wilcoxon signed-ranks test is a more
is analogous to the unpaired t-test. powerful comparison for two related groups,
 Scores are ranked for the total sample, and also analogous to the paired t-test.
the sums of ranks within each group are  Difference scores for each pair are ranked.
then determined. Under the null hypoth- Under the null hypothesis, there would be
esis, we would expect the groups to be an equal representation of + or – scores
equally distributed with regard to high and among higher and lower ranks.
low ranks, and the mean rank for each
group would be the same.  The test statistic is T.
 The effect size index for U is a correlation  The effect size index for the Wilcoxon test
coefficient based on z. is based on a correlation coefficient, similar
to U.
 The Kruskal-Wallis One-Way Analysis of
Variance by Ranks is the nonparametric
See the Chapter 27 Supplement on
analog of the one-way analysis of variance
Estimating Power and Sample Size for
(ANOVA), used with more than two
description of using G*Power for the Mann-
independent groups.
Whitney U test and the Wilcoxon signed-
 Scores are ranked for the total sample, and ranks test.
the sums of ranks within each group are
then determined. Under the null hypoth-  The Friedman two-way analysis of variance by
esis, ranks would be equally distributed ranks is the nonparametric equivalent of the
across all groups. one-way repeated measures ANOVA, used
 The test statistic is H, which is based on the with three or more repeated conditions.
chi-square distribution (see Chapter 28).  Scores are ranked within each subject
 Dunn’s multiple comparison is a across the repeated conditions, and the
nonparametric test for pairwise com- sums of ranks for each condition are
parisons for the Kruskal-Wallis test. determined. Under the null hypothesis, we
would expect high and low ranks to be
 The effect size index for H is eta squared, evenly distributed across all conditions.
𝛈𝟐 .
 The test statistic is 𝟐𝐫 , which is based on
 The sign test is used to compare paired the chi-square distribution.
scores, analogous to the paired t-test. It is
based on the binomial test which considers  Dunn’s multiple comparison is used as a
dichotomous data. nonparametric multiple comparison pro-
cedure.
 The effect size index for the Friedman test
is the Kendall coefficient of concordance,
W.

70
Measuring Association
CHAPTER
for Categorical
28 Variables: Chi-Square
Chapter Overview
 Objectives
After completing this chapter, the learner will be able to:
1. Describe the use of chi-square in research.
2. Discuss the purpose of a goodness-of-fit model.
3. Discuss how standardized residuals help to explain significant chi-square values.
4. Calculate chi-square for 2 × 2 tables and interpret results.
5. Discuss issues related to small samples sizes for interpretation of chi-square.
6. Describe other measures of association for nominal variables.

 Key Terms
Chi-square statistic, χ2 Expected frequencies
Goodness of fit Yate’s continuity correction
Uniform distribution Fisher’s exact test
Known distribution Phi coefficient
Standardized residuals Odds ratio
Contingency table McNemar test

 Chapter Summary
long run. We can count the number of
• Many research questions involve variables
observed heads or tails to determine if it
that are measured in categories, whether
varies significantly from this expected
nominal or ordinal. Analysis of these
distribution.
variables involves proportions or
frequencies within each category. • Chi-square, χ2, tests the difference between
observed and expected frequencies.
• Proportions are analyzed in terms of how
they meet or depart from chance  The test is based on the assumption that
expectations. For example, a coin toss has frequencies represent individual counts,
two possible outcomes (heads or tails). If we and categories are exhaustive and
toss a coin several times, assuming the coin mutually exclusive. No one individual is
is fair, we should see a 50:50 outcome in the represented in more than one category.

71
• Chi-square can be used to test goodness of  Tests of significance for contingency
fit of data with uniform or known tables are tested with (R-1)(C-1) degrees
distributions. The null hypothesis states that of freedom.
there is no difference between observed and  Yates’ continuity correction can be used to
expected counts. adjust values when any cells within the
 A uniform distribution is one in which we contingency table have expected
would expect an equal proportion of frequencies less than 5.0. This test is
individuals within each category. controversial, however, and is considered
overly conservative.
 In a known distribution, we can test
observed counts against the known  Fisher’s exact test is used instead of χ2
distribution of individuals within certain when expected frequencies are less than 1
categories, often based on population data. in some cells.
 Critical values of χ2 are based on k-1 • The effect size index for χ2 is given the
degrees of freedom, where k is the number χ2
of categories (see Appendix Table A-5). notation w and is equal to � 𝑁𝑁 .
Calculated values must be equal to or
greater than the critical value to be • Other measures of effect are often calculated
significant. with χ2.

 A significant test indicates that there is a  The phi coefficient, φ, is interpreted as a


difference between the observed correlation coefficient for 2 × 2 tables and
frequencies and expected chance is equivalent to w.
frequencies for each category. It does not  The contingency coefficient, C, is used
mean that the number of counts within when tables are larger than 2 × 2.
categories are different from each other.
 Cramer’s V is used when tables are
 Standardized residuals reflect the asymmetrical.
difference between expected and observed
frequencies. They provide a basis for  The odds ratio (OR) can also be used as an
distinguishing which categories depart effect size measure with 2 × 2 tables (see
most significantly from the expected Chapter 34).
values. • The McNemar test is a form of χ2 used with
• Chi-square is used most often as a measure 2 × 2 tables involving correlated samples,
of association between two categorical where subjects are their own controls. This
variables, with data organized within a is especially useful with pretest-posttest
contingency table or crosstabulation that designs when the dependent variable is
crosses the two variables (with R rows and categorical.
C columns).  The odds ratio can be used as an effect size
 The null hypothesis states that there is no index with the McNemar test, but it is
association between the two variables, and adjusted to account for the matched
that the expected frequencies across all values.
categories will represent the same
proportions as the observed frequencies.

72
CHAPTER

29 Correlation
Chapter Overview

 Objectives
After completing this chapter, the learner will be able to:
1. Describe the role of correlation for data analysis.
2. Describe the use of a scatterplot in correlation analysis.
3. Discuss the importance of distinguishing between linear and curvilinear relationships in
correlation analysis.
4. Discuss the meaning of significance for a correlation coefficient.
5. Describe how variations of the Pearson r can be used to study relationships with
continuous and dichotomous variables.
6. Discuss the importance of distinguishing between causation and correlation.
7. Discuss factors that can influence generalization of a correlation coefficient.
8. Describe the meaning of partial correlation for understanding the relationships among
several variables.

 Key Terms
Correlation Phi coefficient, φ
Scatter plot Point biserial correlation coefficient, rpb
Correlation coefficient Rank biserial correlation coefficient, rrb
Pearson product-moment coefficient of
Partial correlation
correlation, r
Zero order correlation
Spearman rank correlation coefficient, rs
First-order partial correlation
Kendall’s tau-b

 Chapter Summary
• Correlation refers to the association • A scatter plot provides a visual
between two variables, X and Y, reflecting demonstration of that relationship. The
the degree to which high scores in X are closer points fall to a straight line, the
associated with higher scores in Y, and vice higher the degree of correlation.
versa.

73
• The relationship between two variables can  Alternative hypotheses may be
be positive, when high and low values are expressed with or without direction.
related, or negative, when high values on X
• The correlation coefficient can be tested for
are associated with lower values on Y. significance against a critical value with
• Correlation can be described quantitatively n – 2 df. The calculated value must be equal
by a correlation coefficient, symbolized by to or greater than the critical value to be
r. Coefficients can usually range from -1.0 significant.
(a negative relationship) to +1.0 (a positive  Confidence intervals can also be
relationship). constructed around the correlation
 The magnitude of the correlation coefficient.
coefficient indicates the strength of the • The effect size for correlation is r, which
association between X and Y. The closer can be used to determine power and sample
to ±1.00, the stronger the relationship. A size.
negative sign indicates the direction of
the relationship only. • Two commonly used nonparametric
correlation statistics are the Spearman rho,
• Meaningful interpretation of correlation r s , and the Kendall tau-b, τ. These statistics
depends on certain assumptions: look at the common rankings of X and Y.
 Subjects’ scores represent an under-
• Several statistics can be used to correlate
lying normal population
dichotomies.
 Each subject contributes an X and Y
 The phi coefficient, φ, is used when both
score
X and Y are dichotomous.
 X and Y are independent (no one score
is a component of the other)  The point biserial correlation coefficient,
rpb , is used to correlate one dichot-
 The relationship between X and Y is omous variable with a continuous
linear variable. It is equivalent to the indepen-
 Values of X and Y are observed, not dent t-test.
controlled or manipulated  The rank biserial correlation, rrb, is used
• There is no consensus on what value of a to correlate one dichotomous and one
correlation coefficient is considered strong ranked variable. It is equivalent to the
or weak. Interpretations must be made Mann-Whitney U test.
within the context of what is being
measured and how the correlation will be • An important caveat in the interpretation of
correlation is that it must be interpreted
applied.
relative to its clinical significance, not
• The Pearson produce-moment coefficient of simply its statistical significance.
correlation is the most commonly applied
• Correlation is a measure of covariance
correlation statistic for parametric data.
between two variables. It cannot be used to
The Pearson correlation is symbolized by r,
compare two variables or to establish a
but the Greek letter rho, ρ, is used to
significant difference between two
represent the population parameter.
distributions.
 The null hypothesis states that there is
no relationship between two variables: • Correlation does not imply causation. The
presence of an association between two
H0: ρ = 0.
variables may be due to a third variable that
is correlated with both of them. Partial

74
correlation refers to the correlation of X and relationship. If we look at the correlation
Y with the effect of a third variable of both X and Y with Z, then we can
removed. For instance, if X and Y are partition out the variance that is only
correlated, it is possible that a third explained by the relationship between X
variable, Z, is responsible for their and Y. This would be the partial correlation
with the effect of Z removed.

75
CHAPTER
Regression
30 Chapter Overview

 Objectives
After completing this chapter, the learner will be able to:
1. Discuss the purpose of linear regression.
2. Explain the meaning of the line of best fit.
3. Explain how r2 and standard error of the estimate help the researcher to determine the
accuracy of the fit of a regression equation.
4. Describe the components of a multiple regression equation.
5. Describe the application of stepwise regression analysis.
6. Discuss how nonlinear regression can be used to better reflect some distributions.
7. Discuss the use of logistic regression for predicting dichotomous outcomes.
8. Discuss the application and interpretation of analysis of covariance.
9. Interpret computer output for regression procedures.

 Key Terms
Coefficient of determination (r2) Multicollinearity
Simple linear regression Partial correlation
Independent variable (X) Tolerance level
Dependent variable (Y) Variance inflation factor (VIF)
Regression line Stepwise multiple regression
Regression equation R2 change
Regression constant Forward inclusion
Regression coefficient Backward deletion
Line of best fit Hierarchical regression
Residual Dummy variables
Adjusted R2 Coding
Standard error of the estimate (SEE) Nonlinear regression
Analysis of variance of regression Quadratic curve
Standardized residuals Polynomial regression
Multiple regression Logistic regression
Beta weigh Maximum likelihood estimation

76
Odds ratio (OR) Propensity scores
Indicator variable Analysis of covariance (ANCOVA)
Cox & Snell R square Covariate
Nagelkerke R square Adjusted means
Homer-Lemeshow test Homogeneity of slopes
Adjusted odds ratio

 Chapter Summary
• Regression is an extension of correlation  The regression equation can be used to
specifically designed to analyze shared variance predict a value of Y by knowing the value
to establish how well one variable, or a set of of X. However, unless prediction is
variables, can predict and outcome. perfect, and all points fall directly on the
regression line, this prediction may not be
• The square of the correlation coefficient, r2, accurate for every person. The deviation of
called the coefficient of determination, is a an individual score from the regression
measure of accuracy of prediction. It indicates
line (the predicted score) is called the
the proportion of variance in X that can be
residual, indicating the degree of error in
explained by knowing the variance in Y.
the regression line.
• Simple linear regression involves the
• Prediction accuracy can be analyzed in
development of an equation to calculate
several ways.
predicted values of Y from X. The variable
designated X is the independent variable, and  The value of R2 indicates the proportion
the variable designated Y is the dependent of variance in the dependent variable
variable. Both X and Y variables are expected that is explained by the independent
to be continuous. variable.
• A scatterplot is the best way to visually  The adjusted R2 is slightly smaller than
understand the degree of relationship between X R2, representing a chance-corrected
and Y. For the purposes of prediction, a line is value.
drawn, called the line of best fit or the regression  The standard error of the estimate (SEE)
line, which “best” describes the orientation of reflects the variance of errors on either
data points. side of the regression line, or the
• The algebraic representation of the regression residuals.
line is given by Ŷ = a + bX, where Ŷ (Y-hat) • Regression analysis tests the null hypothesis
is the predicted value of Y. H0: b = 0 and is analogous to testing the
 The term a is called a regression significance of the correlation between X and
constant. It is the Y-intercept, repre- Y.
senting the value of Y when X = 0. This
 This hypothesis can be tested using an
can be a positive or negative value.
analysis of variance of regression which tests
 The term b is the regression coefficient. how much variance in the scores is explained
It is the slope of the line, which is the rate by the regression line.
of change in Y for each one-unit change
 Correlation coefficients can be used to test
in X.
the relationship between X and Y.

77
 The slope of the regression line can be tested • Stepwise multiple regression is a
using a t-test, indicating if the slope is procedure whereby variables are entered
significantly different from zero. into a regression equation one at a time
based on the degree of explained variance.
• There is an assumption that for any given value
The process stops when no further
of X, a normal distribution of Y scores exists in
variables add information to explain Y.
the population. The observed value of Y at a
particular value of X is one random value from • Hierarchical regression involves entering
this distribution. sets of variables in sequential stages to
examine relationships of subsets of
• Multiple regression allows for prediction of Y
variables.
using a set of several independent X variables.
Outcome variables are often better explained by • Although dependent and independent
a series of variables, rather than just one. variables are expected to be continuous
 The multiple regression equation measures for use in regression, many
accommodates several predictor variables: variables are measured in categories.
Ŷ = a + b1X1 + b2X2 +... bkXk. Dummy variables involve coding so that
categorical variables can be included in
 The regression coefficients (b) are the regression.
essentially weights that indicate the
• Nonlinear regression, or polynomial
contribution of each variable to the
regression, is used when the relationship
predicted outcome.
between X and Y is not linear. Other than
 Standardized regression coefficients, linear relationships, the most common
called beta weights, are essentially z trend is based on one turn, or a quadratic
scores, allowing comparison of regression curve.
coefficients.
• Logistic regression is another form of
• Collinearity occurs when independent prediction where dependent and
variables in a regression are correlated with independent variables are categorical,
each other and therefore provide redundant predicting group membership.
information.
 Variables are coded 0 and 1, with 1
 Partial correlation can show how one typically representing the more adverse
variable is related to Y, accounting for the condition.
contributions of other independent
 Odds ratios (OR) can be generated as
variables.
part of logistic regression which
 Tolerance level and variance inflation factor indicate the odds of an outcome based
(VIF) reflect the degree to which on input variables. Adjusted odds ratios
collinearity occurs. The lower the account for the influence of other
tolerance and the higher the VIF, the more variables in the equation in projecting
collinearity is present. an odds ratio for each independent
variable.
• The effect size index for regression is R2,
which can also be converted to f 2 for use with
G*Power.

78
• Propensity scores are outcome measures • The analysis of covariance (ANCOVA) is a
from logistic regression that reflect form of ANOVA that is combined with
probabilities associated with the set of regression to account for the influence of
independent variables. These propensity covariates in studying the relationship
scores can be used as a way of matching between independent and dependent
subjects in studies when matching has to variables. When groups differ on a
account for several variables. relevant variable at baseline, the
ANCOVA essentially adjusts scores so
that the groups start off looking alike on
those variables, reducing confounding
that can be introduced when baseline
characteristics affect group responses.

79
CHAPTER
Multivariate

31 Statistics
Chapter Overview
 Objectives
After completing this chapter, the learner will be able to:
1. Explain the application and purposes of exploratory and confirmatory factor analysis.
2. Describe the purpose of structural equation modeling and its relationship to theory.
3. Describe the purpose of cluster analysis for describing subgroups in a set of individuals.
4. Describe the application of multivariate analysis of variance and discriminant analysis in the
study of multiple dependent variables.
5. Describe the purpose of survival analysis and how survival rates and hazard ratios can be
interpreted for different groups.

 Key Terms
Exploratory factor analysis (EFA) Path coefficient
Principal components analysis (PCA) Direct effect
Confirmatory factor analysis (CFA) Indirect effect
Kaiser-Meyer-Olkin (KMO) measure of Total effect
sampling adequacy Goodness of fit
Bartlett’s test of sphericity Cluster analysis
Communality Multivariate analysis of variance (MANOVA)
Extraction Vector
Principal axis factoring Group centroid
Eigenvalue Discriminant analysis
Factor matrix Canonical correlation
Rotated factor matrix Wilk’s lambda (λ)
Varimax rotation Survival analysis
Orthogonal rotation Terminal event
Oblique rotation Censored observations
Factor loadings Kaplan-Meier estimates
Factor scores Survival curve
Structural equation modeling (SEM) Median survival time
Path diagram Cox proportional hazards model
Exogenous variable Hazard ratio (HR)
Endogenous variable

80
 Chapter Summary
• Multivariate analysis refers to a set of  Data are presented in a path diagram that
statistical procedures that are distinguished by shows the causal and outcome relationships
the ability to examine several response among latent traits.
variables within a single analysis to account for  Exogenous variables are those that have
interrelationships. causes outside the model and are not
• Exploratory factor analysis (EFA) is a procedure included in the theory being tested.
that is used to explore the underlying structure Endogenous variables are those that are
of a set of variables related to a latent trait to caused by the exogenous variables.
reduce data to a smaller set of correlated  Direct effects are those that show a direct
values, or factors. causal connection. Indirect effects are
 Factors are sets of variables that have high identified by a chain of two or more variables
correlations with each other and that are not that include a mediating variable, which
correlated with variables in other factors. essentially serves as both a cause and effect.
 Factors are derived through a process of • Cluster analysis is an exploratory approach that
extraction that defines the number of looks at common characteristics among
variables that account for the majority of subgroups of individuals.
variance in the data, based on a variance
• Multivariate analysis of variance (MANOVA) is a
measure called an eigenvalue. form of ANOVA that incorporates more than
 A factor matrix is developed that lists factor one dependent variable. Its purpose is to
loadings, which are the correlations of each account for shared variance that is present
variable with each factor. Loadings above when dependent variables are related to each
.40 are considered meaningful to assign a other.
variable to a particular factor.  Rather than looking at group means, the
 A factor matrix may need to be rotated to MANOVA looks at a set of means for each
assure a clean orientation of variables with group, called a vector. The overall mean of
individual factors. A rotated factor matrix the vectors is the group centroid.
will usually present more interpretable data.
 The result of the MANOVA will indicate if
 Factors are “named” by the researcher, groups are different based on these
based on which variables are assigned to combined group scores.
that factor.
 Follow-up tests for the MANOVA may
• Principal components analysis (PCA) is a include individual analyses of variance, or a
different exploratory technique that identifies discriminant analysis, which looks at linear
components rather than factors that account for combinations of variables to determine if
the total variance in a dataset. they can predict group membership.
• Confirmatory factor analysis (CFA) is used to • Survival analysis looks at the time to reach a
confirm hypotheses about underlying terminal event for subgroups to determine if
constructs by proposing a model and then the risk of reaching the terminal event is
verifying it through a factor analysis. different. Often used to assess risk of mortality,
it can also be used to look at other outcomes.
• Structural equation modeling (SEM) is another
confirmatory technique that analyzes a  Because not all participants will reach the
proposed model in terms of causal terminal event within the time frame of the
relationships that fit a given theory. study, censored observations represent those

81
who have not yet reached the terminal event terminal event, the null value. Values below
by the time the study ends. 1.0 indicate decreased risk and above 1.0
indicate increased risk. Confidence intervals
 Kaplan-Meier estimates are used to determine
can be constructed.
the probability that someone will reach a
terminal event within a given time period.  The median survival time is the point at which
50% of the participants are expected to
 The Cox proportional hazards method is used
survive, or that point at which there is a 50%
to derive a hazard ratio (HR), which is
chance of surviving.
interpreted like an odds ratio. A value of 1.0
indicates no increased risk of reaching the

82
Measurement
CHAPTER
Revisited: Reliability
32 and Validity Statistics
Chapter Overview

 Objectives
After completing this chapter, the learner will be able to:
1. Discuss how measurement and reliability theory are related to the concept of variance.
2. Explain the purposes of the six models of the intraclass correlation coefficient (ICC).
3. Interpret computer output for ICC values.
4. Discuss how percent agreement, kappa, and weighted kappa are used to assess
agreement.
5. Discuss the purpose of Cronbach’s alpha in testing for internal consistency.
6. Discuss the purpose of the standard error of measurement (SEM) for understanding the
reliability of clinical scores.
7. Discuss the use of Bland-Altman plots and other methods of reliability testing for parallel
forms and response stability.
8. Discuss the interpretation of the minimal detectable change (MDC) and minimal
clinically important difference (MCID) in the measurement of change.
9. Describe several methods for measuring change using effect size indices.

 Key Terms
Intraclass correlation coefficient (ICC) Item-to-total correlation
Random effect Limits of agreement (LOA)
Fixed effect Bland-Altman plot
Standard error of measurement (SEM) Responsiveness
Percent agreement Minimal detectable change (MDC)
Crosstabulation Minimal clinically important difference (MCID)
Kappa statistic, κ
Weighted kappa, κ W
Internal consistency
Chronbach’s alpha

83
Distribution-based measures
Effect size
Standardized response mean (SRM)
Guyatt’s responsiveness index (GRI)
Anchor-based measures
Global rating of change
MDC proportion
MCID proportion

 Chapter Summary
in the study are the only raters of interest.
• Reliability and validity are fundamental
Only two models are typically used.
concepts for evidence-based practice,
establishing consistency and meaning of  Model 2 is used most often, where each
measurements. Several statistical procedures subject is assessed by the same set of
are used to assess reliability and validity raters who are considered “random”
based on classical measurement theory. representatives of others who would also
use the measurement. Model 3 is used
 The intraclass correlation coefficient (ICC) is
when the raters are the only ones of
primarily used to assess test-retest or rater
interest, and their results are not
reliability. It is a measure of relative
intended to be generalized to others.
reliability. As a unitless index, this measure
ICCs are classified by model and form,
allows comparison of testing methods.
such as ICC (2,1) indicating that scores
 ICC values can range from 0.00 to 1.00, are single values and will be generalized
with higher values indicating greater beyond the study. ICC (3,k) indicates
reliability. that scores are means of k scores and will
 The ICC has the advantage of being able to not be generalized beyond the study.
evaluate reliability across two or more sets  Researchers should decide which model
of scores representing multiple assessments fits their study’s purpose at the outset.
of the same test or multiple raters. The different forms and models can
 The ICC is calculated using variance generate different values, with model 2
estimates from a repeated measures generally being larger than model 3, and
analysis of variance (ANOVA). form k being larger than form 1.

 There are two forms of the ICC, depending • There are no standards for strong
on whether the assigned scores are derived reliability, although measures used for
from single ratings or from mean ratings clinical decision-making should be close
across several trials. Form 1 involves to .90 or higher.
single ratings and form k a mean of several • The standard error of measurement (SEM)
ratings. The value of k is the number of is a measure of absolute reliability,
scores used to derive the mean. providing information about expected
 There are three models of the ICC, consistency in a measurement over several
depending on whether the results of the trials. It is expressed in the original unit of
analysis are intended to be generalized measure-ment.
beyond particular testing situation of the  The SEM represents the range of scores
study, or if the reliability of those involved that can be expected when differences are

84
due to random error. It is calculated using dimension. Low values of alpha may
relative reliability coefficients or indicate that items are not measuring
variance measures from an ANOVA. dimensions of the same construct.
 The SEM is used to determine a • When we want to compare the results of
confidence interval. For instance, the different forms of a measurement we can
95% CI would indicate that we were 95% use a procedure that looks at limits of
confident that an individual’s true score agreement (LoA). For example, we could
falls within a certain range, with compare blood pressure measurements
variability due to random error. taken with manual versus automated
Therefore, if an individual scores 60 instruments. If these are reliable alter-
pounds on a test of grip strength, and we natives, we expect their measurements to
know the SEM is 4.38, we determine the be close.
95% CI as X ± z (SEM), where z = 1.96.
 A Bland-Altman plot can be used to
Therefore, we are 95% confident that the
show the relationship between the two
individual’s true score would fall
measurements, plotting the mean of the
somewhere between 51.42 and 68.58.
two measurements for each individual
 The SEM allows a clinician to determine against the difference between the two
how variable we can expect an in- measurements. A band at 1.96 standard
dividual’s performance to be. deviations above and below the mean
scores indicates the limits of agreement
• Percent agreement is a measure of how
within which we would consider the
frequently raters agree on categorical
instruments interchangeable.
scores. It equals the number of observed
agreements out of the total number of • Responsiveness is the ability of a
possible agreements. measuring instrument to register change in
 Because chance can affect agreement, we a score whenever an actual change has
use a chance-corrected statistic called occurred. It is a ratio of true change and
random error.
kappa, κ. It represents a ratio of the
number of observed agreements relative  The minimal detectable change (MDC), a
to the number of agreements that would measure of reliability, is the minimal
be expected if only chance were amount of measured change required
operating. before we can eliminate the possibility
that measurement error is solely
 Kappa will generally be smaller than
responsible. If a measurement exceeds
percent agreement. Excellent agreement
the MDC, we can be confident that
is considered above .80.
some portion of the change is due to
 An alternative method called weighted real change in the measured variable.
kappa, κ w , can be used when there is a
difference in the importance of dis-  The MDC = z * SEM *√2 . The value
agreements in different directions. of z indicates the levels of confidence
attached to the MDC, usually 95% (z =
• Internal consistency is a measure of 1.96) or 90% (z = 1.645).
reliability that reflects the degree of
 Beyond simply looking at random
consistency within a set of items on a scale,
error, the minimal clinically important
indicating the correlation among items.
difference (MCID) is the smallest change
 Cronbach’s alpha is the test statistic. A that is perceived as beneficial by the
high value of alpha, above .70, indicates patient and that would lead to a change
that the items on the scale relate to a single in management. The MCID reflects

85
validity, and is a judgment about the score in a group of stable subjects
importance of different degrees of divided by the standard deviation of
change. baseline scores. A measure with high
variability will have a smaller effect
 The MCID is usually determined by
using a global measure of importance, size. The standardized response mean
often on an ordinal scale, such as 0 to (SRM) is a ratio of the change from
pretest to posttest divided by the
±5 from, with 0 indicating no change, 5
standard deviation of the change scores
indicating a great deal better, and –5
within a distribution. A distribution
indicating a great deal worse. Using a
with high variability in the degree of
cutoff such as “somewhat better” or
change will have a small SRM.
“somewhat worse,” a value can be
Guyatt’s responsiveness index (GRI)
determined that matches those levels,
uses an anchor-based MCID for the
which would be considered the MCID.
measure, indicating the smallest
 Distribution measures can be used to different between baseline and posttest
assess MCID based on effect size that would represent meaningful
indices. The three most commonly change.
used are (1) the effect size index (ES),
which is a ratio of the mean change

86
CHAPTER Diagnostic

33 Accuracy
Chapter Overview

 Objectives
After completing this chapter, the learner will be able to:
1. Apply measures of sensitivity, specificity, predictive values, and likelihood ratios to the
application of clinical tests.
2. Discuss the relationship among measures of diagnostic accuracy.
3. Determine pretest and posttest probabilities for a given case.
4. Discuss the application of receiver operating characteristic (ROC) curves in the
interpretation of test scores.
5. Describe the use of clinical prediction rules.

 Key Terms
Index test SnNout
Reference standard Pretest probability
Sensitivity Posttest probability
Specificity Bayes’ theorem
True positive Likelihood ratio (LR)
True negative Nomogram
False positive Receiver operating characteristic (ROC)
False negative curve
False-negative rate Cutoff score
False-positive rate Area under the curve
Predictive value Youden index
Prevalence Clinical prediction rule
SpPin

 Chapter Summary
the patient’s true status, either the presence or
• The purpose of a study of diagnostic accuracy
absence of the condition.
is to determine if a new test, the index test, is
accurate based on results of a gold standard • The validity of a diagnostic test is dependent
that is known to be an accurate indication of on the accuracy of the gold standard, or a

87
reference standard that is assumed to be an • Prevalence is the number of cases of a
accurate criterion. condition existing in a given population at a
point in time. This value can influence
• Diagnostic tests must be evaluated by using
sensitivity, specificity, or predictive value if
samples that represent the full spectrum of
the sample does not reflect prevalence in the
the condition.
overall population.
• Results of diagnostic study are obtained
• When a test has high sensitivity, a negative
using a 2 × 2 table, with columns indicating
test rules out the diagnosis. When a test has
the true diagnostic result based on the gold
high specificity, a positive test rules in the
standard (present/absent), and rows
diagnosis. The mnemonics SpPIN and
indicating those who test positive or
SnNout are used to remember these
negative. Cells in the table use the standard
relationships.
a, b, c, d designations. Diagnostic accuracy
is the proportion of all correct tests in the • Diagnostic tests provide information about a
sample. patient’s true condition, which will lead to
treatment decisions. When we evaluate a
• True positives are those who are accurately
patient initially we may hypothesize about a
diagnosed with the condition (cell a), and
likely diagnosis. This hypothesis can be
true negatives are those correctly identified
translated into a probability or level of
as not having the condition (cell d). False
confidence, termed the pretest probability, or
positives are those who are incorrectly
prior probability—what we think the
assigned a positive test result (cell b). False
diagnosis will be before we perform the test.
negatives are individuals who are
This can be estimated from clinical
incorrectly assigned a negative test (cell c).
information or literature, or it can be
• Sensitivity (Sn) is the test’s ability to obtain estimated as the prevalence of the condition
a positive test when the target condition is in the population or the study sample.
really present, or the true positive rate
• Once the test is done, we can consider how
(a/(a + c)). Specificity (Sp) is the test’s ability
much more confident we are about the
to obtain a negative test when the condition
diagnosis, the posttest probability, or
is absent, or the true negative rate (d/(c +
posterior probability.
d)). These proportions are expressed as
percentages. • When a pretest probability is high, it may
not be necessary to perform a test. When it
• The complement of sensitivity is the false-
is low, it may warrant considering a different
negative rate (1–sensitivity). The comple-
diagnosis. When there is uncertainty, the
ment of specificity is the false positive-rate
test should be done. Once it is performed,
(1–specificity). These represent the prob-
we can determine if the posttest probability
ability of incorrect tests.
is low (consider a different diagnosis), high
• A positive predictive value (PV+) is the (diagnosis is confirmed), or if there is still
proportion of all those who tested positive uncertainty, and other tests should be done.
who were true positives. A negative
• Likelihood ratios reflect the “confirming
predictive value (PV-) is the proportion of all
power” of a test, indicating how much the
those who tested negative who were true
test can increase certainty about a positive
negatives. Predictive value provides
diagnosis. A positive likelihood ratio (LR+)
information regarding how useful the test
tells us how many times more likely a
will be.
positive test will be seen in those with the

88
disorder than in those without the disorder cutoff should provide the best balance
[Sn/(1–Sp)]. A negative likelihood ratio between sensitivity and specificity.
(LR–) tells us how many times more likely a  This balance can be represented
negative test will be seen in those with the graphically by a receiver operating
disorder than in those without the disorder characteristic (ROC) curve, which
[(1–Sn/Sp]. A discriminating test will have examines the ratio of “signal” to “noise”
a high LR+ and a low LR–. in the data. The curve is based on the
 A LR+ over 5 and a LR– lower than 0.2 balance between Sn and 1–Sp (between
represent relatively important effects. true positives and false positives) at
Likelihood ratios between 0.2 and 0.5 and varied cutoff points. The area under the
between 2 to 5 may be important. Values curve (AUC) can be used to determine the
close to 1.0 represent unimportant effects. relative accuracy of the test. An AUC of
.50 indicates a result no better than
• A nomogram can be used to determine
chance. The closer to 1.0, the stronger the
posttest probability by knowing the pretest
test. The best cutoff point will typically
probability and the LR+ and LR–.
be at the point where the curve turns.
• When the outcome of a diagnostic test is
• Clinical prediction rules (CPR) are shortcut
continuous, rather than dichotomous, a
tools to identify the likely diagnosis,
cutoff score must be set to determine who
prognosis, or response to treatment. A set of
does and does not have the condition. This
findings will increase the probability that a
condition is present.

89
CHAPTER
Epidemiology:
34 Measuring Risk
Chapter Overview
 Objectives
After completing this chapter, the learner will be able to:
1. Discuss the purpose of description epidemiology in relation to person, place and time.
2. Explain the difference between incidence and prevalence.
3. Calculate and interpret relative risk and odds ratios.
4. Discuss the concepts of confounding and effect modification in the interpretation of risk.
5. Explain the concepts of relative and absolute risk reduction.
6. Apply number needed to treat (NNT) and number needed to harm (NNH) to clinical data to
interpret the relative effectiveness or harm of interventions.

 Key Terms
Epidemiology Relative risk (RR)
Descriptive epidemiology Odds ratio (OR)
Prevalence Effect modification
Incidence Confounding
Cumulative incidence Simpson’s Paradox
Incidence rate (IR) Analytic epidemiology
Person-time Experimental event rate (EER)
Vital statistics Control event rate (CER)
Mortality rate Relative risk reduction (RRR)
Crude mortality rate Absolute risk reduction (ARR)
Cause-specific rate Number needed to treat (NNT)
Case-fatality rate Absolute risk increase (ARI)
Age-specific rate Number needed to harm (NNH)

 Chapter Summary
identify who, when, and where disorders
• Epidemiology is that branch of statistics and
occur.
research that focuses on the distribution and
determinants of health outcomes in different • Descriptive studies will measure the
populations. It can involve description to incidence of a disorder, which quantifies the
number of new cases during a specified time

90
period. It can also assess prevalence, which  If the absolute risk is the same for those
quantifies the number of cases existing in the exposed and unexposed, RR = 1.0. A RR
population within a specified time period. above 1.0 indicates an increased risk, and
a value less than 1.0 indicates that the
• Vital statistics indicate the rate of a condition
exposure reduces the risk of developing
within a population, such as birth rate or
the disorder (is preventive).
mortality rate. A crude rate is based on the
total population. Rates can be adjusted to  Confidence intervals can also be
age, causes, or other categories that can calculated for RR to provide a better
differentiate outcomes. estimate of the risk within the population.
• Analytic epidemiology is concerned with  In case-control studies, we cannot
testing hypotheses, typically to determine the compute RR because the number of cases
relationship between specific exposures and and exposed individuals are chosen by the
disease. Exposures may be lifestyle behav- researcher. Therefore, we use an estimate
iors, occupational hazards, or environmental of RR called the odds ratio (OR), which is
influences that present excess or decreased equal to ad/bc. The odds ratio is inter-
risk of acquiring a disorder. preted in the same way as RR.
• To assess risk, data are typically displayed in • Effect modification refers to a difference in
a 2 × 2 contingency table, with the columns the magnitude of an effect measure across
representing the presence or absence of the levels of another variable. An effect
disorder and rows representing the presence modifier will change the relative risk
or absence of the exposure. associated with an exposure for different
subgroups in the population, a form of
 Cells are designated a, b, c, d according to interaction.
typical contingency table format.
• A confounding variable is a nuisance
 The incidence of a health outcome
variable, a third variable that is associated
represents its absolute risk (AR). For those with the exposure, is a risk factor for the
exposed (E), the AR E is the number of
disorder independent of the exposure, and is
cases among the total exposed sample not part of the causal link between exposure
(a/(a+b). For those who are not exposed, and disease.
AR 0 is the number of cases among the
total unexposed sample (c/(c+d). These • Analytic epidemiology can also be applied to
estimates reflect the probability of the measures of intervention effect in the format
health outcome for the exposed and of an RCT, with an experimental and control
unexposed groups. group. By setting a threshold for a successful
outcome of treatment, it is possible to
• Relative risk (RR) is the ratio of these two determine how much more likely this
probabilities, indicating the likelihood that success is seen above “failed” outcomes.
someone who has been exposed to the risk
factor will develop the disorder, as compared  The experimental event rate (EER) is the
to those who are unexposed. proportion of people in the treatment
group who experience an adverse
 RR = AR E/AR0 = [a/(a+b)]/[c/(c+d)]. outcome (an adverse event is considered
 Relative risk can be used with cohort a failure of treatment). The control event
studies and with experimental studies rate (CER) is the proportion of people in
where the exposure is the treatment the control group who experienced an
condition. adverse outcome.

91
 The ratio of these two values is the RR  The absolute risk increase (ARI) is the
associated with the intervention = amount of increased risk associated with
EER/CER. These values are essentially an intervention, or EER – CER.
the same as the measure of absolute risk.
 The number needed to harm (NNH) is the
 The relative risk reduction (RRR) indicates reciprocal of ARI (1/ARI). It is the
how much the risk is reduced in the number of patients that would need to be
treatment group as compared to the treated to cause one adverse outcome.
control group = (CER–EER)/CER.
 The larger the NNH, the less likely a
 Absolute risk reduction (ARR) indicates patient is to experience an adverse
the actual difference in risk = CER – outcome. An NNH of 100 would mean
EER. that we would need to treat 100 patients
before we would see 1 adverse outcome.
• The degree of risk reduction should help us
An NNH of 1.0 would mean every patient
decide whether a treatment is worth pursuing
would have an adverse outcome.
based on the likelihood of a successful
outcome, or avoidance of an adverse event if • Several factors must be considered in
the purpose of treatment is prevention. As a comparing values of NNT across studies:
useful clinical value, the number needed to  The NNT must be interpreted in terms of
treat (NNT) is the number of patients that the time period for treatment and follow-
would need to be treated to prevent one up.
adverse outcome or to achieve one beneficial
outcome in a given time period.  The interpretation of NNT will depend on
baseline risk.
 NNT is the reciprocal of the ARR
(1/ARR).  To compare NNTs across studies,
outcomes must be the same.
 An NNT of 1.0 means that we would need
to treat 1 patient to avoid one adverse  The design validity of the trial must be
outcome; that is, every patient will benefit considered.
from treatment. An NNT of 10 means  There are no standard limits to indicate
that we would need to treat 10 patients to how NNT and NNH should be used for
prevent one adverse outcome, or 1 out of decision-making. Decisions need to
10 will be successful. The lower the value consider severity of the disease and its
of NNT, the more effective the treatment. consequences, availability of treatments,
• Where NNT reflects the prevention of an side effects, cost, and patient preferences.
adverse outcome, we must also be
concerned about potential harms.

92
CHAPTER
Writing a

35 Research Proposal
Chapter Overview

 Objectives
After completing this chapter, the learner will be able to:
1. Explain the purposes of a research proposal.
2. Describe the components of a research proposal.
3. Describe the important issues for administrative support of a research proposal.
4. Describe how research personnel may be involved in the development of a research proposal.
5. Develop a timeline for a research study.

 Key Terms
Abstract Timeline
Specific aims Budget
Objectives Direct costs
Hypotheses Indirect costs
Principal investigator

 Chapter Summary
• The initial stages of the research process  Proposals may serve as the body of a grant
include development of the research question application.
and delineation of methods, data collection,  Proposals will be reviewed by IRBs to assure
and data analysis procedures. To be sure that that all procedures are ethical, that the project
these plans are carried out, a research proposal personnel are qualified, and the project is
describes the purpose of the study, the feasible.
importance of the research question, the
research protocol, and feasibility of the project.  Proposals serve as a blueprint for all
involved in a study to be sure they
 The research proposal contains a synthesis understand their roles and how procedures
of the scientific literature that supports the should be followed to assure consistency and
research question and provides a theoretical accuracy throughout the project.
rationale for the study and its methods.

93
• The first part of a proposal is the research plan. necessary to support the project’s
implementation.
 The proposal begins with its title, which
should reflect the study fully, and an • The second part of the proposal is the plan for
abstract that summarizes the intent of the administrative support.
study. These are often written after the  This includes a full description of the
study is completed. investigators, their qualifications, and other
 The opening section of a proposal personnel involved in the study.
identifies the research question, why it is  Information must be provided that
important, how it is supported by literature demonstrates how existing resources are
or theory, and what contributions the study available to support the project or what
will make. additional resources will be necessary and
 The purpose of a study is usually stated in how they will be obtained.
terms of hypotheses, guiding questions, or
 A full budget for direct costs must be
specific aims that reflect the expected detailed for funding agencies but may be
outcomes of the study. required for academic proposals as well.
 The methods section should include a This should include personnel salaries and
description of the study design, which may the proportion of time the individuals will
be experimental, observational, descript- devote to the study, equipment costs,
tive, or qualitative. facilities, and supplies. It will also include
 It will specify who participants will be indirect costs that relate to overhead, usually
(inclusion and exclusion criteria), how and at a fixed rate for funding agencies.
from where they will be selected, how • The format of a proposal is stipulated by the
informed consent will be obtained, and agency to which it will be submitted and
how sample size was determined. The should be followed exactly.
informed consent form will be included in
the proposal for IRB review. • Proposals should be written clearly, with
concise language, and using an organizational
 Procedures are fully explained as they will structure that will make it easy for reviewers to
be carried out for data collection, including follow.
details on devices used and full operational
definitions. • Agencies will have specific requirements for
submitting proposals, including deadlines that
 A timeline flow sheet is useful to must be followed.
summarize the expected sequence of
events for the planning and imple-
mentation of a study.
 Data management is described in terms of
analysis procedures, safety and confi-
dentiality of data.
 Appendices may follow the main section,
including sample surveys, equipment
details, letters of support from outside
agencies, or other information that is

94
Critical Appraisal:
CHAPTER
Evaluating Research
36 Reports
Chapter Overview
 Objectives
After completing this chapter, the learner will be able to:
1. Discuss the role of levels of evidence in critiquing research.
2. Describe the core questions that can be used to assess all types of research studies.
3. Discuss the components of a research report and the basic information that should be included
to determine if a study is valid.
4. Discuss the information that is necessary to determine if results of a research study are
meaningful.
5. Describe specific questions that are relevant to critically appraise studies of interventions,
diagnosis, prognosis, or qualitative research.
6. Discuss the purpose and format of a critically appraised topic (CAT), and create a CAT for a
given research report.

 Key Terms
Evidence-based practice
Critical appraisal
Levels of evidence
Critically appraised topic (CAT)
Journal club

 Chapter Summary
however, does not guarantee the validity of a
• The purpose of critical appraisal is to determine
study’s findings.
the scientific merit of a research report and its
applicability to evidence-based practice. • The appraisal process begins with a search to
identify a relevant study to answer a clinical
• The classification of levels of evidence can be
question. The title and abstract will be the first
used as an initial criterion to judge the rigor of
pass at determining if the study is appropriate.
a study’s design. Use of a strong design,

95
• Studies must be evaluated based on the such as blinding, and how measures of
background presented to justify the research diagnostic accuracy were applied.
question, the validity of the methods used, how
• Prognostic studies should include information
the results are presented, and how findings are regarding the retrospective or prospective
put in context of prior research and clinical nature of the design, how cases and controls
theory. were defined and recruited, how data were
• Several core questions can be used for any type obtained, the relevant time period, and how
of study to answer the questions, “Is the study prognostic estimates were derived.
valid?” “Are the results meaningful?” and
• Qualitative studies should include information
“Are the results relevant to my patient?” on the sampling strategy and the qualitative
• Studies must present information to establish approach used, such as ethnography,
the study’s purpose and rationale, who the phenomenology, or grounded theory. Data
participants are and how they were selected, collection and analysis procedures should be
what study design was used, how data were described in detail, including how
collected and analyzed, what results were trustworthiness was assured.
obtained, and how findings can be interpreted.
• A critically appraised topic (CAT) is a brief
• Studies should present findings in terms of summary of one or more research reports that
effect size so that the importance of the focus on a specific patient question. Its
findings can be determined. sections include a title that reflects the
question, the author and date, a clinical
• Studies of intervention should include infor- scenario describing the patient case, the
mation about how treatment groups were clinical question using the PICO format, a
formed, operational definitions of intervention clinical bottom line that summarizes the
and measurement outcomes, procedures to applicability of findings, the search history and
eliminate bias such as blinding and random citations, a summary of the study’s design and
assignment, and use of appropriate analysis the results, and other comments that can
procedures, including intention to treat address internal and external validity.
analysis.
• Journal clubs can provide a useful venue for
• Studies of diagnostic accuracy should include teaching practitioners to search for articles and
information on the gold standard used to assess to develop critical appraisal skills.
the index test, procedures to eliminate bias

96
CHAPTER
Synthesizing Literature:
Systematic Reviews and
37 Meta-Analyses
Chapter Overview
 Objectives
After completing this chapter, the learner will be able to:
1. Describe the purposes and formats of systematic reviews, meta-analyses, and scoping reviews.
2. Discuss how publication bias and grey literature can influence outcomes of a systematic
review or meta-analysis.
3. Discuss how selection criteria are applied in the development of a systematic review.
4. Describe the major components in a search strategy to develop a systematic review.
5. Apply scales for assessing methodologic quality of research studies.
6. Discuss how effect size is used to compare studies for meta-analysis.
7. Interpret data from a forest plot and funnel plot.
8. Describe the purpose of sensitivity analysis.
9. Discuss the criteria for appraising systematic reviews and meta-analyses.
10. Describe the purpose of scoping reviews and how they are distinguished from systematic
reviews.

 Key Terms
Systematic review Physiotherapy Evidence Database (PEDro)
Meta-analysis scale
Scoping review Cochrane risk of bias tool
Preferred Reporting Items for Systematic Selection bias
Reviews and Meta-Analyses (PRISMA) Performance bias
Publication bias Attrition bias
Grey literature Detection bias
Data extraction Quality Assessment of Diagnostic Accuracy
Template for Intervention Description and Studies (QUADAS)
Replication (TIDieR) Quality in Prognosis Studies (QUIPS)
Methodological quality COnsensus-based Standards for the selection
Enhancing the QUAlity and Transparency OF of health Measurement INstruments
health Research (EQUATOR) (COSMIN)

97
Cochrane Handbook Funnel plot
Effect size Grade of Recommendation, Assessment,
Standardized mean difference (SMD) Development and Evaluation (GRADE)
Forest plot A MeaSurement Tool to Assess systematic
Heterogeneity Reviews (AMSTAR)
Cochran’s Q PRISMA Extension for Scoping Reviews
I2 statistic (PRISMA-ScR)
Sensitivity analysis

 Chapter Summary
• Systematic reviews are a form of research that • The search strategy is important to assure that
uses a rigorous process of searching, selected studies are comprehensive and
appraising and summarizing existing literature represent the most recent research on the topic.
on a selected topic. When effect size data are Reviewers may restrict articles to randomized
available in research reports, a meta-analysis trials but should also explore studies with other
can be done by pooling effect size estimates, designs that may still provide useful
creating a more robust estimate of an information.
intervention effect.
• Grey literature is composed of nonpublished
• Systematic reviews commonly focus on information, often on websites, in reports,
interventions but can also be applied to studies conference proceedings, theses or disserta-
of prognosis, diagnosis, and qualitative tions. These sources can be valuable to a full
studies. understanding of the scope of knowledge on
the topic.
• Guidelines for reporting systematic reviews
are included in the Preferred Reporting Items • Reviewers want to avoid the impact of
for Systematic Reviews and Meta-Analyses publication bias, which is the tendency for
(PRISMA) checklist. studies with “no significant” findings not to be
published.
• A systematic review is built on a specific
research question that should include the • Once articles are chosen, they must be
elements of PICO: population, intervention reviewed to determine if they fit the set criteria.
and comparison, and outcome measures. Data extraction is a process whereby
information is analyzed and recorded as part of
• The process of developing a systematic review
the review.
is similar to the typical research process,
starting with an introduction with background • The PRISMA flow diagram demonstrates how
literature and a research rationale. articles were accessed, and which were
rejected for review. All systematic reviews
• The methods section details the criteria for
should be completed by at least two colleagues
choosing studies to review, describes the
who review studies independently. Reliability
process of searching for references, and how
of their evaluations should be assessed.
the studies were selected and evaluated. In a
systematic review, the “subjects” of the study • An important aspect of systematic reviews is
are the articles that are included. Therefore, the assessment of methodological quality, to
inclusion and exclusion criteria indicate the determine if there is adequate control or
design, participants, and definitions of sources of bias that affect the value of the
interventions and outcomes that must be results for answering the research question.
present in selected studies.

98
Several scales have been developed for quality • Measures of heterogeneity are applied to data
of assessment of specific types of designs. to reflect consistency of findings across
 For RCTs the Physiotherapy Evidence studies.
Database (PEDro) scale and the Cochrane  Cochran’s Q and the I2 statistics are used
Risk of Bias Tool are often used. to represent the percentage of variance
 For observational studies, the Newcastle- across studies that is due to true
Ottawa Scale (NOS) Quality Assessment is differences in treatment effect.
commonly used. • Sensitivity analysis involves looking at
 For diagnostic studies, the Quality subsets of data to see if results agree with the
Assessment of Diagnostic Accuracy Studies overall data.
(QUADAS) is used. • Funnel plots are scatterplots of treatment
 The Quality in Prognosis Studies (QUIPS) effect sizes against a measure of variability
scale is used for prognosis studies. within the data for each study. If studies are
not equally distributed around the total effect,
 The Consensus-based Standards for the it may indicate publication bias.
Selection of Health Measurement
Instruments (COSMIN) is a checklist for • The Grade of Recommendation, Assessment,
use in methodological studies. Development and Evaluation (GRADE)
system can be used to assess the quality of the
• Discussion and conclusions in a systematic body of evidence, rating quality from high to
review describe the completeness of the very low.
evidence, overall quality, and consistency of
findings with other studies or reviews • A Measurement Tool to Assess Systematic
Reviews (AMSTAR) is a checklist for
• Meta-analysis is an extension of a systematic appraisal of systematic reviews.
review that incorporates statistical pooling of
data across several studies. • Scoping reviews are a different form of
systematic review that does not include the
 Pooling is based on effect sizes, such as same rigor of assessment. Its intent is to
the standardized mean difference for address specific clinical questions to explore
interventions or relative risk for the nature of evidence.
observational studies.
 The PRISMA Extension for Scoping
 Effect sizes are weighted based on sample Reviews (PRISMA-ScR) includes a
size. checklist that defines standard reporting
• A forest plot is a graphic display of weighted items for scoping reviews.
effect sizes of individual studies with
confidence intervals, and a cumulative
summary effect size.

99
CHAPTER Disseminating
38 Research
Chapter Overview

 Objectives
After completing this chapter, the learner will be able to:
1. Describe the components of a research article and their associated content.
2. Discuss the relevant information that should be included in each section of a research
report.
3. Discuss the use of tables and graphs in a research report.
4. Assess examples of good writing style.
5. Compare and select options for dissemination.
6. Describe the elements of good design for a research poster.
7. Describe the important considerations in developing an oral presentation.

 Key Terms

EQUATOR (Enhancing the QUAlity and Plagiarism


Transparency Of health Research) Systematic reviews
CONSORT (CONsolidated Standards of Meta-analysis
Reporting Trials) Perspectives/Commentaries
Journal impact factor Short reports
Open access journals Letters to the editor
Think. Check. Submit. Trial protocols
Peer review Book reviews
Primary author Editorials
Corresponding author Oral presentations
Fabrication Poster presentation
Falsification e-poster

100
 Chapter Summary

• For research results to inform evidence- questions related to the accuracy or


based practice, they must be disseminated integrity of any part of the work are
in the scholarly literature. The most appropriately investigated and resolved.
traditional vehicle for dissemination is a
• Most journals require all authors to sign a
journal article.
copyright release that includes a statement
• To select which journal to target your of their contributions to the work,
manuscript, consider its aims and scope, including:
impact factor, and alternative metrics of  Conception and design
citation.  Participation in data collection or
 The impact factor measures how analysis
frequently a journal’s articles have been  Writing the article
cited elsewhere in the previous 2 years.  Project management or obtaining
funding
• Open access journals charge authors for
 Reviewing the final paper
publication and make articles available to
readers for free. • The primary author takes responsibility for
coordination of the team involved in the
 Green open access means an article is
publication. This person is usually also the
published in a traditional journal and is
corresponding author, who takes respon-
made available in a separate repository.
sibility for communication during sub-
 Gold open access refers to journals that mission and revision.
make all articles freely available.
• Guidelines for reporting research results
 Hybrid gold access refers to traditional provide a checklist for what to include in a
subscription journals that provide an manuscript. Guidelines have been
option for authors to publish with free developed for many kinds of research
access. designs, and they can be accessed via the
 Some journals, considered predatory EQUATOR network. Most guidelines also
journals, take advantage of authors by have Elaboration and Explanation Papers
charging for publication but offer no that clarify each item on the checklist.
rigorous peer review or editorial • Journal articles follow the IMRAD format:
services. introduction, methods, results, and
• The International Committee of Medical discussion. Many journals also allow online
Journal Editors (ICMJE) has established supplementary material.
four criteria for authorship. All authors on • Using clear language with active verbs and
a manuscript should meet all four: concrete nouns helps readers grasp the
 Substantial contribution to conception importance of your research. Avoid
and design of the work or the acquisi- abbreviations and jargon that may cloud
tion, analysis or interpretation of data. meaning.
 Drafting the work or revising it critically • Other vehicles for presenting research
for important intellectual content. results include case reports, short reports,
letters to the editor, and oral and poster
 Final approval of published version .
presentations.
 Agreement to be accountable for all
aspects of the work in ensuring that

101
• With the profusion of publications, authors
need to promote their work to have it reach
their intended audience. Social media can
help spread the word about articles and
provide additional measures of their
impact.

102

You might also like