MedBridge Statistics Booster Summary

MedBridge Statistics Booster

Chad Cook, PT, PhD, MBA, FAAOMPT
KEY TERMS:
Clinical Significance: The practical importance of a treatment effect—whether it has a real, genuine,
palpable, noticeable effect on daily life. It was originally anchored to the patient’s perception but has
since expanded beyond this boundary.
Clinical Trial: Any research study that prospectively assigns human participants or groups of humans
to one or more health-related interventions to evaluate the effects on health outcomes. Clinical
trials are divided into four phases which are designed to keep patients safe and to answer dedicated
questions about the efficacy or effectiveness of an intervention.
Confidence Intervals: A range of values so defined that there is a specified probability that the value of
a parameter lies within it.
Data: Recorded factual material commonly retained by and accepted in the scientific community as
necessary to validate research findings.
Effect Size: This is the magnitude of an intervention reflected by an index value. It can be calculated
from the data in a clinical trial and is mostly independent of sample size. Most interventions have small
to moderate effect sizes.
Effectiveness: The performance of an intervention under “real-world” circumstances.
Efficacy: The performance of an intervention under ideal and controlled circumstances.
False Negative: A test result which incorrectly indicates that a particular condition or attribute is
absent.
False Positive: A test result which incorrectly indicates that a particular condition is present.
Fidelity: This is described two ways. The extent to which delivery of an intervention adheres to
the protocol or program model originally developed and how close the intervention reflects the
appropriateness of the care that should be provided.
Implementation Science: The science of putting (executing) a project or a research finding into effect.
Methodology: Within the research domain, this reflects the specific procedures or techniques used to
identify, select, process, and analyze information about a research topic.
Minimally Clinically Important Difference: The smallest difference in score in the domain of interest
which patients perceive as beneficial and which would mandate, in the absence of troublesome side
effects and excessive cost, a change in the patient’s management.
PAGE 1
Outcomes Research: A broad umbrella term without a consistent de nition. However it tends to
describe research that is concerned with the effectiveness of public-health interventions and health
services.
P value: The probability, under an assumption of no difference in groups of obtaining a result equal to
or more extreme than what was actually observed. Usually depicted at 5%.
Personalized Medicine: Within research, this involves the study of tailoring of medical treatment to the
individual characteristics of each patient.
Precision Medicine: A form of medicine that uses information about a person’s genes, proteins, and
environment to prevent, diagnose, and treat disease.
Reliability: This is measured in several ways. It is the degree to which the result of a measurement,
calculation, or specification can be depended on to be precise.
Statistical Assumptions: Characteristics about the data that need to be present before performing
selected types of inferential statistics.
Statistical Significance: Refers to the claim that a result from data generated by testing or
experimentation is not likely to occur randomly or by chance, but is instead likely to be attributable to a
specific cause.
Statistics: The practice or science of collecting and analyzing numerical data in large quantities,
especially for the purpose of inferring proportions in a whole from those in a representative sample.
True Negative: A test result that accurately indicates a condition is absent.
True Positive: A test result that accurately indicates a condition is present.
Variable: A variable, or data item, is any characteristic, number, or quantity that can be measured or
counted.
Validity: The extent that the instrument measures what it was designed to measure. There are multiple
types of validity, each representing a different construct.
DATA:
Data are always plural. Thus, “The data are, or the data were looked at” is the appropriate way of
referring to data.
There are four classifications of data: 1) nominal, 2) ordinal, 3) interval, and 4) ratio.
Nominal: Two categories, such as “yes or no”, boy or girl
Ordinal: Has order but not rank. Such as strongly agree, agree, disagree, and strongly disagree
PAGE 2
Interval: Has rank and order. 1-4, 5-8, 9-12, etc.
Ratio: Has rank, order, and is countable. Examples include weight, temperature, or age.
Data classifications will dictate the type of statistical analyses that are used in studies.
ANALYSIS TYPES:
Parametric versus Nonparametric Tests: Parametric tests are used when data are normally
distributed. Nonparametric tests are also called distribution-free tests because they don’t assume that
your data follow a specific distribution. They also can be used with smaller sample sizes, and when
you want to be more conservative with your analyses. Parametric analyses test group means, whereas,
nonparametric analyses test group medians.
Tests of Differences: Tests of differences are designed to measure differences in sample group means
(and variance), sample group medians, or sample proportions. The type of analysis used depends on
the assumptions of the data; characteristics about the data that need to be present before performing
selected types of inferential statistics. Parametric analyses meet statistical assumptions of 1) a
normal distribution, 2) data from multiple groups have the same variance, and 3) data have a linear
relationship. There are countless tests of differences and these are beyond the scope of this primer.
The following table gives perspective on commonly used tests and when to use them.
Type of Test Parametric Version Nonparametric Version
2-sample t-test Mann-Whitney U test
Paired 2-sample Paired t-test Wilcoxon
Distribution Chi-square (of Fischer extract) Kolmogorov-Smirnov
> 2 samples ANOVA (analysis of variance) Kruskal Wallis
>2 sample, several

MANOVA
dependent variables
Tests of Association: Tests of association are used to discover if there is a relationship between two or
more variables. The type of analysis also depends on the assumptions of the data. Parametric analyses
meet statistical assumptions of 1) a normal distribution, 2) data from multiple groups have the same
variance, and 3) data have a linear relationship.
PAGE 3
Type of Test Parametric Version Nonparametric Version

Two groups Person Product Kendall Tau
More than Two Groups Person Product (correlational matrix) Kendall Tau (correlational matrix)
Open Dependent Variable

Linear regression Logistic regression
(multiple independent variables)
Ologit regression (categories are Discriminate Analysis (categories are not

Multiple Dependent Variables
ordered) ordered) or Multinominal regression
Tests and Metrics for Diagnostic Accuracy: Diagnostic accuracy is a form of statistical analysis
that relates to the ability of a test to discriminate between the target condition and health. Diagnostic
accuracy statistics are all derived from a 2 X 2 table that categorizes true positives (TP), true negatives
(TN), false positives (FP), and false negatives (FN).
Different measures of diagnostic accuracy relate to the different aspects of diagnostic procedure: while
some measures are used to assess the internal discriminative property of the tests, others are used to
assess the test’s ability to influence post-test decision making.
Test Assessment Definition

The proportion by percentage of patients
Sensitivity Internal who have the disease of interest who
register a positive test finding
The proportion by percentage of patients

Specificity Internal who do not have the disease of interest
who register a negative test finding
Positive predictive value is the probability

Positive Predictive Value Internal that subjects with a positive screening
test truly have the disease
Negative predictive value is the probability

Negative Predictive Value Internal that subjects with a negative screening
test truly don’t have the disease
A positive likelihood ratio (LR+) reflects

the probability of a patient with the
disease and a positive test divided
Positive Likelihood Ratio Post-test decision making
by the probability of a patient without
the disease and a positive test. It is
commonly used to rule in a condition.
PAGE 4
Test Assessment Definition

A negative likelihood ratio (LR-) is the
probability of a person who has the
disease testing negative divided by the
Negative Likelihood Ratio Post-test decision making
probability of a person who does not
have the disease testing negative. It is
commonly used to rule out a condition.
The accuracy of a test can be calculated

by examining the proportion of true
Accuracy Internal positive and true negative in all evaluated
cases.
Accuracy=TP+TN / TP+TN+FP+FN
STATISTICAL SIGNIFICANCE AND CLINICAL SIGNIFICANCE:

Statistical significance refers to the claim that a result from data generated by testing or
experimentation is not likely to occur randomly or by chance, but is instead likely to be attributable to a
specific cause. Clinical significance involves the practical importance of a treatment effect—whether it
has a real, genuine, palpable, noticeable effect on daily life. Clinical significance was originally anchored
to the patient’s perception but has since expanded beyond this boundary.
Statistical significance allows one to interpret the results of their study using a common metric. Up
until the 1990’s, only statistical significance was used to determine utility of results. Now, it is common
for discussions to include both clinical and statistical significance when evaluating the influence of the
findings. It’s important also to recognize that results can vary depending on the finding. Results can
be statistically significant and clinically important. This reflects an important, meaningful difference
between the groups and the statistics also support this. Results can be not statistically significant but
clinically important. This may occur if the study is underpowered (small sample size) and if there is
a large enough sample size to detect a difference between groups. Results may also be statistically
significant but not clinically important. This happens in large sample sizes (as it is easier to find
statistical significance) but when differences are trivial among groups.
EFFECT SIZE:
Effect size is a name given to a family of indices (currently there are over 40 types) that measure the
strength of a treatment effect. Effect size quantifies the true magnitude of the measured intervention,
by providing a dedicated value (a numeric score) when comparing two (or more) groups. Effect
size measures allow greater precision in determining the true magnitude of the intervention when
results are not large or obvious (or statistically significant). Unlike significance tests, effect sizes are
independent of sample size and are useful singular measures when evaluating under- and overpowered
studies.
PAGE 5
Effect size measures depend on the type of effect size used. For example, odds ratios are effect
sizes that represent a measure of association between an exposure and an outcome. The odds ratio
represents the odds that an outcome will occur given a particular exposure, compared to the odds of
the outcome occurring in the absence of that exposure. Values greater than one are stronger in favor of
the finding and odds below 1.0 suggest a finding that is less likely. For randomized trials (comparative
studies), effect size magnitudes are reported verbally as 1) trivial, 2) small, 3) moderate, or 4) large.
Most rehabilitation-based intervention provide only small to moderate effects.
JOURNAL METRICS:
Journals are evaluated based on their influence in society. Although there is no perfect way to measure
journal influence, there are a number of measures that are used.
Journal Impact Factor reflects the number of citations made in the current year to articles in the
previous two years, divided by the total number of citable articles from the previous two years. The
5-Year Journal Impact Factor involves citations to articles from the most recent five full years, divided
by the total number of articles from the most recent five full years. “How much is this journal being
cited during the most recent five full years?”
An “h5 index” is commonly assigned to both an author and a journal. This metric is based on the
articles published by a journal over five calendar years. H is the largest number of articles that have
each been cited H times. A journal with an h5-index of 43 has published, within a 5-year period, 43
articles that each have 43 or more citations.
Although better papers that are more meaningful are more common in journals that have higher impact
factors and h5 indices, the individual credibility of a paper should be evaluated for its risk of bias. Risk
of bias scales are plentiful and are specific to the study design.
BEST PRACTICE FOR ASSIMILATING RESEARCH ARTICLES TO CLINICAL PRACTICE

There is no single, simplistic method to best incorporate research articles to clinical practice. The
following bullets are designed as suggestions when consuming research findings.
1. What was the stage of the clinical trial?
2. Was/were the environment, patients, and interventions similar to those in your setting?
3. Were the findings clinically and statistically significant (see above)?
4. Was the study risk of bias low?
5. Was the study published in a reputable journal (see above)?
6. Were the findings consistent with clinical intuition/clinical sensibility?
PAGE 6
There are four phases to clinical trials and only after the later phases can we assume with confidence
the findings are transferable. Phase I studies assess the safety of an intervention. Phase II studies test
the efficacy of the intervention in a tightly controlled environment. Phase III studies involve randomized
and blinded testing in a real world environment. Phase IV evaluates the impact of the intervention for
costs, overall long-term care, etc.
Unfortunately, similar diagnostic labels do not mean that patients are the same across studies. Patient
disease severity, co-morbidities, longevity of the condition, environment, payer source, and access
to care can markedly influence outcomes. One should look closely at the study demographics to
determine whether it is similar to the environment in which they practice.
Risk of bias is a systematic error, or deviation from the truth, in results or inferences. Biases can
operate in either direction: different biases can lead to underestimation or overestimation of the true
intervention effect. Differences in risks of bias can help explain variation in the results of the studies
included in a systematic review. There are several tools which can help determine the risk of bias in a
study.
Clinical sensibility is equivocal to the “eye test” that is sometimes used when evaluating sports teams.
Does the finding make sense? Is the finding believable? Is the magnitude realistic? It is well known that
major intervention breakthroughs rarely have long-term continuous impact (a phenomenon known as
the Proteus effect). Usually, when something demonstrates a very large effect it is too good to be true.
Interested in learning more about transferring research into practice? Consider taking the Rehabilitation
Research Boot Camp through MedBridge
https://www.medbridgeeducation.com/certificate_programs/14972-rehabilitation- research-boot-camp
PAGE 7

MedBridge Statistics Booster Summary

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

MedBridge Statistics Booster Summary

Uploaded by

Copyright:

Available Formats

MedBridge Statistics Booster

MedBridge Statistics Booster

Effectiveness: The performance of an intervention under “real-world” circumstances.

Efficacy: The performance of an intervention under ideal and controlled circumstances.

True Negative: A test result that accurately indicates a condition is absent.

True Positive: A test result that accurately indicates a condition is present.

Nominal: Two categories, such as “yes or no”, boy or girl

Interval: Has rank and order. 1-4, 5-8, 9-12, etc.

Type of Test Parametric Version Nonparametric Version

2-sample t-test Mann-Whitney U test

Paired 2-sample Paired t-test Wilcoxon

Distribution Chi-square (of Fischer extract) Kolmogorov-Smirnov

> 2 samples ANOVA (analysis of variance) Kruskal Wallis

>2 sample, several

Type of Test Parametric Version Nonparametric Version

Open Dependent Variable

Ologit regression (categories are Discriminate Analysis (categories are not

Test Assessment Definition

The proportion by percentage of patients

Positive predictive value is the probability

Negative predictive value is the probability

A positive likelihood ratio (LR+) reflects

Test Assessment Definition

The accuracy of a test can be calculated

STATISTICAL SIGNIFICANCE AND CLINICAL SIGNIFICANCE:

BEST PRACTICE FOR ASSIMILATING RESEARCH ARTICLES TO CLINICAL PRACTICE

1. What was the stage of the clinical trial?

3. Were the findings clinically and statistically significant (see above)?

4. Was the study risk of bias low?

5. Was the study published in a reputable journal (see above)?

6. Were the findings consistent with clinical intuition/clinical sensibility?

You might also like