Professional Documents
Culture Documents
Biostatistics
Biostatistics
Biostatistics
1. Research Study Designs
2. Bias and Study Errors
3. Risk Quantification
4. Diagnostic Tests
5. Statistical Distributions
6. Statistical Testing
REVIEW OUTLINE (g
1. Intro to Research Design
Biostatistics: A. Overview
B. Features of Research Design
Research
C. Types of Research Studies
2. Descriptive Studies - Observational I
A. Overview
Study
B. Case Report/Case Series
C. Cross Sectional Studies
D. Ecological Studies
Designs
3. Longitudinal Studies - Observational II
A. Overview
B. Case Control Studies
C. Cohort Studies
4. Heritability Studies - Observational III
A. Overview
B. Twin Concordance Studies
C. Adoption Studies
5. Experimental Studies
A. Clinical Trials
B. Crossover Studies
C. Drug Trial Phases
6. Evaluating Research Studies
A. Overview
B. Bradford Hill Criteria
7. Practice Question
(§) Bootcamp.com
Biostatistics: Research Study Designs Bootcamp.com
Adoption Studies
• Adopted children compared to biologic family and adopted family
o Biologic family —> Similar genes, different environment
o Adopted family —> Different genes, similar environment
Experimental Studies
Overview
• Researcher manipulates the independent variable
• Can assess causality
Clinical Trials
• Experimental studies on humans
• Compare benefits of 2 or more treatments, or a treatment and a placebo
• Randomized: Participants assigned to experimental groups randomly
• Controlled: Other potentially relevant variables are accounted for in randomization
• Blinded: Participants involved in the study are unaware of group assignment
o Double-Blinded: Participants and researchers conducting the study
o Triple-Blinded: Participants, researchers conducting the study, and data analysts
-,.,.
Crossover Studies
• Patients serve as their own controls
• Can J confounding bias
WASH
OUT
CAUSE EFFECT
\________ -
CAUSE
CAUSE ] EFFECT
Item 1 of 1 Test Your Knowledge
■ Mark Difficulty: OO Bootcamp.com
Question ID: 1001 Previous Next
A research group is studying whether or not living close to the highway increases risk of children developing asthma. They
identify 500 children from ages 2 to 8 living in housing located within 1000 feet of a major highway, and 500 children from ages
2 to 8 living in housing located at least 1 mile from a major highway. 5 years later, the researchers check the children’s clinical
records and determine how many in each group have been diagnosed with asthma. What kind of study design best describes
this study?
A. Case-control study
B. Cross-sectional study
C. Prospective cohort study
D. Randomized clinical trial
E. Case series
F. Retrospective cohort study
G. Crossover study
H. Ecological study
Item 1 of 1 Test Your Knowledge
■ Mark Difficulty: OO Bootcamp.com
Question ID: 1001 Previous Next
A research group is studying whether or not living close to the highway increases risk of children developing asthma. They
identify 500 children from ages 2 to 8 living in housing located within 1000 feet of a major highway, and 500 children from ages
2 to 8 living in housing located at least 1 mile from a major highway. 5 years later, the researchers check the children’s clinical
records and determine how many in each group have been diagnosed with asthma. What kind of study design best describes
this study?
A. Case-control study
B. Cross-sectional study
O C. Prospective cohort study
D. Randomized clinical trial
E. Case series
F. Retrospective cohort study
G. Crossover study
H. Ecological study
Item 1 of 1 Test Your Knowledge
■ Mark Difficulty: OO Bootcamp.com
Question ID: 1001 Previous Next
A research group is studying whether or not living close to the highway increases risk of children developing asthma. They
identify 500 children from ages 2 to 8 living in housing located within 1000 feet of a major highway, and 500 children from ages
2 to 8 living in housing located at least 1 mile from a major highway. 5 years later, the researchers check the children’s clinical
records and determine how many in each group have been diagnosed with asthma. What kind of study design best describes
this study?
A. Case-control study - Select 500 children with asthma and 500 without, and see if they grew up next to the highway
B. Cross-sectional study - Select 1000 children living in a specific city and assess asthma diagnosis and proximity to highway for each child
O C. Prospective cohort study
D. Randomized clinical trial - Select 1000 children and force half of them to live near the highway, then assess asthma diagnosis in five years
E. Case series - Write a detailed report about ten children diagnosed with asthma
F. Retrospective cohort study - Select 500 teenagers who lived near the highway as children and 500 who did not, and assess their asthma diagnosis
G. Crossover study - Not feasible since two treatments are not being compared
H. Ecological study - Measure average proximity to highway and rates of asthma diagnosis in an entire population of children in two cities
REVIEW OUTLINE
Bias and
C. Random vs. Systematic Error
2. Recruitment Bias
A. Overview
Study Errors
B. Sampling Bias
C. Attrition Bias
3. Performing Bias
A. Overview
B. Recall Bias
C. Measurement Bias
D. Procedure Bias
E. Observer Bias
4. Interpretation Bias
A. Overview
B. Confounding Bias
C. Lead Time Bias
D. Length Time Bias
5. Practice Question
(§) Bootcamp.com
Biostatistics: Bias and Study Errors Bootcamp.com
!GG
• Error: Difference between the true value and measured result
o Error / mistake
Accuracy vs. Precision:
• Accuracy: Proximity of a measurement to the true value
o Also referred to as validity
• Precision: Variation between the measurement values
o Also referred to as reproducibility or reliability
Random vs. Systematic Error:
• Random Error: J, precision leads to measured values that differ from the truth
o Unpredictable
jGG
• Systematic Error: J, accuracy leads to measured values that differ from the truth
o Commonly referred to as bias
o Predictable
• True Value = Measured Value + Random Error + Systematic Error
Biostatistics: Bias and Study Errors Bootcamp.com
0@O
■ If this were confounding, risk of blood clots: OOP + smoking = smoking alone > OOP alone
Lead Time Bias:
• Earlier detection incorrectly interpreted as longer survival
• Occurs when survival time is chosen as an endpoint G
o Example: A new screening test detects colon cancer earlier, but disease course has not changed
Length Time Bias:
• More aggressive diseases more likely to be diagnosed based on symptoms
• Less aggressive diseases more likely to be diagnosed based on screening tests
o Example: A new screening test disproportionately detects less aggressive breast cancer
Death or
Death Remission
Item 1 of 1 Test Your Knowledge
■ Mark Difficulty: QOO Bootcamp.com
Question ID: 1002 Previous Next
A research group has developed a new treatment for pancreatic cancer. They are interested in seeing if their new treatment is
more effective than currently available options. They recruit 100 participants who were recently diagnosed with pancreatic
cancer at one of five outpatient clinics and randomly assign half to receive their new treatment, and half to receive the current
standard of care. Some of the demographic information of the patient groups is shown in Table 1. Neither the patients nor their
care teams are aware of the group assignments. The researchers then compare 5-year survival rates between the treatment
groups. What kind of bias is most likely affecting this study?
A. Recall bias
B. Observer bias
Experiment Group Control Group
C. Lead-time bias
D. Confounding bias Mean Age 60.3 62.5
E. Berkson’s bias
F. Misclassification bias Tobacco Use 37 34
Tumor Location:
Head 23 20
Tail 27 30
A research group has developed a new treatment for pancreatic cancer. They are interested in seeing if their new treatment is
more effective than currently available options. They recruit 100 participants who were recently diagnosed with pancreatic
cancer at one of five outpatient clinics and randomly assign half to receive their new treatment, and half to receive the current
standard of care. Some of the demographic information of the patient groups is shown in Table 1. Neither the patients nor their
care teams are aware of the group assignments. The researchers then compare 5-year survival rates between the treatment
groups. What kind of bias is most likely affecting this study?
A. Recall bias
B. Observer bias
Experiment Group Control Group
C. Lead-time bias
O D. Confounding bias Mean Age 60.3 62.5
E. Berkson’s bias
F. Misclassification bias Tobacco Use 37 34
Tumor Location:
Head 23 20
Tail 27 30
Risk
C. Incidence Relative Risk
2. Interpreting Prevalence A. Overview
and Incidence B. Relative Risk Reduction
Quantification
A. Overview C. Absolute Risk Reduction
B. Interpreting Population Changes D. Attributable Risk
C. Interpreting Disease State E. Number Needed to Treat
3. Morbidity Frequency Measures - F. Number Needed to Harm
Practice Question 10. Additional Calculations with
4. Mortality Frequency Measures Relative Risk - Practice
A. Overview Question
B. Mortality Rate
C. Case Fatality Rate
5. Relative Risk
A. Overview
B. Calculating Relative Risk
C. Interpreting Relative Risk
6. Odds Ratio
A. Overview
B. Calculating Odds Ratio
C. Interpreting Odds Ratio
7. Odds Ratio and Relative Risk -
Practice Question 1
(§) Bootcamp.com
Biostatistics: Risk Quantification Bootcamp.com
A local public health agency is studying atherosclerosis in their town’s adult population. They discover that in the year 2020,
8,500 of the 35,000 adults living in the town are currently diagnosed with atherosclerosis. In 2021, they find that 9,100 adults
living in the town now have a diagnosis of atherosclerosis. They decide to run a series of public service announcements on
healthy diet changes to prevent the development of atherosclerosis.
A. 2.3%
B. 26%
C. 13%
D. 24%
E. 7.1%
Item 1 of 2 Test Your Knowledge
■ Mark Difficulty: QOO Bootcamp.com
Question ID: 1003 Previous Next
A local public health agency is studying atherosclerosis in their town’s adult population. They discover that in the year 2020,
8,500 of the 35,000 adults living in the town are currently diagnosed with atherosclerosis. In 2021, they find that 9,100 adults
living in the town now have a diagnosis of atherosclerosis. They decide to run a series of public service announcements on
healthy diet changes to prevent the development of atherosclerosis.
O A. 2.3%
B. 26%
C. 13%
D. 24%
E. 7.1%
Item 2 of 2 Test Your Knowledge
■ Mark Difficulty: OO Bootcamp.com
Question ID: 1003 Previous Next
A local public health agency is studying atherosclerosis in their town’s adult population. They discover that in the year 2020,
8,500 of the 35,000 adults living in the town are currently diagnosed with atherosclerosis. In 2021, they find that 9,100 adults
living in the town now have a diagnosis of atherosclerosis. They decide to run a series of public service announcements on
healthy diet changes to prevent the development of atherosclerosis.
What are the most likely changes to the prevalence and incidence of atherosclerosis in the town in 2022, assuming the public
health agency's interventions are effective?
A local public health agency is studying atherosclerosis in their town’s adult population. They discover that in the year 2020,
8,500 of the 35,000 adults living in the town are currently diagnosed with atherosclerosis. In 2021, they find that 9,100 adults
living in the town now have a diagnosis of atherosclerosis. They decide to run a series of public service announcements on
healthy diet changes to prevent the development of atherosclerosis.
What are the most likely changes to the prevalence and incidence of atherosclerosis in the town in 2022, assuming the public
health agency's interventions are effective?
Mortality Rate:
• Cumulative incidence of death in entire population
• Number of deaths in a population in a given time interval / Total number of people in population
• Example: In 2000, 10 people in a town of 1000 die. The mortality rate is 1%.
Relative Risk
Overview:
• Longitudinal studies assess association between exposure + outcome
• Need to quantify strength of association
• Risk: Probability of an event occurring (outcome of interest / all possible outcomes in population at risk)
o Example: 5,000 people who take aspirin and 5,000 people who do not are recruited for a study
■ 300 people have a first-time heart attack during the 10 years they are followed
■ Overall risk = 300/10,000 = 3%
• Risk (and RR) only calculated from longitudinal cohort studies (prospective or retrospective) HIGH YIELD
o Need to know the # people at risk of developing the outcome
o Ina case-control study, all participants have already developed the outcome
Calculating Relative Risk:
• Compares the risk of an event occurring based on exposure
• RR = Risk,.Exposed./Risk.,.,-
Not Exposed
.
o RiskExposed = (# People w/ outcome) / (# People w/ exposure)
■ Example: 120 heart attacks among 5,000 people taking aspirin —► Risk = 0.024
o RiskNot Exposed = (# People w/ outcome) / (# People w/out exposure)
■ Example: 180 heart attacks among 5,000 people not taking aspirin —> Risk = 0.036
o Example: RR = 0.667
• RR = [a/(a+b)] / [c/(c+d)] h ig h y iel d
Interpreting Relative Risk:
• RR = 1 —► No difference in risk basedon exposure
• RR > 1 —► Exposure f risk
• RR < 1 —► Exposure J. risk
o Example: People taking aspirin are 0.667 times as likely to have a heart attack
■ In other words, people not taking aspirin are 1 5x as likely to have a heart attack
• RR 1 does not imply causation
Biostatistics: Risk Quantification Bootcamp.com
A research group recruits 2,000 participants with skin cancer and 2,000 participants without skin cancer. Each participant fills
out a survey about their time spent in the sun, their use of sunscreen, and their use of UV tanning beds. They find an
association between skin cancer and use of UV tanning beds. These results are shown below.
Which of the following measures of association are they most likely to report in their published work?
A. Relative risk
B. Number needed to harm
C. Incidence rate
D. Pearson correlation coefficient
E. Odds ratio
Skin Cancer No Skin Cancer
A research group recruits 2,000 participants with skin cancer and 2,000 participants without skin cancer. Each participant fills
out a survey about their time spent in the sun, their use of sunscreen, and their use of UV tanning beds. They find an
association between skin cancer and use of UV tanning beds. These results are shown below.
Which of the following measures of association are they most likely to report in their published work?
A. Relative risk
B. Number needed to harm
C. Incidence rate
D. Pearson correlation coefficient
O E. Odds ratio
Skin Cancer No Skin Cancer
Question ID:
of 2
1004
■ _ Mark �
Previous
�
Next
Test Your Knowledge
Difficulty: 000 Bootcamp.com
A research group recruits 2,000 participants with skin cancer and 2,000 participants without skin cancer. Each participant fills
out a survey about their time spent in the sun as well as their use of sunscreen and UV tanning beds. They find an association
between skin cancer and use of UV tanning beds. These results are shown below.
Which of the following calculations represents the odds ratio for this study?
A. (576/(576+133))/(1424/(1424+1867))
B. 1/(((576/(576+133))/(1424/(1424+1867)))
C. (576+1424)/(576+133+1424+1867)
D. (576/1424))/(133/1867))
E. (576x1424)/(133x1867)
Skin Cancer No Skin Cancer
Question ID:
of 2
1004
■ _ Mark �
Previous
�
Next
Test Your Knowledge
Difficulty: 000 Bootcamp.com
A research group recruits 2,000 participants with skin cancer and 2,000 participants without skin cancer. Each participant fills
out a survey about their time spent in the sun, their use of sunscreen, and their use of UV tanning beds. They find an
association between skin cancer and use of UV tanning beds. These results are shown below.
Which of the following calculations represents the odds ratio for this study?
A. (576/(576+133))/(1424/(1424+1867))
B. 1/(((576/(576+133))/(1424/(1424+1867)))
C. (576+1424)/(576+133+1424+1867)
0 D. (576/1424))/(133/1867))
E. (576x1424)/(133x1867)
Skin Cancer No Skin Cancer
The contingency table from a cohort study on the association between eating a high fat diet and developing gallstones is
shown below. Which of the following is the most accurate interpretation of the results from this study?
A. Eating a highfat diet is associated with increased risk of developing gallstones, because the relative risk is greater than 0
B. Eating a highfat diet is associated with decreased risk of developing gallstones, because the relative risk is less than 1
C. Eating a highfat diet is associated with increased risk of developing gallstones, because the odds ratio is greater than 0
D. Eating a highfat diet is associated with decreased risk of developing gallstones, because the odds ratio is greater than 1
Gallstones
+ - Sum
The contingency table from a cohort study on the association between eating a high fat diet and developing gallstones is
shown below. Which of the following is the most accurate interpretation of the results from this study?
A. Eating a high fat diet is associated with increased risk of developing gallstones, because the relative risk is greater than 0
O B. Eating a high fat diet is associated with decreased risk of developing gallstones, because the relative risk is less than 1
C. Eating a high fat diet is associated with increased risk of developing gallstones, because the odds ratio is greater than 0
D. Eating a high fat diet is associated with decreased risk of developing gallstones, because the odds ratio is greater than 1
Gallstones
+ - Sum
300 patients with rectal cancer undergoing treatment with chemotherapy are recruited into a clinical study assessing the
effectiveness of a new drug designed to reduce chemotherapy-induced side effects. Half of the patients are assigned to
receive the new drug, and half are assigned to receive a placebo. Among the patients in the new drug group, 68 participants
report experiencing side effects following chemotherapy. Among the patients in the placebo group, 103 patients report
experiencing side effects.
What is the relative risk reduction for side effects among patients receiving the new drug?
A. 0.782
B. 0.660
C. 0.204
D. 0.340
Item 1 of 2 Test Your Knowledge
■ Mark Difficulty: OO Bootcamp.com
Question ID: 1006 Previous Next
300 patients with rectal cancer undergoing treatment with chemotherapy are recruited into a clinical study assessing the
effectiveness of a new drug designed to reduce chemotherapy-induced side effects. Half of the patients are assigned to
receive the new drug, and half are assigned to receive a placebo. Among the patients in the new drug group, 68 participants
report experiencing side effects following chemotherapy. Among the patients in the placebo group, 103 patients report
experiencing side effects.
What is the relative risk reduction for side effects among patients receiving the new drug?
A. 0.782
B. 0.660
C. 0.204
O D. 0.340
Item 1 of 2 Test Your Knowledge
■ Mark Difficulty: QOO Bootcamp.com
Question ID: 1006 Previous Next
300 patients with rectal cancer undergoing treatment with chemotherapy are recruited into a clinical study assessing the
effectiveness of a new drug designed to reduce chemotherapy-induced side effects. Half of the patients are assigned to
receive the new drug, and half are assigned to receive a placebo. Among the patients in the new drug group, 68 participants
report experiencing side effects following chemotherapy. Among the patients in the placebo group, 103 patients report
experiencing side effects.
What is the number of patients needed to treat for for one patient to have their side effects improved by the new drug?
A. 2
B. 7
C. 5
D. 13
Item 1 of 2 Test Your Knowledge
■ Mark Difficulty: QOO Bootcamp.com
Question ID: 1006 Previous Next
300 patients with rectal cancer undergoing treatment with chemotherapy are recruited into a clinical study assessing the
effectiveness of a new drug designed to reduce chemotherapy-induced side effects. Half of the patients are assigned to
receive the new drug, and half are assigned to receive a placebo. Among the patients in the new drug group, 68 participants
report experiencing side effects following chemotherapy. Among the patients in the placebo group, 103 patients report
experiencing side effects.
What is the number of patients needed to treat for for one patient to have their side effects improved by the new drug?
A. 2
B. 7
O C. 5
D. 13
REVIEW OUTLINE
Biostatistics:
1. Sensitivity and Specificity
A. Overview
B. Sensitivity
Diagnostic 2.
C. Specificity
Positive and Negative Predictive Values
Tests
A. Overview
B. Positive Predictive Values
C. Negative Predictive Values
3. Diagnostic Test Thresholds
A. Overview
B. Widening the Inclusion Criteria
C. Narrowing the Inclusion Criteria
D. Receiving Operating Characteristic Curves
4. Sensitivity, Specificity, PPV, and NPV - Practice Question
5. Likelihood Ratio
A. Overview
B. LR+
C. LR-
6. Likelihood Ratio - Practice Question
(§) Bootcamp.com
Biostatistics: Diagnostic Tests Bootcamp.com
Sensitivity: CONDITION/
DISEASE
• True-positive rate
• Probability that a person who has the condition will test positive
• Sensitivity = TP/(TP+FN) HIGH YIELD
• 1 - Sensitivity = False-negative rate
• f Sensitivity = j FN —► Important for screening
• Mnemonic: Sn-N-Out
• Example: Of 16 pregnant people, 13 test positive. Sensitivity = 0.81.
Specificity:
• True-negative rate
• Probability that person who does not have the condition will test negative
• Specificity = TN/(TN+FP) HIGH y iel d
• 1 - Specificity = False-positive rate
• f Specificity = | FP —> Important for diagnostic confirmation
• Mnemonic: Sp-P-ln
• Example: Of 16 non-pregnant people, 15 test negative. Specificity = 0.94.
Biostatistics: Diagnostic Tests Bootcamp.com
Researchers are studying the use of prostate-specific antigen as a screening test for prostate cancer. Study participants with
and without a prostate cancer diagnosis as confirmed by transrectal ultrasound-guided biopsy undergo prostate-specific
antigen screening. A prostate-specific antigen concentration of 4 ng/mL qualifies as a positive test result. The results are
shown in the table below.
Researchers are studying the use of prostate-specific antigen as a screening test for prostate cancer. Study participants with
and without a prostate cancer diagnosis as confirmed by transrectal ultrasound-guided biopsy undergo prostate-specific
antigen screening. A prostate-specific antigen concentration of 4 ng/mL qualifies as a positive test result. The results are
shown in the table below.
Researchers are studying the use of prostate-specific antigen as a screening test for prostate cancer. Study participants with
and without a prostate cancer diagnosis as confirmed by transrectal ultrasound-guided biopsy undergo prostate-specific
antigen screening. A prostate-specific antigen concentration of 4 ng/mL qualifies as a positive test result. The results are
shown in the table below.
Which of the following statements correctly describes the effects of changing the definition of a positive test result to 3 ng/mL?
A. The negative predictive value will increase. PSA > 4 ng/mL PSA < 4ng/mL
B. The number of false positives will decrease.
C. The sensitivity will decrease.
Biopsy + 118 379
D. The positive predictive value will increase.
E. The specificity will increase.
Biopsy - 47 456
Item 2 of 2 Test Your Knowledge
■ Mark Difficulty: QOO Bootcamp.com
Question ID: 1007 Previous Next
Researchers are studying the use of prostate-specific antigen as a screening test for prostate cancer. Study participants with
and without a prostate cancer diagnosis as confirmed by transrectal ultrasound-guided biopsy undergo prostate-specific
antigen screening. A prostate-specific antigen concentration of 4 ng/mL qualifies as a positive test result. The results are
shown in the table below.
Which of the following statements correctly describes the effects of changing the definition of a positive test result to 3 ng/mL?
O A. The negative predictive value will increase. PSA > 4 ng/mL PSA < 4ng/mL
B. The number of false positives will decrease.
C. The sensitivity will decrease.
Biopsy + 118 379
D. The positive predictive value will increase.
E. The specificity will increase.
Biopsy - 47 456
Biostatistics: Diagnostic Tests Bootcamp.com
Overview:
• Pre-test probability: Estimated probability of patient condition before test result is known
o Pre-test odds: Pre-test probability / (1 - Pre-test probability)
• Post-test probability: Estimated probability of patient condition after test result is known
o Post-test odds: Post-test probability / (1 - Post-test probability)
• Likelihood ratio is useful for assessing the clinical utility of a test
• Compares likelihood that a person with a specific test result has or does not have the condition
o Calculated individually for both positive and negative test results
• Intrinsic to the test (not affected by prevalence)
• Pre-test odds x Likelihood ratio = Post-test odds
LR+:
• For a person that tests +, Compares likelihood that the person does or does not have the condition
• LR+ = True-positive rate / False-positive rate
o LR+ > 1 —> Person who has condition more likely to test + than person who does not
o LR+ > 10 —> Test is highly specific
• Example: The TP rate of a pregnancy test is 0.81 and the FP rate is 0.06. LR+ = 13.5
o Person who is pregnant is 13.5x as likely to have a + test compared to person who is not
LR-:
• For a person that tests -, Compares likelihood that the person does or does not have the condition
• LR- = False-negative rate / True-negative rate
o LR- < 1 —» Person who does not have condition more likely to test - than someone who does
o LR- < 0.1 —► Test is highly sensitive
• Example: The FN rate of a pregnancy test is 0.19 and the TN rate is 0.94. LR- = 0.2
o Person who is pregnant is 0.2x as likely to have a - test compared to person who is not
Item 1 of 1 Test Your Knowledge
■ Mark Difficulty: QOOO Bootcamp.com
Question ID: 1008 Previous Next
A 14 year old male presents to his primary care physician in January with fevers, myalgias, and a rhinorrhea for the past two
days. Several of his teammates on the baseball team have had similar symptoms in the past week. The physician estimates
that this patient’s pre-test odds of influenza are 2.33. He then orders a rapid antigen influenza known to have a specificity of
0.90 and a sensitivity of 0.78. The test comes back positive for influenza. What are the patient’s post-test odds of having
influenza?
A. 90.00
B. 18.17
C. 10.13
D. 2.02
E. 7.80
Item 1 of 1 Test Your Knowledge
■ Mark Difficulty: QOOO Bootcamp.com
Question ID: 1008 Previous Next
A 14 year old male presents to his primary care physician in January with fevers, myalgias, and a rhinorrhea for the past two
days. Several of his teammates on the baseball team have had similar symptoms in the past week. The physician estimates
that this patient’s pre-test odds of influenza are 2.33. He then orders a rapid antigen influenza known to have a specificity of
0.90 and a sensitivity of 0.78. The test comes back positive for influenza. What are the patient’s post-test odds of having
influenza?
A. 90.00
O B. 18.17
C. 10.13
D. 2.02
E. 7.80
REVIEW OUTLINE
Biostatistics:
1. Measures of Central Tendency
A. Overview
B. Mean
Statistical C. Median
D. Mode
Distributions
2. Measures of Dispersion
A. Overview
B. Range
C. Standard Deviation
D. Variance
E. Standard Error of the Mean
3. Normal and Non-normal Distributions
A. Overview
B. Normal Curve
C. Non-normal Distributions
4. Statistical Distributions - Practice Question 1
5. Statistical Distributions - Practice Question 2
(§) Bootcamp.com
Biostatistics: Statistical Distributions Bootcamp.com
T tfT '
• Advantages:
o Accounts for all values in a dataset
• Disadvantages:
o Most affected by outliers
Median:
Middle value of a dataset sorted from least to greatest
o Average of the two middle values if dataset has even # of values
Advantages:
iilt'MlTT
o Minimally affected by outliers
Disadvantages:
o Ignores all but the middle value of a dataset
Mode:
Most common value
Advantages:
o Least affected by outliers
11 If
o Can be used for categorical data
Disadvantages:
o Multiple or no mode values may occur
o May not be representative
Biostatistics: Statistical Distributions Bootcamp.com
Measures of Dispersion
Overview:
• Describe how much values in a dataset differ from the average
Range:
• Range = Largest value - Smallest value
• Sensitive to outliers
• Example: The final exam scores in a class are 83%, 90%, 78%, 94%, and 74%.
o The range is 94 - 74 = 20
Standard Deviation:
• Calculates the average difference between each value and the mean
• Denoted by o or SD
• o = ^[(Z[x-p]2)/n]
• 1'0 = More variability in a dataset
• For a dataset that follows a normal distribution: HIGH y iel d
o Mean +/-1 o —► 68% of the sample
o Mean +/- 2 o —► 95% of the sample
o Mean +/- 3 o —► 99.7% of the sample
• Example: The mean is 83.8 and the SD is 7.39
Variance:
• Calculates the square of the average difference between each value and the mean
• Denoted by o2
• o2 = (Z[x - p]2) / n
• Example: The variance is 7.392 = 54.6
Standard Error of the Mean:
• Estimates the difference between a sample’s mean and the true population mean
• SEM = oA/n
• SEM f when o f or n |
• Example: The SEM is 3.305
Biostatistics: Statistical Distributions Bootcamp.com
A graduate student is studying how much time adults living in her town spend exercising each week. She recruits a sample of
2,000 adults between the ages of 18 and 65, and asks them to self-report how much time they spent exercising each week for
three months. She then calculates the average amount of time exercising per week for each individual. She finds that the
mean amount of time spent exercising per week for the entire sample is 2.33 hours, and the variance is 0.43 hours squared.
Assuming this sample is normally distributed, approximately how many people in the sample spend fewer than 1 hour per
week exercising?
A. 3
B. 100
C. 320
D. 50
E. 640
Item 1 of 1 Test Your Knowledge
■ Mark Difficulty: OO Bootcamp.com
Question ID: 1009 Previous Next
A graduate student is studying how much time adults living in her town spend exercising each week. She recruits a sample of
2,000 adults between the ages of 18 and 65, and asks them to self-report how much time they spent exercising each week for
three months. She then calculates the average amount of time exercising per week for each individual. She finds that the
mean amount of time spent exercising per week for the entire sample is 2.33 hours, and the variance is 0.43 hours squared.
Assuming this sample is normally distributed, approximately how many people in the sample spend fewer than 1 hour per
week exercising?
A. 3
B. 100
C. 320
O D. 50
E. 640
Item 1 of 1 Test Your Knowledge
■ Mark Difficulty: O Bootcamp.com
Question ID: 1010 Previous Next
A hospital administration team is collecting data on inpatient stay durations at a local hospital. The results of their study are
shown in the graph below.
A hospital administration team is collecting data on inpatient stay durations at a local hospital. The results of their study are
shown in the graph below.
1. Hypothesis Testing
Biostatistics:
A. Overview
B. Null Hypothesis
C. Alternative Hypothesis
Statistical 2.
D. Interpreting Hypothesis Testing Results
Testing Errors
A. Overview
True Population
Hypothesis Testing
Overview:
• Hypothesis: Proposed explanation based on limited evidence
o Developed prior to start of the study
o Example: f Brain natriuretic peptide levels are associated with heart failure
• Hypothesis Testing: Analysis of whether or not hypothesis is supported
o Conducted after obtaining study results from a sample
o Begins with assumption that hypothesis is wrong
o Calculates the likelihood of obtaining those results in a different sample
Null Hypothesis:
• No relationship/difference exists between variables
• Represented by the symbol Ho
• Example: Ho = BNP levels of people w/ HF = BNP levels of people w/out HF Examples of Possible Samples
Alternative Hypothesis:
• Relationship/difference exists between variables
• Represented by the symbol H1
• Example: H1 = BNP levels of people w/ HF > BNP levels of people w/out HF
Interpreting Hypothesis Testing Results:
• p-value: Probability of obtaining results equal or more extreme assuming Ho is true
o Calculated based on known theoretical distributions
o f p-value —► f Likelihood of obtaining these results
• Arbitrarily chosen cut-off value represented by a No HF 11 HF
Testing Errors
Overview:
• Results of hypothesis testing from sample may not reflect truth in population
• Null hypothesis (Ho): No relationship exists between the variables
• Alternative hypothesis (H^: A relationship exists between variables
• Hypothesis testing —> Reject Ho or fail to reject Ho
I BNP levels I
Type I Errors:
• Incorrectly rejecting Ho HIGH y iel d
• False positive
• Example: Stating BNP levels f in people w/ HF when they are not
• Alpha: Probability of making a Type I Error
o Pre-set value (often 0.05)
Type II Errors:
• Incorrectly failing to reject Ho h ig h y iel d
• False negative
Example: Stating BNP levels are not f in people w/ HF when they are The Truth in the Population
Beta: Probability of making a Type II Error
1 H1 1
Power: Probability of correctly rejecting Ho
o Power = 1 - p Ho
______________________________________
o f Power by
■ Ta Correct
■ f Sample size Fail to Decision
Type II Error
■ f Expected effect size Reject Ho 1-0 P
■ f Precision of measurement
Correct
Type 1 Error
Reject Ho 0
Decision
1-P
Biostatistics: Statistical Testing Bootcamp.com
A research team hypothesizes that childhood exposure to pets decreases the chances of developing seasonal allergies later in
life. After conducting a power analysis to determine the necessary sample size for a statistical power of 80%, they recruit 1,300
people with seasonal allergies and 1,300 people without seasonal allergies and determine whether or not the participants had
pets in their home between the ages of 0 and 5. They find that 274 of the people with allergies had childhood pets, and 318 of
the people without allergies had childhood pets.
Which of the following statistical tests would be most appropriate for analyzing this dataset?
A. Regression analysis
B. Chi-square test
C. Paired t-test
D. AN OVA
E. Correlation coefficient
Item 2 of 2 Test Your Knowledge
■ Mark Difficulty: QOO Bootcamp.com
Question ID: 1011 Previous Next
A research team hypothesizes that childhood exposure to pets decreases the chances of developing seasonal allergies later in
life. After conducting a power analysis to determine the necessary sample size for a statistical power of 80%, they recruit 1,300
people with seasonal allergies and 1,300 people without seasonal allergies and determine whether or not the participants had
pets in their home between the ages of 0 and 5. They find that 274 of the people with allergies had childhood pets, and 318 of
the people without allergies had childhood pets.
A chi-square analysis of the data using an cr level of 0.05 gives a p-value of 0.04. Assuming there truly is an association
between the two variables, which of the following statements is the most accurate interpretation of these results?
.,
C.
0
A researcher is conducting a study to determine the effectiveness of a new antihypertensive medication. 36 participants are
recruited for the study and randomly divided into Groups 1 and 2. Group 1 participants take a placebo for one month, then take
no medication for two weeks, then take the new medication for one month. Group 2 participants take the new medication for
one month, then take a placebo for one month following the two week washout period. The patients are unaware of their group
assignments. The researcher then subtracts each participant’s systolic blood pressure while taking the new medication from
their systolic blood pressure while taking the placebo. The average difference in the systolic blood pressure is 15 mmHg, and
the standard deviation is 27.4 mmHg.
Which of the following statements best describes the 95% confidence interval for these results?
A. 95% of the participants had a decrease in systolic blood pressure between 6 mmHg and 24 mmHg
B. Raising the confidence level to 99% would increase the interval width
C. Since the confidence interval does not include 0, the effect of the medication is not statistically significant
D. The average difference is expected to fall between 10.4 and 19.6 mmHg in 95% of independent samples
E. A confidence interval cannot be determined for this type of study
Item 2 of 2 Test Your Knowledge
■ Mark Difficulty: O Bootcamp.com
Question ID: 1012 Previous Next
A researcher is conducting a study to determine the effectiveness of a new antihypertensive medication. 36 participants are
recruited for the study and randomly divided into Groups 1 and 2. Group 1 participants take a placebo for one month, then take
no medication for two weeks, then take the new medication for one month. Group 2 participants take the new medication for
one month, then take no medication for two weeks, then take a placebo for one month. The patients are unaware of their
group assignments. The researcher then calculates the difference between each participant’s systolic blood pressure while
taking the new medication and while taking the placebo. The average difference in the systolic blood pressure is 15 mmHg,
and the standard deviation is 27.4 mmHg.
Similar studies were conducted at 5 other hospitals, and the results were pooled for a meta-analysis. Which of the following
statements is the most appropriate conclusion to draw from the meta-analysis?
References
Observational and Experimental Studies
• Created with BioRender.com
Descriptive Studies
• Created with BioRender.com
Longitudinal Studies
• Created with BioRender.com
Heritability Studies Family Tree
• Created with BioRender.com
Experimental Study Design
• Created with BioRender.com
Biostatistics: Bias and Study Errors Bootcamp.com
References
Types of Study Errors
• Created with BioRender.com
Recruitment Bias
• Created with BioRender.com
Interpretation Bias
• Created with BioRender.com
Biostatistics: Risk Quantification Bootcamp.com
References
Morbidity Frequency Measures
• Created with BioRender.com
Interpreting Prevalence and Incidence
• Created with BioRender.com
Mortality Frequency Measures
• Created with BioRender.com
Relative Risk
• Created with BioRender.com
Odds Ratio
• Created with BioRender.com
Additional Calculations with Relative Risk
• Created with BioRender.com
Biostatistics: Diagnostic Tests Bootcamp.com
References
Sensitivity and Specificity
• Created with BioRender.com
True and False Positives and Negatives
• Created with BioRender.com
Cut-off Values
• Created with BioRender.com
Biostatistics: Statistical Distributions Bootcamp.com
References
Measures of Central Tendency
• Created with BioRender.com
Measures of Dispersion
• Created with BioRender.com
Normal and Non-normal Distributions
• Created with BioRender.com
Biostatistics: Statistical Testing Bootcamp.com
References
Hypothesis Testing
• Created with BioRender.com
Correlation Coefficient
• Created with BioRender.com
Confidence Intervals
• Created with BioRender.com