You are on page 1of 22

Biostatistics: Hypothesis Testing

9/29/2011 9:56:00 AM

1. Describe the null and alternative hypotheses for a given study design. Null hypothesis (H0) - the hypothesis to be tested, typically a statement of "no difference/effect/association"; this is the hypothesis that is assumed true unless the test produces sufficient evidence to the contrary. Alternative hypothesis (H1 or Ha) - the hypothesis that contradicts the null hypothesis. The alternative hypothesis can be two-sided or one-sided; this should be decided before beginning analysis of the data. - Two-sided (≠): Used when study is interested in any deviation from the null. More commonly used than one-sided. - One-sided (<, >): Used if prior evidence or the nature of the study directs the focus towards a deviation in just one direction. A one-sided test increases the power to detect a significant effect in one direction, but does not account for deviations in the other direction at all. Thus, deciding to use a one-sided test requires careful consideration. OK: A newly developed drug is considerably cheaper than the option currently on the market. The researcher may choose to utilize a one-tailed test that will strongly detect if the new drug is LESS effective. The other tail need not be addressed because being more effective or merely equally effective are both fine, as the main positive point of the new drug in this case is its increased affordability. Not OK: A researcher wishes to demonstrate that a new drug is more effective than the current one, and chooses a one-tailed test in order to maximize detection of improvement in the data. However, this approach would fail to account for the possibility that the new drug is less effective. 2. Define and interpret p-values. What the p-value is: The probability of observing a result/difference as extreme as, or more extreme than, those observed in the studied sample(s), assuming that the null hypothesis is true. What the p-value is NOT: The probability that the null hypothesis is true. (See #6 in the practice problem set for this lecture.) How do you obtain the p-value?

After calculating the test statistic (t or chi-squared), refer to a table or computer program that provides the corresponding p-value. Graphically speaking, the p-value is the area under the t or chi-squared distribution curve (probability density function) that lies beyond the calculated test statistic.

(For more on interpretation of p-values, see objective #4 regarding statistical significance.) 3. Recognize appropriate hypothesis tests for means. (i.e., continuous variables)

Paired samples: Matched pairs (based on age, ethnicity, sex, etc.) or individual subjects “paired” with themselves; use paired t-test Two independent samples: Ex., subjects randomly assigned to one of two treatment groups; use 2-sample independent t-test More than two groups: Use ANOVA (ANalysis Of VAriance) models 4. Interpret statements of statistical significance. General guidelines (for typical significance level alpha = .05): p <= .001 very highly significant .001 < p <= .01 highly significant .01 < p <= .05 significant p > .05 non-significant .05 < p <= .10 borderline significance Consider the typical wording of a conclusion statement in a research article: "The data demonstrate that there is a statistically significant difference between X and Y (p-value provided)." Avoid absolute statements (see practice problems #7 and #8) and always keep the definition of p-value in mind. 5. Distinguish between the statistical significance of a result and its importance in clinical application. Statistical significance: The test results are unlikely to occur by pure chance. Clinical significance: The test results indicate a difference/improvement/disparity that is meaningful from a healthcare standpoint. A result can be statistically significant and clinically insignificant (ex., a drug that is statistically significant in its performance vs. a placebo, but doesn't make enough of a difference in terms of treatment to be of any use in the

clinical setting), or it can be statistically insignificant but still have potential for clinical significance (ex., statistical insignificance was due to insufficient sample sizes; requires further investigation). 6. Recognize when proportions are appropriate summary statistics. Proportions are useful when dealing with experiments with two outcomes (success vs. failure, yes vs. no, etc.). Sample proportion can be used to estimate population proportion (probability) of a given outcome. 7. Compute and interpret proportions and associated confidence intervals. (For equations used for calculation of the above, see handout or lecture slides.) Interpretation of sample estimate of population proportion: We estimate that X% of [the population in question] will have [outcome of interest]. –or– The estimated probability of [outcome of interest] is X%. Interpretation of 95% confidence interval (z = 1.96) for proportions: We are 95% confident that the true proportion of [outcome of interest] in [the population in question] is between [lower limit of confidence interval] and [upper limit of confidence interval].

8. Understand the two-by-two table format for displaying categorical data.

Can test for association between variable A and variable B using 2 x 2 chisquared test that compares observed values to the expected values assuming the variables are independent (i.e., assuming the null hypothesis of no association). 9. Distinguish between when to use a t-test and when to use a chisquared test. t-test: Use to compare continuous outcome variable between groups (involves means) chi-squared test: Use to compare categorical outcome variable between groups (involves proportions) 10. Interpret p-values and statements of statistical significance for chi-squared tests. Follow same guidelines of interpretation as for t-tests. (See objectives #2 and #4.)

Correlation and Power

9/29/2011 9:56:00 AM

1. Describe the advantages and disadvantages of contingency table procedures versus correlation procedures for assessing associations between continuous variables. Contingency tables are easy to interpret, can be easily stratified, and have no distributional assumptions. However, there may be some loss of information due to arbitrary grouping. Correlation procedures maintain the continuity of the data and model one variable as a function of another, but they can only measure linear relationship and are only useful with two continuous variables. Contingency Table (2x2) Advantage Ease interpretation No distribution assumptions Easily stratify by other variables Can calculate OR and RR Disadvantage Arbitrary grouping of continuous (loss of information) Only measures linear relationships Only useful when both variables are continuous Correlation and Regression Maintains continuity of data Model one variable as fxn of another

2. Understand a scatter plot for displaying the relationship between two variables and use it to distinguish between positive, negative, and zero correlation. When observing data in a scatter plot, the correlation can be represented by a linear regression or best fit line. The slope of that line is proportional to the correlation between the two variables.

3. Interpret the correlation coefficient (r) and the coefficient of determination (r squared) The correlation coefficient is a representation of how linearly correlated two variables are. It can range from -1 (perfectly negatively correlated) to +1 (perfectly positively correlated). The coefficient of determination is the square of the correlation coefficient, and represents the percentage of the variation that can be explained by the correlation. *This means that a certain amount of the range exists simply because of the correlation ie. the rise that must be present given the run* 4. Interpret p-values and statements of statistical significance with regard to the correlation coefficient A linear relationship between two variables cannot be considered statistically significant without an accompanying p-value of adequate value. 5. Distinguish between possible conclusions that can be drawn

from a correlation coefficient Correlation does NOT imply causation. That is to say even if we observe a statistically significant correlation, it only implies that the relationship between the variables MAY reflect a causal relationship. You must always be aware of extraneous variables. NO2 exposure vs. FEV1 example. 6. Distinguish between a simple regression equation and a multiple regression equation A simple regression equation takes a dependent variable(Y) and seeks to observe its change against an independent variable (X). Think y=mx+b Multiple linear regression plots the dependent variable (Y) against multiple independent variables (X1, X2, X3) An example is a plot of carotid intimiamedia thickness against several factors including age, height, BMI etc.

7. Use a regression equation to make predictions and understand when these predictions are valid A regression equation can be used to make predictions by simply plugging in the hypothetical "x" value and seeing what y value you get. This can only be used within the confines of the study (domain of x values observed) as extrapolation is not possible because we don't know how the relationship between the variables behaves beyond these values. 8. Interpret slope coefficients in a regression equation The slope coefficient represents how much of a change in Y we expect to see for a one unit increase in X. The y intercept represents the dependent variable (y) value when the dependent variable (x) is zero. 9. Contrast linear vs. logistic regression and simple vs multiple regression multiple regression covered in question 6 Linear regression is exactly what we've been describing up until now. Logistic regression is actually the same thing except the dependent variable basically turns into a dummy variable in that it has only two values: 0 and 1. Often 0 is not diseased where 1 is diseased. These are primarily used to determine if X is a risk factor for Y. 10. understand the difference between type 1 and type 2 errors A type 1 error is when we reject the null hypothesis when the null is in fact true. α=the probability of a type one error. alpha is AKA the significance level A type 2 error is when we should in fact reject the null, but we do not. β=probability of a type 2 error, and 1-β is known as the power of a study. Power is the chance of detecting a difference in treatments if the difference truly exists. 11. Identity the information needed before an appropriate sample size can be determined 1. Determine whether this is a 1 or 2 sided hypothesis H

2. Determine α, the significance level α 3. Determine the minimum difference in values you wish to detect Δ 4. Determine the power level you want 1-β 5. Estimate the standard deviation of the data you will collect std

12. Understand the concept of statistical power and the importance of adequate sample size in testing hypotheses Increasing sample size will always increase power, and power is the chance you will discover an effect if an effect truly exists. If you have a new drug, you are much more likely to determine if it truly works better than the old drug if you test it on thousands of people instead of dozens. 13. recognize the effects on power and the needed sample size to achieve the same power when there are specific changes in the significance level, detectable difference, standard deviation of the outcome, and sidedness of the alternative hypothesis. When significance level increases, the power needed decreases When the minimum detectable difference increase, the power needed decreases When the standard deviation decreases, the power needed decreases If the study is one sided, the power needed decreases

Epidemiology 1
Epidemiology 1

9/29/2011 9:56:00 AM

Prevalence The proportion of people who have the disease at a given point in time=(Diseased)/(Total Population) Example: There are 180 medical students in class today. 18 have cholera. Therefore, prevalence=18/180=0.1 of the population is sick Units: number of sick people over total population Cumulative Incidence The proportion of cases that develop over a specified period of time. Example: There are 180 medical students. In the next year, 36 develop anality retentiveness. Therefore, the cumulative incidence of anality retentiveness is (36 new cases)/(180 total people)=0.2 Note: The study population at risk (denominator must be free of disease). Example: There are 180 medical students. 120 of them already have anality retentiveness. In the next year, 15 develop anality retentiveness. Therefore, the cumulative incidence of this population is (15 new cases)/(180-120 students without disorder)=15/60=0.25. Units: number of new cases over population without disorder over a certain amount of time. Incidence Rate The rate at which individuals develop a disease in a population. Person-time is the total amount of time all subjects contribute to the study. Example: 120 students are followed for 1 year, 30 students are followed for 2 years, and 50 students are followed for 3 years. Therefore, total persontime=120(1 year)+30(2 years)+ 50(3 years)=330 person-years. When calculating person-time, the people who get sick are considered to have served only half the amount of time.

Example: Let's say you a have a study spanning one year and 10 people get sick. On average, half of these people will have gotten sick before the halfway point and half after so that they average out. So each person would have contributed an average of 0.5 person-years. Therefore, total persontime=10(0.5 years)=5 person-years. Units: number of new cases over population without disorder over a certain amount of time or number of new cases per person-time Mortality Rate A measure of the rate at which individuals in a population die over a specific amount of time. Example: 200 people are followed over the next year. 30 of them die. Mortality rate=30/200 person-years Cause-Specific Rates The number of new events over a specified amount of time. Example: 180 medical students are followed over 2 years. 60 develop cholera. The cause-specific rate is 60 new cases/[120(1 year)+60(0.5 years)]=60/150 person-years. Sex-Specific Rates Event rates within gender. Example: In a class of 180 medical students, 75 are women and 105 are men. Over the next year, 10 women develop cholera and 20 men develop cholera. Therefore: Sex-specific rate for women is 10/[65(1 year)+10(0.5 years)]=10/70 person-years. Therefore: Sex-specific rate for men is 20/[85(1 year)+20(0.5 years)]=20/95 person-years. Race-Specific Rates Same as sex-specific rates but with race used instead of sex. Case Fatality Rate (not an actual rate) The proportion of people who die from a specific disease.

Example: There are 180 medical students. 50 develop anality retentiveness. 40 die from it. The case fatality rate=40/50=0.8 Proportional Mortality The proportion of deaths attributed to a specific cause. Example: A medical class had 100 deaths in 2008. 40 were caused by anality retentiveness. The proportional mortality of anality retentiveness in 2008 = 40/100=40%. Crude-Rates vs. Age-Adjusted Rates Crude rates measure the total amounts of death while age-adjusted rates stratify by age. See PM 4-8. The age-adjusted rates for 1990 are lower than for 1960 but the crude rate is higher. Simpson's Paradox! Annual Rate

Miscellaneous Incidence rates are more helpful when studying the etiology of the disease. An incidence rate could show a change in causative factors of the effect of a preventive program. Prevalence is more helpful when planning public health programs. A change in prevalence could reflect a change in incidence rate, a change in duration, or a change in immigration/emigration rate of sick people. Relationships Incidence Rate * Time = Cumulative Incidence Prevalence = Incidence Rate * Average Duration of Disease

Study Designs

9/29/2011 9:56:00 AM

1. State the “exposure-disease” hypothesis when given a description of a study. Exposure (E) = the independent variable Disease (D) = the dependent variable; may also be called “outcome” Basic hypothesis: “If exposed, then disease.” 2. Recognize possible explanations for an observed association; e.g. recognize that any association observed in a hypothesis-testing study might be causal, or due to chance, confounding, or bias. Questions to ask when evaluating possibilities besides causality: - Chance: What is the p-value and/or alpha-value? What is the probability that the observed association could have occurred by pure chance when there is no actual relationship? Is it possible that we are rejecting the null when it is true (type I error)? - Confounding: Are there other variables that could be in play besides E and D? What other factors might be connected to E? What else might cause or contribute to D? - Bias (ex. selection bias, information bias): Are there any issues with the study design (ex. sampling technique, reliability of reported data, etc.) that might affect the results? 3. Describe the criteria used for judging whether an association is causal. 1. Temporal relationship – E precedes onset of D 2. Strength of association – Stronger associations are more likely to be causal; look for a high, statistically significant relative risk (RR) value 3. Dose-response relationship – Strength of the association shows a relationship with the level/dose of exposure 4. Consistency – Same association is observed in multiple different types of studies and in different populations 5. Biological plausibility – Reasonable explanation for how E  D; experimental evidence 6. Consideration of alternate explanations – Other possibilities (see objective #2) have been controlled for or otherwise ruled out 7. Cessation of exposure – rate of D decreases after E is eliminated/reduced; similar to dose-response relationship

4. Understand the differences between descriptive and analytic studies. Descriptive studies describe patterns of disease occurrence relative to characteristics of person, place, time; can be used to identify correlations and formulate hypotheses, but do not have capacity to test them; do not examine a specific exposure-disease association Analytic studies focus on and test hypotheses regarding exposure-disease associations *Note: Testing does not necessarily mean experimental or intervention studies (actively assigning E to subjects in order to observe results); observational studies are considered analytic too, in that they are also designed to find evidence of causality for a particular E-D association - Cross-sectional studies could be considered either descriptive or analytic 5. List the major types of descriptive studies, and describe the advantages and limitations of descriptive study designs. Type of Descriptive Study Case reports, case series – give information on a single patient or group of patients (usu. for Pros Cons

- Most basic type of descriptive study – easy, inexpensive - Early warning – brings attention to new/rare

- No comparison groups – cannot test hypotheses - Deals with single or small number of cases (which might be exceptions, not a representative sample), so limited generalizability

unusual/rare conditions) conditions PERSON - Can help generate hypotheses

Ecological studies (correlational studies) – document disease occurrence in relation to specific

- Also relatively easy, inexpensive; data usually already available - Can work with data that is only

- No way to control for confounding factors (if group data on these factors is not available) – cannot test

population characteristics (measures of exposure for population as a whole, i.e. averages); E and D data on populations, not individuals (usu. based on geographic regions) PLACE Studies of disease frequency – examine change in disease/death frequency with respect to personal characteristics (age, gender, race, etc.), location (geographic area, city or rural, etc.), over time PERSON, PLACE, and TIME

available/possible to report as a populationbased measure (ex. air pollution in a given city) - Can help identify potential risk factors for a disease, generate hypotheses applicable to individuals - Can help identify potential exposures that are causing differing disease rates in specific populations, locations, or time periods; generate hypotheses

hypotheses - Beware of ecological fallacies – cannot distinguish in group data whether individuals with disease were in fact the ones who were exposed, so unable to form an E  D conclusion - Again, cannot test hypotheses from these studies alone

6. For each of the major types of descriptive and analytic study designs: (a) Describe in general terms the basic process for conducting the study (b) Distinguish between designs from a brief description. Analytic Studies Observational (Nonexperimental; the three C‟s) Cross-sectional: Take sample from specified population, then ascertain both E and D status at same time for each subject in sample.

Cohort: First identify E or non-E status in subjects without D, then follow to observe onset of D. (Can be retrospective if info on E status is obtained from past records) Case-control: First identify cases and non-cases of new/incident D, then ascertain E status (current or past).

Intervention (Experimental)

Intervention: Randomly assign E to subjects, then observe effect on D. Can be E1 vs. E2, or E vs. placebo/control.

For the types of descriptive studies, refer to first column of the table for objective #5.

Analytic Study Design

9/29/2011 9:56:00 AM

For each of the major epidemiologic study designs be able to: - Identify the type of study design from a brief description of the study - Recognize the major advantages and disadvantages inherent in the design - Recognize situations in which each would be the most appropriate study design - Describe which measures of association are commonly used with each design
Study name Pros Cons When do you use it Measures with this study
CrossSectional -2nd cheapest and quick -generates hypotheses -results can be generalized to the population -temporality bias -prevalence bias -selection bias -hard to find subjects with rare exposures or disease -when you don‟t know time or don‟t have it -when you are observing a common E and O in a defined population Cohort -no temporality issues -no ethical issues -rare exposures can be followed -it‟s okay if we get multiple outcomes -expensive -not good for rare disease outcomes, but if your exposure is rare disease, that‟s fine -validity (presence of information bias, confounding, -when you just want incidence of O -when E is rare -Absolute Risk -Relative Risk -AR% -PAR% You can‟t get an odds ratio -You can control time! (prospective vs. retrospective) -prevalence -odds ratio -relative risks

How is it from the other studies?
-lack of time variable -no %s (AR%, PAR%) -common E and O

associated different

-more robust to selection bias -less likely that results will bias our study -we can find incidence rates!

and loss to follow-up can screw things up) nonparticipation: many will be ineligible for a rare disease study, makes the study difficult to generalize

CaseControl

-cheapest -rare diseases can be studied multiple exposures can be investigated simultaneously -efficient -OR can be used to estimate relative risk

-validity (information bias, selection bias, prevalence bias, confounding, and temporal relationship bias)

-Odds Ratio -AR% -PAR%

-you can‟t get relative risk. At all.

Intervention

-this can “prove” causality -no temporal bias -no selection bias -multiple outcomes can be studied

-expensive -not always feasible -not always ethical -outcomes can be limited by time, money, etc. -validity placebo

You assign exposure and monitor through time to see outcome

-incidence rate -relative risk -AR and AR% -PAR and PAR%

-you can measure incidence rates (not like casecontrol)

-incidence rates can be measured directly

effect, information bias, loss to followup, noncompliance, confounding)

Describe the differences between cohort studies: “specific exposure” vs “general population” cohort studies, and “prospective” vs retrospective” cohort studies. Specific exposure vs general population Specific exposure: subjects are chosen to represent a specific exposure General population: subjects are chosen to represent the population, which has a wide range of exposures What‟s the difference: one exposure versus several Examples, which is which? You are chosen for a study that is interested in seeing if there is a relation between taking the MCAT and the average stress level of all current graduate school students (exposure: MCAT, disease: stress level) over the course of 4 years You are chosen for a study that is interested in seeing if exposure to cadavers, hospitals, and libraries leads to the development of any stressrelated diseases Prospective vs retrospective Prospective: subjects are chosen who are exposed, and the investigator follows the cohort into the future to see if exposure leads to disease Retrospective: subjects are chosen who have already been exposed and show disease What‟s the difference: one hasn‟t shown signs of disease yet and the other already has Examples, which is which? You choose subjects for a study on lung cancer based on their exposure to asbestos before retirement You choose subjects for a study on lung cancer based on their current exposure to asbestos

Explain the purpose, process, and effects of matching in a case-control study. Purpose: to prevent confounders (the reason why we do anything in statistics, really) Process: Pair control and exposed group based on matched control variables (ex. Age, race). But it‟s tricky to match more than two controls Effect: By eliminating as many confounders as possible, you increase the chance that a relationship between exposure and disease are associated, and not due to chance (think back to t-tests) Examples A study of a new diabetes medication‟s efficacy in lowering blood glucose levels compared to the current market‟s top grossing diabetes medication is done via case-control study. The control group and case group are all of the same race, age, and sex Your turn! Describe the process of conducting and advantages of a nested case-control study, and recognize examples from brief descriptions of studies. Nested case-control: when members of a cohort develop the disease of interest, you can select members from the cohort who have yet to get the disease to be their controls. So basically you create a control group in a cohort study How we do this: we separate diseased from non-diseased. We match up those with the disease to a control „partner‟ in the non-diseased group. We then look to the previous information we‟ve gathered on the cohort members to determine what the exposure was that brought about the disease Example: Nurse Health Study. If a group of nurses in this study cohort developed alcoholism do to exposure to over 200 trauma cases per year, we can attempt to prove this relation by making the non-alcoholic nurses the control and the alcoholic nurses our exposed group Your turn! Explain why an incidence rate ratio cannot be calculated directly in a casecontrol study The controls that we select for a study is only a small fraction of the population that is not exposed. Explain why the odds ratio is a good estimator of the RR in a case-control study.

Odds ratio is a good estimator because of the nature of the case-control study, which is looking at disease, aka outcome, measurement, and not the exposure measurement. (because we supposedly already know what they were exposed to or exposed them to it) Example: (which is totally, not mine, it‟s from http://www.childrensmercy.org/stats/journal/oddsratio.asp and super brilliant) Consider a case-control study of prostate cancer risk and male pattern balding. The goal of this research was to examine whether men with certain hair patterns were at greater risk of prostate cancer. In that study, roughly equal numbers of prostate cancer patients and controls were selected. Among the cancer patients, 72 out of 129 had either vertex or frontal baldness compared to 82 out of 139 among the controls (see table below). Cancer cases Balding Hairy Total 72 55 129 Controls 82 57 139 Total 154 112 268

So you can estimate the probability of balding for cancer patients (72*57)/(82*55), but you can‟t calculate the probably of cancer for bald patients. This is because we are looking for information on the outcome. There are plenty of bald people out there who don‟t have cancer, but that would be a huge probability that is clearly not represented in the data we have here. So you would need additional information or a different type of research design to estimate the relative risk of prostate cancer for patients with different types of male pattern balding. You can always calculate and interpret the odds ratio in a case control study as long as the outcome event is rare Recognize the purpose for randomization in intervention studies. Ex: A study on 1st graders‟ height growth wants to see the association between drinking milk with increases in height. They plan to give the control group one extra cup of water at lunch everyday, and the exposure group one extra cup of milk at lunch everyday.

The researchers notice that some of the children are definitely smaller than the others…and they worry that maybe the children aren‟t getting enough nutrition at home So they decide to put all the smaller children in the milk-drinking group What could potentially happen to these smaller children that were not randomly assigned to the exposure group? What could potentially happen to the study‟s results as a whole? The small kids grow a lot the study claims that kids have dramatic height growth when they drink milk everyone makes their kids drink milkmany kids and parents are disappointed when they don‟t grow much The small kids don‟t grow much, they just get fat the study finds not enough information to associate milk and height, but they instead say there is an association between weight gain and milk everyone stops their kids from drinking milk many kids and parents are disappointed when they don‟t grow much And that‟s why you randomize your intervention studies.