You are on page 1of 10

FCH Finals  Biostatistics - scientific discipline concerned with the application of statistical methods to problems in biology or medicine Application

of Statistics – Public Health Statistics – for planning and monitoring of health status – Vital Statistics – data related to events (birth, marriage, death) – Health Statistics – health status of individual or community – Hospital Statistics Aids the researcher in: 1. Designing a research project 2. How to process, organize, and summarize research data 3. Quantifying variability 4. Interpreting results and drawing valid conclusions Nature of Statistical Data – Expressed numerically – Treated as a mass or group of observations – Subject to variation  Definitions – Population: total enumeration of all elements to which a research is to be applied Target Population: the complete collection of elements that is on interest to the investigators Sampled Population: A collection of elements from which the sample was actually taken Sampling frame: the list of elements in the sampled population

Branches of Statistics Descriptive Statistics – A set of statistical techniques whose main objective is to summarize and present data in a form that will make them easier to analyze and interpret

Inferential Statistics – A branch of Statistics concerned with making estimates, predictions, generalizations and conclusions about a target population based on information from a sample Foundation of the concept of inference is the theory of probability Estimation, ex:  Point estimate  Interval estimate – Hypothesis Testing

– Uses of Statistics – A data reduction technique – A tool for objective appraisal and evaluation – A tool in the decision making process

SBCM FCH II | Finals Reviewer Part 1

1

π. Occupation. gender. Hemoglobin level Values indicate a quantity or amount and can be expressed numerically Values can be arranged according to magnitude Examples: Age. weight Types of Variables Qualitative vs. Height. usually applicable to the physical sciences o Examples: speed of light. with reduced bias or partiality o o Representativeness is assumed Best approach for clinical trials o – – Qualitative: Sex. Weight. Quantitative Variables? – Sampling units: Elements of the population that were sampled Observation units: Elements from which measurements were derived Probability sampling methods – random selection.– Sampling allows researchers to collect more accurate data of greater scope at a reduced cost and at greater speed. Also termed “randomization” Variation: the tendency of a measurable characteristic to change o From one individual or setting to another Within the same individual or setting at different periods of time Statistics is necessary to analyze variability – Types of Variables Qualitative vs. Religion. height. BP WBC Count. minutes in an hour SBCM FCH II | Finals Reviewer Part 1 2 . Disease Status Merely labels to distinguish one group from another Numerical representation only for coding / labeling and not for comparison – Random sampling: the process of choosing a random sample from the target population Random allocation: the method of assigning the sample into different treatment groups in an experimental study. weight. Educational Level. # of students o Levels of Measurement o o o o Nominal Ordinal Interval Ratio o – Constants: Non-changing. height. Residence. Quantitative Variables? – Quantitative: Age. or values vary from one individual to another or within the same individual at different periods of time Examples: Age. o More efficient / more feasible – Variables: measurement or a characteristic o Values that change within a category.

– – – – – – Epidemiology o is the study of epidemics of infectious disease. adult) Interval Scale – – Same characteristics as ordinal scales Distances between all adjacent classes are equal Scales are infinite. health care organization. Zero point is arbitrary and does not mean the absence of the characteristic Ex: Calendar Time. Frequency o THESE 3 INTERRELATED COMPONENTS ENCOMPASS ALL EPIDEMIOLOGIC PRINCIPLES AND METHODS Disease Frequency: o The measurement of disease frequency involves quantification of the existence or occurrence of disease o The availability of such data is a prerequisite for any systematic investigation of patterns of disease occurrence in human populations Distribution of Disease: o Considers such questions as who is getting the disease within a population as well as where and when the disease is occurring o Determinants of Disease: o Derives from the first two since the knowledge of frequency and distribution of disease is SBCM FCH II | Finals Reviewer Part 1 3 . Number of Teeth. the distance between two categories cannot be clearly quantified Ex: Psychosocial scales (Strongly agree. disagree. teenager. Disease Type Ordinal Scale – – – Categories used as labels (like nominal) Can be ordered or ranked However. occupational and environmental health 3 Components o Determinants. health care delivery. IQ Ratio Scale – Same as for ordinal scales with the additional feature that a meaningful zero point exists Ex: Weight. O) Patient ID. child. Temperature. AB. Blood Pressure. agree. o The study of both the distribution of diseases in human populations and the determinants of the observed distribution o It began as the study of infectious diseases but has expanded to include the study of chronic diseases. B. Height.Nominal Scale – – Categories used as labels only Numbers / Names represent a set of mutually exclusive and exhaustive classes to which individuals or objects may be assigned Ex: Sex (M. Distribution. strongly disagree) Age Groups (Infant. F) Blood Type (A.

high infant mortality and the seasonal variations in mortality o Attempted to provide a numerical assessment of the impact of plaque on the population of the city o Examined characteristics of the years in which such outbreaks occurred Recognized the value of routinely collected data in providing information about human illness (forms the basis of modern epidemiology) FARR . History of Epidemiology o Hippocrates.necessary to test an epidemiologic hypothesis o Describe patterns of disease as well as to formulate hypotheses concerning possible causal or preventive factors FOURTH ASPECT OF THE DEFINITION OF EPIDEMIOLOGY 2 assumptions o That human disease does not occur at random o That human disease has causal and preventive factors that can be identified through systematic investigation of different populations or subgroups of individuals within a population in different places or at different times.THE DEVELOPMENT OF HUMAN DISEASE RELATED TO THE EXTERNAL & PERSONAL ENVIRONMENT OF THE INDIVIDUAL GRANT – 1662. duration of exposure or general health status SBCM FCH II | Finals Reviewer Part 1 4 o .1839. responsible for medical statistics in the Office of the Registrar General for England and Wales o Set up a system for routine compilation of the number and causes of deaths o Established a tradition of careful application of vital statistical data to the evaluation of health problems of the general public o Recognized that data collected from human populations could be used to learn about illness o Compared mortality patterns of married and single persons and workers in different occupations (metal mines & earthenware industry) o Noted the association between the elevation above sea level and deaths from cholera o Attempted to ascertain the effect of imprisonment on mortality o Addressed many major methodologic issues relevant to modern epidemiology o Defined the exact population at risk o Chose an appropriate comparison group o Considered whether other factors could affect the results such as age. Farr. quantified patterns of disease in a population o Noted an excess of men for both births and deaths. Grant. Snow HIPPOCRATES . Analyzed the weekly reports of births and deaths in London o For the first time.

in what geographic areas it is most or least common. in a specified geographic area. or determinant. Coronary Heart Dse. Lung Ca Hyperendemic – a situation in which there is a persistent transmission of a disease among most of a population.  Analytic Epidemiology Focuses on the determinants of disease by testing the hypotheses formulated from descriptive studies. can be either qualitative or quantitative  Severity of Disease o  SBCM FCH II | Finals Reviewer Part 1 5 .postulated that cholera was transmitted by contaminated water through a then unknown mechanism o He charted the frequency and distribution of cholera and also ascertained a cause. Malaria in certain parts of Africa Endemic – the constant presence of a disease in a specific geographic area.SNOW . that is. It is the occurrence of an illness. expected incidence (new cases). and how the frequency of occurrence varies over time. HIV. SARS (1st pandemic of the 21st century) Descriptive Epidemiology Is concerned with the distribution of disease. Schistosomiasis in Samar Pandemic – the worldwide spread of an epidemic disease. Ex. Ex. Ex. Ex. those having values that can fall into only a limited number of separate categories with no possible intermediate levels o Continuous – those that theoretically can assume all possible values along a continuum including fractions or decimals Classification of Variables accdg to SCALE OF MEASUREMENT o Nominal – qualitative variables  Gender o Ordinal – can be ranked or ordered. occurring at a greater frequency than usually expected. o Snow was the first investigator to draw together all 3 components of the definition of epidemiology Epidemic . including consideration of what populations or subgroups do or do not develop a disease. infectious or chronic.include any disease. Describing Data QUALITATIVE – CATEGORIES ARE SIMPLY USED AS LABELS TO DISTINGUISH ONE GROUP FROM ANOTHER QUANTITATIVE – CATEGORIES CAN BE MEASURED AND ORDERED ACCORDING TO QUANTITY OR AMOUNT OR WHOSE VALUES CAN BE EXPRESSED NUMERICALLY Types of quantitative variables: o Discrete – can assume only integral values or whole numbers. with the ultimate goal of judging whether a particular exposure causes or prevents disease. that clearly exceeds the normal. of the outbreak.

between categories of a qualitative or a discrete quantitative variable Component Bar Graph . NUMERICAL DISCRETE – quantitative but cannot take on any intermediate levels  Ex: total number of live births. MULTICHOTOMOUS – when more than two alternative categories are possible and no inherent order  Ex: blood type. tabular. number of episodes of stroke DESCRIPTIVE STATISTICS – are used to summarize data in a form that permits the clearest presentation of the most information and facilitates useful comparisons between study groups or populations.for comparisons of absolute or relative counts. level of alcohol intake o 4.Shows trend data or changes with time or age with respect to some other variable Frequency Polygon .Graphic representation of the frequency distribution of a continuous variable or measurement including age groups Line Graph . etc.Graphic representation of the frequency Subtypes of Variables o 1. systole. The most common forms for presenting data: o o o o textual or narrative. race. These formats provide complimentary information with text and tables giving more specific detail about the individual values while graphs giving a general depiction of the overall pattern.o o Interval – exact distance between 2 categories can be determined but the zero point is arbitrary  Temperature Ratio – zero point is fixed  Weight. survival status. diastole Summarization – involves a reduction in the amount of data presented and that there is inevitably some loss of information. Graphs Are illustrations that even more effectively deliver a specific message than tables. and. marital status o 3. DICHOTOMOUS – can be categorized as one of only two categories  Ex: gender.Shows the breakdown of a group or total where the number of categories is not too many Histogram . SBCM FCH II | Finals Reviewer Part 1 6 . o Can show trends or patterns in a large data set o Striking comparisons can be made Bar Graphs . rates. exposure status o 2. graphical. improvement in mobility. ORDINAL – when the possible categories have a natural progression or order  Ex: stage of disease at diagnosis.

4. half the observations fall o Middle number in a group of data 2. people often find that a certain number occurs more than once. Use this as the median. 1. Compare the two variance formulae with their corresponding standard deviation formulae. 4. where as SBCM FCH II | Finals Reviewer Part 1 7 .is the third method of measuring dispersion. Here the median is 5. 8. and we see that variance is just the square of the standard deviation. etc.Shows the breakdown of a group or total where the number of categories is not too many Picture Graph . while all of the other ones are only seen once. This is the median number. 4. 8. Variance . 9. 9. 6. 5.The data value that occurs most often o When collecting information. 3. Add all of the data and divide by the number of the data Median – value above or below which. which provides an average distance for each element from the mean Range – the difference between the greatest and the least value in a set of numbers. 4.distribution of a continuous variable or measurement including age groups Scatterplot . 4. make sure that they are arranged in order and then find the one in the middle. 5. 10 o In this example.5. First. Here is an example if you have an even number of data samples: 1.SHOWS CORRELATION BETWEEN TWO QUANTITATIVE VARIABLES Pie Chart . so 4 is the mode.ALSO CALLED PICTOGRAPH o CONTAINS ILLUSTRATIONS OR PICTURES OR ICONS Forest Plot . 9 o Find the two numbers in the middle o (3 and 4) and find the mean (average) which in this case is (3+4 = 7)/2 = 3. o Here is an example of that: 2. Mode . there are seven numbers.GRAPH THAT DISPLAYS RESULTS OF STUDIES CALLED METAANALYSES o SHOWS RESULTS OF INDIVIDUAL STUDIES PLUS THE STATISTICALLY POOLED OVERALL RESULT  Measures of Central Tendency Mean – the most commonly used measure of central tendency o Mean = average. Statisticians tend to consider variance a primary measure and use it extensively (ANOVA. 10 o Here the number 4 occurs twice. o  Measures of Dispersion The most common and most important is the Standard Deviation.).

e. “we call someone a liar when he is telling the truth. one “3” in six rolls of a die ..scientists are very happy to use standard deviation exclusively. chisquare tells the likelihood that the degree of statistical dependence observed is simply the luck of the draw.e.“Mistakes “ arising from whether a given sample may or may not be representative of a population If a Null Hypothesis assumes there is no association between two variables. and we accept it even though there is an association this is a Type II error. “no association between variables.a. that is.       Hypothesis testing: Process for finding out whether we can generalize about an association from a sample to a population. the more statistical dependence between two variables   When hypothesizing about an association between two variables. Generally. Research hypothesis: (H_1) (a.” Chi-square Test . will be accepted unless the data provides convincing evidence it is false. Errors . The expected frequency is only one “3”. Thus. i. so the null hypothesis.e.05 tells that there are no more than 5 chances in 100 that the statistical dependence is due to chance.Observed vs. there are 95 chances in 100 that the statistical dependence found is not due to chance... One-tailed Hypothesis Test determines whether a particular population parameter is larger or smaller than some predefined value Uses one critical value of test statistic Two-tailed Hypothesis Test determines the likelihood that a population parameter is within certain upper and lower bounds May use one or two critical values  Study Designs Hierarchy of Study Designs o Meta-Analysis o Systematic Reviews o RCTs o Cohort Studies o Case-Control Studies o Cross-Sectional Studies o Case-series. alternative hypothesis) will be accepted only if the data provides convincing evidence of its truth. and we reject it even though there is no association then this is a Type I error.e. Null hypothesis : (H_0) Represents the status quo to the party performing the sampling experiment. the less likely we are to make a Type I error. i.” is rejected The higher the value of p. This is the observed (actual) frequency.. case reports o Editorials.k. i. Expert Opinion SBCM FCH II | Finals Reviewer Part 1 8 . we “say someone is truthful when he is lying. A p value of 0. the greater the value of chisquare. i. Expected: when we roll a die 6 times and we get three “3’s”.” If a Null Hypothesis assumes there is no association between two variables.

 Descriptive Studies .selection of subjects start with identifying individuals who have the presence or absence of a particular cause or exposure observed forward in time and determine who among them will develop the outcome or effect can be prospective or retrospective depending on the manner of patient recruitment       SBCM FCH II | Finals Reviewer Part 1 9 .sedentary lifestyle) Cross Sectional (analytical) . no attempt to establish a temporal relationship between the cause and effect o Examples – prevalence of different diseases in population – sex distribution among hemophiliacs – prevalence of use of tobacco among cancer patients  Analytical Studies .designed simply to describe certain characteristics of a problem • cause and effect relationship is not being answered by this design • used to generate hypothesis that can serve as a topic for future research  Answers the questions who.g. blood group) Special type of cross-sectional study Diagnostic studies Case-Control studies .Detailed descriptions of one or more cases of a disease that is unusual for some reason o Not seen before o Not noted before o Not named before o Rarely seen in that form Cross Sectional (descriptive) – prevalence surveys. sex.inclusion of subjects starts with selecting those who have the outcome or effect (cases) this group is compared with subjects who do not have the outcome or effect (controls) both the cases and the controls should be taken from within the same population Advantages: o For rare diseases or diseases with long latency periods o Less expensive o No follow-up needed o Evaluate multiple risk factors o Minimal ethical concerns Disadvantages: o No temporal relationship is established o Cannot determine incidence of disease Cohort Studies .Study group chosen to be a cross-section of a population     Exposure and outcome are both assessed at the same point in time Useful for examining exposures that do not change over time (e. when and where Case reports & case series .define the relationship between the outcome and its causes outcome can be a development of a disease or cure of a disease  cause can be a risk factor – genetic predisposition – social characteristic – environmental exposure – unhealthy behavior (smoking. what.

one with the exposure. o         SBCM FCH II | Finals Reviewer Part 1 10 .  Advantages: o Can establish temporal relationship between cause (exposure) and outcome (disease) o Evaluate multiple outcomes o Determine the incidence of disease o Minimal ethical concerns Disadvantages: o More expensive than casecontrol o Long follow-up required o More subjects needed Randomized Controlled Trial – If done properly the result will surely be of highest validity and reliability Individuals are randomly assigned to two or more groups.International non-profit organization that prepares. Review Manager (RevMan) is the software used for preparing and maintaining Cochrane Reviews.Something that resembles the real treatment but is not active Acceptable standard treatment (it would be unethical to withhold standard treatment from the control group) BLINDING o Minimize measurement bias o Single blind – only patient is unaware of treatment o Double blind – both patients and trial investigators are unaware of treatment Advantages: o Offer best evidence for causality  Establishes temporal relationship between exposure and outcome o Takes care of confounding Disadvantages: o Design is complex and expensive o Many ethical issues o Sometimes impractical REVIEW = Non-systematic review w/o focused clinical question OVERVIEW = Systematic review w/ focused clinical question/s Systematic Review = Structured process involving several steps: o Well Formulated Question o Comprehensive Data Search o Unbiased Selection and Abstraction Process o Critical Appraisal of Data o Synthesis of Data META-ANALYSIS = Overview + “statistical combining” of results Cochrane Collaboration . maintains. the other without they are observed forward in time and their outcomes compared Control group may be given: PLACEBO . and disseminates systematic up-to-date reviews of health care interventions. intervention or cause.