You are on page 1of 30

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/249322486

Medical Biostatistics, Third Edition

Book · January 2012

CITATIONS READS
2 79,696

1 author:

Abhaya Indrayan
University College of Medical Sciences
226 PUBLICATIONS   2,116 CITATIONS   

SEE PROFILE

Some of the authors of this publication are also working on these related projects:

Bosatistics View project

Statistical Medicine View project

All content following this page was uploaded by Abhaya Indrayan on 09 October 2014.

The user has requested enhancement of the downloaded file.


1

Contents
Preface to Third Edition
Summary Tables
Frequently Used Notations
1 Medical Uncertainties
1.1 Uncertainties in Health and Disease
1.1.1 Uncertainties due to Intrinsic Variation – Biologic, Genetic, Behavioral and
Other Host Factors, Environmental, Chance, Sampling Fluctuations
1.1.2 Natural Variation in Assessment – Observer, Treatment Strategies, Instrument
and Laboratory, Imperfect Tools, Incomplete Information on the Patient, Poor
Compliance with the Regimen
1.1.3 Inadequate Knowledge – Epistemic Uncertainties; Diagnostic, Therapeutic, and
Prognostic Uncertainties; Predictive and Other Uncertainties
1.2 Uncertainties in Medical Research
1.2.1 Empiricism in Medical Research – Laboratory Experiments, Clinical Trials,
Surgical Procedures, Epidemiological Research
1.2.2 Elements of Minimizing the Impact of Uncertainties on Research – Proper
Design, Improved Medical Methods, Analysis and Synthesis
1.2.3 Critique of a Report of a Medical Study – Introduction, Methodology, Results,
Discussion and Conclusions
1.3 Uncertainties in Health Planning and Evaluation
1.3.1 Health Situation Analysis – Identification of the Specifics of the Problem, Size
of the Target Population, Magnitude of the Problem, Health Infrastructure,
Feasibility of Remedial Steps
1.3.2 Evaluation of Health Programs
1.4 Management of Uncertainties: About This Book
1.4.1 Contents of the Book – Limitations and Strengths, New in Third Edition
1.4.2 Salient Features of the Text – System of Notations, Guide Chart of the
Biostatistical Methods
References
2 Basics of Medical Studies
2.1 Study Protocol
2.1.1 The Problem, Objectives, and Hypotheses
2.1.2 Protocol Content
2.2 Types of Medical Studies
2

2.2.1 Elements of Design


2.2.2 Basic Types of Study Design – Descriptive, Analytical, Basic Types of
Analytical Studies
2.2.3 Choosing a Design – Recommended Design for Particular Setups, Choice of
Design by Level of Evidence
2.3 Data Collection
2.3.1 Nature of Data – Factual, Knowledge-Based, and Opinion-Based Data; Method
of Obtaining the Data
2.3.2 Tools of Data Collection – Existing Records, Questionnaires and Schedules,
Likert Scale
2.3.3 Pretesting and Pilot Study
2.4 Nonsampling Errors and Other Biases
2.4.1 Nonresponse
2.4.2 Variety of Biases to Guard Against – List of Biases, Steps for Minimizing Bias
References
3 Sampling Methods
3.1 Sampling Concepts
3.1.1 Advantages and Limitations of Sampling – Sampling Fluctuations, Advantages
and Limitations
3.1.2 Some Special Terms Used in Sampling – Unit of Enquiry and Sampling Unit,
Sampling Frame, Parameters and Statistics, Sample Size, Nonrandom and
Random Sampling
3.2 Common Methods of Random Sampling
3.2.1 Simple Random Sampling
3.2.2 Stratified Random Sampling
3.2.3 Multistage Random Sampling
3.2.4 Cluster Random Sampling
3.2.5 Systematic Random Sampling
3.2.6 Choice of Method of Random Sampling
3.3 Some Other Methods of Sampling
3.3.1 Other Random Methods of Sampling – Probability Proportional to Size, Area
Sampling, Inverse Sampling, Consecutive Subjects Attending a Clinic,
Sequential Sampling
3.3.2 Nonrandom Methods of Sampling – Convenience Samples, Other Types of
Purposive Samples
3

References
4 Designs for Observational Studies
4.1 Some Basic Concepts
4.1.1 Antecedent and Outcome
4.1.2 Confounders
4.1.3 Effect Size
4.2 Prospective Studies
4.2.1 Variations of Prospective Studies – Cohort Study, Longitudinal Study,
Repeated Measures Study
4.2.2 Selection of Subjects for a Prospective Study – Comparison Group in a
Prospective Study
4.2.3 Potential Biases in Prospective Studies – Selection Bias, Bias due to Loss in
Follow-Up, Assessment Bias and Errors, Bias due to Change in the Status,
Confounding Bias, Post Hoc Bias, Validity Bias
4.2.4 Merits and Demerits of Prospective Studies
4.3 Retrospective Studies
4.3.1 Case-Control Design – Nested Case-Control Design
4.3.2 Selection of Cases and Controls – Sampling Methods in Retrospective Studies,
Confounders and Matching
4.3.3 Merits and Demerits of Case-Control Studies
4.4 Cross-Sectional Studies
4.4.1 Selection of Subjects for a Cross-Sectional Study
4.4.2 Merits and Demerits of Cross-Sectional Studies
4.5 Comparative Performance of Prospective, Retrospective,
and Cross-Sectional Studies
4.5.1 Performance of Prospective Studies
4.5.2 Performance of Retrospective Studies
4.5.3 Performance of Cross-Sectional Studies
References
5 Medical Experiments
5.1 Basic Features of Medical Experiments
5.1.1 Statistical Principles of Experimentation – Control Group, Randomization,
Replication
5.1.2 Advantages and Limitations of Experiments
4

5.2 Design of Experiments


5.2.1 Classical Designs: One-Way Design, Two-Way Design, Interaction, K-Way
and Factorial Experiments
5.2.2 Some Unconventional Designs – Repeated Measures Design, Crossover Design,
Other Complex Designs
5.3 Choice and Sampling of Units for Laboratory Experiments
5.3.1 Choice of Experimental Unit
5.3.2 Sampling Methods in Laboratory Experiments
5.3.3 Choosing a Design of Experiment
5.3.4 Pharmacokinetic Studies
References
6 Clinical Trials
6.1 Therapeutic Trials
6.1.1 Phases of a Clinical Trial – Phases I to IV
6.1.2 Selection of Subjects – Selection of Participants for RCT, Control Group in a
Clinical Trial
6.1.3 Randomization and Matching
6.1.4 Methods of Random Allocation – Allocation out of a Large Number of
Available Subjects; Random Allocation of Consecutive Patients Coming to a
Clinic; Block, Cluster and Stratified Randomization
6.1.5 Blinding and Masking
6.1.5.1 Blinding
6.1.5.2 Masking
6.2 Issues in Clinical Trials
6.2.1 Outcome Assessment – Specification of End-point or Outcome, Causal
Inference, Side Effects, Efficacy versus Effectiveness, Pragmatic Trials
6.2.1.1 End-Points or Outcome – Causal Inference, Side Effects
6.2.2 Various Equivalences in Clinical Trials – Superiority, Equivalence, and
Noninferiority Trials; Therapeutic Equivalence and Bioequivalence
6.2.3 Designs for Clinical Trials – One-Way, Two-Way, and Factorial Designs;
Crossover and Repeated Measures Designs; N-of-1, Up-and-Down, and
Sequential Designs; Choosing a Design for a Clinical Trial
6.2.4 Designs with Interim Appraisals – Design with Provision to Stop Early,
Adaptive Designs
6.2.5 Biostatistical Ethics for Clinical Trials – Equipoise, Ethical Cautions, Statistical
Considerations in a Multicentric Trial, Multiple Treatments with Different
5

Outcomes in the Same Trial, Size of the Trial, Compliance


6.2.6 Reporting Results of a Clinical Trial – CONSORT, Open Access
6.3 Trials Other than for Therapeutics
6.3.1 Clinical Trials for Diagnostic and Prophylactic Modalities
6.3.2 Field Trials for Screening, Prophylaxis, and Vaccines
6.3.3 Issues in Field Trials – Randomization and Blinding in Field Trials, Designs for
Field Trials
References
7 Numerical Methods for Representing Variation
7.1 Types of Measurement
7.1.1 Nominal, Metric, and Ordinal Scales
7.1.2 Other Classifications of the Types of Measurement – Discrete and Continuous
Variables, Qualitative and Quantitative Data, Stochastic and Deterministic
Variables
7.2 Tabular Presentation
7.2.1 Contingency Tables and Frequency Distribution – Empty Cells, Problems in
Preparing a Contingency Table on Metric Data
7.2.2 Multiple Response Tables and Other Features
7.2.3 Other Types of Statistical Tables – What is a Good Statistical Table?
7.3 Rates and Ratios
7.3.1 Proportion, Rate, and Ratio
7.4 Central and Other Locations
7.4.1 Central Values: Mean, Median, and Mode – Understanding Mean, Median, and
Mode, Calculation in Case of Grouped Data, Which Central Value to Use?,
Geometric Mean, Harmonic Mean
7.4.2 Other Locations: Quantiles – Ungrouped and Grouped Data, and Interpretation
7.5 Measuring Variability
7.5.1 Variance and Standard Deviation – Ungrouped and Grouped Data, Variance of
Sum or Difference of Two Measurements
7.5.2 Coefficient of Variation
References
8 Presentation of Variation by Figures
8.1 Graphs for Frequency Distribution
8.1.1 Histogram and Its Variants – Histogram, Stem-and-Leaf Plot, Line Histogram
6

8.1.2 Polygon and Its Variants – Frequency Polygon, Area Diagram


8.1.3 Frequency Curve
8.2 Pie, Bar, and Line Diagrams
8.2.1 Pie Diagram – Useful Features, Donut Diagram
8.2.2 Bar Diagram
8.2.3 Scatter and Line Diagrams
8.2.4 Choice and Cautions in Visual Display of Data
8.2.5 Mixed and Three-Dimensional Diagrams – Mixed Diagram, Box-and-Whiskers
Plot, Three-Dimensional Diagram, Biplot, Nomogram
8.3 Special Diagrams in Health and Medicine
8.3.1 Diagrams Used in Public Health – Epidemic Curve, Lexis Diagram
8.3.2 Diagrams Used in Individual Care and Research – Growth Charts, Partogram,
Dendrogram, Area Under the Concentration Curve, Radar Graph
8.4 Charts and Maps
8.4.1 Charts – Schematic Chart, Pedigree Chart
8.4.2 Maps – Spot Map, Thematic Choroplethic Map, Cartogram
References
9 Some Quantitative Aspects of Medicine
9.1 Some Epidemiological Measures of Health and Disease
9.1.1 Epidemiological Indicators of Neonatal Health – Birth Weight, Apgar Score
9.1.2 Epidemiological Indicators of Growth in Children – Weight-for-Age, Weight-
for-Height and Height-for-Age, Z-Scores and Percent of Median, Growth
Velocity, Skinfold Thickness
9.1.3 Epidemiological Indicators of Adolescent Health – Growth in Height and
Weight in Adolescence, Sexual Maturity Rating
9.1.4 Epidemiological Indicators of Adult Health – Obesity, Smoking, Physiological
Functions, Quality of Life
9.1.5 Epidemiological Indicators of Geriatric Health – Activities of Daily Living,
Mental Health of the Elderly
9.2 Reference Values
9.2.1 Gaussian and Other Distributions – Checking Gaussianity
9.2.2 Reference or Normal Values – Implications
9.2.3 Normal Range – Disease Threshold, Clinical Threshold, Statistical Threshold
9.3 Measurement of Uncertainty: Probability
7

9.3.1 Elementary Laws of Probability – Law of Multiplication, Law of Addition


9.3.2 Probability in Clinical Assessments – Probabilities in Diagnosis, Assessment of
Prognosis, Choice of Treatment,
9.3.3 Further on Diagnosis: Bayes Rule
9.4 Validity of Medical Tests
9.4.1 Sensitivity and Specificity – Features of Sensitivity and Specificity, Likelihood
Ratio
9.4.2 Predictivities – Positive and Negative Predictivity, Predictivity and Prevalence,
The Meaning of Prevalence for Predictivity, Features of Positive and Negative
Predictivities
9.4.3 Combination of Tests – Tests in Series, Tests in Parallel, Gains from a Test,
When Can a Test Be Avoided?
9.4.4 Gains from a Test – When can a Test be Avoided
9.5 Search for the Best Threshold of Continuous Test: ROC Curve
9.5.1 Sensitivity–Specificity Based ROC Curve, Methods to Find the ‗Optimal‘
Threshold Point, Area Under the ROC Curve
9.5.2 Predictivities Based ROC Curve
References
10 Clinimetrics and Evidence-Based Medicine
10.1 Indicators, Indexes, and Scores
10.1.1 Indicators – Merits and Demerits of Indicators, Choice of Indicators
10.1.2 Indexes – Some Commonly Used Indexes, Advantages and Limitations of
Indexes
10.1.3 Scores – Scoring System for Diagnosis, Scoring for Gradation of Severity
10.2 Clinimetrics
10.2.1 Method of Scoring – Method of Scoring for Graded Characteristics, Method of
Scoring for Diagnosis, Regression Method of Scoring
10.2.2 Validity and Reliability of a Scoring System
10.3 Evidence-Based Medicine
10.3.1 Decision Analysis – Decision Tree
10.3.2 Other Statistical Tools for Evidence-Based Medicine – Etiology Diagram,
Expert System
References
11 Measurement of Community Health
11.1 Indicators of Mortality
8

11.1.1 Crude and Standardized Death Rates – Crude Death Rate, Age-Specific Death
Rate, Standardized Death Rate, Comparative Mortality Ratio
11.1.2 Specific Mortality Rates – Fetal Deaths and Mortality in Children, Maternal
Mortality, Adult Mortality, Other Measures of Mortality
11.1.3 Death Spectrum
11.2 Measures of Morbidity
11.2.1 Prevalence and Incidence – Point Prevalence, Period Prevalence, Incidence,
The Concept of Person-Time, Capture–Recapture Methodology
11.2.2 Duration of Morbidity – Prevalence in Relation to Duration of Morbidity,
Incidence from Prevalence, Epidemiologically Consistent Estimates
11.2.3 Morbidity Measures for Acute Conditions – Attack Rates, Disease Spectrum
11.3 Indicators of Social and Mental Health
11.3.1 Indicators of Social Health – Education, Income, Occupation, Socioeconomic
Status, Dependency Ratio, Health Inequality
11.3.2 Indicators of Health Resources – Health Infrastructure, Health Expenditure
11.3.3 Indicators of Lack of Mental Health – Smoking and Other Addictions,
Divorces, Vehicular Accidents and Crimes, Others Measures of Lack of
Mental Health
11.4 Composite Indexes of Health
11.4.1 Indexes of Status of Comprehensive Health – Human Development Index,
Physical Quality of Life Index
11.4.2 Indexes of Health Gap – DALYs Lost, Human Poverty Index, Index of Need
for Health Resources
References
12 Confidence Intervals, Principles of Tests of Significance, and Sample Size
12.1 Sampling Distributions
12.1.1 Basic Concepts – Sampling Error, Point Estimate, Standard Error of p and x
12.1.2 Sampling Distribution of p and x – Gaussian Conditions
12.1.3 Obtaining Probabilities from a Gaussian Distribution – Gaussian Probability,
Continuity Correction, Probabilities Relating to the Mean and the Proportion
12.1.4 The Case of σ Not Known (t-Distribution)
12.2 Confidence Intervals
12.2.1 Confidence Interval for π, μ and Median (Gaussian Conditions) – Confidence
Interval for Proportion π (Large n), Lower and Upper Bounds for π (Large n),
Confidence Interval for Mean μ (Large n), Confidence Bounds for Mean μ
(Large n), CI for Median (Gaussian Distribution)
9

12.2.2 Confidence Interval for Differences (Large n) – Two Independent Samples,


Paired Samples
12.2.3 Confidence Interval for π, μ and Median: NonGaussian Conditions –
Confidence Interval for π (Small n), Confidence Bound for π When the
Success or the Failure Rate in the Sample is Zero Percent, Confidence Interval
for Median (Small n): NonGaussian Conditions
12.3 P-Values and Statistical Significance
12.3.1 What Is Statistical Significance? – Court Judgment, Errors in Diagnosis, Null
Hypothesis, Philosophical Basis of Statistical Tests, Alternative Hypothesis,
One-Sided Alternatives: Which Tail is Wagging?
12.3.2 Errors, P-Values, and Power – Type-I Error, Type-II Error, Power
12.3.3 General Procedure to Obtain P-value – Subtleties of Statistical Significance
12.4 Assessing Gaussian Pattern
12.4.1 Significance Tests for Assessing Gaussianity
12.5 Initial Debate on Statistical Significance
12.5.1 Confidence Interval versus Test of H0
12.5.2 Medical Significance versus Statistical Significance
12.6 Sample Size Determination in Some Cases
12.6.1 Sample Size Required in Estimation Setup – General Considerations in the
Estimation Setup, General Procedure for Determining Size of Sample for
Estimation, Formulas for Sample Size Calculation for Estimation in Simple
Situations
12.6.2 Sample Size for Testing a Hypothesis with Specified Power – General
Considerations in a Testing-of-Hypothesis Setup, Sample Size Formulas for
Test of Hypothesis in Simple Situations, Nomograms and Tables of Sample
Size, Thumb Rules, Power Analysis
12.6.3 Sample Size Calculation in Clinical Trials – Stopping Rules in Case of Early
Evidence of Success or of Failure: Lan–deMets Procedure, Sample Size
Reestimation
References
13 Inference from Proportions
13.1 One Qualitative Variable
13.1.1 Dichotomous Categories: Binomial Distribution – Large n: Gaussian
Approximation to Binomial
13.1.2 Poisson Distribution
13.1.3 Polytomous Categories (Large n): Goodness-of-Fit Test – Chi-Square and Its
Explanation, Degrees of Freedom, Cautions in Using Chi-Square, Further
10

Analysis: Partitioning of Table


13.1.4 Goodness of Fit to Assess Gaussianity
13.1.5 Polytomous Categories (Small n): Exact Multinomial Test – Goodness-of-Fit
in Small Samples
13.2 Proportions in 2×2 Tables
13.2.1 Structure of 2×2 Table in Different Types of Study – Structure in Prospective
Study, Structure in Retrospective Study, Structure in Cross-Sectional Study
13.2.2 Two Independent Samples (Large n): Chi-Square Test and Proportion Test –
Chi-square Test, Yates Correction for Continuity, Z-Test for Proportions,
Detecting a Medically Important Difference in Proportions, Crossover Design
with Binary Response (Large n)
13.2.3 Equivalence Tests – Superiority, Equivalence and Noninferiority;
Equivalence; Determining Inferiority Margin
13.2.4 Two Independent Samples (Small n): Fisher Exact Test – Crossover Design
(Small n)
13.2.5 Proportions in Matched Pairs: McNemar Test (Large n) and Exact Test (Small
n) – Large n: McNemar Test, Small n: Exact Test (Matched Pairs),
Comparison of Two Tests for Sensitivity and Specificity: Paired Setup
13.3 Analysis of R × C Tables (Large n)
13.3.1 One Dichotomous and the Other Polytomous Variable (2×C Table) – The Test
Criterion, Trend in Proportions in Ordinal Categories, Dichotomy in Repeated
Measures: Cochran Q Test (Large n)
13.3.2 Two Polytomous Variables – Chi-square Test for Large n, Matched Pairs: I×I
Tables
13.4 Three-Way Tables
13.4.1 Assessment of Association in Three-Way Tables
13.4.2 Log–Linear Models – Two-Way Tables, Three-Way Tables
References
14 Relative Risk and Odds Ratio
14.1 Relative and Attributable Risks (Large n)
14.1.1 Risk, Hazard, and Odds – Ratios of Risks and Odds
14.1.2 Relative Risk – RR in Independent Samples, Confidence Interval for RR
(Independent Samples), Test of Hypothesis on RR (Independent Samples),
RR in the Case of Matched Pairs
14.1.3 Attributable Risk – AR in Independent Samples, AR in Matched Pairs,
Number Needed to Treat, Relative Risk Reduction, Population Attributable
Risk
11

14.2 Odds Ratio


14.2.1 OR in Two Independent Samples – CI for OR (Independent Samples), Test of
Hypothesis on OR (Independent Samples)
14.2.2 OR in Matched Pairs – Confidence Interval for OR (Matched Pairs), Test of
Hypothesis on OR (Matched Pairs), Multiple Controls
14.3 Stratified Analysis, Sample Size and Meta Analysis
14.3.1 Mantel–Haenszel Procedure – Pooled Odds Ratio and Chi-square
14.3.2 Sample Size Requirement for Statistical Inference on RR and OR
14.3.3 Meta Analysis
References
15 Inference from Means
15.1 Comparison of Means in One and Two Groups (Gaussian Conditions): Student t-
Test
15.1.1 Comparison with a Prespecified Mean – Student t-Test for One Sample,
15.1.2 Difference in Means in Two Samples – Paired Samples Setup, Unpaired
(Independent) Samples Setup, Some Features of Student t, Effect of Unequal n,
Difference-in-Differences Approach
15.1.3 Analysis of Crossover Designs – Test for Group Effect, Test for Carry-Over
Effect, Test for Treatment Effect
15.1.4 Analysis of Data of Up-and-Down Trials
15.2 Comparison of Means in Three or More Groups (Gaussian Conditions):
ANOVA F-Test
15.2.1 One-Way ANOVA – The Procedure to Test H0, Checking the Validity of the
Assumptions of ANOVA
15.2.2 Two-Way ANOVA – Two-Factor Design, The Hypotheses and Their Test,
Main Effect and Interaction (Effect), Repeated Measures
15.2.3 Repeated Measures – Random Effects versus Fixed Effects, Sphericity and
Hynh–Feldt Correction, Repeated Measures versus Two-way ANOVA, Area
Under the Concentration Curve
15.2.4 Multiple Comparisons: Bonferroni, Tukey and Dunnett Tests – Intricacies of
Multiple Comparisons
15.3 Non-Gaussian Conditions: Nonparametric Tests for Location
15.3.1 Comparison of Two Groups: Wilcoxon Tests – Paired Data, Independent
Samples
15.3.2 Comparison of Three or More Groups: Kruskal–Wallis Test
15.3.3 Two-Way Layout: Friedman Test
12

15.4 When Significant is Not Significant


15.4.1 The Nature of Statistical Significance
15.4.2 Testing for Presence of Medically Important Difference in Means – Detecting
Specified Difference in Mean, Equivalence Tests for Means
15.4.3 Power and Level of Significance – Balancing Type-I and Type-II Error
References
16 Relationships: Quantitative Data
16.1 Some General Features of a Regression Setup
16.1.1 Dependent and Independent Variables – Simple, Multiple, and Multivariate
Regression
16.1.2 Linear, Curvilinear, and Nonlinear Regressions
16.1.3 The Concept of Residuals
16.1.4 General Method of Fitting a Regression
16.2 Linear Regression Models
16.2.1 Adequacy of a Regression Fit – 1 – Goodness of Fit and η2, Multiple
Correlation in Linear Regression, Stepwise Procedure, Statistical Significance
of Individual Regression Coefficients
16.2.2 Adequacy of Regression – 2 – Validity of Assumptions, Choice of Form of
Regression, Outliers and Missing Values
16.2.3 Interpretation of the Regression Coefficients – Standardized Coefficients,
Other Implications of Regression Models
16.3 Some Issues in Linear Regression
16.3.1 Confidence Interval, Confidence Band, and Tests – SEs and CIs for the
Regression, Confidence Band for Simple Linear Regression, Equality of Two
Regression Lines, Difference-in-Differences Approach with Regression
16.3.2 Some Variations of Regression – Ridge Regression, Multilevel Regression,
Regression Splines, Analysis of Covariance, Some Generalizations
16.4 Measuring the Strength of Quantitative Relationship
16.4.1 Product–Moment and Related Correlations – Multiple Correlation, Product–
Moment Correlation, Covariance, Statistical Significance of r, Intraclass
Correlation, Serial Correlation
16.4.2 Rank Correlation – Spearman Rho, Kendall Tau
16.5 Assessment of Quantitative Agreement
16.5.1 Agreement in Quantitative Measurements
16.5.2 Approaches for Measuring Quantitative Agreement – Limits of Disagreement
Approach, Intraclass Correlation as a Measure of Agreement, Relative Merits
13

of the Two Methods, An Alternative Simple Approach


References
17 Relationships: Qualitative Dependent
17.1 Binary Dependent: Logistic Regression (Large n)
17.1.1 Meaning of a Logistic Model
17.1.2 Assessing Overall Adequacy of a Logistic Regression – Log Likelihood,
Classification Accuracy, Hosmer–Lemeshow Test,
17.2 Inference from Logistic Coefficients
17.2.1 Interpretation of the Logistic Coefficients – Dichotomous Predictors,
Polytomous and Continuous Predictors
17.2.2 Confidence Interval and Test of Hypothesis on Logistic Coefficients
17.3 Issues in Logistic Regression
17.3.1 Conditional Logistic for Matched Data
17.3.2 Polytomous Dependent – Nominal Categories: Multinomial Logistic, Ordinal
Categories
17.4 Some Models for Qualitative Data and Generalizations
17.4.1 Cox Regression for Hazards
17.4.2 Classification and Regression Trees
17.4.3 Further Generalizations
17.5 Strength of Relationship in Qualitative Variables
17.5.1 Both Variables Qualitative – Dichotomous Categories, Polytomous
Categories: Nominal, Proportional Reduction in Error, Polytomous
Categories: Ordinal Association
17.5.2 One Qualitative and the Other Quantitative Variable
17.5.3 Agreement in Qualitative Measurements (Matched Pairs) – The Meaning of
Qualitative Agreement, Cohen Kappa
References
18 Survival Analysis
18.1 Life Expectancy
18.1.1 Life Table
18.1.2 Other Forms of Life Expectancy – Potential Years of Life Lost, Healthy Life
Expectancy, Application to Other Setups
18.2 Analysis of Survival Data
18.2.1 Nature of Survival Data – Types of Censoring, Collection of Survival Time
Data, Statistical Measures of Survival
14

18.2.2 Survival Observed in Time Intervals: Life Table Method


18.2.3 Continuous Observation of Survival Time: Kaplan–Meier Method – Using the
Survival Curve, Standard Error of Survival Rate, Hazard Function
18.3 Issues in Survival Analysis
18.3.1 Comparison of Survival in Two Groups – Comparing Survival Rates,
Comparing Survival Experience: Log-Rank Test
18.3.2 Factors Affecting Survival: Cox Model – Parametric Models, Cox Model for
Survival, Proportional Hazards
18.3.3 Sample Size for Survival Studies
References
19 Simultaneous Consideration of Several Variables
19.1 Scope of Multivariate Methods
19.1.1 The Essentials of a Multivariate Setup
19.1.2 Statistical Limitation on the Number of Variables
19.2 Dependent and Independent Sets of Variables
19.2.1 Dependents and Independents Both Quantitative: Multivariate Multiple
Regression
19.2.2 Quantitative Dependents and Qualitative Independents: Multivariate Analysis
of Variance (MANOVA) – MANOVA for Repeated Measures
19.2.3 Classification of Subjects into Known Groups: Discriminant Analysis –
Discriminant Function, Classification Rule, Classification Accuracy
19.3 Identification of Structure in the Observations
19.3.1 Identification of Clusters of Subjects: Cluster Analysis – Measures of
Similarity, Hierarchical Agglomerative Algorithm, Deciding on the Number
of Natural Clusters
19.3.2 Identification of Unobservable Underlying Factors: Factor Analysis – Steps
for Factor Analysis, Features of a Successful Factor Analysis, Factor Scores
References
20 Quality Considerations
20.1 Statistical Quality Control in Medical Care
20.1.1 Statistical Control of Medical Care Errors – Adverse Patient Outcomes,
Monitoring Fatality, Limits of Tolerance
20.1.2 Quality of Lots – The Lot Quality Method, LQAS in Health Assessment
20.1.3 Quality Control in a Medical Laboratory – Control Chart, Cusum Chart, Other
Errors in Medical Laboratory, Six Sigma Methodology, Nonstatistical Issues
20.2 Quality of Measurements
15

20.2.1 Validity of Instruments – Types of Validity


20.2.2 Reliability of Instruments – Internal Consistency, Cronbach Alpha, Test–
Retest Reliability
20.3 Quality of Statistical Models: Robustness
20.3.1 External Validation – Split-Sample Method, Another Sample Method
20.3.2 Sensitivity Analysis and Uncertainty Analysis
20.3.3 Resampling – Bootstrapping, Jackknife Resampling
20.4 Quality of Data
20.4.1 Errors in Measurement – Lack of Standardization in Definitions, Lack of Care
in Obtaining or Recording Information, Inability of the Observer to Get
Confidence of the Respondent, Bias of the Observer, Variable Competence of
the Observers
20.4.2 Missing Values – Approaches for Missing Values, Handling Nonresponse,
Imputations, Intention-to-Treat Analysis
20.4.3 Lack of Standardization in Values – Standardization Methods Already
Described, Standardization for Calculating Adjusted Rates, Standardized
Mortality Ratio
References
21 Statistical Fallacies
21.1 Problems with the Sample
21.1.1 Biased Sample – Survivors, Volunteers, Clinical Subjects, Publication Bias,
Inadequate Specification of Sampling Method, Abrupt Series
21.1.2 Inadequate Size of the Sample – Problems with Calculation of Sample Size
21.1.3 Incomparable Groups – Differential in Group Composition, Differential
Definitions, Differential Compliance, Variable Periods of Exposure, Improper
Denominator
21.1.4 Mixing of Distinct Groups – Effect on Regression, Effect on Shape of the
Distribution, Lack of Intragroup Homogeneity
21.2 Inadequate Analysis
21.2.1 Ignoring Reality – Looking for Linearity, Overlooking Assumptions, Selection
of Inappropriate Variables, Area Under the Concentration Curve, Further
Problems with Statistical Analysis, Anomalous Person-Years, Problems with
Intention-to-Treat Analysis and Equivalence
21.2.2 Choice of Analysis – Mean or Proportion? Forgetting Baseline Values
21.2.3 Misuse of Statistical Packages – Over-Analysis, Data Dredging, Quantitative
Analysis of Codes, Soft Data versus Hard Data
21.3 Errors in Presentation of Findings
16

21.3.1 Misuse of Percentages and Means – Unnecessary Decimals


21.3.2 Problems in Reporting – Incomplete Reporting, Over-Reporting, Selective
Reporting, Self-Reporting versus Objective Measurement, Misuse of Graphs
21.4 Misinterpretation
21.4.1 Misuse of P-Values – Magic Threshold 0.05, One-Tail or Two-Tail P-Values,
Multiple Comparisons, Dramatic P-Values, P-Values for Nonrandom Sample,
―Normal‖ with Respect to Several Parameters, Absence of Evidence is not
Evidence of Absence
21.4.2 Correlation versus Cause–Effect Relationship – Criteria for Cause–Effect,
Other Considerations
21.4.3 Sundry Issues – Diagnostic Test is Only an Additional Adjunct, Medical
Significance versus Statistical Significance, Interpretation of Standard Error of
p, Univariate Analysis but Multivariate Conclusions, Limitation of Relative
Risk, Misinterpretation of Improvements
21.4.4 Final Comments
References
Appendix A Statistical Software
A.1 General Purpose Statistical Software
A.2 Special Purpose Statistical Software
Appendix B Some Statistical Tables
Appendix C Software Illustrations
C.1 ROC Curves
C.2 Repeated Measures ANOVA
C.3 One-way ANOVA and Tukey Test
C.4 Stepwise Multiple Linear Regression
C.5 Curvilinear Regression
C.6 Analysis of Covariance (ANCOVA)
C.7 Logistic Regression
C.8 Survival Analysis (Life Table Method)
C.9 Cox Proportional Hazards Model
Index
Data sets in the Examples in this text are available in Excel for ready download at
http://MedicalBiostatistics.synthasite.com . Use these data sets to rework some of the examples
of your interest and to do further analysis where needed.
17

Preface to Third Edition


Biostatistical aspects are receiving increased emphasis in medical books, medical journals, and
pharmaceutical literature, yet there is a lack of appreciation of biostatistical methods as a medical
tool. This book arises from the desire to help biostatistics earn its rightful place as a medical,
rather than a mathematical, subject. Medical and health professionals may then perceive
biostatistics as their own discipline, instead of an alien discipline. A book that effectively focuses
on the statistical aspects of medicine with a medical perspective is clearly needed. To enhance
focus, this book is titled Medical Biostatistics. Prefix ‗medical‘ precludes fishes and plants that a
purist might include under the genre of ‗bio‘statistics.
Variation is an essential, and perhaps the most enjoyable, aspect of life. But consequent
uncertainties are profound. Thus, methods are needed to measure the magnitude of uncertainties
and to minimize their impact on decisions. Biostatistics is the science of management of
uncertainties in health and medicine. Beginning with this premise, this book provides a new
orientation to the subject. This theme is kept alive throughout the text. I have tried to
demonstrate that biostatistics is not just statistics applied to medicine and health sciences but is
two steps further, providing tools to manage some aspects of medical uncertainties.
The primary target audiences are students, researchers, and professionals of medicine and
health. These include clinicians who deal with medical uncertainties in managing patients and
want to practice evidence-based medicine; research workers who design and conduct empirical
investigations to advance knowledge, including research workers in pharmaceutical industry who
search new regimens that are safer and more effective yet less expensive and more convenient;
and health administrators who are concerned with epidemiological aspects of health and disease.
Although the text is tilted to the viewpoint of medical and health professionals, the contents
are of sufficient interest to a practicing biostatistician and a student of biostatistics as well. They
may find some sections very revealing, particularly the heuristic explanations provided for
various statistical methods.
The boundary between epidemiology methods and biostatistics is thin, if at all. This text does
not limit itself to the conventional topics of confidence intervals and tests of significance. It
discusses at length study designs, measurement of health and diseases, clinimetrics, and quality
control in medical setup. The text fosters the thought that medicine has to be individualized yet
participatory. It tries to develop pathways that can achieve this through biostatistical thinking.
Emphasis is laid on the concepts and interpretation of the methods rather than on theory or
intricacies. Theoretical development is intentionally de-emphasized and applications increasingly
emphasized. A large number of real-life examples are included that illustrate the method and
explain the medical meaning of the results. Many statistical concepts are repeatedly explained in
different contexts to bring home the point, keeping the requirement of the target audience in
mind.
In the process of projecting biostatistics as a medical discipline, it is imperative to place less
emphasis on mathematical aspects. But the essential algebra, which is needed to communicate
and understand some statistical concepts, is not ignored. In fact, the second half of the book
makes liberal use of notations. An attempt is made to strike an even balance. Medical and health
professionals, who are generally not well trained in mathematics, may find the language and
presentation very conducive. Equations and formulas are separately identified and manual
calculations are described for the fundamentals, but the emphasis is on the use of computers for
18

advanced calculations. Software illustrations for intricate methods are provided in Appendix to
this book.
The text is fairly comprehensive and incorporates a large number of statistical concepts used
in medicine and health. The contents are more than an introduction and less than an advanced
treatise. References have been provided for further reading. A medical or a health professional
should be able to plan and carry out an investigation by oneself on the basis of this text and
intelligently seek the help of an expert biostatistician when needed. Medical laboratory
professionals, scientists in basic medical sciences, epidemiologists, public health specialists,
nutritionists, and others in health-related disciplines may also find this volume useful. The text is
expected to provide a good understanding of the statistical concepts required to critically
examine the medical literature. The material is suitable for use in preparation for professional
examinations such as that for membership in the College of Physicians. The content is also broad
enough to cover an undergraduate biostatistics course for medical and health science students.
I am thankful to the reviewers around the world who have examined the book
microscopically and provided extremely useful suggestions for its improvement while also
finding first edition as ‗probably the most complete book on biostatistics‘ and second edition
‗almost encyclopedic in breadth‘. This edition incorporates most of these suggestions. The
second edition increased the coverage and now third edition increases the depth. Some details
left out earlier have been included to provide more intelligible reading. Yet, many important
techniques continue to be side-tracked in this text. This illustrates my escape from discussing
complexities as the book is designed primarily for medical professionals.
The sequence of chapters may not look natural to statisticians because their thoughts follow
mathematical continuum but may look natural to medical and health professionals whose
biostatistics needs are for problem solving.
I am confident that the book would be found as the most comprehensive treatise on
biostatistical methods. In the process, I realize I am undertaking the risk involved in including
elementary- and middle-level discussions in the same book. I would be happy to receive
feedback from readers.
Abhaya Indrayan
19

Summary Tables
SUMMARY-1: Methods to compute some confidence intervals
Parameter of Interest Conditions 95% CI
Proportion (π) Large n, p ≠ 0 and p ≠ 1 Equation 12.11
Small n, any p Figure 12-4
Any n, p = 0 or 1 (bound) Table 12-4
Mean (μ) Large n, σ known, almost any underlying Equation 12.14
distribution
Small n, σ known or unknown, underlying Table 12-5 (CI for
nonGaussian median)
Any n, σ unknown, underlying Gaussian Equation 12.15
Large n, σ unknown, underlying nonGaussian Equation 12.15
Small n, σ known, underlying Gaussian Equation 12.14
Median Gaussian distribution Equation 12.18
NonGaussian Conditions Table 12-5
Difference (π1 – π2) Large n1, n2—Independent samples Equation 12.20
Large n1, n2—Paired samples Equation 12.23
Difference (μ1 – μ2) Independent samples
(σ unknown)
Large n1, n2—Any underlying distribution Equation 12.21
Small n1, n2—Underlying Gaussian Equation 12.21
Paired samples Same as for one sample
after taking the
difference
Relative risk Large n1, n2—Independent samples Equation 14.4
Large n1, n2—Paired samples Same as for OR
Attributable risk Large n1, n2—Independent samples Same as for π1 – π2
Large n1, n2—Paired samples Equation 14.12
Number needed to treat Large n1, n2—Independent samples Section 14.1.3
Odds ratio Large n1, n2—Independent samples Equation 14.18
Large n1, n2—Paired samples Equation 14.21
Regression coefficient Large n Section 16.3.1
Regression line Large n Section 16.3.1
Logistic coefficient Large n Section 17.2.2
20

SUMMARY-2: Statistical procedures for test of hypothesis on proportions


Parameter of
Interest and Setup Conditions Main Criterion Equation/Section
Small Sized Tables
One dichotomous Independent trials
variable
Any n Binomial Use Equation 13.1
Large n Gaussian Z Equation 13.3
One polytomous Independent trials
variable
Large n Goodness-of-fit Equation 13.5
chi-square
Small n Multinomial Use Equation 13.6
Two dichotomous Two independent
variables (2×2) samples
Large n Chi-square or Equation 13.8 or
Gaussian Z Equation 13.9
Small n Fisher exact Equation 13.11
Detecting a medically Gaussian Z Equation 13.10
important difference—
Large n
Equivalence test TOSTs Section 13.2.3
Matched pairs
Large n McNemar Equation 13.12
Small n Binomial Equation 13.13
Crossover design
Large n Chi-square Section 13.2.2
Small n Fisher exact Equation 13.11
Bigger Tables, The Case of Small n Large n Required
No Matching Not Discussed in This
Text
Association 2×C tables Chi-square Equation 13.15
Trend in proportions 2×C tables Chi-square for trend Equation 13.16
Dichotomy in Many related 2×2 tables Cochran Q Equation 13.18
repeated measures
Association R×C tables Chi-square Equation 13.15
Association Three-way tables
Test of full Chi-square Equation 13.19
independence
Test of other types of G2 Three-way
independence (log–linear extension of
models) Equation 13.22
I×I Table Matched pairs McNemar–Bowker Section 13.3.2
Stratified Stratified into many 2×2 Mantel-Haenszel Equation 14.26
tables chi-square
21

SUMMARY-3: Procedures for test of hypothesis on relative risk (RR) and odds ratio (OR)
Parameter of
Interest and Equation/Sectio
Setup Conditions Main Criterion n
Relative and The Case of Small n Large n Required
Attributable Not Discussed in This
Risks Text
ln(RR) Two independent Gaussian Z or Equation 14.5 or
samples Chi-square Equation 13.8
RR Matched pairs As for OR Section 14.2.2
Gaussian Z or Equation 14.22
McNemar or Equation
14.23
Stratified Mantel–Haenszel Equation 14.26
chi-square
AR Two independent Chi-square or Equation 13.8 or
samples Gaussian Z Equation 13.9
Matched pairs McNemar Equation 13.12
Odds Ratio The Case of Small n Large n Required
Not Discussed in This
Text
ln(OR) Two independent Chi-square Equation 13.8
samples
OR Matched pairs Gaussian Z or Equation 14.22
McNemar or Equation
14.23
Stratified Mantel–Haenszel Equation 14.26
chi-square
22

SUMMARY-4: Statistical procedures for test of hypothesis on means or locations


Setup Conditions Main Criterion Equation/Section
One sample Comparison with
prespecified—Gaussian
σ known Gaussian Z Section 15.1.1
σ not known Student t Equation 15.1
Comparison of two Paired—Gaussian Student t Equation 15.3
groups
Paired—NonGaussian
Any n Sign test Equation 15.17a–c
5 ≤ n ≤ 19 Wilcoxon signed- Equation 15.18a
ranks WS
20 ≤ n ≤ 29 Standardized WS Equation 15.18b
referred to Gaussian Z
n ≥ 30 Student t Equation 15.3
Unpaired—Gaussian
Equal variances Student t Equation 15.6a
Unequal variances Student t Equation 15.6b
Unpaired—NonGaussian
n1, n2 between (4, 9) Wilcoxon rank-sum Equation 15.19
WR
n1, n2 between (10, (29) Standardized WR Equation 15.20
referred to Gaussian Z
n1, n2 ≥ 30 Student t Equation 15.6a or
Equation 15.6b
Crossover design Student t Section 15.1.3
Gaussian
Up-and-down trial Section 15.1.4
Detecting medically Student t Equation 15.23
important difference
Equivalence tests Student t Section 15.4.2
Comparison of One-way layout Gaussian ANOVA F Equation 15.8
three or more
groups
NonGaussian
n≤5 Kruskal–Wallis H Equation 15.21
n≥6 H referred to chi- Equation 15.21
square
Two-way layout Gaussian ANOVA F Section 15.2.2
NonGaussian (one
observation per cell)
J ≤ 13 and K = 3 Friedman S Equation 15.22a or
Equation 15.22b
J ≤ 8 and K = 4 Friedman S Equation 15.22a or
Equation 15.22b
J ≤ 5 and K = 5 Friedman S Equation 15.22a or
23

Equation 15.22b
Larger J, K S referred to chi- Equation 15.22a or
square Equation 15.22b
Multiple comparisons
Gaussian
All pairwise Tukey D Equation 15.15
With control group Dunnett Section 15.2.4
Few comparisons Bonferroni Section 15.2.4
Repeated measures Gaussian Section 15.2.3
24

SUMMARY-5: Methods for studying the nature of relationship


Dependent Independent Variables Equation/Sectio
Variable (y) (xs) Method n
Quantitativea Qualitative ANOVA Section 15.2
Quantitative Quantitative Quantitative regression Chapter 16
Quantitative Mixture of qualitative ANCOVA Section 16.3.2
and quantitative
Qualitative Qualitative or Logistic Sections 17.1 and
(dichotomous) quantitative or mixture 17.2
Qualitative Qualitative or Logistic—any two Section 17.3.2
(polytomous) quantitative or mixture categories at a time
Quantitative Discriminant Section 19.2.3
Survival Groups Life table Equation 18.8
Kaplan–Meier Equation 18.10
Log–rank Section 18.3.1
Hazard ratio Mixture of qualitative Cox model Section 18.3.2
and quantitative
Note: Large n required, particularly for tests of significance. Exact method for small n not
discussed in this text.
a
Quantitative are variables on metric scale without any broad categories. Fine categories are
admissible.
25

SUMMARY-6: Main methods of measurement of strength of relationship between two variables


Type of Variables Measure Equation/Section
Both qualitative
Binary categories OR and several others Section 17.5.1
Polytomous categories - Phi-coefficient Equation 17.7a
nominal
Contingency coefficient Equation 17.7b
Cramer V Equation 17.7c
Proportional reduction in error Equation 17.8
Polytomous categories - Kendall tau, Goodman– Section 17.5.1
ordinal Kruskal gamma, Somer d
Dependent qualitative and Odds ratio Section 17.1
independent quantitative
Dependent quantitative and R2 from ANOVA Equation 17.9
independent qualitative
Both quantitative η2 from regression Equation 16.7
For multiple linear R2 from regression Use Equation 16.7
For simple linear r Equation 16.17
For monotonic rS Equation 16.19
For intraclass rI Equation 16.20 or 16.21
Agreement
Qualitative Cohen kappa Equation 17.10
Quantitative Limits of disagreement Section 16.5.2
Intraclass Equation 16.20 or 16.21
26

SUMMARY-7: Multivariate methods in different situations (large n required)


Nature of the Types of Statistical Method Section
Variables Objective Variables
A dependent set Relationship Both quantitative Multivariate Section
and an multiple regression 19.2.1
independent set
Equality of Dependent MANOVA Section
means of quantitative and 19.2.2
dependents independent
qualitative
Dependent is one Classify subjects Independent Discriminant Section
of many groups into known quantitative analysis 19.2.3
groups
All variables Discover natural Qualitative or Cluster analysis Section
interrelated (none clusters of quantitative or 19.3.1
is dependent) subjects mixed
Identify Quantitative Factor analysis Section
underlying 19.3.2
factors that
explain the
interrelations

Note: Situations not mentioned in Summary Tables 1–7 are not discussed in this book.
27

Frequently Used Notations


I have tried to restrict the mathematical expressions to a minimum but notations have been used
so that clarity and generalizability do not suffer. Some notations have been used for more than
one quantity. The following list may help you to understand the text more easily. The list is not
exhaustive. Some notations that have been sparingly used in specific contexts are not included.

2 chi-square
1–α confidence level
1–β power
A antecedent characteristic
A, B, C, D frequencies in a 2×2 table—retrospective matched pairs
a, b, c, d frequencies in a 2×2 table, particularly in case of relative risk
and odds ratio
ak upper end point of the kth interval (k = 1, 2, …, K)
akm factor loading of mth factor on kth variable (m = 1, 2, …, M; k
= 1, 2, …, K)
b sample regression coefficient in simple linear regression
bar (–) over a mean of that variable, such as x and y
variable
bk sample regression coefficient for the kth regressor
bmk factor score coefficient of mth factor for kth variable
C cumulative frequency, number of columns in a contingency
table, contingency coefficient, complaints or symptoms
complex, contribution to log-likelihood, number of controls
per case
d difference, discriminant value, Euclidean distance
D discriminant function, disease
Dk kth discriminant function
dot (•) sum over the corresponding group
e residual, Naperian base
Ek, Erc, Ercl expected frequency in the subscripted group
et expectation of life at age t in a life table
F ANOVA criterion
f function
fk frequency in the kth group
Fm mth factor (m = l, 2, …, M)
H Kruskal–Wallis criterion
H0 null hypothesis
H1 alternative hypothesis
hat(^) over a estimated or predicted value of the parameter or of the
parameter or a variable
variable
I sampling interval, smoking index
J number of groups, number of independent variables
K number of groups, number of dependent variables
28

L number of layers in a contingency table, likelihood, precision,


number of years lived (in life table)
L0 likelihood under H0
L1 likelihood under the model
M number of factors, number of observers, number of methods,
number of items
max maximum value among the columns
N number of subjects in a population
n size of sample
nk number of subjects in the kth group in a sample
O outcome
Ok, Orc, Orcl observed frequency in the subscripted group (k = 1, 2, …, K; r
= 1, 2, …, R; c = 1, 2, …, C; l = 1, 2, …, L)
P probability, particularly of Type–I error
p proportion in the sample, estimated probability
P(–) negative predictivity
P(+) positive predictivity
Q Cochran statistic
Q1 first quartile
Q3 third quartile
R number of rows in a contingency table
r product–moment correlation coefficient in a sample
R2 square of the coefficient of multiple correlation
rI intraclass correlation coefficient in a sample
Rij rank of Yij (in case of nonparametric methods)
rs Spearman‘s rank correlation coefficient in a sample
s sample standard deviation
S(–) specificity
S(+) sensitivity
S1, S2 Friedman criterion
sd standard deviation of difference in a sample
sp pooled estimate of the standard deviation
t Student t
T Test, number of time points (t = 1, 2, …, T)
tv t-value at v degrees of freedom
WR Wilcoxon rank-sum criterion
WS Wilcoxon signed-ranks criterion
x, y variable values
X[1] ordered value of x at ith rank
xi,.xk ith or kth observation, midpoint of an interval
xij, yij, yijk ith observed value of the variables x or y in the jth or (j,k)th
group (i = 1, 2, …, n; j = 1, 2, …, J; k = 1, 2, …, K)
z specific value of Z
Z Z-score or a standardized Gaussian variable
zα the value such that P(Z ≥ zα) = α
α level of significance
29

β probability of Type–II error


βk regression coefficient in the population for the kth regressor
ε relative precision
η2 coefficient of determination
κ Cohen kappa
λ logistic of π
Λ Wilks criterion
λ(t) Hazard at time t
μ population mean
μj, μk population mean in the subscripted group (j = 1, 2, …, J; k =
1, 2, …, K)
ν degrees of freedom
π population proportion or probability
πrc, πk π in the subscripted group (r = 1, 2, …, R; c = 1, 2, …, C; k =
1, 2, …, K)
ρ product–moment correlation coefficient in the population
ρs rank correlation coefficient in the population
σ population standard deviation
Σ sum
φ phi-coefficient

View publication stats

You might also like