# BASIC CONCEPTS IN BIOSTATISTICS

Definition
• Numerous definitions by numerous
authors • Example: it is a set of concepts, rules, and procedures that help us to:
– organize numerical information in the form of tables, graphs, and charts; – understand statistical techniques underlying decisions that affect our lives and well-being; and – make informed decisions.

Our working definition
• It is the scientific study of numerical data
based on natural phenomena.
– Scientific.. – Data…quantities of information..groups of individuals. – Numerical: quantified in one way or anothers – Natural phenomena: but natural and introduced


statistics (singular) vs statistics (plural)
• The word "statistics" is also used in
another, though related, way. It can be the plural of the noun statistic, which refers to any one of many computed or estimated statistical quantities, such as the mean, the standard deviation, or the correlation coefficient. Each one of these is a statistic.


Variable
• Variable in general sense
– property of an object or event that can take on different values. – More than one variable can be measured on each smallest sampling unit.

• Variable in strict sense
– It as a properly with respect to which individuals in a sample differ in some ascertainableA variable is measured through counting (eg. weight), sorting (eg. gender) or ordering (severity of depression)


Variate/datum
• any particular measured instance of a
variable spoken of as a variate: eg the measured weight of the body weight of this, that, or the other person; male and female are variates/data of the variable gender.



Classification of variable
• Classified based on
– Limit of possible values - Discrete vs continuous Variable – Cause-effect 1 - Independent vs dependent Variable – Cause-effect 2 - Intervention vs outcome Variable – Qualitative vs Quantitative Variable – Method of measuremenr - Measurement scale: nominal, ordinal, interval, ratio

Discrete vs continuous Variable
• Discrete Variable - a variable that can
assume only whole number of values (e.g., gender (male/female), college class (freshman/sophomore/junior/senior). • Continuous Variable - a variable that can take on many different values, in theory, any value between the lowest and highest points on the measurement scale.


Independent vs dependent Variable

• Independent Variable - a variable that is

manipulated, measured, or selected by the researcher as an antecedent condition to an observed behavior. • Dependent Variable - a variable that is not under the experimenter's control It is the variable that is observed and measured in response to the independent variable.


Qualitative vs Quantitative Variable

• Qualitative Variable - a variable based on
categorical data. – Nominal: – Ordinal:

• Quantitative Variable - a variable based on
quantitative data.
– Interval: meaningful distance, but no absolute zero • Discrete – no decimal • continous – Ratio e.g weight

The Population in biostatistics

• The biological definition refers to all the

individuals of a given species (perhaps of a given life-history stage or sex) found in a circumscribed area at a given time. • In statistics, population always means the totality of individual observations (and at times totality of individuals) about which inferences are to be made, existing anywhere in the world or at least within a definitely specified sampling area limited in space and time.

Finite and infinite population
• Finite: a concrete collection of objects or
creatures, such as the tail lengths of all the white mice in the world, the leucocyte counts of all the Chinese men in the world • Infinite: outcomes of experiments, such as all the heartbeat frequencies produced in guinea pigs by injections of adrenalin; the experiment can be repeated an infinite number of times (at least in theory).


Census, Parameter, Statistic

• Census: enumeration or count of every

member of the population. (mental health survey of all Nigerians • Parameter: summary measure of the individual observations made in census of an entire population. – E.g., average GHQ score • Statistic: summary measure obtained from a sample. – E.g., average GHQ score of 5,000 Nigerians selected for the 6 Geopolital zones

Samples in biostatics
• A sample is a subset of a population. • Sampling unit is a single instance of the

sample about which observations or measurements are taken. The sampling units frequently, but not necessarily, are also individuals in the ordinary biological sense. Discuss



Sampling frame
• A complete list of all the eligible sampling
units, from which sample is drawn for the study. • A requirement for simple random sampling



TESTS OF SIGNIFICANCE 1: Conditions for parametric tests
• The data must have normal distribution • Homogeneity of variance: 1. equality of variance
of different groups; 2. the variance of one variable should be stable at all levels of the other variable. Data must be Interval or ratio , not categorical Independence of different set of data, that is sets of data are not from same sample, except on case of repeated sample T-test


• •

TESTS OF SIGNIFICANCE 2: Difference in means

Analysis

Parametric test

Non-parametric test

Difference between a mean value from a One sample population and that of a specified value T-Test Difference between two means from (two) independent samples/populations: parametric; non-parametric TwoIndependentSample TTest

One sample ??kolmogorov_Smi rnov (K-S) test Mann_Whitney U, ??Kolmogorov – Smirnov Z, Moses Extreme Reactions,a nd Wald-Wolfowitz runs

Difference between two means from repeated measures of same sample/population

Paired Wilcoxon Signed sample T-test Ranks Test, Sign, and McNemar TwoRelated Sample Tests
TESTS OF SIGNIFICANCE 2: Difference in means

Difference between more than two means from independent samples: Difference between more than two means from repeated measures: independent variable is categorical; dependent is quantitatve Difference between more than two means from independent samples with two or more independent variables: Difference between more than two means from independent samples with two or more dependent variables:

One-Way ANOVA Repeated measure ANOVA

Kruskal-Wallis Test Friedman, Kendall’s, Cochran’s

ANCOVA

MANOVA

TESTS OF SIGNIFICANCE 3: relationships

Analysis

Parametri c test

Non-parametric test

Correlation between two sets of Pearson Kendall's tau_b; quantitative variables (assuming Correlation Spearman's rho NO cause-effect relationship: both variables quantitative)

Correlation between two sets of Partial quantitative variables (assuming correlation NO cause-effect relationship: both coefficient variables quantitative) plus controlling for confounding variables

Partial correlation coefficient

TESTS OF SIGNIFICANCE 3: relationships

Analysis

Parametri c test

Non-parametric test

Correlation between two sets of quantitative variables (assuming there is cause-effect relationship: both variables quantitative) Here, we talk of dependent and independent variables Correlation between two sets of quantitative variables (assuming there is cause-effect relationship: both variables quantitative) Here, we talk of one dependent and two or more independent variables

Linear Logistic Regression Regression

Multiple Linear Regression

TESTS OF SIGNIFICANCE 4: Reconciliation

Analysis

Parametr Nonic test parametric test

Correlation between two sets of MANOV quantitative variables A (assuming NO cause-effect relationship: both variables quantitative) Several variables, need to reduce to dimensions/ factors eg cattel 16 factors Partial Explorator y factor analysis
