You are on page 1of 15



COURSE: B. Pharm 8th semester

Observations from biological laboratory experiments, clinical trials, and health
surveys always carry some amount of uncertainty. In many cases, especially for
the laboratory experiments, it is inevitable to just ignore this uncertainty due to
large variation in observations. Tools from statistics are very useful
in analysing this uncertainty and filtering noise from data. Also, due to
advancement of microscopy and molecular tools, a rich data can be generated
from experiments. To make sense of this data, there is a need to integrate this
data using tools from statistics. Statistical tools can therefore be used to:
 analyse our observations
 design new experiments, and
 integrate large number of observations in single unified model.

Biostatistics is thus explained as the science that helps in managing medical

uncertainties. It mainly consists of various steps like generation of hypothesis,
collection of data and application of statistical analysis.

Biostatistics is conventionally divided into two aspects:

1. RESEARCH: This includes designing and conducting of experiments for
collecting the data.
2. STATISTICAL ANALYSIS: This includes use of various types of statistical
tools to analyse and interpret the data collected in research.



FRANCIS GALTON is called as the “FATHER OF BIOSTATISTICS”. He for the first

time used statistical tools to study differences among human population. He
also invented the use of questionnaires and surveys for collecting the data
from human communities.
Research is a careful investigation or inquiry specifically through search for new
facts in any branch of knowledge. It is an original contribution to the existing
stock of knowledge making for its advancement. Research can simply be defined
a task of searching from available data to modify a certain result or theory.
There are various bases to classify the research:
1. On the Basis of Objectives of Research
On the basis of objectives of research they are of two types:
 Fundamental research and
 Action research.
2. On the Basis of Approach of Research
On the basis of approach of Research they are of two types:
 Longitudinal research: Historical research, case study, genetic
comes under longitudinal approach of research.
 Cross sectional research: Experimental research, survey are the
examples of cross sectional research.
3. On the Basis of Precision in Research Findings
On the basis of precision (accuracy) the researches are:
 Experimental research and Perception of Research
 Non-experimental research.
Experimental research is precise while non-experimental is not.
4. On the Basis of Nature of Findings
On the basis of findings Researches are of two types:
 Explanatory research: Such researches explain more concerned
theories. laws and principles.
 Descriptive research: These are more concerned with facts.
5. According to National Science Foundation
These National Science Foundation formulated a three-fold classification
of research.
 Basic research: Those researches which embrace origin or unique
investigation for the advancement of knowledge.
 Applied research: Which may be characterized as the utilization in
 Development research: It is the use of scientific knowledge for the
production of useful materials, devices, systems, methods for
processes excluding design and production engineering.
6. Another Classification
 Adhoc research: Adhoc research is the class of inquiry used for a
purpose alone and special.
 Empirical research: Empirical research is that which depends upon
the experience or observation of phenomena and events.
 Explained research: Explained research is that which is based on a
 Boarder line research: Boarder line research is that which involves
those main two branches or are as of science. For example study of
public school finance.


Research methods are all those methods and techniques that are used for
conduction of research. It refers to the methods the researchers use in
performing research operations. It can be put under three groups –

1. Methods concerned with the collection of data.

2. Statistical techniques used for establishing relationship between
3. Methods to evaluate the accuracy of the results.

Research Methodology is a way to systematically solve a research problem. It

is a science of studying how research is done scientifically. Essentially it is the
procedure by which the researchers go about their work of describing,
evaluating and predicting phenomenon. It aims to give the work plan of
research. It provides training in choosing methods materials, scientific tools and
techniques relevant for the solution of the problem.


One should remember that the various steps involved in a research process are
not mutually exclusive; nor they are separate and distinct. They do not
necessarily follow each other in any specific order and the researcher has to be
constantly anticipating at each step in the research process the requirements of
the subsequent steps. However, the following order concerning various steps
provides a useful procedural guideline regarding the research process:
1. Formulating the research problem: Essentially two steps are involved in
formulating the research problem, viz., understanding the problem
thoroughly, and rephrasing the same into meaningful terms from an
analytical point of view. The researcher must at the same time examine
all available literature to get himself acquainted with the selected
problem. He may review two types of literature—the conceptual
literature concerning the concepts and theories, and the empirical
literature consisting of studies made earlier which are similar to the one
2. Extensive literature survey: Once the problem is formulated, a brief
summary of it should be written down. It is compulsory for a research
worker writing a thesis for a Ph.D. degree to write a synopsis of the topic
and submit it to the necessary Committee or the Research Board for
approval. At this juncture the researcher should undertake extensive
literature survey connected with the problem. Academic journals,
conference proceedings, government reports, books etc., must be tapped
depending on the nature of the problem.
3. Development of working hypothesis: Working hypothesis is tentative
assumption made in order to draw out and test its logical or empirical
consequences. As such the manner in which research hypotheses are
developed is particularly important since they provide the focal point for
research. They also affect the manner in which tests must be conducted
in the analysis of data and indirectly the quality of data which is required
for the analysis. The role of the hypothesis is to guide the researcher by
delimiting the area of research and to keep him on the right track.
4. Preparing the research design: Research design states the conceptual
structure within which research would be conducted. The function of
research design is to provide for the collection of relevant evidence with
minimal expenditure of effort, time and money. There are several
research designs, such as, experimental and non-experimental hypothesis
testing. Experimental designs can be either informal designs (such as
before-and-after without control, after-only with control, before-and-
after with control) or formal designs (such as completely randomized
design, randomized block design, Latin square design, simple and complex
factorial designs), out of which the researcher must select one for his own
5. Sampling Design: A complete enumeration of all the items in the
‘population’ is known as a census inquiry. Census inquiry is not possible in
practice under many circumstances. For instance, blood testing is done
only on sample basis. Hence, quite often we select only a few items from
the universe for our study purposes. The items so selected constitute
what is technically called a sample. The researcher must decide the way
of selecting a sample or what is popularly known as the sample design. In
other words, a sample design is a definite plan determined before any
data are actually collected for obtaining a sample from a given population.
Samples can be either probability samples or non-probability samples.
With probability samples each element has a known probability of being
included in the sample but the non-probability samples do not allow the
researcher to determine this probability.
6. Collection of data: Primary data can be collected either through
experiment or through survey. If the researcher conducts an experiment,
he observes some quantitative measurements, or the data, with the help
of which he examines the truth contained in his hypothesis. But in the
case of a survey, data can be collected by any one or more of the following
 By observation
 By personal interviews
 By telephonic interviews
 By questionnaires
7. Execution of the project: The researcher should see that the project is
executed in a systematic manner and in time. Steps should be taken to
ensure that the survey is under statistical control so that the collected
information is in accordance with the pre-defined standard of accuracy.
8. Analysis of data: The analysis of data requires a number of closely related
operations such as establishment of categories, the application of these
categories to raw data through coding, tabulation and then drawing
statistical inferences. Various statistical tests come into play to analyse
the data.
9. Hypothesis testing: After analysing the data as stated above, the
researcher is in a position to test the hypotheses, if any, he had
formulated earlier. Do the facts support the hypotheses or they happen
to be contrary? This is the usual question which should be answered while
testing hypotheses. Various tests, such as Chi square test, t-test, F-test,
have been developed by statisticians for this purpose.
10.Generalisation and interpretation of results: If a hypothesis is tested and
upheld several times, it may be possible for the researcher to arrive at
generalisation, i.e., to build a theory. If the researcher had no hypothesis
to start with, he might seek to explain his findings on the basis of some
theory. It is known as interpretation. The process of interpretation may
quite often trigger off new questions which in turn may lead to further
11. Preparation of a report or thesis: The layout for a research report
should be in the order:-
 Preliminary Pages – The research report must contain the full title,
foreword and acknowledgement in the preliminary pages.
 Main body or text – The main text must contain an introduction,
summary of findings, main report and conclusion.
 End Matter – The end matter of the report must contain an
appendices in respect of all technical terms and data used in the
report and must end with a bibliography.

Statistics is a field of study concerned with (1) the organization and
summarization of data, and (2) the drawing of inferences about a body of data
when only a part of the data are observed.” All statistical procedures can be
divided into two general categories: descriptive or inferential.
A. Descriptive statistics, as the name implies, describe data that we
collect or observe (empirical data). They represent all of the
procedures that can be used to organize, summarize, display, and
categorize data collected for a certain experiment or event.
Examples include: the frequencies and associated percentages; the
average or range of outcomes; and pie charts, bar graphs or other
visual representations for data.
B. Inferential statistics (sometimes referred to as analytical statistics
or inductive statistics) represent a wide range of procedures that
are traditionally thought of as statistical tests (e.g., student t-tests,
analysis of variances, correlation, regression or various chi square
and related tests). These statistics infer or make predictions about
a larger body of information based on a sample (a small subunit)
from that body.
Descriptive statistics are one of the fundamental “must knows” with any set of
data. It gives you a general idea of trends in your data. Descriptive statistics can
be further broken down into several sub-areas, like:

 Measures of central tendency.

 Measures of dispersion.
 Charts & graphs.
 Shapes of Distributions.

Measures of Central Tendency: Central tendency (sometimes called “measures

of location,” “central location,” or just “center”) is a way to describe what’s
typical for a set of data. Central tendency doesn’t tell you specifics about the
individual pieces of data, but it does give you an overall picture of what is going
on in the entire data set. There are three major ways to show central
tendency: mean, mode and median.
A. The mean is the average of a set of numbers. Add up all the numbers in a
set of data and then divide by the number of items in the set.
B. The median is the middle of a set of numbers. Place your data in order,
and the number in the exact center of a list is the median.
C. The mode is the most common number in a set of data.

Measures of Dispersion: Dispersion in statistics is a way of describing how

spread out a set of data is. When a data set has a large value, the values in the
set are widely scattered; when it is small the items in the set are tightly
clustered. The spread of a data set can be described by a range of descriptive
statistics including variance, standard deviation, and interquartile range.
A. Variance measures how far a data set is spread out. It is mathematically
defined as the average of the squared differences from the mean.
B. Standard deviation is probably the most common measure. It tells you
how spread out numbers are from the mean, mathematically it is the
square root of variance.
C. Interquartile range describes where the bulk of the data lies.
D. Mean difference or difference in means measures the absolute difference
between the mean value in two different groups in clinical trials

Charts and Graphs: These are excellent for presenting a set of data and includes
bar graphs, line charts, pie charts, box and whiskers plot, histogram, scatter
plots, etc. Various computer software allow the feature of graphical and tabular
presentation of a set of data. This visual presentation will reveal some
unsuspected aspects of data that could not be known otherwise.
Shapes of distributions: When a data set is graphed, each point is arranged to
produce one of dozens of different shapes. The distribution shape can give you
a visual which helps to show how the data is:
 Spread out (e.g. dispersion, variability, scatter),
 Where the mean lies,
 What the range of the data set is,
Factors determining the shapes are number of peaks or modes (unimodal or
bimodal), Symmetry across the mean (Normal distribution: mean, median,
mode coincides) and skewness (the data points lie on one side and the
distribution is not symmetrical)

Inferential Statistics allows you to make predictions (“inferences”) from that

data. With inferential statistics, you take data from samples and make
generalizations about a population. There are two main areas of inferential

1. Estimating parameters. This means taking a statistic from your sample data
(for example the sample mean) and using it to say something about a
population parameter (i.e. the population mean).
2. Hypothesis tests. This is where you can use sample data to answer research
questions. For example, you might be interested in knowing if a new cancer
drug is effective
If we have a sample data about a potential new drug for cancer and we have
calculated descriptive statistics to describe the sample, we can perform
inferential statistics from a small number of people and try to determine if the
data can predict whether the drug will work for everyone (i.e. the population).
There are various ways this can be done, from calculating a z-score (z-scores
are a way to show where your data would lie in a normal distribution to post-
hoc (advanced) testing.
Inferential statistics use statistical models to help you compare your sample data
to other samples or to previous research. Most research uses statistical models
called the Generalized Linear model and include Student’s t-tests, ANOVA
(Analysis of Variance), regression analysis and various other models that result
in straight-line (“linear”) probabilities and results.
Statistical tests should conform to the sample features cited above: distribution
and pairing. But in order to select the best test, the number of groups or
observations should also be considered. The main tests for each situation are
summarised in the above flow chart.
Biostatistics role and importance in clinical research started way back in the 17th
century and continues to grow stronger. After helping in the work of scientific
greats like Charles Darwin, Karl Parson, and others, it is now helping the budding
researchers in clinical research. Biostatistics also helps in presenting
the scientific manuscript with relatively sophisticated statistical analyses of a
complex set of medical data in renowned scientific journals.

Clinical researches utilize biostatistics methods to provide formal accounting for

sources of variability in patients’ response to treatment. Also, it allows
researchers to draw reasonable and precise inferences from the gathered
information to make outstanding decisions in times of uncertainty. It enables
the collection, analyzing, presenting and interpreting data to find application in
various field like:
 Epidemiology
 Clinical trials
 Population genetics
 Systems biology
Biostatistics help in Clinical Researches right from its start for Designing,
Conducting, Analyzing, Reporting, Minimizing biases, Confounding factors,
Measuring random errors, Understanding the research, Make suggestions on
hypothesis testing & analysis, Calculating the sample size, Determine the power
of the study, Ensure continuity throughout the research, Assess the statistical
significance of the results, Efficacy & safety of the drug, Line of treatment and
Not only in clinical trials but there are many other areas where the principles of
biostatistics can be applied:
1. PUBLIC HEALTH PROGRAMS: Biostatistics experts use statistical
techniques to assess the impact of public health programs and initiatives
being undertaken by governments, nonprofits and hospitals. They
develop robust studies and sampling techniques that help researchers
understand how well an initiative is performing and whether that model
can be replicated in other areas
2. EPIDEMIOLOGICAL STUDIES: There are various factors that influence the
cause, outbreak and distribution of a disease. Epidemiological studies aim
to identify these factors by collecting various data related to the disease,
and deriving the link between the cause and effect. Based on the
outcome, public health policies and preventive healthcare measures are
developed. Understanding the correlation between the variables requires
the skill of a biostatistics specialist, who can extract the most meaningful
information from the data. Historically, statistical evidence has always
filled in the missing link: for example, the 1954 publication of the results
of a study led by Richard Doll and Austin Bradford Hill lent very strong
statistical support to the link between smoking and lung cancer. Data
analysts and statisticians also played a key role in understanding the Ebola
epidemic in West Africa through statistical analyses and visualization
Survival statistics help estimate how long a terminally ill patient will live
and guides treatment accordingly. For patients affected with cancer,
these statistics (typically given as five-year relative rates) help understand
the best treatment option, chances of remission, chances of survival post-
remission and the chances of living a disease-free life after treatment.
Biostatistics experts are able to put forth numbers drawn from the careful
examination of previous studies, providing both doctor and patient with
reference data.
a huge role in systematic reviews and meta-analysis of medical data,
which can in turn be useful in developing evidence-based healthcare.
Meta-analysis is a structured study design which evaluates all the
previous medical research published on a specific topic. Drawing
conclusions from these studies require integration of individual results
and outcomes into reliable findings that will guide further research. For
example, meta-analysis can be used to determine if an individual has a
risk factor for a disease, what kind of environmental and genetic factors
play a role in a disease, and what kind of treatment should be given to the
patient. Having a biostatistics expert study the literature and provide a
rigorous, quantitative review of it helps healthcare providers improve
their chances of delivering successful, data-driven treatment
5. GENOME SEQUENCING DATA ANALYSIS: Genome sequencing generates
large amounts of data that can help scientists understand complex traits.
Sophisticated tools and software help sequence entire genomes quickly,
providing insights into identifying variants that cause disease. Data from
the variant genome is typically compared to the reference genome of the
individual or a control group. Biostatistics experts study variant
frequencies in populations according to specific parameters such as blood
type, ethnicity, etc. and draw conclusions as to how the genetic make-up
of an individual or subpopulation can affect if they’re at-risk or prone to
specific conditions.

 Pure research: The pure research also called as basic research is a
fundamental research. The aim of the basic or pure research is not to find
out solutions for the current problems but to think for the wider areas of
life. Pure research often does not provide any benefits to the masses at
the time it is done but in the future the results can be applied to gain
benefits. Scientists often make wonderful discoveries while doing this
kind of research. Some of these discoveries are purely coincidental and
they do not plan for it. The purpose of the pure research is to gain greater
knowledge of the world, develop new theories. Some researchers do pure
research to discover new research tools, techniques, and strategies. In
basic research, most of the time there is no timeframe and the researcher
can go on for several years.
 Applied research: Applied research is done to get answers for the current
problems. Applied research can be done to verify previously done
research. Applied research seek to solve practical problems, scientists are
in constant research to find out cure for diseases. The findings of the
research give benefit to the real world. In applied research, there is a
research, however, benefit the applied research in several ways. The
tools, techniques and procedures that are discovered in pure research are
used to find solutions in applied research.


1. EXPLORTAORY: As the name suggests, exploratory research is

conducted to explore a group of questions. The answers and
analytics may not offer a final conclusion to the perceived problem.
It is conducted to handle new problem areas which haven’t been
explored before. This exploratory process lays the foundation for
more conclusive research and data collection.
2. DESCRIPTIVE: Descriptive research is a method which identifies the
characteristics of an observed phenomenon and collects more
information. This method is designed to depict the participants in a
very systematic and accurate manner. In simple words, descriptive
research is all about describing the phenomenon, observing it, and
drawing conclusions from it.
3. CORREALTIONAL RESEARCH: Correlational research examines the
relationship between two or more variables. Consider a
researcher is studying a correlation between cancer and married.
Married women have a negative correlation with cancer. In this
example, there are two variables: cancer and married women.
When we say negative correlation, it means women who are
married are less likely to develop cancer. However, it doesn’t mean
that marriage directly avoids cancer.

You might also like