Research Methodology
Research Methodology
Research
Research is a systematic and organized process of inquiry, investigation, and exploration
Page | 1
that is conducted to acquire new knowledge, answer questions, solve problems, or
contribute to the existing body of knowledge in a particular field or discipline. It involves
the collection, analysis, and interpretation of data and information, often using established
methodologies and methods, with the aim of generating insights, making discoveries, and
advancing understanding in various domains, such as science, academia, business, and
social sciences. Research can take on various forms, including empirical studies, theoretical
investigations, experimental work, and literature reviews, and it plays a fundamental role in
expanding human knowledge and driving progress and innovation in numerous fields.
Research and its importance
Research :
Research is a systematic and methodical process of inquiry and investigation that
seeks to expand knowledge, discover new information, and address questions or
problems.
Importance
Advancing Knowledge: Research contributes to the growth of human knowledge
by exploring new ideas, theories, and concepts. It builds on existing information
and expands our understanding of the world.
Human Welfare: Research in fields like healthcare, public health, and social
sciences directly impacts human well-being, leading to improved healthcare
practices, disease prevention, and social policy development.
Objectives of Research
The objectives of research serve as the specific goals and purposes that guide the research
process. These objectives vary depending on the type and nature of the research, but they
generally fall into several categories.
To Explore and Describe: Research often begins with the objective of exploring a
new area of study or describing a phenomenon. This type of research aims to create
a foundational understanding of a topic or issue.
To Understand and Explain: Research seeks to uncover the underlying reasons,
factors, and causes behind phenomena. The objective here is to gain insights and
provide explanations for observed patterns or occurrences.
To Predict and Forecast: Some research aims to make predictions or forecasts
based on existing data and patterns. This can be particularly important in fields like
economics, meteorology, and marketing.
To Evaluate and Assess: Research may be conducted to evaluate the effectiveness
or impact of a particular program, intervention, policy, or product. This involves
assessing outcomes and measuring success.
To Compare and Contrast: Comparative research objectives involve examining
similarities and differences between groups, variables, or scenarios. This can help in
drawing meaningful comparisons and identifying patterns.
To Test Hypotheses: Research often involves the testing of specific hypotheses,
which are statements or propositions that can be either confirmed or refuted through
data collection and analysis.
To Develop and Create: In fields like engineering and product design, research
objectives may involve developing new technologies, products, or solutions to
address specific needs or challenges.
To Contribute to Existing Knowledge: Research aims to contribute to the body of
existing knowledge by either confirming, expanding, or refuting previous findings
and theories.
To Solve Practical Problems: Applied research has the objective of addressing
practical problems, often with the goal of improving processes, solving real-world
issues, or developing innovative solutions.
To Generate Theory: The objective here is to develop new theoretical frameworks
or refine existing ones. This type of research is common in social sciences and
theoretical fields.
Page | 3
To Document and Record: Research can be conducted to document historical,
cultural, or natural phenomena. The goal is to create a record and preserve
information for future generations.
To Inform Decision-Making: Research often provides information that supports
informed decision-making in various fields, including government, healthcare, and
business.
Various requirements of research
Conducting research involves several requirements to ensure a systematic, rigorous, and
meaningful process. These requirements may vary depending on the specific research
project and field, but some common elements include:
Clear Research Objective: A well-defined and focused research question or
objective is essential to guide the research process and provide a clear direction for
the study.
Research Proposal or Plan: A comprehensive research proposal outlines the
research objectives, methodology, scope, and timeline. It helps in securing funding
and gaining approval.
Literature Review: A thorough review of existing literature is essential to
understand the context of the research, identify gaps, and build on prior knowledge.
Research Design and Methodology: Develop a detailed research design, including
data collection methods, data analysis techniques, and sampling strategies. The
methodology should be appropriate for the research question.
Data Collection: Collect data using relevant and valid instruments. This could
involve surveys, experiments, interviews, observations, or the analysis of existing
data.
Ethical Considerations: Ensure that the research complies with ethical standards,
including obtaining informed consent from participants, protecting their privacy,
and addressing any potential harm or conflicts of interest.
Data Management and Analysis: Organize and manage the collected data in a
structured manner. Use appropriate statistical or analytical tools to analyze the data
and draw meaningful conclusions.
Resources and Funding: Secure the necessary resources, equipment, and funding
to carry out the research effectively. This may include financial support, access to
laboratories, or data storage facilities.
Time Management: Plan and manage the research timeline to ensure that the
project stays on track and is completed within the allocated time frame.
Research Team and Collaboration:
Depending on the scope of the research, a research team with various expertise
may be required. Collaboration with other researchers and institutions can also be
Page | 4
valuable.
Data Validation and Quality Control: Implement mechanisms to validate and
control the quality of the data collected, ensuring that it is accurate and reliable.
Data Security and Storage: Protect research data from loss or unauthorized
access by employing secure storage and backup systems.
Case-Control Study: Case-control studies start with individuals who have a specific
condition or outcome (cases) and compare them to individuals without the condition
(controls). These studies are often used to investigate risk factors and causes of diseases.
Experimental or Randomized Controlled Trial (RCT): RCTs are designed to test the
effectiveness of a specific intervention or treatment. Participants are randomly assigned to
treatment and control groups to assess the impact of the intervention.
Cross-Over Study: Cross-over studies are typically used to compare two or more
treatments. Participants receive multiple treatments in a specific sequence, with washout
periods in between, allowing for within-subject comparisons.
Case Series and Case Reports:
Case series involve a description of a series of cases with a particular condition or outcome.
Case reports provide detailed information about individual cases, often highlighting rare or
unique clinical findings.
Ecological Study: Ecological studies analyze population-level data to explore associations
between variables. These studies are commonly used in environmental health research.
Systematic Review and Meta-Analysis: Systematic reviews and meta-analyses are not Page | 5
original research studies but involve the comprehensive review and synthesis of existing
research on a specific topic. They aim to provide a summary of the evidence and a
quantitative assessment of the data.
Longitudinal Study: Longitudinal studies involve collecting data from the same
individuals or groups over an extended period. They are useful for tracking changes and
developments over time, such as disease progression or aging.
Nested Case-Control Study: These studies are often part of larger cohort studies.
Researchers select a subset of cases and controls from the cohort and investigate specific
exposures or risk factors in more detail.
Qualitative Studies: Qualitative research methods, such as interviews and focus groups,
are used to gather in-depth information about individuals' experiences, perceptions, and
behaviors.
Cross-Sectional Time-Series Study: This study design combines elements of cross-
sectional and time-series studies, aiming to assess the impact of interventions or policies
over time within a population.
Community-Based Participatory Research (CBPR):
CBPR involves collaboration between researchers and the community, engaging
community members in various stages of the research process to address local health
issues.
Elimination of errors and bias
Eliminating errors and bias is crucial in research to ensure the accuracy and validity of
findings. Here are some strategies to minimize and eliminate errors and bias in the research
process, along with explanations for each:
Randomization: Randomization involves assigning subjects or treatments randomly. It
helps reduce selection bias by ensuring that each subject has an equal chance of being in
the treatment or control group, making the groups comparable.
Blinding and Double-Blinding: Blinding involves keeping participants, researchers, or
assessors unaware of group assignments or treatment details to prevent bias. Double-
blinding extends this concept to both participants and researchers. This minimizes observer
and participant biases, as they do not know whether they are in the control or treatment
group.
Control Groups: Control groups are used to account for confounding variables. By
comparing a treatment group to a control group, researchers can isolate the effect of the
treatment and minimize the influence of other factors. Page | 6
Crossover Design: Crossover studies involve exposing each participant to both the
treatment and control conditions in a random order. This design helps eliminate individual
differences and carryover effects, reducing bias in the results.
Random Sampling: Random sampling ensures that every member of a population has an
equal chance of being selected for a study. This minimizes selection bias and allows the
sample to represent the population accurately.
Data Validation and Cleaning: Validate and clean data to identify and correct errors,
inconsistencies, and outliers. This process helps ensure the accuracy and reliability of the
dataset.
Peer Review: Subject research to peer review, where experts in the field assess the study's
design, methods, and findings. Peer review helps identify methodological flaws and
potential biases.
Replication: Replicate the study to confirm the results independently. Replication helps
ensure the reliability of findings and minimizes the impact of isolated errors or biases.
Transparency and Reproducibility: Clearly document and report all aspects of the
research, including methods, data, and statistical analysis. This transparency allows others
to reproduce the study and verify the results, reducing the potential for undisclosed bias or
errors.
Robust Experimental Design: Carefully design experiments or studies to minimize
potential biases from the outset. Consider factors like randomization, blinding, and control
groups in the research design.
Minimizing participant bias: If participants know which group they are in, they may
consciously or unconsciously alter their behavior or responses, affecting the study results.
Enhancing the credibility of the study: Blinding is a fundamental aspect of good research
practice and is often expected by peer reviewers and regulatory agencies to ensure the
validity of the study.
Biostatistics
Biostatistics is a branch of statistics that focuses on the application of statistical
methods and techniques to biological, medical, and public health-related data.
It plays a critical role in the design, analysis, and interpretation of experiments and
Page | 9
studies in these fields.
Biostatistics helps researchers make data-driven decisions, draw meaningful
conclusions, and generate evidence-based recommendations in areas such as
epidemiology, clinical trials, genetics, and health research.
Key aspects of Biostatistics
Study Design: Biostatisticians assist in designing research studies and experiments,
helping to determine sample sizes, data collection methods, and the overall structure
of investigations. They aim to ensure that studies are well-designed and capable of
addressing research questions effectively.
Statistical Power: Statistical power is the ability of a study to detect true effects or
differences. A larger sample size increases statistical power, making it more likely to
identify significant relationships, effects, or associations. In medical research, adequate
power is crucial for detecting treatment effects or differences in health outcomes.
Precision: With a larger sample size, estimates of population parameters, such as means or
proportions, become more precise. Smaller sample sizes lead to wider confidence intervals,
which indicate greater uncertainty in the estimates.
Generalizability: Biostatistical studies often aim to draw conclusions about a larger
population based on the findings from a sample. A sufficiently large sample size enhances
the ability to generalize study results to the broader population.
Confidence in Results: A larger sample size results in more reliable and robust findings.
Researchers and healthcare professionals have greater confidence in the accuracy of results
Page | 11
when they are derived from larger and more representative samples.
Minimizing Type II Errors: Type II errors occur when a study fails to detect a true effect
or difference. Increasing the sample size reduces the likelihood of making Type II errors,
ensuring that real relationships or effects are not overlooked.
Hypothesis Testing: Sample size affects hypothesis testing. In studies where researchers
want to test whether there is a significant difference or association, a larger sample size
increases the chances of detecting such differences or associations if they exist.
Cost Efficiency: While larger sample sizes generally lead to more precise results, they can
be costly and resource-intensive. Determining the optimal sample size balances the need for
accuracy with budget constraints.
Ethical Considerations: In clinical trials and biomedical research involving human
subjects, larger sample sizes may reduce the burden on individual participants, as the
study's results are more likely to be clinically meaningful and applicable to the broader
patient population.
Publication and Peer Review: Journals and peer reviewers often consider the adequacy of
the sample size when evaluating the quality and validity of research studies. Inadequate
sample sizes may lead to study rejection or the need for additional data collection.
Clinical Relevance: In healthcare and clinical research, larger sample sizes are often
necessary to detect clinically meaningful effects or differences that are relevant to patient
care and medical practice.
Factor influencing sample size
The determination of an appropriate sample size in biostatistics is influenced by a range of
factors that researchers need to consider carefully. The choice of sample size depends on
the specific goals and requirements of the study.
Some key factors that influence sample size in biostatistics:
Effect Size: The effect size refers to the magnitude of the difference or effect that
researchers aim to detect. Larger effects are typically easier to detect with smaller sample
sizes, while smaller effects require larger sample sizes for reliable detection.
Statistical Power: Statistical power is the probability of detecting a true effect or
difference when it exists. Researchers often specify a desired level of statistical power (e.g.,
80% or 90%) when calculating sample size. Higher levels of power require larger sample
sizes.
Page | 12
Significance Level (Alpha): The significance level (alpha) represents the acceptable risk
of Type I error (false positive), typically set at 0.05. A smaller alpha level may necessitate a
larger sample size to achieve statistical significance.
Population Variability: The variability or dispersion of data within the population affects
sample size. Highly variable data may require a larger sample size to obtain precise
estimates.
Confidence Interval Width: Researchers may specify the width of the confidence interval
they want to achieve around their estimates. Narrower intervals require larger sample sizes.
Biased Results: When participants drop out of a study, their reasons for leaving may be
related to the treatment, intervention, or adverse effects being studied. This can introduce
bias into the results, as those who continue to participate may differ systematically from
those who drop out.
Reduced Statistical Power: A higher dropout rate can reduce the statistical power of a
study. Statistical power is the ability of a study to detect a true effect if it exists. With
missing data, the study may become underpowered, meaning it might not have sufficient
statistical strength to detect real effects.
Generalizability: High dropout rates can affect the generalizability of the study results. If a
significant portion of the study population drops out, the findings may not accurately
represent the intended population.
Bias in Adverse Event Reporting: High dropout rates can result in underreporting of
adverse events, as individuals who experience adverse effects may be more likely to
Page | 14
discontinue their participation, leading to an underestimation of the potential risks
associated with a treatment.
Resource Utilization: High dropout rates can increase the cost and duration of a study, as
researchers may need to recruit additional participants to account for attrition.
Per-Protocol Analysis: This analysis includes only participants who completed the study
according to the protocol. It may provide insights into the treatment's efficacy when taken
as prescribed but can be subject to bias.
Multiple Imputation: This statistical technique involves imputing missing data multiple
times to account for uncertainty, allowing researchers to provide more accurate estimates.
Sensitivity Analysis: Researchers can conduct sensitivity analyses to assess the impact of
different assumptions about missing data on the study's results.
Statistica test of Significance
Statistical tests of significance, often referred to simply as hypothesis tests, are fundamental
tools in statistics and data analysis. They help researchers and analysts determine whether
the observed differences or relationships in data are statistically significant, or if they could
have occurred due to random chance.
Key concepts and steps involved in conducting statistical tests of significance:
1. Formulate Hypotheses:
Null Hypothesis (H0): This is the default assumption, stating that there is no significant
Page | 15
effect, relationship, or difference in the population. It represents the status quo or no
change.
Alternative Hypothesis (Ha or H1): This is what you want to prove or show evidence for. It
asserts that there is a significant effect, relationship, or difference in the population.
2. Choose a Test Statistic:
The choice of test statistic depends on the type of data and the nature of the hypothesis
being tested. Common test statistics include the t-statistic, chi-squared statistic, F-statistic,
and z-statistic.
6. Draw a Conclusion:
If you reject the null hypothesis, you accept the alternative hypothesis and conclude that
there is evidence of a significant effect or relationship.
If you fail to reject the null hypothesis, you do not have sufficient evidence to support the
alternative hypothesis.
7. Interpret Results:
Consider the practical significance of the findings in addition to statistical significance. A
small p-value does not necessarily imply that the effect is practically meaningful.
8. Report Findings:
Clearly state the results, including the test statistic, critical values, p-value, and the
decision (reject or fail to reject the null hypothesis) in your report or research paper.
Page | 16
Common Statistical Tests:
t-Test: Used to compare means of two groups.
Z-Test:
Similar to t-test but used when the population standard deviation is known.
Statistical tests of significance provide a systematic and objective way to make data-driven
decisions and draw conclusions based on evidence. Properly conducted tests are crucial for
scientific research, quality control, and decision-making across various fields, including
medicine, social sciences, business, and more.
Types of significance test
There are various types of significance tests or hypothesis tests, each designed to assess
different types of research questions and data scenarios. Here are some common types of
significance tests:
Z-Test: The Z-test is used to compare a sample mean to a known population mean when
the population standard deviation is known. It is typically employed when dealing with a
large sample size.
t-Test: The t-test is used to compare the means of two groups (independent samples) or to
determine if a sample mean is significantly different from a hypothesized population mean
(one-sample t-test). There are two main types: independent samples t-test and paired
(dependent) samples t-test.
Chi-Square Test: The chi-square test is used to examine the association between
categorical variables. There are two primary types:
F-Test: The F-test is often used in the context of ANOVA to compare variances between
multiple groups. It is used to determine if there are significant differences in variability
between groups.
Paired-Difference Test: This test is used when working with paired or dependent data
(e.g., before and after measurements). The paired t-test is a common example of this type of
test.
Mann-Whitney U Test (Wilcoxon Rank-Sum Test): This non-parametric test is used to
compare two independent groups when the assumption of normal distribution is not met. It
assesses whether there are differences in the distributions of the two groups.
Wilcoxon Signed-Rank Test: Similar to the Mann-Whitney U test, this non-parametric test
is used to compare paired (dependent) samples when the data is not normally distributed.
Binomial Test: The binomial test is used to assess if the observed proportion of successes
in a binary outcome (success/failure) differs from a hypothesized proportion.
Logistic Regression: This is used when the outcome variable is binary (e.g., yes/no) and
the goal is to understand the relationship between predictor variables and the probability of
the outcome.
Survival Analysis: Survival analysis tests, like the Kaplan-Meier survival analysis and log-
rank test, are used to compare survival or time-to-event data between different groups or
treatments.
Parametric test
Parametric tests in biostatistics are statistical methods used to make inferences and draw
conclusions about population parameters when certain assumptions about the data are met.
These assumptions typically include the assumption of normality and the assumption of
homogeneity of variances. When these assumptions are satisfied, parametric tests tend to be
more powerful (i.e., better at detecting true effects) than non-parametric tests. Here are
some common parametric tests used in biostatistics:
Page | 18
Student's t-Test:
One-Sample t-Test: Used to determine if a sample mean is significantly different from a
hypothesized population mean.
Independent Samples t-Test: Used to compare the means of two independent groups to
assess if they are significantly different.
Paired Samples t-Test: Used to compare the means of paired or dependent data (e.g., before
and after measurements).
Regression
Regression analysis is a statistical technique used in data analysis to model the
relationship between a dependent variable and one or more independent variables.
The primary goal of regression analysis is to understand how changes in the
independent variables are associated with changes in the dependent variable.
It is widely used in various fields, including economics, social sciences, biology,
finance, and many others, to make predictions, understand patterns, and infer causal
relationships.
Types of Regression: There are various types of regression models, including logistic
regression (used for binary outcomes), Poisson regression (for count data), ridge regression
and lasso regression (for variable selection and regularization), and many others, tailored to
specific data and research questions.
Page | 19
Regression analysis is a versatile and powerful statistical technique that is widely applied to
a broad range of research and practical problems. It allows researchers and analysts to make
predictions, infer relationships, and understand the impact of independent variables on the
dependent variable.
Nonparametric test
Non-parametric tests, also known as distribution-free tests, are a class of statistical tests
used when the assumptions of parametric tests (which assume specific characteristics of
the population distribution, such as normality) are not met or when dealing with non-
continuous data. Non-parametric tests are robust to violations of distributional
assumptions and are often used in various research areas, including psychology, social
sciences, and biology. Here are some common non-parametric tests:
Non-parametric tests are valuable when dealing with data that may not meet the
assumptions of parametric tests or when you want to make fewer assumptions about the
population distribution.
However, they can be less powerful than their parametric counterparts when the data
does conform to the parametric assumptions.
Dependent Variable (DV): The dependent variable is the outcome or response variable
you want to study or compare among groups. It is typically continuous and numerical.
Independent Variable (IV): The independent variable is the factor or category that divides
the data into groups. ANOVA is used to compare the means of the dependent variable
across these groups.
One-Way ANOVA: One-way ANOVA is used when you have one independent variable
with three or more levels or groups. It tests whether there are any significant differences
among these groups. If ANOVA indicates significant differences, post hoc tests (e.g.,
Tukey, Bonferroni) can be used to identify which groups differ.
Two-Way ANOVA:
Two-way ANOVA is used when you have two independent variables, and you want to
examine the main effects of each factor and their interaction. It assesses how each factor
independently influences the dependent variable and whether their interaction has an
additional impact.
Repeated Measures ANOVA: Repeated measures ANOVA is used when you have
repeated measurements on the same subjects, and you want to determine if there are
significant differences across time or conditions.
Null Hypothesis
Page | 21
o The null hypothesis, often denoted as H0, is a fundamental concept in statistical
hypothesis testing.
o It represents the default or status quo assumption in a hypothesis test and is used to
assess whether there is a significant effect, relationship, or difference in the population.
o The null hypothesis plays a crucial role in the scientific method and statistical analysis
for several reasons:
Testable Statement: The null hypothesis provides a testable statement about the
population or a phenomenon. It defines a specific condition or state of affairs that can be
subjected to empirical investigation.
Basis for Comparison: The null hypothesis serves as a point of reference against which
the alternative hypothesis (H1 or Ha) is compared. The alternative hypothesis represents
what the researcher aims to demonstrate or the effect they expect to find.
Statistical Decision Rule: When conducting a hypothesis test, statistical criteria are used to
determine whether to reject or fail to reject the null hypothesis. The decision rule is based
on the null hypothesis and a pre-defined significance level (α).
Significance of Null Hypothesis
The significance of the null hypothesis lies in its central role in hypothesis testing and
scientific inquiry. Here are the key aspects that highlight the importance of the null
hypothesis:
Page | 22
o Control of Type I Errors: The null hypothesis plays a crucial role in controlling
Type I errors (false positives) in statistical testing. By establishing a predefined
significance level (α), researchers set a threshold for the level of evidence required
to reject the null hypothesis. This control is essential to maintain the reliability and
validity of research results.
o Replicability: The null hypothesis allows for the replication of experiments and
studies. Researchers can test whether the results are consistent with previous
findings or if new evidence supports the alternative hypothesis. This replicability is
a cornerstone of the scientific method.
o Test of Falsifiability: A critical tenet of scientific investigation is the principle of
falsifiability. The null hypothesis should be formulated in a way that makes it
falsifiable, meaning that it can be proven false through empirical evidence. This
strengthens the scientific rigor of the research process.
o Clarity in Communication: Defining the null hypothesis clearly and precisely in
research and statistical analysis helps researchers and scientists communicate the
research question and objectives to their peers, the scientific community, and other
stakeholders. Clarity is essential for effective scientific communication.
o Rigorous Research: The null hypothesis is integral to maintaining the rigor and
validity of scientific research. It ensures that conclusions are based on a systematic
and unbiased evaluation of the evidence, and it serves as a safeguard against
unwarranted claims.
o Control of Type I Errors: The p-value and the significance level (α) help control
Type I errors (false positives). By setting a specific α level (e.g., 0.05), researchers
determine the threshold for statistical significance, reducing the risk of claiming a
significant effect when none exists.
o Effect Size and Practical Significance : While the p-value assesses statistical
significance, it doesn't provide information about the size or practical importance of
an effect. Researchers should also consider effect size measures to determine the
magnitude of the observed effect in addition to its statistical significance.
o Sensitivity Analysis: The p-value can be used in sensitivity analysis, where
researchers assess how robust their conclusions are to changes in the significance
level. By adjusting α, researchers can explore the impact on the findings.
o Hypothesis Refinement: When p-values are not statistically significant, researchers
can use the findings to refine their hypotheses, explore alternative explanations, or
revise their study design.
o Research Quality: The p-value is an essential tool for evaluating the quality of
research and the validity of its conclusions. It helps researchers and decision-makers
distinguish between findings that are likely due to chance and those that represent
real effects.
Page | 24
Degree of Freedom
Degrees of freedom (often abbreviated as "df" or "ν") are a fundamental concept in
statistics and are used in various statistical tests and calculations. Degrees of freedom
represent the number of values in the final calculation of a statistic that are free to vary. In
different statistical contexts, degrees of freedom have slightly different interpretations.
some common uses of degrees of freedom:
t-Distribution: In the context of the t-distribution and t-tests, degrees of freedom refer to
the number of independent observations used to calculate the sample mean. In a one-
sample t-test, there are n - 1 degrees of freedom, where "n" is the sample size. For a two-
sample independent t-test, the degrees of freedom depend on the sample sizes of both
groups.
Regression Analysis: In linear regression, degrees of freedom are used in hypothesis tests
for the overall model (total degrees of freedom) and individual model parameters. The total
degrees of freedom are typically (n - 1), where "n" is the sample size. The degrees of
freedom for the residuals (error terms) are equal to (n - p - 1), where "p" is the number of
predictor variables.
Sample Variance: Degrees of freedom are used in the calculation of the sample variance.
The formula for the sample variance divides by (n - 1), where "n" is the number of data
points. This adjustment (using n - 1 instead of n) is known as Bessel's correction and
corrects for bias in the estimation of population variance.
Non-Parametric Tests: In some non-parametric tests, such as the Wilcoxon signed-rank
test and Mann-Whitney U test, degrees of freedom are used to determine the critical values
for the test statistic.
Page | 25