0% found this document useful (0 votes)
97 views14 pages

Biostats and Research Methodology Unit 4 Notes

The document provides an overview of Design of Experiments (DOE), highlighting its principles, types, processes, advantages, limitations, and applications in various fields. It also discusses statistical analysis using Microsoft Excel, SPSS, and R, detailing their key features, processes, advantages, and limitations. Additionally, it covers sampling in biostatistics, including definitions, types, and the importance of standard error of the mean.

Uploaded by

binnusowji
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
97 views14 pages

Biostats and Research Methodology Unit 4 Notes

The document provides an overview of Design of Experiments (DOE), highlighting its principles, types, processes, advantages, limitations, and applications in various fields. It also discusses statistical analysis using Microsoft Excel, SPSS, and R, detailing their key features, processes, advantages, and limitations. Additionally, it covers sampling in biostatistics, including definitions, types, and the importance of standard error of the mean.

Uploaded by

binnusowji
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 14

Design of Experiments (DOE) – 10 Marks Answer

Introduction

Design of Experiments (DOE) is a systematic method used to determine the relationship


between factors (independent variables) affecting a process and the output (dependent
variable). It helps in planning experiments efficiently to obtain valid and objective
conclusions with minimum trials.

It is widely used in pharmaceutical, industrial, agricultural, and clinical research to


optimize processes, identify critical parameters, and improve quality.

Basic Principles of DOE

1. Replication – Repeating trials to estimate experimental error.

2. Randomization – Randomly assigning treatments to eliminate bias.

3. Blocking – Grouping similar experimental units to reduce variability.

4. Factorial Concept – Studying multiple factors simultaneously.

Types of DOE Designs

Type Description Use

Full Factorial All possible combinations of factors and levels Detailed analysis

Fractional Factorial A subset of full factorial Resource-saving

Randomized Block Design Blocks based on known variability Reduce confounding

Central Composite Design (CCD) Includes center and axial points RSM & optimization
Plackett-Burman Screening design Identify important factors

Steps in DOE Process

1. Define the Objective – What is to be optimized or studied?

2. Select Factors and Levels – Choose independent variables and their values.

3. Choose Design – Select the type of experiment (e.g., factorial, CCD).

4. Randomize and Replicate – To reduce bias and estimate error.

5. Conduct the Experiment – Perform trials as per the design.

6. Analyze Results – Use ANOVA, regression, or software tools.

7. Draw Conclusions – Identify significant factors and interactions.

8. Optimize and Validate – Fine-tune process and confirm with repeat runs.

Advantages

Reduces number of trials, time, and cost.

Studies interaction effects among variables.

Improves process quality and robustness.

Aids in identifying key influencing factors.

Suitable for multivariate optimization.

Limitations

Requires statistical knowledge and planning.

Complex designs need software (e.g., Design Expert, JMP).

Risk of misleading results if design is poorly chosen.

Difficult to execute for very large factor combinations.

Applications
Industrial:

Process optimization in manufacturing.

Quality improvement (Six Sigma).

Product formulation and stability testing.

Clinical/Pharmaceutical:

Drug dosage optimization.

Bioavailability & bioequivalence studies.

Stability and shelf-life analysis.

Conclusion

DOE is a powerful statistical tool for experimental planning and optimization. It enables
researchers to study the effect of multiple variables simultaneously and helps in making
data-driven decisions for quality and performance enhancement.

✍️ Statistical Analysis Using Microsoft Excel – 10 Marks Answer


🧾 Introduction

Microsoft Excel is a spreadsheet application widely used for basic to intermediate statistical
analysis. With built-in functions and tools like the Data Analysis ToolPak, Excel enables users to
perform descriptive statistics, correlation, regression, ANOVA, and graphical representation. It is
popular due to its accessibility, user-friendliness, and compatibility with various file formats.
📘 Key Statistical Functions in Excel

1. Descriptive Statistics:
o AVERAGE() – Mean
o MEDIAN() – Middle value
o [Link]() – Most frequent value
o STDEV.S() – Standard deviation
o VAR.S() – Variance
2. Correlation & Regression:
o CORREL() – Correlation coefficient
o SLOPE() – Slope of linear regression
o INTERCEPT() – Y-intercept
o RSQ() – R² value
3. Data Analysis ToolPak:
An add-in that provides tools for:
o Regression analysis
o ANOVA
o t-tests
o Histograms
o Descriptive statistics

🔄 Step-by-Step Process

1. Enter data into Excel spreadsheet.


2. Apply functions for mean, median, SD, etc.
3. Load Data Analysis ToolPak:
File → Options → Add-ins → Analysis ToolPak → Enable.
4. Choose the statistical test (e.g., regression, t-test, ANOVA).
5. Input range & output location.
6. Generate and interpret output tables and graphs.

✅ Advantages

• Easy to learn and widely accessible.


• Built-in functions for quick calculations.
• Data visualization with charts (line, bar, pie).
• Integrates with other tools (CSV, TXT, etc.).
• Suitable for small datasets and classroom use.
❌ Limitations

• Not ideal for large or complex datasets.


• Limited to basic and some intermediate statistical tests.
• Prone to user errors during manual data entry.
• Advanced analysis requires additional add-ins or macros.
• No built-in error-checking for assumptions in statistical models.

🏭 Applications

Industrial:

• Inventory trend analysis


• Sales data interpretation
• Quality control summaries

Clinical/Pharmaceutical:

• Patient data summaries


• Health monitoring dashboards
• Basic epidemiological analysis

📌 Conclusion

Microsoft Excel is an excellent starting tool for statistical analysis due to its simplicity and
accessibility. While limited for advanced statistical modeling, it remains highly useful for
preliminary data analysis, reporting, and visualization in academic, industrial, and clinical
settings.
✍️ Statistical Analysis Using SPSS – 10 Marks Answer
🧾 Introduction

SPSS (Statistical Package for the Social Sciences) is a powerful, user-friendly software used for
statistical data analysis. It is especially popular in social sciences, health sciences, marketing, and
clinical research. SPSS offers both graphical user interface (GUI) and syntax-based options for
executing statistical tests like t-tests, ANOVA, regression, and descriptive statistics.

📘 Key Statistical Capabilities

• Descriptive Statistics: Mean, median, standard deviation, frequency tables.


• Inferential Tests: t-test, ANOVA, Chi-square, correlation, regression.
• Advanced Analyses: Factor analysis, MANOVA, logistic regression.
• Graphical Output: Histograms, boxplots, scatter plots, bar charts.

🔄 Step-by-Step Process

1. Open SPSS software.


2. Enter Data:
o Use Data View to input raw data.
o Use Variable View to define variable names, labels, types, and measures.
3. Choose Statistical Test:
o Go to Analyze → choose test (e.g., Descriptive, Compare Means, Correlation).
4. Select Variables for analysis.
5. Set Parameters (e.g., confidence level, grouping variable).
6. Click OK to run analysis.
7. Interpret Output in Output Viewer (tables, charts, p-values, etc.).

✅ Advantages

• User-Friendly GUI – No programming skills required.


• Wide Range of Tests – Covers basic to advanced statistical methods.
• Formatted Output – Clean, readable tables and graphs.
• Handles Missing Data – Built-in tools for data cleaning.
• Custom Reports – Output export to Word, Excel, or PDF.
❌ Limitations

• Costly – Expensive for students and individual users.


• Limited Customization – Graphs are less customizable than in R or Python.
• Learning Curve – Complex analyses require understanding of statistical concepts.
• Not Open Source – Locked ecosystem with limited flexibility.

🏭 Applications

Industrial:

• Market research analysis


• Employee performance evaluation
• Customer satisfaction modeling

Clinical/Pharmaceutical:

• Clinical trial data analysis


• Treatment efficacy comparison
• Survey-based research and patient outcomes

📌 Conclusion

SPSS is a comprehensive and widely trusted tool for statistical analysis, especially suitable for
users without a programming background. It simplifies complex analyses through an intuitive
interface and produces professional output ideal for research, publication, and industry reporting.
✍️ Statistical Analysis Using R (Online) – 10 Marks Answer
🧾 Introduction

R is a free, open-source programming language and software environment used for statistical
computing, data visualization, and advanced analytics. It is widely used in research, industry,
bioinformatics, and epidemiology. Online platforms such as RStudio Cloud allow users to
access R without needing local installation, making statistical analysis accessible anywhere with
internet access.

📘 Key Statistical Features in R

• Descriptive Analysis: mean(), median(), sd(), summary()


• Inferential Statistics: [Link](), aov(), [Link](), lm(), anova()
• Data Manipulation: dplyr, tidyr, reshape2
• Data Visualization:
o Basic: plot(), hist(), boxplot()
o Advanced: ggplot2, lattice

🔄 Step-by-Step Process (Online via RStudio Cloud)

1. Access RStudio Cloud ([Link] or install R & RStudio locally.


2. Import Data:
Use [Link]() or the import wizard to load datasets.
3. Run Basic Analysis:
o Use summary(data) for overview
o Use mean(data$column), sd(), etc.
4. Perform Tests:
o [Link](), [Link](), aov(), lm() for regression
5. Create Graphs:
o plot(), hist(), boxplot() for basic plots
o ggplot(data, aes(x, y)) + geom_line() for advanced plots
6. Interpret Results:
Look for p-values, confidence intervals, and model summaries.
7. Export Output:
Generate RMarkdown reports or export plots and tables.
✅ Advantages

• Free and Open Source – No licensing cost.


• Advanced Capabilities – Ideal for complex models and big data.
• Reproducible Research – Scripts can be saved, shared, and re-run.
• Huge Library Support – Thousands of packages (e.g., ggplot2, caret, survival).
• Online Access via RStudio Cloud – No installation required.

❌ Disadvantages

• Requires Programming Knowledge – Steeper learning curve than Excel/SPSS.


• Not GUI-Based by Default – Commands must be typed.
• Internet Needed for Cloud Access – Offline use requires installation.
• Error-Prone for Beginners – Mistyped code can stop execution.

🏭 Applications

Industrial:

• Predictive analytics & time-series forecasting


• Real-time dashboards for production performance
• Anomaly detection in process control

Clinical/Pharmaceutical:

• Survival analysis, epidemiological modeling


• Genomics and proteomics data analysis
• Clinical trial statistics and bioequivalence testing

📌 Conclusion

R (especially through online platforms like RStudio Cloud) is a powerful tool for statistical
analysis, ideal for advanced research and large datasets. Though it requires programming
knowledge, its flexibility, speed, and reproducibility make it a top choice in data-driven
industries and clinical studies.
UNIT 2

📘 1. Sampling (In Biostatistics & Research Methodology)


🔹 Definition:

In biostatistics, sampling is the process of selecting a subset of individuals, measurements, or


observations from a larger target population for the purpose of statistical analysis and
drawing inferences about the population.

🔹 Need for Sampling in Research:

• Entire population studies are impractical and costly


• Required when destructive testing is involved (e.g., tablet dissolution)
• Essential in clinical and epidemiological studies

🔹 Example:

In a clinical trial, sampling is used to select 200 diabetic patients from a hospital population of
10,000 for testing a new antidiabetic drug.

🔹 Application in Biostatistics:

• Used in hypothesis testing


• Estimation of population parameters (mean, proportion)
• Enables use of statistical tests like t-tests, ANOVA

📘 2. Essence of Sampling (In Biostatistics & Research Methodology)


🔹 Essence:

The essence of sampling lies in the idea that studying a representative part can yield
conclusions about the whole, without studying every unit. It ensures efficiency, validity, and
reliability of research findings.
🔹 Biostatistical Perspective:

• Ensures statistical power without examining the full population


• Enables estimation of standard errors and confidence intervals
• Minimizes sampling error with proper methods

🔹 Research Methodology View:

• Makes data collection feasible


• Maintains ethical standards in human/animal studies
• Supports randomization and blinding in experimental design

🔹 Example:

Sampling patients from a cancer registry to evaluate treatment outcomes instead of reviewing all
registered cases.

📘 3. Types of Sampling (In Biostatistics & Research Methodology)


🔹 I. Probability Sampling (used in inferential statistics)
Type Description Example

Simple Random Each unit has equal Using a random number generator to
Sampling chance select 50 patient files

Systematic Sampling Select every kth item Every 5th patient from the OPD register

Divide into strata, then Sampling equal males & females in a drug
Stratified Sampling
sample study

Entire groups are Selecting 3 hospitals and surveying all


Cluster Sampling
selected patients in them
🔹 II. Non-Probability Sampling (used in exploratory/descriptive research)
Type Description Example

Convenience
Based on ease Selecting students available in class
Sampling

Based on researcher’s Choosing patients with specific


Judgmental Sampling
judgment criteria

30% from rural, 70% from urban


Quota Sampling Fixed proportion from groups
areas

Rare disease studies via patient


Snowball Sampling Participants recruit others
networks

🔹 Research Use:

• Probability sampling → best for generalizability


• Non-probability sampling → used in pilot or qualitative studies

📘 4. Standard Error of the Mean (In Biostatistics & Research


Methodology)
🔹 Definition:

The Standard Error of the Mean (SEM) is a biostatistical measure indicating the precision
of the sample mean in estimating the true population mean.
🔹 Formula:

🔹 Biostatistical Significance:

• Lower SEM means higher precision


• Used to construct confidence intervals (CI):

🔹 Example:

In a clinical study with:

• Mean systolic BP = 130 mmHg


• SD = 8
• n = 64

SEM = \frac{8}{\sqrt{64}} = 1

🔹 Research Application:

• Helps in interpreting the variability of data


• Crucial for data presentation and analysis
• Aids in comparing treatment effects in clinical research

You might also like