0% found this document useful (0 votes)
23 views24 pages

New Chat

The document provides a detailed explanation of conducting a two-way ANOVA without replication to assess the mean crushing strength of aspirin tablets across different testing methods and batches. It includes step-by-step calculations, critical values for F-tests, and concludes that there are no significant differences among the methods or batches. Additionally, it outlines a Chi-square test of independence to evaluate the association between smoking status and blood pressure, presenting expected frequencies and the Chi-square statistic calculations.

Uploaded by

Siddharth Doshi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
23 views24 pages

New Chat

The document provides a detailed explanation of conducting a two-way ANOVA without replication to assess the mean crushing strength of aspirin tablets across different testing methods and batches. It includes step-by-step calculations, critical values for F-tests, and concludes that there are no significant differences among the methods or batches. Additionally, it outlines a Chi-square test of independence to evaluate the association between smoking status and blood pressure, presenting expected frequencies and the Chi-square statistic calculations.

Uploaded by

Siddharth Doshi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

New chat

Reasoned for 1m 0s
Solution Explanation (Two‐Way ANOVA Without Replication)
Below is a step‐by‐step worked solution. We first restate the problem exactly as given, then show
how to set up and carry out a two‐factor (two‐way) ANOVA without replication to decide whether:

1. Different testing methods (Pfizer, Monsanto, Dr. Schleuniger) give the same mean crushing
strength or not.
2. Different batches of aspirin tablets (A, B, C, D, E) have the same mean crushing strength or
not.

Finally, we compare our computed F ‐values with the critical F ‐values provided to draw conclusions.

1. Problem Statement (as is)


Q.7 (a)
Tablets of Aspirin were subjected to crushing strength testing. The results obtained are as
follows (Values indicate average crushing strength in N/sq. m.):

Method/Batch Pfizer tester Monsanto tester Dr. Schleuniger tester


A 90 91 94
B 95 89 92
C 92 94 87
​ ​ ​ ​ ​

D 94 92 93
E 95 87 91

Find out whether all the methods and/or batches are similar or not.
Given critical values:
F(0.05, 4, 8) = 3.83

F(0.05, 2, 8) = 4.45

2. Data Layout
We have 5 “rows” (Batches A, B, C, D, E ) and 3 “columns” (Testing Methods: Pfizer, Monsanto,
Dr. Schleuniger). The (single) observation in each row‐column cell is:

Pfizer Monsanto Dr. Schleuniger


A 90 91 94
B 95 89 92
C 92 94 87
​ ​ ​ ​ ​

D 94 92 93
E 95 87 91

Printed using ChatGPT to PDF, powered by PDFCrowd HTML to PDF API. 1/24
Number of data points N = 5 × 3 = 15.

3. ANOVA Setup and Formulas


Because there is only one measurement per (Batch, Method) combination (i.e., no replication), we
use a two‐way ANOVA without replication. The usual formulas are:

1. Grand Total:

T = ∑ Xij . ​ ​

all cells

2. Correction Factor (CF):

T2
CF = . ​

N
3. Total Sum of Squares (SST):

SST = ∑ Xij2 − CF. ​ ​

all cells

4. Row (Batch) Sum of Squares (SSA):

(RowSum)2
SSA = ∑ − CF.
number of columns
​ ​

rows i

5. Column (Method) Sum of Squares (SSB):

(ColSum)2
SSB = ∑ − CF.
number of rows
​ ​

columns j

6. Error Sum of Squares:

SSE = SST − SSA − SSB.

7. Degrees of Freedom:

Total: N − 1 = 15 − 1 = 14.
Factor A (Batches): a − 1 = 5 − 1 = 4.
Factor B (Methods): b − 1 = 3 − 1 = 2.
Error: (a − 1) (b − 1) = 4 × 2 = 8.

8. Mean Squares:

SSA SSB SSE


MSA = , MSB = , MSE = .
4 2 8
​ ​ ​

9. F -ratios:

MSA
Printed using ChatGPT to PDF, powered by PDFCrowd HTML to PDF API.
MSB 2/24
MSA MSB
FA = , FB = .
MSE MSE
​ ​ ​ ​

10. Compare each F -ratio to its corresponding F -critical (Fα, df1 , df2 ) value to decide significance.
​ ​

4. Step‐by‐Step Calculations
4.1 Row and Column Totals

Row sums (Batches):


A : 90 + 91 + 94 = 275
B : 95 + 89 + 92 = 276
C : 92 + 94 + 87 = 273
D : 94 + 92 + 93 = 279
E : 95 + 87 + 91 = 273
Column sums (Methods):
Pfizer: 90 + 95 + 92 + 94 + 95 = 466
Monsanto: 91 + 89 + 94 + 92 + 87 = 453
Dr. Schleuniger: 94 + 92 + 87 + 93 + 91 = 457
Grand total

T = 275 + 276 + 273 + 279 + 273 = 1376 (also 466 + 453 + 457 = 1376).

Number of observations: N = 15.

4.2 Correction Factor (CF)

T2 13762 1, 893, 376


CF = = = ≈ 126,225.07.
15 15
​ ​ ​

4.3 Total Sum of Squares (SST)


2
We need ∑ Xij . Square each cell and sum:

Batch A: 902 = 8100, 912 = 8281, 942 = 8836, sum= 25, 217.
2
Batch B : 95 = 9025, 892 = 7921, 922 = 8464, sum= 25, 410.
Batch C : 922 = 8464, 942 = 8836, 872 = 7569, sum= 24, 869.
Batch D : 942 = 8836, 922 = 8464, 932 = 8649, sum= 25, 949.
Batch E : 952 = 9025, 872 = 7569, 912 = 8281, sum= 24, 875.

Summing these:

∑ Xij2 = 25,217 + 25,410 + 24,869 + 25,949 + 24,875 = 126,320.


Hence

SST = ∑ Xij2 − CF = 126,320 − 126,225.07 ≈ 94.93.


Printed using ChatGPT to PDF, powered by PDFCrowd HTML to PDF API. 3/24
4.4 Sum of Squares for Batches (SSA)

Using

(RowSumi )2
SSA = ∑ − CF,

(number of columns)
​ ​

rows i

with 3 columns:

2752 2762 2732


= 25,208.33, = 25,392, = 24,843,
32 23 3
​ ​ ​

279 273 ​

= 25,947, = 24,843.
3 3
​ ​

Summing them:

25,208.33 + 25,392 + 24,843 + 25,947 + 24,843 = 126,233.33.

Subtract CF:

SSA = 126,233.33 − 126,225.07 ≈ 8.26.

4.5 Sum of Squares for Methods (SSB)

Using

(ColSumj )2
SSB = ∑ − CF,

(number of rows)
​ ​

columns j

with 5 rows:
2
Pfizer: 466
5 = 43,431.2.

2
Monsanto: 453
5 = 41,041.8.

2
Dr. Schleuniger: 457
5
≈ 41,769.8. ​

Sum = 43,431.2 + 41,041.8 + 41,769.8 = 126,242.8.


Subtract CF:

SSB = 126,242.8 − 126,225.07 ≈ 17.73.

4.6 Error Sum of Squares (SSE)

SSE = SST − SSA − SSB = 94.93 − 8.26 − 17.73 ≈ 68.94.

4.7 Degrees of Freedom and Mean Squares

df A = 5 − 1 = 4.

df B = 3 − 1 = 2.

df E = (5 − 1) × (3 − 1) = 8.

Printed using ChatGPT to PDF, powered by PDFCrowd HTML to PDF API. 4/24
df Total = 15 − 1 = 14.

Mean squares:

8.26 17.73 68.94


MSA = ≈ 2.07, MSB = ≈ 8.87, MSE = ≈ 8.62.
4 2 8
​ ​ ​

4.8 F ‐Ratios

MSA 2.07 MSB 8.87


FA = = ≈ 0.24, FB = = ≈ 1.03.
MSE 8.62 MSE 8.62
​ ​ ​ ​ ​ ​

4.9 Compare with F ‐Critical Values

From the problem statement:

F(0.05, 4, 8) = 3.83.

F(0.05, 2, 8) = 4.45.

We have:

FA = 0.24, which is much less than 3.83.


⇒ Not significant for Batches.


FB = 1.03, which is less than 4.45.

⇒ Not significant for Methods.

5. Conclusion
Since both computed F ‐values are below their respective critical values, we fail to reject both null
hypotheses at the 5 % level. In other words:

1. There is no statistically significant difference among the five batches (A–E) in terms of
average crushing strength.
2. There is no statistically significant difference among the three test methods (Pfizer,
Monsanto, Dr. Schleuniger) in terms of average crushing strength.

Hence, all methods appear similar and all batches appear similar with respect to their mean
crushing strength at the 5 % level of significance.

Reasoned for 2m 7s
Solution Using the Chi‐Square Test of Independence
Below is an in‐depth, step‐by‐step solution showing how to determine whether smoking status and
blood‐pressure status are statistically associated. The data given form a 2 × 3 contingency table:

Blood Pressure Non‐Smokers Moderate Smokers Chain Smokers Row Tot


High BP 90 108 120 318
No High BP 60 42 30 132
​ ​ ​ ​

Column Totals 150 150 150 450

Printed using ChatGPT to PDF, powered by PDFCrowd HTML to PDF API. 5/24
We want to test:

H0 : Smoking status and Blood‐pressure status are independent (not associated).


H1 : They are dependent (associated).


1. Compute Row Totals, Column Totals, and Grand Total


From the table:

Row totals:
High BP: 90 + 108 + 120 = 318
No High BP: 60 + 42 + 30 = 132
Column totals:
Non‐Smokers: 90 + 60 = 150
Moderate Smokers: 108 + 42 = 150
Chain Smokers: 120 + 30 = 150
Grand total: 450

2. Expected Frequencies Under H0 ​

If the two classifications (Smoking, Blood Pressure) are independent, the expected count in each
cell is given by:

(Row totali ) × (Column totalj )


Eij = .
​ ​

Grand total
​ ​

2.1 High BP Row (Row total = 318)

High BP & Non‐Smokers:

318 × 150 318 × 150


E= = = 106.
450 450
​ ​

High BP & Moderate Smokers:

318 × 150
E= = 106.
450

High BP & Chain Smokers:

318 × 150
E= = 106.
450

2.2 No High BP Row (Row total = 132)

No High BP & Non‐Smokers:

Printed using ChatGPT to PDF, powered by PDFCrowd HTML to PDF API. 6/24
132 × 150
E= = 44.
450

No High BP & Moderate Smokers:

132 × 150
E= = 44.
450

No High BP & Chain Smokers:

132 × 150
E= = 44.
450

Hence the expected‐frequency table (under H0 ) is: ​

Non‐Smokers Moderate Chain


High BP ​ 106 106 106
​ ​ ​ ​

No High BP 44 44 44

3. Compute the Chi‐Square Statistic


The formula for the Chi‐square test statistic is:

(O − E)2
χ2 = ∑ ​ ,
E
all cells

where O is the observed frequency, and E is the expected frequency.

We label cells as follows:

Non‐Smokers (A1) Moderate (A2) Chain (A3)

High BP (B1) O11 = 90



108 120
No High BP (B2) 60 42 30
Their corresponding expected values (Eij ) are: ​

Non‐Smokers Moderate Chain


High BP 106 106 106

No High BP 44 44 44

(O−E)2
3.1 Individual E Terms ​

1. (High BP, Non‐Smokers):


256
O = 90, E = 106, O − E = −16, (O − E)2 = 256, 106 ​ ≈ 2.415.
2. (High BP, Moderate Smokers):
4
O = 108, E = 106, O − E = 2, (O − E)2 = 4, 106
​ ≈ 0.0377.
3. (High BP, Chain Smokers):
O = 120, E = 106, O − E = 14, (O − E)2 = 196, 196
106
​ ≈ 1.849.

Printed using ChatGPT to PDF, powered by PDFCrowd HTML to PDF API. 7/24
4. (No High BP, Non‐Smokers):
256
O = 60, E = 44, O − E = 16, (O − E)2 = 256, 44
​ ≈ 5.818.
5. (No High BP, Moderate Smokers):
O = 42, E = 44, O − E = −2, (O − E)2 = 4, 4
44
​ ≈ 0.0909.
6. (No High BP, Chain Smokers):
O = 30, E = 44, O − E = −14, (O − E)2 = 196, 196
44
​ ≈ 4.4545.

3.2 Summation

χ2 = 2.415 + 0.038 + 1.849 + 5.818 + 0.091 + 4.455 ≈ 14.67.

4. Degrees of Freedom and Critical Value


For a contingency table with r rows and c columns,

d.f. = (r − 1) × (c − 1).

Here, r = 2 and c = 3, so

d.f. = (2 − 1) × (3 − 1) = 1 × 2 = 2.

From the problem statement, the critical χ2 value for α = 0.05 and 2 d.f. is:

χ2(2, 0.05) = 5.99.


5. Decision and Conclusion


Our computed χ2 ≈ 14.67.
The critical value at 2 d.f. (5 % level) is 5.99.
Since 14.67 > 5.99, we reject the null hypothesis of independence.

Conclusion: Smoking and Blood Pressure **are** associated. ​

In practical terms, the data suggest that the proportion of individuals with high blood pressure
differs significantly across the three smoking categories, indicating a statistical relationship between
smoking status and blood‐pressure status.

Reasoned for 1m 23s


Below is a step‐by‐step, in‐depth solution showing how to perform a paired t‐test on the “before”
and “after” blood‐sugar levels for 10 patients. We then conclude whether the new drug has a
significant hypoglycemic (blood‐sugar‐lowering) effect.

1. Data

Printed using ChatGPT to PDF, powered by PDFCrowd HTML to PDF API. 8/24
The table below shows blood‐sugar levels (BSL) for 10 patients before and after the treatment:

Patient Before After Difference (di ​ = Before − After)


1 148 132 16
2 152 136 16
3 149 129 20
4 143 130 13
5 152 144 8
6 136 134 2
7 134 122 12
8 139 124 15
9 150 127 23
10 147 129 18

Number of patients n = 10.

2. Paired t‐Test Setup


We test the hypothesis:

H0 : The true mean difference μd = 0 (i.e., the treatment has no effect on blood sugar).
​ ​

H1 : μ d 
​ = 0 (two‐tailed), or specifically μd > 0 if we assume the drug lowers BSL.
​ ​

Typically, if we specifically want to show a reduction in BSL, we use a one‐tailed test.


However, the standard form is often two‐tailed, which still shows significance if the mean
difference is large in either direction.

2.1 Differences di ​

We define

di = (Before)i − (After)i .
​ ​ ​

If the drug lowers blood sugar, then di should be positive on average.


From the table, the differences di are:


16, 16, 20, 13, 8, 2, 12, 15, 23, 18.

2.2 Sample Mean of Differences dˉ

143
∑ di = 16 + 16 + 20 + 13 + 8 + 2 + 12 + 15 + 23 + 18 = 143, dˉ = = 14.3.
10

2.3 Sample Standard Deviation of Differences sd ​

Printed using ChatGPT to PDF, powered by PDFCrowd HTML to PDF API. 9/24
We compute the deviations (di ​ − dˉ) and square them:

1. 16 − 14.3 = 1.7 → (1.7)2 = 2.89


2. 16 − 14.3 = 1.7 → (1.7)2 = 2.89
3. 20 − 14.3 = 5.7 → (5.7)2 = 32.49
4. 13 − 14.3 = −1.3 → (−1.3)2 = 1.69
5. 8 − 14.3 = −6.3 → (−6.3)2 = 39.69
6. 2 − 14.3 = −12.3 → (−12.3)2 = 151.29
7. 12 − 14.3 = −2.3 → (−2.3)2 = 5.29
8. 15 − 14.3 = 0.7 → (0.7)2 = 0.49
9. 23 − 14.3 = 8.7 → (8.7)2 = 75.69
10.18 − 14.3 = 3.7 → (3.7)2 = 13.69

Sum of squares:

2.89 + 2.89 + 32.49 + 1.69 + 39.69 + 151.29 + 5.29 + 0.49 + 75.69 + 13.69 = 326.10.

Then the sample variance of the differences is:

326.10 326.10
s2d = = ≈ 36.23.
n−1 9
​ ​ ​

Hence,

sd =

36.23 ≈ 6.02.

2.4 Test Statistic t

The paired t‐test statistic is:

dˉ − 0 14.3 14.3 14.3


t= = ≈ ≈ ≈ 7.53.
sd / n 6.02/ 10 6.02/3.1623 1.90
​ ​ ​ ​

​ ​

The degrees of freedom for a paired t‐test with n = 10 pairs is n − 1 = 9.

3. Compare with Critical t

The problem statement gives a few critical t values, including

t0.05,9 = 2.26 (two‐tailed at α = 0.05, df = 9).


Our computed test statistic is

tcalc ≈ 7.53,

which

Reasoned for 1m 42s


Solution Using a Paired t‐Test

Printed using ChatGPT to PDF, powered by PDFCrowd HTML to PDF API. 10/24
Below is a step‐by‐step worked solution. First, we restate the problem exactly as given, then show
how to set up and carry out a paired t‐test to decide whether the new drug has a significant
hypoglycemic (blood‐sugar‐lowering) effect.

1. Problem Statement (As Is)


Q.3 (a)
Ten patients were evaluated for hypoglycemic effect of a new drug. Data of blood sugar
level are as follows, perform t‐test and suggest whether the treatment is effective or not.
(t0.05, 9 = 2.26, t0.05, 19 = 2.09, t0.05, 20 = 2.08)
​ ​ ​

Patient 1 2 3 4 5 6 7 8 9 10
BSL Before (mg/dl) ​ 148 152 149 143 152 136 134 139 144 147
​ ​ ​ ​ ​ ​ ​ ​ ​ ​ ​

BSL After (mg/dl) 132 136 129 143 144 134 122 124 127 132

We want to determine if the average blood sugar is significantly lower after treatment (i.e., if the
drug is effective).

2. Why a Paired t‐Test?

Each patient is measured before and after receiving the same new drug, so the observations come
in pairs for the same subject. This calls for a paired t‐test (also known as the dependent samples t
‐test).

Null and Alternative Hypotheses

1. H0 : The drug does not change mean blood‐sugar level (no difference).

2. H1 : The drug reduces the mean blood‐sugar level (i.e., the before–after difference is positive

and significant).

3. Differences di = (Before) − (After)


Let each patient’s difference be di ​


= XBefore,i − XAfter,i . We have n = 10 patients:
​ ​

Printed using ChatGPT to PDF, powered by PDFCrowd HTML to PDF API. 11/24
d1 ​ = 148 − 132 = 16,
d2 ​ = 152 − 136 = 16,
d3 ​ = 149 − 129 = 20,
d4 ​ = 143 − 143 = 0,
d5 ​ = 152 − 144 = 8,
d6 = 136 − 134 = 2,
​ ​

d7 = 134 − 122 = 12,


d8 = 139 − 124 = 15,


d9 = 144 − 127 = 17,


d10 = 147 − 132 = 15.


3.1 Mean of the Differences, dˉ

∑ di = 16 + 16 + 20 + 0 + 8 + 2 + 12 + 15 + 17 + 15 = 121.

∑ di 121
dˉ = = = 12.1.

10
​ ​

3.2 Standard Deviation of the Differences, sd ​

We first compute each (di ​


− dˉ)2 , then sum and divide by (n − 1).

d1 = 16, 16 − 12.1 = 3.9,


​ (3.9)2 = 15.21,
d2 = 16, 16 − 12.1 = 3.9,
​ (3.9)2 = 15.21,
d3 = 20, 20 − 12.1 = 7.9,
​ (7.9)2 = 62.41,
d4 = 0, 0 − 12.1 = −12.1,
​ (−12.1)2 = 146.41,
d5 = 8, 8 − 12.1 = −4.1, (−4.1)2 = 16.81,

d6 = 2, 2 − 12.1 = −10.1, (−10.1)2 = 102.01,


​ ​

d7 = 12, 12 − 12.1 = −0.1, (−0.1)2 = 0.01,


d8 = 15, 15 − 12.1 = 2.9, (2.9)2 = 8.41,


d9 = 17, 17 − 12.1 = 4.9,



(4.9)2 = 24.01,
d10 = 15, 15 − 12.1 = 2.9,
​ (2.9)2 = 8.41.

Sum of squares:

15.21 + 15.21 + 62.41 + 146.41 + 16.81 + 102.01 + 0.01 + 8.41 + 24.01 + 8.41 = 398.9.

Hence the sample variance:

398.9 398.9
s2d = = ≈ 44.322.
n−1 9
​ ​ ​

sd = ​ 44.322 ≈ 6.656.

Printed using ChatGPT to PDF, powered by PDFCrowd HTML to PDF API. 12/24
3.3 Standard Error of dˉ

sd 6.656 6.656
SE(dˉ) = = ≈ ≈ 2.106.

10 3.1623
​ ​ ​

n ​

4. Paired t‐Statistic

The test statistic for a paired t‐test is:


t= .
sd / n

​ ​

Substituting our values:

12.1
t= ≈ 5.74.
2.106

4.1 Degrees of Freedom and Critical t

= 10 pairs, df = n − 1 = 9.
For a paired test with n
From the problem statement, t(0.05,9) = 2.26.​

5. Decision and Conclusion


Our computed t ≈ 5.74.
The critical t at 5 % level with 9 d.f. is 2.26.
Since 5.74 > 2.26, we reject the null hypothesis and conclude that the drug does lower
blood sugar significantly.

Yes, the treatment is effective in reducing blood sugar (significant at 5%). ​

Reasoned for 1m 12s


Below is a step‐by‐step illustration of how to construct a simple bar chart (column chart) to
compare the percentage results for each subject code. We also show a quick numerical summary.

1. Given Data (as is)


Subject code: 848801, 848802, 848803, 848804, 848805, 848806
Result (%): 90, 80, 85, 85, 80, 75

A convenient way to arrange them:

Subject Code 848801 848802 848803 848804 848805 848806

Result (%) 90 80 85 85 80 75

Printed using ChatGPT to PDF, powered by PDFCrowd HTML to PDF API. 13/24
2. Constructing the Bar (Column) Chart
1. Label the axes:
Horizontal (x‐axis): Subject Code (categorical labels: 848801, 848802, etc.).
Vertical (y‐axis): Result (%) from 0 to (at least) 100.
2. Draw bars:
For each subject code, draw one vertical bar whose height corresponds to its percentage
result.
For example, subject 848801’s bar goes up to 90 %. Subject 848802’s bar goes up to 80 %,
and so on.
3. Add a title: e.g., “Results by Subject Code”.

That’s it! You get a quick visual comparison of all six subjects’ percentages.

3. Optional Numerical Summary


Sometimes, you might want to compute basic statistics:

Sum of all percentages:

90 + 80 + 85 + 85 + 80 + 75 = 495.

Number of subjects: n = 6.
Mean (average):

495
ˉ=
x = 82.5%.
6

Range: from a minimum of 75 % (subject 848806) to a maximum of 90 % (subject 848801).

These values can be used for additional interpretation, but typically a bar chart alone is sufficient to
visualize and compare the results for each subject code.

Final Note

A bar (column) chart is the most straightforward choice here because we have categorical
labels (subject codes) on the x‐axis and quantitative values (percentages) on the y‐axis.
One would not normally use a pie chart, because these data do not represent parts of a single
whole (they do not sum to 100 in a way that represents a single “pie”).
A line chart is sometimes used if you want to show a trend over a sequential variable (e.g.,
time), but for distinct subject codes, a bar chart is typically clearer.

Reasoned for 1m 39s


Below is a step‐by‐step outline for testing whether there is a statistically significant difference in
viral load between two independent groups of HIV patients: those receiving Standard Therapy vs
those receiving a New Anti‐Retroviral Therapy. The problem statement suggests a nonparametric
approach (the Mann–Whitney or Wilcoxon rank‐sum test) because:

Printed using ChatGPT to PDF, powered by PDFCrowd HTML to PDF API. 14/24
1. We have two independent samples (Standard vs New).
2. The question mentions a critical value of 64, which corresponds to typical Wilcoxon rank‐
sum tables for sample sizes n1 = 15 and n2 = 15 at α = 0.05.
​ ​

If the data were known to be approximately normal, one could do a two‐sample t‐test.
However, a critical t‐value would be around 2 (not 64!), so the mention of “64” clearly
points to the rank‐sum approach.

1. Restate the Problem (as is)


Q.1(a)
A clinical trial is run to assess the effectiveness of a new anti‐retroviral therapy for
patients with HIV. Patients are randomized to receive a standard therapy (usual care) or
the new anti‐retroviral therapy, and their viral load (HIV copies per ml of blood) is
measured after 3 months. A total of 30 participants (15 per group) are randomized, and
the data are shown below. The critical value can be found in the Wilcoxon rank‐sum table
for α = 0.05. The critical value is 64.

Is there statistical evidence of a difference in viral load in patients receiving the standard
therapy vs. the new anti‐retroviral therapy?

(Exact numeric data for each group would typically be listed—15 values for Standard, 15 for New—though
only a partial snapshot is shown here.)

2. Outline of the Wilcoxon Rank‐Sum (Mann–Whitney) Test


We denote the two groups as:

= 15 observations.
Group 1 (Standard Therapy), with n1 ​

Group 2 (New Therapy), with n2 = 15 observations.


2.1 Hypotheses

Null Hypothesis (H0 ): ​

The two populations (viral‐load distributions) are the same. In other words, there is no
difference in typical viral load between Standard and New therapy groups.
Alternative Hypothesis (H1 ): ​

The two populations differ (the distribution of viral load in one group is shifted compared to
the other). Often, we suspect the New therapy might yield lower viral loads, but the rank‐sum
test in its basic form is a two‐sided test of any difference.

2.2 Significance Level and Critical Value

We use α = 0.05 (two‐sided).


From standard Wilcoxon rank‐sum tables for (n1 ​ = 15, n2 = 15), the critical value of the

smaller rank sum W is 64.

Printed using ChatGPT to PDF, powered by PDFCrowd HTML to PDF API. 15/24
In practice, the table often shows a “lower” and “upper” critical bound. If the smaller of
the two rank sums ≤ 64 or ≥ (some upper cutoff), we reject H0 . ​

2.3 Test Statistic: W

1. Combine all n1 ​ + n2 = 30 observations into a single list.


2. Rank them from smallest (rank 1) to largest (rank 30). If there are ties, assign the average of
the tied ranks.
3. Sum the ranks for one of the groups (it does not matter which, but conventionally we pick the
“New Therapy” group). Call this sum of ranks W .
4. The other group automatically has rank sum T − W , where T is the sum of all ranks 1 + 2 +
30×31
⋯ + 30 = 2

= 465.

2.4 Decision Rule

Let W be the rank sum for the chosen group (often the smaller group or the one of most
interest).
If the smaller of the two group rank sums is ≤ 64 (or ≥ the corresponding upper bound),
reject H0 .

In simpler terms, if W ≤ 64, that indicates the group’s data are significantly “shifted”
relative to the other group.

3. Step‐by‐Step (Conceptual)
1. List the data from both groups (15 viral loads for Standard, 15 for New).
2. Combine into one list of 30 values.
3. Sort from smallest to largest.
4. Assign ranks 1 to 30.
5. Sum the ranks for each group:

W1 = ∑(ranks in Standard group),



W2 = ∑(ranks in New group).

6. Let W = min(W1 , W2 ). ​ ​

7. Compare W to the critical value = 64.


If W ≤ 64, we reject H0 . ​

Otherwise, we fail to reject H0 . ​

3.1 Typical Outcome

If the New Therapy group tends to have lower viral loads, then the New group’s ranks will be
smaller, so WNew will be smaller.

If that smaller rank sum is indeed ≤ 64, it implies a significant difference at α = 0.05.

4. Conclusion

Printed using ChatGPT to PDF, powered by PDFCrowd HTML to PDF API. 16/24
After you do the ranking and get the actual rank sum:

1. If the smaller rank sum ≤ 64, then at the 5 % level, we have evidence that the two
distributions differ.
Practically, if it is the New Therapy group that has the smaller rank sum, that suggests
lower viral loads under the new therapy (i.e., it’s more effective).
2. Otherwise, if W > 64, we do not reject the null hypothesis and cannot claim a statistically
significant difference.

Because the problem specifically asks, “Is there a difference in viral load?”—the final statement
would be:

If the computed W is ≤ 64, we conclude there is a statistically significant difference in


viral load between Standard and New therapies (favoring whichever group has the
smaller rank sum).
If W > 64, we conclude there is no statistically significant difference at the 5 % level.

Quick Notes

The question gave 30 total patients, 15 in each group, and a critical Wilcoxon rank‐sum value
of 64 for α = 0.05.
If someone tried a two‐sample t‐test, the “critical t” would be around 2, not 64, confirming
that the “64” indeed refers to the Wilcoxon rank‐sum method.
In many practical analyses, one might check data normality and choose between a parametric
(two‐sample t) or nonparametric (Mann–Whitney) test. Here, the question’s reference to
“critical value 64” makes the Mann–Whitney approach the clear choice.

Final Answer (in Words)


Performing the Wilcoxon rank‐sum test step by step (ranking all 30 viral loads, summing ranks by
group, and comparing the smaller rank sum to 64) will determine if there is a statistically
significant difference at the 5 % level.

If the smaller rank sum ≤ 64, we reject the null hypothesis and conclude there is evidence of
a difference in viral load between the standard and new therapies.
If the smaller rank sum > 64, we fail to reject the null, meaning we do not have enough
evidence to conclude a difference at α = 0.05.

Reasoned for 9 seconds


Below is the exact list of trial‐related variables, followed by a brief explanation of which are
naturally discrete (including nominal or ordinal categories, or integer counts) and which are
continuous (measured on a real‐valued scale).

1. Variables (as given)

Printed using ChatGPT to PDF, powered by PDFCrowd HTML to PDF API. 17/24
1. Experimental vs. controls (placebo)
2. Dosage form — tablet/capsule/other
3. Bioavailability measurements (Cmax, Tmax, AUC)
4. Test drug vs. reference standard
5. Fed vs. fasted state (before/after meals)
6. Prolactin levels (ng/l)
7. Manufacturer (generic vs. brand)
8. Male vs. female subjects
9. Age (in years)
10. Smoking history (cigarettes per day)
11. “Normal” vs. geriatric population
12. Pain Score (1–10)

2. Classification: Discrete vs. Continuous


1. Experimental vs. controls (placebo)
Discrete (categorical, nominal)
2. Dosage form — tablet/capsule/other
Discrete (categorical)
3. Bioavailability measurements (Cmax, Tmax, AUC)
Continuous (real‐valued pharmacokinetic measures)
4. Test drug vs. reference standard
Discrete (categorical)
5. Fed vs. fasted state
Discrete (categorical)
6. Prolactin levels (ng/l)
Continuous (a laboratory measurement that can take any real value within a range)
7. Manufacturer (generic vs. brand)
Discrete (categorical)
8. Male vs. female subjects
Discrete (categorical)
9. Age (in years)
Typically treated as continuous in most statistical analyses, even though recorded as an
integer sometimes.
In principle, age can be measured to decimal places (e.g., 18.4 years).
10. Smoking history (cigarettes per day)

Discrete integer count (1, 2, 3 …).


Sometimes treated approximately as continuous if large enough range.

11. “Normal” vs. geriatric population

Discrete (categorical)

Printed using ChatGPT to PDF, powered by PDFCrowd HTML to PDF API. 18/24
12. Pain Score (1–10)

Strictly discrete (an integer on a 1–10 ordinal scale).


Often called ordinal rather than purely numeric continuous.

Summary

Discrete (categorical or integer count):


1, 2, 4, 5, 7, 8, 10, 11, 12
Continuous:
3, 6, 9

This classification reflects the most common conventions in clinical and statistical practice.

Reasoned for 22 seconds


Below is a step‐by‐step solution for finding the least‐squares line of Y on X from the given data
and then using it to predict Y at X = 95.

1. Problem Restatement (as is)


Q.5 (a)
The following are the results of five assays of different but known potency.

Drug Potency (X) 60 80 90 100 120


Assay (Y ) 61 79 91 102 119
​ ​ ​ ​ ​ ​

Find the equation of the line of regression of Y on X and estimate Y when X = 95.

2. Computations for the Regression Line


We have n = 5 pairs (Xi , Yi ).
​ ​

We want a line of the form

Y = a + b X.

2.1 Summations

∑ X = 60 + 80 + 90 + 100 + 120 = 450,


∑ Y = 61 + 79 + 91 + 102 + 119 = 452,
∑ X 2 = 602 + 802 + 902 + 1002 + 1202 = 3,600 + 6,400 + 8,100 + 10,000 + 14,400 = 42,50
∑ XY = (60 × 61) + (80 × 79) + (90 × 91) + (100 × 102) + (120 × 119)
= 3,660 + 6,320 + 8,190 + 10,200 + 14,280 = 42,650.

2.2 Slope b
Printed using ChatGPT to PDF, powered by PDFCrowd HTML to PDF API. 19/24
Using the standard “least‐squares” formulas:

n ∑ XY − (∑ X)(∑ Y )
b= .
n ∑ X 2 − (∑ X)2

Substitute n = 5:
5 × 42,650 − 450 × 452
b= .
5 × 42,500 − (450)2

Numerator:

5 × 42,650 = 213,250, 450 × 452 = 203,400, so numerator = 213,250 − 203,400 = 9

Denominator:

5 × 42,500 = 212,500, (450)2 = 202,500, so denominator = 212,500 − 202,500 = 10

Hence

9,850
b= = 0.985.
10,000

2.3 Intercept a

∑Y − b ∑X 452 − 0.985 × 450


a= = .
5
​ ​

n
We have 0.985 × 450 = 443.25.
So

8.75
∑ Y − b ∑ X = 452 − 443.25 = 8.75, a= = 1.75.
5

2.4 Final Regression Equation

Y = 1.75 + 0.985 X . ​

3. Estimating Y at X = 95
Plug X = 95 into the fitted line:

Y (95) = 1.75 + 0.985 × 95.

0.985 × 95 ≈ 93.575, thus Y (95) ≈ 1.75 + 93.575 = 95.325.

One might round to:

Y (95) ≈ 95.3. ​

Printed using ChatGPT to PDF, powered by PDFCrowd HTML to PDF API. 20/24
4. Conclusion
1. Regression line:

Y = 1.75 + 0.985 X. ​

2. Estimated assay for a drug potency of 95:

Y (95) ≈ 95.3. ​

Reasoned for 1m 6s
Solution: Pearson’s Correlation for Temperature (X) and Pulse Rate (Y )

Below is a step‐by‐step computation of the sample correlation coefficient r for the 8 patients’ data:

Patient # 1 2 3 4 5 6 7 8
Xi (Temp) ​
​ 98 97 102 100 99 101 99 101
​ ​ ​ ​ ​ ​ ​ ​

Yi (Pulse)
​ 100 91 63 80 92 70 90 72

1. Summations
We use the standard formulas for Pearson’s r :

n ∑ X i Y i − (∑ X i )(∑ Y i )
r = .
​ ​ ​ ​

[n ∑ Xi2 − (∑ Xi )2 ] [n ∑ Yi2 − (∑ Yi )2 ]
​ ​ ​ ​ ​

Let n = 8.

1. ∑ X and ∑ Y :

∑ X = 98 + 97 + 102 + 100 + 99 + 101 + 99 + 101 = 797,

∑ Y = 100 + 91 + 63 + 80 + 92 + 70 + 90 + 72 = 658.

2. ∑ X 2 and ∑ Y 2 :

∑ X 2 = 982 + 972 + 1022 + 1002 + 992 + 1012 + 992 + 1012 = 79,421.

∑ Y 2 = 1002 + 912 + 632 + 802 + 922 + 702 + 902 + 722 = 55,298.

3. ∑ Xi Yi :​ ​

∑ XY = (98 × 100) + (97 × 91) + (102 × 63) + (100 × 80)



+ (99 × 92) + (101 × 70) + (99 × 90) + (101 × 72) ​

= 65,413.

Printed using ChatGPT to PDF, powered by PDFCrowd HTML to PDF API. 21/24
(Individual squares/products may be checked with a small table if desired.)

2. Plugging Into the Formula


2.1 Numerator of r

n ∑ XY − (∑ X)(∑ Y ) = 8 × 65,413 − (797) × (658).

8 × 65,413 = 523,304.
797 × 658 = 524,426.

So

Numerator = 523,304 − 524,426 = −1,122.

2.2 Denominator of r

We compute each bracket:

n ∑ X 2 − (∑ X)2 = 8 × 79,421 − (797)2 = 635,368 − 635,209 = 159,

n ∑ Y 2 − (∑ Y )2 = 8 × 55,298 − (658)2 = 442,384 − 432,964 = 9,420.

Hence,

Denominator = 159 × 9,420 ≈ ​ 1,497,780 ≈ 1,224.​

2.3 Final Correlation


−1,122
r = ≈ −0.917.
1,224

3. Interpretation
The computed correlation r≈ −0.92 indicates a strong negative linear relationship between
temperature (X ) and pulse rate (Y ) in this particular data set—namely, higher temperatures here
correspond to lower observed pulse rates, and vice versa.

Reasoned for 10 seconds


Below is a one‐sample t‐test of whether the true mean BMI of the population (from which these
10 males were drawn) is 35 kg/m². We take

H0 : μ = 35 vs H1 : μ =
​  35, ​

at a 5 % significance level, with the given critical value t(9, 0.05) ​ = 2.262.

Printed using ChatGPT to PDF, powered by PDFCrowd HTML to PDF API. 22/24
1. Sample Statistics
The 10 BMIs are:

21, 23, 32, 24, 47, 22, 45, 37, 24, 35.

1. Sample size: n = 10.


2. Sample mean, x ˉ:
310
∑ BMI = 21 + 23 + 32 + 24 + 47 + 22 + 45 + 37 + 24 + 35 = 310, ˉ=
x = 31.0.
10

3. Sample standard deviation, s:


We first compute deviations from the mean, (Xi ​
− 31), and square them:

(21 − 31)2 = 100, (23 − 31)2 = 64, (32 − 31)2 = 1,


(24 − 31)2 = 49, (47 − 31)2 = 256, (22 − 31)2 = 81,
(45 − 31)2 = 196, (37 − 31)2 = 36, (24 − 31)2 = 49,
​ ​

(35 − 31)2 = 16.

Sum of squares: 100 + 64 + 1 + 49 + 256 + 81 + 196 + 36 + 49 + 16 = 848.


Sample variance:

848 848
s2 = = ≈ 94.22.
n−1 9
​ ​

Sample stdev:

s= 94.22 ≈ 9.71.

2. Test Statistic
The one‐sample t‐statistic is:

ˉ − μ0
x 31 − 35
t = = .

9.71/ 10
​ ​

s/ n ​

Numerator: 31 − 35 = −4.
Denominator:
9.71/ 10 ≈ 9.71/3.1623 ≈ 3.07.

Thus,

−4
t≈ ≈ −1.30.
3.07

2.1 Compare with t‐Critical

Printed using ChatGPT to PDF, powered by PDFCrowd HTML to PDF API. 23/24
At α = 0.05 (two‐sided) and df = 9, the critical value is t(9, 0.05) = 2.262.

Our test statistic is −1.30, whose absolute value ∣ − 1.30∣ is less than 2.262.

3. Decision and Conclusion


Since ∣t∣ = 1.30 < 2.262, we fail to reject H0 . There is no statistically significant evidence (at the

5 % level) that the true mean BMI differs from 35 kg/m². In other words, based on this sample, we
cannot conclude that the population’s mean BMI is different from 35.
ChatGPT can make mistakes. Check important info.

Printed using ChatGPT to PDF, powered by PDFCrowd HTML to PDF API. 24/24

You might also like