Professional Documents
Culture Documents
OF
COMPUTER
SUBMITTED BY :-
SUBMITTED TO :-
Ashana
Mrs Charu Sharma 11187807
Msc Chem 4th Sem
Q1. What is t-test? When it is used and for what purpose? Explain by the
means of an example.
Ans:- A t-test is a type of inferential statistic used to determine if there is a significant
difference between the means of two groups, which may be related in certain features.
It is mostly used when the data sets, like the data set recorded as the outcome from flipping
a coin 100 times, would follow a normal distribution and may have unknown variances. A t-
test is used as a hypothesis testing tool, which allows testing of an assumption applicable to
a population.
A t-test looks at the t-statistic, the t-distribution values, and the degrees of freedom to
determine the statistical significance. To conduct a test with three or more means, one must
use an analysis of variance.
For example:- Assume that we are taking a diagonal measurement of paintings received in
an art gallery. One group of samples includes 10 paintings, while the other includes 20
paintings. The data sets, with the corresponding mean and variance values, are as follows:
Set 1 Set 2
19.7 28.3
20.4 26.7
19.6 20.1 Since the number of data records is different (n1 = 10 and
17.8 23.3 n2 = 20) and the variance is also different, the t-value and
18.5 25.2 degrees of freedom are computed for the above data set
18.9 22.1
using the formula.
18.3 17.7
18.9 27.6 T-value= mean1 – mean2
19.5 20.6
√𝑠2 (1/𝑛2 + 1/𝑛2)
21.95 13.7
23.2
17.5 The t-value is -2.24787. Since the minus sign can be
20.6 ignored when comparing the two t-values, the computed
18 value is 2.24787.
23.9
21.6
24.3
20.4
23.9
13.3
Mean 19.4 21.6
Variance 1.4 17.1
Q2. What do you mean by hypothesis? Differentiate between null and alternate
hypothesis?
Ans:- A hypothesis may be defined as a proposition or a set of proposition set forth as an
explanation for the occurrence of some specified group of phenomena either asserted
merely as a provisional conjecture to guide some investigation or accepted as highly
probable in the light of established facts.
Characteristics of Hypothesis:-
• Hypothesis should be clear and precise.
• Hypothesis should be capable of being tested.
• Hypothesis should state relationship between variables, if it happens to be a relational
hypothesis.
• Hypothesis should be limited in scope and must be specific.
• Hypothesis should be consistent with most known facts i.e it must be consistent with a
substantial body of established facts.
A B C D Total
Let us take the sample living in neighbourhood A, 150, to estimate what proportion of the whole
1,000,000 live in neighbourhood A. Similarly we take 349/650 to estimate what proportion of the
1,000,000 are white-collar workers. By the assumption of independence under the hypothesis we
should "expect" the number of white-collar workers in neighbourhood A to be
150 × (349∕650) ≈ 80.54
(𝑶𝒊−𝑬𝒊) 𝟐 (𝟗𝟎−𝟖𝟎.𝟓𝟒)𝟐
= ≈ 1.11
𝑬𝒊 𝟖𝟎.𝟓𝟒
The sum of these quantities over all of the cells is the test statistic; in this case, ≈ 24.6.
Under the null hypothesis, this sum has approximately a chi-squared distribution whose
number of degrees of freedom are
(number of rows – 𝟏) (number of columns – 𝟏) = (3 – 𝟏) (4 – 𝟏) = 6
If the test statistic is improbably large according to that chi-squared distribution, then one
rejects the null hypothesis of independence.
Q4. Explain the meaning of ANOVA? Describe ANOVA for one-way & two-way
classification?
Ans:- Analysis of variance (ANOVA) is a collection of statistical models and their
associated estimation procedures (such as the "variation" among and between groups) used
to analyze the differences among group means in a sample using the F - distribution.
ANOVA helps us to figure out if you need to reject the null hypothesis or accept the alternate
hypothesis.
One-way ANOVA
The one-way ANOVA is a hypothesis test in which only one categorical variable or single
factor is taken into consideration. With the help of F-distribution, it enables us to compare the
means of three or more samples. It compares the means between the groups you are
interested in and determines whether any of those means are statistically significantly
different from each other. Specifically, it tests the null hypothesis:
H 0 : 𝝁 1 = 𝝁2 = 𝝁3 = … = 𝝁k
where µ = group mean and k = number of groups. If, however, the one-way ANOVA returns
a statistically significant result, we accept the alternative hypothesis (H A), which is that there
are at least two group means that are statistically significantly different from each other.
Two-way ANOVA
Two-way ANOVA examines the effect of two independent factors on a dependent variable. It
also studies the inter-relationship between independent variables influencing the values of the
dependent variable, if any.
For example, analyzing the test score of a class based on gender and age. Here test score is
a dependent variable and gender and age are the independent variables. Two-way ANOVA
can be used to find the relationship between these dependent and independent variables.
Advantages of ANOVA
• It is an improved technique over t-test and z-test.
• Suitable for multidimensional variables.
• Analysis of various factors at a time.
• Economical method of parametric testing.
• Can be used in 3 or more than 3 groups.
Disadvantages of ANOVA
• It is difficult to analyze ANOVA under strict assumptions regarding the nature of data.
• It is not so helpful in comparison with t-test that there is no special interpretation of the
significance of two means.
• The requirement of post-ANOVA t-test for further testing.
Applications of ANOVA
• Recommendation of a fertilizer against others for the improvement of crop yield.
• ANOVA has immensely useful practical applications in business, particularly Lean-Six
Sigma/operational efficiency.
• Comparing the gas mileage of different vehicles, or the same vehicle under different fuel
types, or road types.
• Understanding the impact of temperature, pressure or chemical concentration on some
chemical reaction (power reactors, chemical plants, etc).
• Understanding the impact of different catalysts on chemical reaction rates.
• Studying whether advertisements of different kinds solicit different numbers of customer
responses.
• Understanding the performance, quality or speed of manufacturing processes based on
the number of cells or steps they’re divided into.
• Consider a data sample consisting of five positive integers. The values c ould be any
number with no known relationship between them. This data sample would, theoretically,
have five degrees of freedom.
• Four of the numbers in the sample are {3, 8, 5, and 4} and the average of the entire data
sample is revealed to be 6.
• This must mean that the fifth number has to be 10. It can be nothing else. It does not
have the freedom to vary.
• So the Degrees of Freedom for this data sample is 4.
The formula for Degrees of Freedom equals the size of the data sample minus one:
Df=N−1
where , Df = degrees of freedom
N = sample size
Degrees of Freedom are commonly discussed in relation to various forms of hypothesis
testing in statistics, such as a Chi-Square.
For Chi-square tests, degrees of freedom are utilized to determine if a certain null
hypothesis can be rejected based on the total number of variables and samples within the
experiment. For example, when considering students and course choice, a sample size of 30
or 40 students is likely not large enough to generate significant data. Getting the same or
similar results from a study using a sample size of 400 or 500 students is more valid.