You are on page 1of 17

JHELUM COLLEGE OF NURSING AND HEALTH SCIENCES, JHELUM

BIOSTATISTICS

IMPORTANT QUESTIONS

CHAPTER: 1 INTRODUCTION
MEANING OF STATISTICS in plural sense:

It refers to numerical facts in any field of study. These facts are collected in a systematic manner with a
definite purpose in view. We also use the word data to refer to statistics in this sense.

MEANING OF STATISTICS in singular sense:

It refers to the science comprising methods which are used in the collection, presentation, analysis and
interpretation of numerical data.

MEANING OF STATISTICS in technical sense:

It is plural of statistic. By statistic, we mean a quantity calculated from few observations taken on
sample basis. For example, if we select at random ten students from a class of fifty students, measure
their height and find the average height; this average is a statistic.

POPULATION:

It refers to a collection of individuals or objects having come common characteristics.

BIOSTATISTICS:

When statistics is applied in biology including human biology, medicine and public health, it is known as
biostatistics. Francis Galton (1822-1911) has been called the father of Biostatistics.

QUESTIONS FROM PAST PAPERS:


1. What do you understand by the term variable?
A measurable quantity which can vary from one individual or object to another is called a
variable. Examples of variable are heights and weights of individuals, number of children in a
family etc. A variable is usually denoted by the last letters of alphabets X, Y or Z.

2. Define data.
A set of observations like height of students, temperature of patients, blood pressure of patients
and number of paramedical staff, called data

3. Classify the variables with one example of each:


(a) Qualitative variable
(b) Quantitative variable
(c) Continuous and discrete variable
(d) Independent and dependent variable
Define the following:
(a) Continuous variable
(b) Discrete variable
(c) Chronological data

Classify the variables with examples of each type

CONTINUOUS VARIABLE: A variable which can assume any value within a given range is called a
continuous variable. Examples of continuous variable are heights and weights of individuals,
level of mercury in a thermometer etc. The value of continuous variable varies without any gaps
or jumps.
DISCONTINUOUS OR DISCRETE VARIABLE: A variable which can assume only some specific
value within given range is called discontinuous or discrete variable. Examples of discontinuous
variable are number of children in a family, number of students in a class.
DEPENDENT VARIABLE:
INDEPENDENT VARIABLE:
4. Define the following
(a) Primary data
(b) Secondary data
(c) Descriptive statistics
(d) Inferential statistics

Define the inferential statistics.

What is the difference between descriptive statistics and inferential statistics?

PRIMARY DATA: the primary data are the firsthand information collected, compiled and
published by an organization for a certain purpose. The data in the Population Census Reports
are primary because these are collected, compiled and published by the Population Census
Organization.
SECONDARY DATA: The data published or used by an organization other than the one which
originally collected them are known as secondary data.
DESCRIPTIVE STATISTICS: Descriptive statistics deals with collection of data, its presentation in
various forms, such as tables, graphs and diagrams and finding averages and other measures
which would describe the data.
INFERENTIAL OR INDUCTIVE STATISTICS deals with techniques used for analysis of data, making
the estimates and drawing conclusions form limited information taken on sample basis and
testing the reliability of the estimates.
5. Describe the quantitative and qualitative data

QUALITATIVE DATA: Data which are described by a qualitative variable, e.g., marital status, sex,
etc. are called qualitative data.
QUANTITATIVE DATA: Data described by a quantitative variable, e.g., heights, weights etc. are
called quantitative data

6. How does statistics help in Nursing profession?


(i) It helps nurses to apply the most current and up to date research and evidence to
patient care delivery.
(ii) Researchers link the statistical analyses they choose with the research question, design
and level of data collected.
(iii) It allows the nurses to critically analyze the results.
(iv) It provides organization and meaning to data.
(v) It helps in comparison.
(vi) It helps to test null hypothesis.
(vii) It helps to draw inferences and make the generalization.

7. State the difference between


(a) Data on ordinal scale
(b) Data on interval scale

Discuss the ratio scale of measurement.

(a) Define scales. (01)


(b) Describe types of scale. (04)

Classify scales of measurement and write down briefly characteristics of each scale.

Enlist scales used for measurements of statistical data.

Classify scales of measurement and write down briefly characteristics of each scale

DATA ON NOMINAL SCALE: It is the weakest of the four measurement scales. The nominal scale
distinguishes one object or event from another on the basis of a name. For example, we classify
(name) items from an assembly line as defective or non-defective. A new born baby is male or
female. Students in a class may be judged as average, good, very good or excellent.
DATA ON ORDINAL SCALE: Object or events measured on the ordinal scale are distinguished
from one another on the basis of the relative importance of some characteristic they possess.
For example, contestants in a race may be ranked 1, 2, 3, …… according to the order in which
they cross the finish line. Data of this type are usually called rank data.
DATA ON AN INTERVAL SCALE: An interval scale has equal unit but an arbitrary zero point. A
familiar example of interval measurement is the measurement of temperature in Fahrenheit
degrees or Celsius degrees (centigrade). It is important to note that 0 is just a point on the scale.
It does not represent the absence of condition. Zero degrees Fahrenheit does not represent the
absence of heat. In fact, 0 degrees Fahrenheit is about -18 degrees on the Celsius scale.
DATA ON RATIO SCALE: the scale of measurement is the ratio scale when measurements have
the properties of the first three scales and the additional property that their ratios are
meaningful. The measurements of height and wight are examples of measurements on the ratio
scale. (e.g., 30 Kg is thrice of 10 Kg; 20 cm is twice of 10 cm; 8 hours is four times of 2 hours) are
said to be measured on a ratio scale.

8. Define the following: (a) Proportion and Ratio


PROPROTION: A proportion is the number (a) observations with a given characteristic divided by
the total number of observations (a + b) in a given group, i.e., Proportion =
RATIO: A part divided by another part. For example, among physicians taking aspirin, the ratio
of those who had an MI to those who did not is 62/5200 = 0.12.

9. Name the scale of measurement for the following examples


(a) Sex of the participant (Nominal scale)
(b) Social class categories (ordinal scale)
(c) Temperature recorded in centigrade (interval scale)
(d) Heart rate of the participant (ratio scale)
(e) Weight of an object (ratio scale)
(f) A person’s name (Nominal scale)

CHAPTER: 2 PRESENTATIONS OF DATA


1. Which type of data is good for a histogram?
Continuous or discrete data is good for a histogram.

2. How can we present the data diagrammatically?


We can present the data diagrammatically by
(I) Line graph
(II) Simple bar chart
(III) Multiple bar chart
(IV) Component bar chart
(V) Pie chart
(VI) Pictogram
(VII) Cartogram

3. Discuss the properties of Histogram.


(i) A histogram has an appearance similar to a vertical bar char, but there are no gaps
between the bars.
(ii) A histogram will have bars of equal width.
(iii) Histogram gives an idea of the distribution of the data.
(iv) The histogram is a set of blocks.
(v) Horizontal scale shows income in thousands of dollars.
(vi) Area of blocks represent percentages.
CHAPTER: 3 MEASURES OF CENTRAL TENDENCY
1. Discuss the desirable properties of an ideal average.
(i) It should be clearly defined preferable by a mathematical formula.
(ii) It should be simple to understand and easy to calculate.
(iii) It should be based on all the observations so that if we change the value of any
observation, the value of the average is also changed.
(iv) It should be capable of algebraic manipulation.
(v) It should not be affected by fluctuations of sampling.
(vi) It should
2. What are the common measures of central tendency?
The common measures of central tendency are (i) Arithmetic Mean (ii) Median (iii) Mode

3. Which type of average is suitable in these situations?

(a) Growth rate of population Geometric Mean


(b) Average sale at a shop Arithmetic Mean
(c) BP of normal person Arithmetic Mean
(d) BP of normal person and patient Median

4. Find a mean incubation period, median and mode of 9 polio cases given below:
17, 20, 18, 24, 16, 19, 21, 22, 23
SOLUTION:

𝑥̅ = = = = 20

5. In the given data: 1, 2, 3, 4, 5


(a) Calculate mean
(b) Calculate median

SOLUTION:

𝑋= = = =3

Median = ( ) th value
( ) th value = 3rd value = 3

6. Define Mode.
MODE: The most repeated value in the data is called Mode.

7. Find the mean weight of 100 persons form the following frequency distribution
Weight 45 50 55 60 65 70 75
(Kg)
No. of 5 12 18 20 33 10 3
Persons

SOLUTION:
Weight (in Kg) No. of persons fx
(X) (f)
45 5 225
50 12 600
55 18 990
60 20 1200
65 33 2145
70 10 700
75 2 150
100 6010

Mean = 𝑥̅ = = = 60.1

8. Calculate mode of the given data: 5, 6, 8, 6, 9, 4

Solution: Mode = 6

9. The following are the marks obtained by 96 students in an examination


Marks obtained: 0 – 9, 10 – 19, 20 – 29, 30 – 30, 4 – 49
Number of students: 7, 23, 46, 15, 5

SOLUTION:
MARKS F X
0–9 7 4.5
10 – 19 23 14.5
20 – 20 46 24.5
30 – 39 15 34.5
40 – 49 5 44.5

𝑥̅ = = 23.25

10. The gain in weights of 5 albino rats in a period of 5 days are 5, 6, 4, 8, 7. Calculate the mean.

SOLUTION:


𝑥̅ = = = =6

11. In the given values 4, 5, 6, 7, 8, 9: (a) Calculate the median (b) calculate the mean

SOLUTION:

Median = [ ] th value
=[ ] th value = 7/2 = 3.5
= 3 value + 0.5 [4th value – 3rd value]
rd

=6 + 0.5 [7 – 6 ]
= 6 + 0.5 [1] = 6.5


𝑥̅ = = = = 6.5
12. (a) The arithmetic mean of a distribution is 40 and median is 43, find its mode.

Mode = 3 median – 2 Mean Mode = 3(43) – 2(40) Mode = 129 – 80 = 49


13. (a) Define term measures of central tendency. (01) (b) Calculate the mean, median, mode and
standard deviation of the given data. X = 4, 5, 6, 7, 7, 7,8, 9, 10 (04)

CHAPTER: 4 MEASURES OF DISPERSION


1. Enlist the measures of dispersion of which four are absolute measures and three are relative
measures

ABSOLUTE MEASURES OF DISPERSION:


(I) Range
(II) Quartile Deviation
(III) Mean Deviation
(IV) Standard Deviation

RELATIVE MEASURES OF DISPERSION:

(I) Coefficient of Range


(II) Coefficient of variation
(III) Coefficient of Quartile Deviation
(IV) Coefficient of Mean Deviation
2. (a) there are number of asymmetric distribution but skewness is one of its kind. Define term
skewness. (01) (b) briefly describe and draw types of skewness. (04)

3. Define the following: (i) Coefficient of variation (ii) Standard Deviation

COEFFICIENT OF VARIATION:
The coefficient of variation expresses the standard deviation as a percentage of the arithmetic
mean. Symbolically, the coefficient of variation, denoted by C.V., is given by C.V = * 100
STANDARD DEVIATION:
The positive square root of the mean of the squared deviation of the values from their mean.

4. Briefly descry e absolute and relative measures of dispersion.


There are two main types of dispersion namely absolute and relative dispersion.
ABSOLUTE MEASURE OF DISPERSION: It measures the variation present in the observation in the
unit of measurement.
RELATIVE MEASURE OF DISPERSION: It measures the variation in the values relative to their
central values. The relative dispersion is independent of the unit of measurement

5. Find the standard deviation of the respiration rate per minute found to be 16, 18, 10, 17, 21,
24, 22 and 23 in 8 individuals.

SOLUTION:

6. Briefly describe absolute and relative measures of dispersion.


7. Find the quartiles of the following raw scores: 27, 19, 25, 33, 29, 35, 20
First arranging the given raw scores in ascending order of magnitude, i.e., 19, 20, 25, 27, 29, 33,
35

SOLUTION:
Q1 = [ ] th value = = = 2nd term value of 2nd term is 20

Q2 = [ ] th value = = = 4th term value of 4th term is 27

( ) ( ) ( )
Q3 = = = = = 6th term value of 6th term is 33

8. Find the quartile deviation of the weights (in Kg) of 07 people given below: 55, 50, 62, 58, 65,
48, 52.

SOLUTION:
Arranging the given values in ascending order of the magnitude, i.e., 48, 50, 52, 55, 58, 62,65
Q1 = [ ] th value = = =2

( ) ( ) ( )
Q3 = = = = = 6th value =62

Quartile Deviation = = =6

9. Five patients had the following Hemoglobin level in grams: 3, 5, 2, 7, 8


Calculate the mean, median and standard deviation

SOLUTION:

𝑥̅ = = = =5

After arranging the data in ascending order, we get 2, 3, 5, 7, 8


Median = [ ] th value
=[ ] th value = 6/2 = 3rd value
The median of this value is 5

X (X – X )2
3 4
5 0
2 9
7 4
8 9
25 26
∑( – )
S.D = = = 5.2

10. A child born to Mrs. X every year for 7 consecutive years compute standard deviation of
children’s age when youngest is 9 years old.

SOLUTION:
∑𝑥 ∑𝑥 2
9 81
10 100
11 121
12 144
13 169
14 196
15 225
84 1036

∑ ∑
S= [ − ( )2]
S= − ( )2 = 148 − (12)2 = √4 = 2

CHAPTER: 5 PROBABILITIES
1. Define the following terms: (a) Probability

PROBABILITY:
Probable chances of occurrence with which an event is expected to occur on an average.

2. State the difference between: (a) Equally likely events (b) Mutually exclusive events

EQUALLY LIKELY EVENTS:


When each outcome of a sample space is as likely to occur as any other, the outcomes are said
to be equally likely. For example, if we toss a fair coin, the head is as likely to occur as the tail.
MUTUALLY EXCLUSIVE EVENTS:
If the occurrence of an outcome prevents the occurrence of other outcomes, i.e., if one
outcome occurs, others cannot occur, they are called mutually exclusive outcomes. Suppose we
toss a coin. If the head occurs the tail cannot occur.

CHAPTER: 6 NORMAL DISTRIBUTIONS


1. Discuss the importance of normal distribution curve.
(i) It is the most important of the continuous probability distribution.
(ii) It provides a reasonable approximation in a great many situations.
(iii) The inference procedures based on normal distribution have wide applications.
(iv) Data obtained from biological measurements approximately follow normal distribution.
(v) For large samples, any statistic (sample mean, sample standard deviation)
approximately follows normal distribution.
(vi) Normal distribution also forms the basis for the tests of significance.

2. All normal distribution have a particular internal distribution for area under the curve,
whether mean or standard deviation is large or small, the relative area between any two
designated points is always same, so how much area is included under the normal curve
within: µ ± 1Ó, µ ± 2Ó, µ ± 3Ó

SOLUTION:
µ ± 1Ó contain 68.27%
µ ± 2Ó contain 95.45%
µ ± 3Ó contain 99.73%

3. Assume that the systolic blood pressure of is normally distributed with mean B.P 120mmHg
and standard deviation 10mmHg. (a) determine the proportion of men whose B.P is above
140mmHg (b) What is the value of B.P that 5% men have B.P above it.
SOLUTION:
(a) µ = 120, 𝜎 = 10mmHg, P(X ≥ 140) = ?
Z= = =2 P(X ≥140) = P ( Z ≥2) = 0.0228

(b) for the point above X, the probability is 0.05


for area more then 0.05, the Z value is 1.64
Z= → 1.64 =
1.64 * 10 = X – 120
X = 1.64 /120 = 136.4
P(X ≥ 136.4) = 0.05

4. What are the qualities of a normal distribution curve?


Discuss the properties of Normal distribution curve.

(i) The distribution is bell-shaped, unimodal and symmetrical.


(ii) The total area under the normal curve is one.
(iii) The parameter µ and Ó of the normal distribution are in fact its mean and standard
deviation.
(iv) As the distribution is symmetrical, its mean, median and mode coincide and are equal to
µ.
(v) The function has a positive value for every value of X. The curve, therefore, lies entirely
above the X-axis.
(vi) All odd moments about mean are zero.
(vii) The normal distribution has two parameters mean and standard deviation.
(viii) The mean deviation is 4/5 of SD and quartile deviation is 2/3 of SD.
(ix) In normal distribution, the limits µ ± 1Ó contain 68.27%, µ ± 2Ó contain 95.45%, µ ± 3Ó
contain 99.73%
5. What is the difference between normal distribution and standard normal distribution?

CHAPTER: 7 SAMPLING TECHNIQUES


1. (a) Define sampling. (1)(b) briefly describe its types along with its application in research. (4)
2. Distinguish between a parameter and statistic.
What do you understand by the term sample?
Define the following: (a) sample and population (b) statistic and parameter
Distinguish between a parameter and a statistic

SAMPLE: A sample is a part of the whole selected with the object that it will represent the
characteristics of the whole.
POPULATION: The whole from which sample is drawn in known, in statistical language, as
population or universe.
PARAMETER: A numerical value such as mean, median or standard deviation calculated form
the population is called population parameter or simply a parameter.
STATISTIC: A numerical value such as mean, median and standard deviation calculated from the
sample is called sample statistic or simply a statistic.

3. Enlist various methods of sampling.


Sampling techniques or methods may be categorized into two main headings.
(a) PROBABILITY SAMPLING / RANDOM SAMPLING
(i) Sample Random Sampling
(ii) Stratified Random Sampling
(iii) Systematic Sampling
(iv) Cluster Sampling
(b) NON-PROBABILITY SAMPLING
(i) Judgement or Purposive Sampling
(ii) Quota Sampling

4. Write a brief note on cluster sampling along with examples.

CLUSTER SAMPLING: In cluster sampling, we first select at random clusters (i.e., groups) of
individual items from the population and then choose all or sub-sample of the items within each
cluster to make up the overall sample.

5. Define the following: (a) systematic and stratified sampling


How is systematic random sample drawn?
Write a brief note on systematic random sampling along with one example

STRATIFIED RANDOM SAMPLING: in this sampling procedure, the material or area to be


sampled is divided into groups or classes called strata. Items within each such stratum are
homogeneous with respect to the characteristics under study. Form each stratum, a simple
random sample is taken and the overall sample is obtained by combining the samples for all
strata. This overall sample is called a stratified random sample.

SYSTEMATIC SAMPLNG: Here the samples are equally spaced throughout the area or population
to be sampled. For example, in house-to-house sampling, every tenth or twentieth house may
be taken. More specifically, a systematic sample is obtained by taking every kth unit in the
population after the unit in the population have been numbered.
To select a sample of n = 5 units from a population of N units numbered 1 to N, we take
a unit at random from the first k = 10 units and every 10 th unit afterwards. For instance, it the
first unit drawn is number 8, the subsequent units are having numbers 18, 28, 38 and 48. The
selection of first unit determines the whole sample.

6. Define probability sampling and Non probability sampling. Enlist type of Non probability
sampling.
PROBABILITY SAMPLING: Probability sampling is a procedure of drawing a sample in which
following a sampling plan, every element of the population has a known probability of being
included in the sample.
NON-PROBABILITY SAMPLING: It is a procedure of drawing a sample in which the sample
elements are arbitrarily selected by the sampler because in his judgement the elements thus
chosen will most effectively represent the population.
TYPES OF NON-PROBABILITY SAMPLING:
(i) Judgement or Purposive Sampling
(ii) Quota Sampling

CHAPTER: 9 ESTIMATIN AND TESTING OF HYPOTHESIS


1. Define the following: (a) Null Hypothesis and Alternative Hypothesis (b) Level of significance
(c) degree of freedom
Differentiate between null hypothesis and alternative hypothesis.
NULL HYPOTHESIS: an assumption to be tested for possible rejection is called a null hypothesis
and is denoted by H0.
ALTERNATIVE HYPOTHESIS: any hypothesis that is different from the null hypothesis and is set
up in parallel to the null hypothesis, is called an alternative hypothesis and is denoted by H 1.
LEVEL OF SIGNIFICANCE: The probability of making type-I error is called the level of significance
of the test and is denoted by α.
DEGREE OF FREEDOM: If there are n values or numbers in a sample with a specified mean, then
the degree of freedom would be (n – 1). But if there two samples of sizes n 1 and n2 with
specified means 𝑋1 and 𝑋2 , then the degrees of freedom would be (n1 + n2 – 2).

2. Enumerate type-I and type-II errors in brief according to statistical inference in hypothesis
testing.
Briefly describe type-I and type-II errors. (5)

TYPE-I ERROR: Rejection of H0 when H0 is true

TYPE-II ERROR: Acceptance of H0 when H1 is true

The probability of making type-I and type-II errors is denoted by α and β respectively.

3. A random sample of size 64 is drawn from a finite population consisting of 122 units. If
population standard deviation is 16.8, find the standard error of mean.

SOLUTION:
S.E. of sample mean = 𝜎/√𝑛 = 16.8/ √64 =16.8/8 = 2.1

4. (a) What is standard error of proportion? (b) when is it used?


A statistic indicating how greatly a particular sample proportion is likely to differ from the
proportion in the population proportion, p. this indicates that the proportion is based on sample
data. Standard error of proportion is calculated as S.E(p) =
It is useful for determining the proportion or percentage above or below mean of a group using
area curve. It is also used to calculate in precision in the sampling distribution of statistic. Larger
the standard error, smaller the value of precision and vice versa.

5. Systolic blood pressure of 566 males was measured and standard deviation 13.5mm of Hg.
Calculate standard error

SOLUTION:
.
S.E. (𝑋) = = = 0.55
√ √

CHATER: 10 LARGE SAMPLE TEST


1. A random of size 30 is drawn from a finite population consisting of 121 units. If the population
standard deviation is 11. Find the standard error of sample mean when the sample is drawn (i)
with replacement, and (ii) without replacement.

SOLUTION:

(I) When the sample is drawn with replacement, the standard error of sample mean, i.e.,
.
S.E. of 𝑥̅ = = =2
√ √

(II) When the sample is drawn without replacement, the standard error of 𝑥̅ = . =

2 =2 * 0.87 = 1.74

2. A Saeed company claims that 90% of its radish seeds germinate. Out of 100 planted, 14 failed
to germinate. Formulate the hypothesis about company’s claim and perform a test of
hypothesis.

SOLUTION:

Null hypothesis: H0 P = 0.90


Alternative hypothesis: H1 P ≠ 0.90
Level of significance (𝛼) = 0.05
Test statistic Z = p – P/
Reject H0 if |𝑍𝑐𝑎𝑙| ≥ 1.96
Calculations:
( . )( . )
Z = 0.86 – 0.90/ = 1.33
Accepted H0
Sample proportion → p (small p)
Population proportion → P (capital P)

CHAPTER: 11 SMALL SAMPLE TEST


1. What are the properties of students’ t-distribution?

(i) t-distribution is asymptotic to x-axis.


(ii) The shape of the curve or form of t-distribution varies with degree of freedom.
(iii) t-distribution is a symmetrical distribution with mean zero.
(iv) t-distribution has a greater spread than normal distribution.
(v) t-distribution becomes closer to the Z distribution with increasing degree of freedom.
(vi) Theoretically, when d.f approaches to ∞, then the t-distribution approaches the
standard normal curve.

2. Write the procedure to test the hypothesis about the population mean in small sample.

The steps for testing a hypothesis about the mean of a normal population with unknown
variance against various alternative hypothesis (based on a sample of size n < 30) are
summarized as follows:
(i) H0 : µ = µ0 verses H1 : µ < µ0 or µ > µ0 or µ ≠ µ0
(ii) Choose the level of significance α
(iii) Determine the critical region
t < -tα for the alternative µ < µ0
t > tα for the alternative µ > µ0
|𝑡| > tα/2 (t < tα/2 and > tα/2) for the alternative µ ≠ µ0
µ
(iv) Test to be used: t =

(v) Calculations
(vi) Draw the conclusion

3. The height of 10 students selected at random from a school had a mean 116 cm and variance
96 cm. test the hypothesis at 5% level of significance that students are on average less than
120 cm in all.

SOLUTION:
Null Hypothesis H0 : µ= 120
Alternative Hypothesis H1: µ < 120
µ
Test statistic: t= = = -1.23 or |𝑡| =1.23
. ( ̅) .
.
S.E of 𝑥̅ = = 3.26

H0 is accepted

CHAPTER: 12 CHI-SQUARE TEST


1. Discuss the properties of chi-square distribution.
(i) Chi-square values increase with the increase in degree of freedom.
(ii) The value of ꭓ2 lies between 0 and ∞.
(iii) Chi-square curve is always positively skewed.
(iv)
2. Nephropathy was observed in 100 cases of each class diabetics, divided into 4 classes as per
severity of disease.
Class: 1 2 3 4
Observed frequency 8 15 14 7
Is this inequality in different classes due to severity? (ꭓ2 0.05,3 = 7.81)

SOLUTION:
Null Hypothesis: H0 the severity of disease and incidence of nephropathy are independent
Alternative Hypothesis: H1 Not independent
(𝑶 𝑬)𝟐
Test statistic ꭓ2 = ∑
𝑬
Calculation:
Expected frequency in each class should be the same i.e., = 11
O E O–E (O – E )2
8 11 -3 0.82
15 11 4 1.45
14 11 3 0.82
17 11 -4 1.45
4.54
Here d.f. = 4 – 1 = 3
(𝑶 𝑬)𝟐
ꭓ2 = ∑ = 4.54
𝑬
As the table value of ꭓ2 at α =0.05 with d.f. is 7.81
Since ꭓ2 calculated = 4.54 < ꭓ2 0.05 3 =7.81
H0 is accepted
3. From the following data, use ꭓ2-test and conclude whether inoculation is effective in
preventing tuberculosis.
Group Attacked Non Attacked Total
Inoculated 10 90 100
Non inoculated 26 74 100
Total 35 164 200

SOLUTIOIN:
CELL Observed Expected (O – E)2 (O – E )2/E
frequency frequency
A 10 18 (-8)2 3.555
B 26 18 82 3.555
C 90 82 82 0.780
D 74 82 (-8)2 0.780
TOTAL 8.67
The test statistic is
( )
ꭓ2 = ∑ = 8.67 with d.f (2-1)(2-1) = 1
The table value of ꭓ2 at α = 0.05 with 1 d.f. is 3.64
ꭓ2 (calculated) = 8.67 > ꭓ2 (table value) = 3.84
The null hypothesis is rejected

CHAPTER: 13 CORRELATION AND REGRESSION


1. Discuss the properties of regression line.
(i) The sum of errors is zero i.e., ∑(Y - 𝑌) = 0
(ii) The sum of observed values is equal to the sum of fitted values.
(iii) The sum of the squared of errors is minimum.
(iv) The regression line always passes through 𝑥̅ and 𝑦.

2. Define correlation and interpret the value of correlation coefficient ‘r’ = -0.982.

You might also like