You are on page 1of 8

With Lots of Lucks Statistics Pa g e |1

1. a) What is the difference between a qualitative and quantitative variable?

Ans: Qualitative variables are based on qualitative aspect or descriptive characteristics of a
phenomenon viz. sex, beauty, literacy, honesty, intelligence, religion, eye-sight tec.

Such variables are usually dichotomous in nature in which the whole data are divided
into two groups viz. a group with presence of the attribute and a group with absence of the
attribute such as blind and not blind, deaf and not deaf etc.

However in certain cases variables can also be made in manifold manner in which the
data are grouped under more than two classes. This type of classification is made when the
qualitative aspect are defined by some grade or performance. For instance, in the field of
education, the classification can be made in to different group viz. primary, secondary, higher
secondary, and higher education. Similarly on the basis of eye sight, the data may be grouped
under different grades of eye-sight viz. A, B, C, etc. Further, qualitative classifications are
made in made manner when more than on e attribute are taken into consideration at a time, the
classification will lead to a type of manifold classification.

Quantitative variables are numerical in nature. In simple these variables can be measured in
quantitative terms. For example- mark, income, expenditure, profit, loss, height, weight, age,
price, production etc. which is capable of quantitative expression and measurement.
Quantitative variables may be defined as a characteristic which varies in amount of magnitude
under different time and place e.g. mark, age, and height etc. These variables can be of two
types viz. a) discrete variables, b) Continuous variables. A variable that assumes only some
specified values in a given range is known as discrete variable. A variable that assumes all the
values in the series is known as continuous variables.

b) Before answering this question we need to know what population is and what a sample is.
The totality of all individual in a survey is called population or universe. If the number of
objects in a population is finite then it is called finite population otherwise it is known as
infinite population.

A sample is a part or subset of the population. By studying the sample, we can predict
the characteristics of the entire population from where the sample is taken. The data that
describes the characteristics of sample is known as statistics.

Now if we interview only one particular neighborhood then it would be a sample survey
not a population survey. Because here we interviewed every individuals of a particular group
not the whole population. But by selecting 100 people from all neighborhoods for a survey
would be called as a random sample.
With Lots of Lucks Statistics Pa g e |2

2. a) Explain the steps involved in planning of a statistical survey?

Ans: Stages in a statistical survey:
1. Nature of the problem to be investigated should be clearly defined in an unambiguous
2. Objectives of the investigation should be stated at the outset. Objectives could be
§ Obtain certain estimates.
§ Establish a theory.
§ Verify an existing statement.
§ Find relationship between characteristics.
3. The scope of investigation has to be made clear. The scope of the investigation refers to
the area to be covered, identification of units to be studied, nature of characteristics to
be observed, accuracy of measurement, analytical method, time cost and other resources
4. Whether to use data collected from primary sources or secondary sources should be
determined in advanced.
5. The organization of investigation is the final step in the process. It encompasses the
determination of the number of investigator required, their training, supervision work
needed, funds required.

b) What are the merits & Demerits of Direct personal observation and Indirect Oral

Ans: Direct personal observation: In the direct personal observation method, the
investigator collects data by having direct contact with the units of investigation. The accuracy
of the data depends upon the ability, training, and attitude of the investigator.
• We get the original data which is more accurate and reliable.
• Satisfactory information can be extracted by the investigator through indirect questions.
• Data is homogenous and comparable.
• Additional information can be gathered.
• Misinterpretation of question can be avoided.
• This method consumes more cost.
• This method costs more time.
• This cannot be used when the scope of the investigation is wide.
Indirect oral interview: Indirect oral interview is used when the area to be covered is large.
The investigator collects the data from a third party or witness or had of the institution. This
method is generally used by police department in cases related to enquiries on causes of fires,
theft or murders.
• Economical in terms of time, cost and man power.
• Confidential information can be collected.
• Information is likely to be unbiased and reliable.
With Lots of Lucks Statistics Pa g e |3

• The degree of accuracy of information is less.

3. a)
Central Value Limits Frequency Less than Greater than
5 0-10 5 10 5 0 63
15 10-20 11 20 16 10 58
25 20-30 21 30 37 20 47
35 30-40 16 40 53 30 26
45 40-50 10 50 63 40 10
Total 63

Now from the meeting points of these two ogives if we draw a perpendicular to the X
axis, the point where it meets X axis gives median of the series. So here midpoint of 20-30
limit is 25. So median is 25.

By actual calculation
Here n=63, hence median is (N+1)/2th item which is (63+1)/2=32nd item =25.
So ogive median and actual median are same.

b) Size f cf
1000-1500 120 120
1500-2000 f1 120+f1
2000-2500 400 520+f1
2500-3000 500 1020+f1
3000-4000 410-f1* 1430
4000-5000 50 1480
5000-6000 20 1500
With Lots of Lucks Statistics Pa g e |4

Median = (N)th item , 1500/2=750th item ,but median is 2600 (given)
This lies between 2500-3000 groups
Now M= L1 + L2-L1 (m-c)

2600= 2500+ 3000-2500/500 *(750-(520+f1))
= >2600 = 2500+ 500/500* (750-520-f1)
= >2600 = 2500- 230-f1
= >2600-2500= 320-f1
= > 100= 320- f1
= > f1 =130
Then f2 = 410-130=280

Ci f m fm
1000-1500 120 1250 150000
1500-2000 130 1750 227500
2000-2500 400 2250 900000
2500-3000 500 2750 1375000
3000-4000 280 3500 980000
4000-5000 50 4500 225000
5000-6000 20 5500 110000
1500 3967500

X= Σ fm
=3967500/1500 = 2645 (ans)

4. a) What is the main difference between correlation analysis and regression analysis?

Ans: Correlation analysis: When two or more variables move in sympathy with other, they
are said to be correlated. If both variables move in the same direction then they are said to be
positively correlated. If the variables move in opposite direction then they are said to be
negatively correlated. If they move haphazardly then there is no correlation between them.

Regression analysis: Regression analysis is used to estimate the values of the dependent
variables from the values of the independent variables. Regression analysis is used to get
measure of the error involved while using the regression line as a basis for estimation.
Regression coefficient is used to calculate correlation coefficient.
The main difference between these two is:- correlation analysis attempts to study the
relationship between the variable ‘X’ and ‘Y’. Regression analysis attempts to predict the
average ‘X’ for a given ‘Y’. It is attempted to quantify the dependence of one variable on the
With Lots of Lucks Statistics Pa g e |5

Difference between regression coefficient and correlation coefficient
Correlation coefficient Regression Coefficient
• The correlation coefficients, rxy = ryx. The regression coefficients, byx = bxy
• It indirectly helps in estimation. It is meant for estimation.
• It has no units attached to it. It has units attached to it.
• There exists nonsense correlation. There is no such nonsense correlation.
• It is not based on cause and effect It is based on cause and effect relationship.

b) In Multiple regressions analysis is an extension of two variable regression analyses. In this
analysis, two or more independent variables are used to estimate the values of a dependent
variable, instead of one independent variable.

Objectives of multiple regression analysis are:
• To derive an equation, this provides estimates of the dependent variable from values of
the two or more independent variables?
• To obtain the measure of the error involved in using the regression equation as a basis
of estimation.
• To obtain a measure of the proportion of variance in the dependent variable accounted
for or explained by the independent variables.
In the given question N=12, hence degree of freedom will be v=n-1, where n is the sample
size. So the degree of freedom will be 12-1=11

5. a) Discuss what is meant by Quality control and quality improvement.
Ans: a) Quality Control – is defined as the part of quality management focused on fulfilling
quality requirements. Ideally, prevention based controls should prevent problems from
occurring, but in reality, no system is foolproof and problems do occur. Accordingly, controls
to detect quality problems must be established so that customers receive only products that
meet their requirements. ISO 9000 Lead Auditor Training Detection based controls are
reactive – the problem and cost have already occurred and the company is resorting to damage
control. The intent of detection is to evaluate output from processes and activities by
implementing controls to catch problems when they do occur. For example, final inspection to
catch defective product before it gets shipped.

Quality Improvement – is defined as the part of quality management focused on increasing
the ability to fulfill requirements. Continual improvement results from ongoing actions taken
to enhance product characteristics or increase process effectiveness and efficiency. This is one
of the key characteristics that differentiate a quality management system from a quality
assurance system, i.e., being able to improve the effectiveness and efficiency and of a process
or activity by setting measurable objectives and using performance data to manage the
achievement of these objectives.
Effectiveness is defined as the extent to which planned activities are realized and planned
results are achieved. In determining the effectiveness of quality assurance and quality
improvement activities, the following questions should be asked:
– To what extent have problems in product or processes been prevented?
With Lots of Lucks Statistics Pa g e |6

– To what extent have planned objectives for quality been met?
Efficiency is defined as the relationship between result achieved and resources used.
The measure of efficiency is determined by asking the following:
– Can we get the same output using fewer resources?
– Can we get more output without adding resources?
These questions may be applied to the output of any activity within the quality management
system of an organization.
It should be noted that ISO 9001 requires organizations to achieve QMS effectiveness through
quality assurance and continual improvement activities. QMS efficiency is desirable, but not
currently required by ISO 9001. ISO 9004 provides guidelines that consider both the
effectiveness and efficiency of the QMS.
Quality improvement actions may include:
• Measuring and analyzing situations
• Establishing improvement objectives
• Searching for possible solutions
• Evaluating these solutions
• Implementing the selected solution
• Measuring, verifying, and analyzing results
• Formalizing the changes
b) What are the limitations of a quality control charts?
The quality control chart is based on the research of Villefredo Pareto. He found that
approximately 80 percent of all wealth of Italian cities he researched was held by only 20
percent of the families. The Pareto principle has been found to apply in other areas, from
economics to quality control. Pareto charts have several disadvantages, however.

Easy to Make but Difficult to Troubleshoot

• Based on the Pareto principle, any process improvement should focus on the 20 percent
of issues that cause the majority of problems in order to have the greatest impact.
However, one of the disadvantages of Pareto charts is that they provide no insight on
the root causes. For example, a Pareto chart will demonstrate that half of all problems
occur in shipping and receiving. Failure Modes Effect Analysis, Statistical Process
Control charts, run charts and cause-and-effect charts are needed to determine the most
basic reasons that the major issues identified by the Pareto chart are occurring.

Multiple Pareto Charts May Be Needed

• Pareto charts can show where the major problems are occurring. However, one chart
may not be enough. To trace the cause for the errors to its source, lower levels of Pareto
charts may be needed. If mistakes are occurring in shipping and receiving, further
analysis and more charts are needed to show that the biggest contributor is in order-
taking or label-printing. Another disadvantage of Pareto charts is that as more are
created with finer detail, it is also possible to lose sight of these causes in comparison to
each other. The top 20 percent of root causes in a Pareto analysis two to three layers
down from the original Pareto chart must also be compared to each other so that the
targeted fix will have the greatest impact.
With Lots of Lucks Statistics Pa g e |7

Qualitative Data versus Quantitative Data

• Pareto charts can only show qualitative data that can be observed. It merely shows the
frequency of an attribute or measurement. One disadvantage of generating Pareto charts
is that they cannot be used to calculate the average of the data, its variability or changes
in the measured attribute over time. It cannot be used to calculate the mean, the standard
deviation or other statistics needed to translate data collected from a sample and
estimate the state of the real-world population. Without quantitative data and the
statistics calculated from that data, it isn't possible to mathematically test the values.
Qualitative statistics are needed to whether or not a process can stay within a
specification limit. While a Pareto chart may show which problem is the greatest, it
cannot be used to calculate how bad the problem is or how far changes would bring a
process back into specification.

Q6. a) Suggest a more suitable average in each of the following cases:
(i) Average size of ready-made garments.
(ii) Average marks of a student.
Ans: Average size of readymade garments: Arithmetic mean will be used because it is
continuous and additive in nature.
Average marks of a student: Arithmetic mean will be used because the data re in the interval
and the distribution is symmetrical.

b) State the nature of symmetry in the following cases:
When median is greater than mean, the series is said to have negative skewness. The
following characteristics can be seen
• Mode > Median > Mean
• The left tail of the curve is longer than the right tail, when the data are plotted through a
histogram, or a frequency polygon.
• The formula of skewness and its coefficients give negative figures.
When mean is greater than median, the series is said to have positive skewness.the
following characteristics can be seen
• Mean > Median > Mode
• The right tail of the curve is longer than its left tail, when the data are poltted through a
histogram, or a polygon.
• The formula of skewness and its coefficients give positive figures.
The following example would show the above distributions and their respective

Value (X) Positively Skewed Negatively Skewed
10 5 50 5 5 50 5
20 15 300 20 7 140 12
30 13 390 33 9 270 21
With Lots of Lucks Statistics Pa g e |8

40 11 440 44 11 440 32
50 9 450 53 13 650 45
60 7 420 60 15 900 60
70 5 350 65 5 350 65
Total 65 2400 - 65 2800 -

Mean= 2400/65= 37 Mean= 2800/65= 43
Median=(65+1)/2=33th Median= 33th item =50
Item =30