Professional Documents
Culture Documents
(C) Amity University Online: Module-1: Introduction To Statistics
(C) Amity University Online: Module-1: Introduction To Statistics
e
Objectives
in
1. To get introduced with limitations, applications and functions of statistics
2. To discuss data collection and presentation techniques
nl
Outcomes
1. The learner will be able to utilize the knowledge of statistics in answering statistical
O
questions
ity
to exhibit important interrelationships among them”
According to A.M Tuttle-
1.1 Introduction
s
The word Statistics is derived from the Italian word ‘Stato’ which means ‘state’;
and the word ‘Statista’ refers to a person who is involved with the affairs of state. Thus,
er
statistics originally was meant for collection of facts useful for affairs of the state, like
the taxes, land records, population demography, etc. There is an evidence of use of
some of the principles of statistics by ancient Indian civilization as well. Some of the
v
techniques have found their mention in Vedic Mathematics. However, the modern
statistical methods spread from Italy to France, Holland and Germany in 16th century.
ni
Definitions of Statistics
The definitions of statistics are as follows: “Statistics are the classified facts
U
representing the conditions of the people in the state. Specially those facts which can
be stated in number or in table of numbers or in any tabular or classified arrangement.”
– Webster
ity
Functions of Statistics
m
for planning and decision-making. Predictions based on the gut feeling or hunch
can be harmful for the business. For example, to decide the refining capacity for a
petrochemical plant, it is required to predict the demand of petrochemical product
mix, supply of crude oil, the cost of crude, substitution products, etc., for next 10 to
Notes
e
20 years, before committing an investment.
3. Testing of hypotheses: Hypotheses are the statements about population parameters
in
based on past knowledge or information. It must be checked for its validity in the light
of current information. Inductive inference about the population based on the sample
estimates involves an element of risk. However, sampling keeps the decision-
nl
making costs low. Statistics provides quantitative base for testing our beliefs about
the population.
4. Relationship between Facts: Statistical methods are used to investigate the cause
O
and effect relationship between two or more facts. The relationship between demand
and supply, money-supply and price level can be best understood with the help of
statistical methods.
5. Expectation: Statistics provides the basic building block for framing suitable policies.
ity
For example how much raw material should be imported, how much capacity should
be installed, or manpower recruited, etc., depends upon the expected value of
outcome of our present decisions.
s
1.2 Limitations of Statistics
Statistical techniques, because of their flexibility have become popular and
er
are used in numerous fields. But statistics is not a cure-all technique and has few
limitations. It cannot be applied to all kinds of situations and cannot be made to answer
all queries. The major limitations are:
v
1. Statistics deals with only those problems, which can be expressed in quantitative
terms and amenable to mathematical and numerical analysis. These are
ni
not suitable for qualitative data such as customer loyalty, employee integrity,
emotional bonding, motivation etc.
2. Statistics deals only with the collection of data and no importance is attached to
U
an individual item.
3. Statistical results are only an approximation and not mathematically correct.
There is always a possibility of random error.
ity
6. The greatest limitation is that the statistical data can be used properly only by a
profressional. A person having thorough knowledge of the methods of statistics
and proper training can only come to conclusions.
)A
7. If statistical data are not uniform and homogenous, then the study of the problem
is not possible. Homogeneity of data is essential for a proper study.
8. Statistical methods are not the only method for studying a problem. There are
other methods as well, and a problem can be studied in various ways.
(c
e
The collection and analysis of data constitute the primary stages of execution of
any statistical investigation. The procedure for collection of data depends upon various
in
considerations such as the scope, objective, nature of investigation, etc. Availability
of resources such as time, money, manpower, etc., also affect the procedure choice.
Data may be collected either from a primary or from a secondary source, which are
nl
described below.
O
The data used in statistical study is termed as either ‘primary’ or ‘secondary’
depending upon whether it was collected specifically for the undertaken study or for
some other purpose.
ity
When the data used in a statistical study is collected under the control and
supervision of the investigator, such type of data is referred to as ‘primary data’.
Primary data is collected afresh and for the first time, and thus, happen to be original
in character. On the other hand, when the data is not collected for this purpose, but is
derived from other sources then such data is referred to as ‘secondary data’. Often,
s
secondary data is collected by some other organization to satisfy their needs, but it is
used by someone else for entirely different reasons.
er
The difference between primary and secondary data is only in terms of degree. For
example, data, which are primary in the hands of one, becomes secondary in hands
of other. Suppose an investigator wants to study the working conditions of labourers in
v
an industry. If the investigator or their agent collects the data directly, then it is called
a ‘primary data’. But if subsequently someone else uses this collected data for some
ni
The study of statistics can be categorized into two main branches. These branches
are descriptive statistics and inferential statistics.
Descriptive statistics is used to sum up and graph the data for a category picked.
ity
Descriptive statistics give information that describes the data in some manner. For
m
example, suppose a pet shop sells cats, dogs, birds and fish. If 100 pets are sold, and
35 out of the 100 were dogs, then one description of the data on the pets sold would be
that 35% were dogs.
)A
represents the population. This requirement affects our process. At a broad level, we
must do the following:
e
●● Draw a representative sample from that population.
●● Use analyses that incorporate the sampling error.
in
1.5 Methods of Collecting Data
nl
Methods of collecting primary data
Generally, for managerial decision-making, it is necessary to analyze information
regarding a large number of characteristics. Collection of primary data can be time
O
consuming, expensive, and hence requires a great deal of deliberation. According to the
nature of information required, one of the following methods or their combination can be
selected.
1. Observation Method: In this method, the investigator collects the data through
ity
personal observations. This method is very useful if data is created in the system
through capturing transactions. Computerized transaction processing can be
modified to generate necessary data or information. An investigator well versed with
the system or a part of the system is ideally suited for collecting this kind of data. Since
s
the investigator is solely involved in collecting the data, their training, knowledge and
skills play an important role as far as the quality of the data is concerned. Sometimes,
er
the audio/video aids can also be used to record the observations.
2. Indirect Investigation: In this case, the information collected by oral or written
interrogation forms the primary data. Usually enquiry commissions, board of
v
investigations, investigation teams and committees collect data in this manner.
Quality of the data largely depends upon the person interviewed, their motives,
ni
memory, overall cooperation, and the interviewer’s repute with the person being
interviewed.
3. Questionnaire with Personal Interview: This is the most common and popular
U
method for data collection. In this method, individuals are personally interviewed and
answers are recorded to collect the data. Questionnaire is structured and followed
in specific sequence. Occasionally, a part of the questionnaire may be unstructured
to motivate the interviewee to give additional information or information on intimate
ity
matters. Accuracy of the data depends on the ability, sincerity and the tactfulness of
the interviewer to conduct the interview in friendly and professional environment.
4. Mailed Questionnaire: In this method, the structured questionnaire is mailed to
selected people with a request to fill it and return. Along with the questions, the
m
explain the reason for the data collection and, if any, to alleviate the respondent’s
fears. The respondents are believed to be literate and be able to answer the
questions without any confusion. This is a less expensive and faster method to collect
large volume of data, over a wide geographic area, in a standard form, and at the
convenience of the respondent. Hence this method is most popular and extensively
(c
used. However, this method needs a guard against two drawbacks viz. The absence
of an interviewer, which results in a large proportion of the non-response and the
possibility of reducing the reliability of the replies if the respondent is not sufficiently
Notes
e
motivated. These shortcomings can be overcome by increasing sample size and
designing the questionnaire comprehensively.
in
5. Telephonic Interview: This method is less expensive but has limited in scope, as
the respondent must possess a telephone and has it listed. Further, the respondent
must be available and in the frame of mind to provide correct answers. This method
nl
is comparatively less reliable for public surveys. However, for industrial survey, in
developed regions, and with known customers, this method is best suited. There is
a limit to the number of questions that the interviewee could answer in three to four
minutes. The mthod is efficient If there are just three to five yes/no type questions
O
and two to three short questions.
6. Internet Surveys: Of late, Internet surveys have become popular. These are less
expensive, fast and can be interactive. However, its scope is limited to those who have
ity
regular Internet access. With rapid growth in technology and Internet connectivity it
would be one of the main methods of collecting primary data. With its interactivity
and multimedia facilities it also combines the advantages of other methods.
s
Secondary data is one that has been collected or analyzed by some other agency
er
for another purpose. Sources of secondary data are -
using this data. For international situations this data could be very useful and
authentic.
3. Journals of trade, commerce, economics, scientific, engineering, medicine, etc.
U
can provide most authentic and much cheaper information provided we could
identify the source.
6. Diaries, letters, mailers can also provide secondary data. The problem with the
unpublished data is that it’s difficult to locate and get access.
m
Applications of Statistics
Data is a collection of any number of related observations. We can collect the
)A
Data is used everywhere in day to day life. It is applicable in very large number
(c
medicine, psychology, education. All the fields lean heavily on data and its analysis. The
Notes
e
application of data is so vast and ever expanding that it is very difficult to define. Its use
has permeated almost in every facet of our lives.
in
Application of Statistics in Business Decision
Statistics is not restricted to only information about the State, but it also extends
nl
to almost every realm of the business. Statistics is about scientific methods to gather,
organize, summarize and analyze data. More important still is to draw valid conclusions
and make effective decisions based on such analysis. To a large degree, company
performance depends on the preciseness and accuracy of the forecast. Statistics
O
is an indispensable instrument for manufacturing control and market research.
Statistical tools are extensively used in business for time and motion study, consumer
behaviour study, investment decisions, credit ratings, performance measurements and
compensations, inventory management, accounting, quality control, distribution channel
ity
design, etc.
s
statistical analysis of various business circumstances has greatly increased. Prior to
this, when the size of business used to be small without much complexities, a single
er
person, usually owner or manager of the firm, used to take all decisions regarding
the business. Example: A manager used to decide, from where the necessary raw
materials and other factors of production were to be acquired, how much of output will
v
be produced, where it will be sold, etc. This type of decision making was usually based
on experience and expectations of this single individual and as such had no scientific
ni
basis.
Bases of Classification
Notes
e
Some common types of bases of classification are:
in
1. Geographical classification: In this type, data is classified according to area or region.
For example, State wise industrial production, city wise consumer behaviour, area
wise sales figures, etc.
nl
2. Chronological classification: In this type, data is classified according to the time of its
occurrence; for example, monthly sales, daily demand, yearly production, etc.
3. Qualitative classification: When the data is classified according to some attributes,
O
which are not capable of measurement, it is known as qualitative classification. In
dichotomous classification, an attribute is divided into two classes, one possessing
the attribute and other not possessing it; for example, smoker, non-smoker, employed,
unemployed, etc. In many-fold classification, attribute is divided so as to form several
ity
classes like education level, religion, mother tongue, etc.
4. Classification of data according to characteristics: It refers to the classification of
data according to some characteristics which can be measured; for example, age,
salary, height, etc. Quantitative data may be further classified into two types namely
s
discrete and continuous. In case of discrete type, values of the variables taken are
countable (could be infinitely large also for example, integers). Examples of these
er
are number of accidents, number of defectives, etc. In case of continuous quantities,
data can take any real values; for example, weight, height, distance, volume, etc.
Diagrams and graphs are extremely used because of the following reasons:
U
(iv) Diagram and graphs give bird’s eye view of entire data; therefore, it conveys
meaning very quickly.
a. Bar Diagram
m
In a bar diagram, only the length of the bar is taken into account but not the width.
In other words bar is a thick line whose width is merely shown, but length of the bar is
taken into account and is called one-dimensional diagram.
)A
sales, production, population figures etc. for various years may be shown by simple bar
charts
Illustration - 1
Notes
e
The following table gives the birth rate per thousand of different countries over a
certain period of time.
in
New
Country India Germany U. K. Sweden China
Zealand
nl
Birth Rate 33 16 20 30 15 40
40
B 40 Simple Bar Diagram
O
I 35
30
r
30
t
25
h 20
ity
20 16 15
R
15
a 10
t 5
s
e 0
India Germ UK New Swe Chin
eran zeala
nd
den a
Countries
Comparing the size of bars, China’s birth rate is highest, next is India whereas
Germany and Sweden equal in the lowest positions.
v
Illustration 2 - Represent the data by using a simple bar diagram.
ni
Countries: A B C D E F
Production of
U
Rice (000’s 38 42 29 28 18 11
tons):
Production of Rice (000's Tons
ity
50
42
40 38
30 29 28
18
m
20
11
10
)A
0
A B C D E F
Sub-divided BarDiagram
In a subdivided bar diagram, each bar representing the magnitude of given value is
(c
further subdivided into various components. Each component occupies a part of the bar
proportional to its share in total.
Illustration -
Notes
e
Present the following data in a sub-divided bar diagram.
in
Year/Faculty Science Humanities Commerce
2014-2015 240 560 220
2015-2016 280 610 280
nl
Y
1400
O
1200
1000
Scale: 1 cm = 200
800 Index
ity
Sci
600 Hum
400 Com
200
s
er X
2014-15 2015-16
Illustration – 2
m 50000
b 75
e,
e i e nc
,7
0 Sc 00
r 40000 ce
ie n rce
50 Sc 00 m me 0
e, Co , 950
m
o e nc rce
f 30000
i
Sc 00 m me 0
Co , 900
rce
s
m me 0
0
t Co 100
,
)A
u 20000 0
31
d t s,
2 00 r
A 00
e 0 s,
20 Art 00
n 10000 r t s,
A 00
t
s
(c
0
2008-2009 2009-2010 2010-2011
Years
e
In a multiple bar diagram, two or more sets of related data are represented and the
components are shown asseparate adjoining bars. The height of each bar represents
in
the actual value of the component. The components are shown by different shades or
colours.
Illustration 1 - Construct a suitable bar diagram for the following data of number of
nl
students in two different colleges in different faculties.
O
A 1200 800 600 2600
B 700 500 600 180
1800
= College 'A'
ity
1600
= College 'B'
1400
1200
s
1200
No. of students
1000
er
800
800 700
600 600
v
600 500
ni
400
200
U
Fig: A multiple bar diagram showing numbers of students in two different colleges
in differentdepartments.
Illustration 2
University held in May 2006, 2007 and 2008 in a multiple bar diagram
In percentage bar diagram the length of the entire bar kept equal to 100 (Hundred).
Various segment of each bar may change and represent percentage on an aggregate.
Illustration 1
Notes
e
Year Men Women Children
in
1995 45% 35% 20%
1996 44% 34% 22%
1997 48% 36% 16%
nl
700
600
O
500
400 Ist Class
ity
300 IInd Class
s
2006 2007 2008 er
1.8 Line Graph
A line graph is a type of chart used to show information changing over time. We
use multiple dots to plot line graphs connected by straight lines. It is also known as a
v
line chart. The line graph consists of two axes, defined as the axis ‘x’ and the axis ‘y.’
Plotting a line graph is easy. There are simple steps to consider while plotting a line
graph.
●● Draw the x-axis and y-axis on the graph paper. Make sure to write the title
ity
per their respective factors. For example, The x axis can be labeled as time or
day.
●● Afterward, with the help of the already given data, the exact values on the
)A
graph can be pointed. Once the points are joined, a clear inference about the
trend can be made.
slices to illustrate a numerical proportion. In a pie chart, the arc length of each slice
is proportional to the quantity it represents. While it is named for its resemblance to a
pie which has been sliced, there are variations on the way it can be presented..In a pie
Notes
e
chart, categories of data are represented by wedges in the circle and are proportional in
size to the percent of individuals in each category.
in
Pie charts are very widely used in the business world and the mass media. Pie
charts are generally used to show percentage or proportional data and usually the
percentage represented by each category is provided next to the corresponding slice of
nl
pie. Pie charts are good for displaying data for around six categories or fewer.
Example:
O
Show the following data of expenditure of an average working class family by a
suitable diagram
ity
Food 65
Clothing 10
Housing 12
Fuel and Lighting 5
s
Miscellaneous er 8
Solution:
1. Food = 65/ 100 x 360 = 234
2. Clothing = 10/ 100 x 360 = 36
v
3. Housing = 12/ 100 x 360 = 43.2
ni
Classification of data shows the different values of a variable and their respective
frequency of occurrence is called a frequency distribution of the values.
e
distribution (or simple, or ungrouped frequency distribution), and continuous frequency
distribution (or condensed or grouped frequency distribution).
in
a. Discrete Frequency Distribution
The process of preparing discrete frequency distribution is simple. First, all the
nl
possible values of variables are arranged in ascending order in a column. Then another
column of ‘Tally’ mark is prepared to count the number of times a particular value of the
variable is repeated. To facilitate counting, a block of five ‘Tally’ marks is prepared. The
last column contains frequency. To illustrate this let us consider one example.
O
Example:
Construct frequency distribution table for the following data of number of family
ity
members in 30 families:
4 3 2 3 4 5 5 7 3 2
3 4 2 1 1 6 3 4 5 4
2 7 3 4 5 6 2 1 5 3
s
Number
of Family ‘Tally Marks’
er
Frequency
Members
1 ||| 3
v
2 |||| 5
3 |||| || 7
ni
4 |||| | 6
5 |||| 5
6 || 2
U
7 || 2
Total N = 30
1. Class limits: Class limits denote the lowest and highest value which can be included
in the class. The two boundaries of class are known as the lower limit and upper limit
)A
of the class. For example, 10-18, 20-28, where 10 and 18 are limits of the first class;
20 and 28 are limits of second class,etc.
2. Class intervals: The class interval represents the width, the span or the size of a
class. The width may be determined by subtracting the lower limit of one class from
the lower limit of the following class. For example, classes 10-15, 15-20, etc have
(c
e
is known as its class frequency. Total frequency indicates the total number of
observations N =Σf.
in
4. Mid-point of a class is defined as the sum of two
successive lower limits divided by two. Thus class mark is the value lying halfway
between lower and upper class limits. For example, classes 10-20, 20-30, etc have
nl
class marks 15, 25etc.
5. Types of class intervals: There are many different ways in which limits of class
intervals can beshown.
O
6. Exclusive method: In this method, the class intervals are so arranged that upper
limit of one class is the lower limit of next class. This method always presumes that
the upper limit is excluded from the class, for example, with class limits 20-25, 25-30
observation with value 25 is included in class25-30.
ity
7. Inclusive method: In this method, the upper limit of the class is included in the
same class itself. In such case there is no overlap of upper limit of former class and
lower limit of successive class. For example, with class limits 20-29.5, 30-39.5, 40-
49.5, etc. there is no ambiguity but values from 29.5 to 30 or 39.5 to 40 etc. are not
s
allowed. er
8. Open end: In an open-end distribution, the lower limit of the very first class or upper
limit of the last class is not given. For example, while stating the distribution of
monthly salary of managers in rupees, one may specify class limits as, below 10000,
10000-15000, 15000-20000, 20000-25000, above 25000. Similarly, while recording
v
weights of college students in kg as grouped data the class intervals could be less
than 40, 40 to 50, 50 to 60, 60 to 70, 70 to 80, greater than80.
ni
9. Unequal class interval: The method Is also used to limit the class intervals where
the width of the classes is not equal for all classes. This method is of practical use
when there are large gaps in the data, or distribution of the data is uneven. It is used
U
for explaining, visualizing and plotting data with unequal class interval. However, we
must adjust formulae for calculationsaccordingly.
In many situations rather than listing the actual frequency opposite each class, it
may be appropriate to list either cumulative frequencies or relative frequencies orboth.
Cumulative Frequencies
m
The cumulative frequency of a given class interval thus, represents the total of all
the previous class frequencies including the class against which it is written.
)A
Relative Frequencies
Relative frequency is obtained by dividing the frequency of each class by the total
number of observations ie. the totalfrequency.
Theseare:
Notes
e
●● Relative frequencies facilitate the comparison of two or more than sets ofdata.
●● Relative frequencies constitute the basis of understanding the probability
in
concept.
Example: Age of 50 employees is given. Find cumulative frequency, relative
frequency and percentage frequency.
nl
22 21 37 33 28 42 56 33 32 59
40 47 29 65 45 48 55 43 42 40
O
37 39 56 54 38 49 60 37 28 27
32 33 47 36 35 42 43 55 53 48
29 30 32 37 43 54 55 47 38 62
ity
Class Class Cumulative Relative Percentage
Interval Frequency Frequency Frequency Frequency
20-30 7 (0+7) = 7 7/50 = 0.14 14
s
30-40 16 (7+16) = 23 16/50 = 0.32 32
40-50 15 (23+15) = 38 15/50 = 0.30 30
er
50-60 9 (38+9) = 47 9/50 = 0.18 18
60-70 3 (47+3) = 50 3/50 = 0.06 6
N = f = 50 Total = 1 Total = 100
v
ni
measures.
1.11 Histogram
A histogram consists of contiguous boxes and has both horizontal axis and a
ity
vertical axis. The horizontal axis is labeled with what the data represents (for instance,
distance from your home to school). The vertical axis is labeled either Frequency or
relative frequency. The graph will have the same shape with either label. The histogram
(like the stemplot) can give you the shape of the data, the center, and the spread of the
m
data. (The next section tells you how to calculate the center and the spread.)
The relative frequency is equal to the frequency for an observed value of the data
divided by the total number of data values in the sample. (In the chapter on Sampling
)A
and Data (Section 1.1), we defined frequency as the number of times an answer
occurs.)
RF = f/n
Where f is the frequency n is the total number of data values (or the sum of the
(c
100%, then,
Notes
e
f = 3, n = 40 and
in
RF = f/n
= 3/40
= 0.075
nl
Seven and a half percent of the students received 90% to 100%. Ninety percent to
100% are quantitative measures.
O
Example:
Formulate the Histogram from the following data –
ity
Class Interval Frequency
10.5 – 18.5 3
18.5 – 26.5 5
26.5 – 34.5 5
s
34.5 – 42.5 2
42.5 – 50.5 4
er
50.5 – 58.5 2
Solution:
v
Histogram
ni
U
ity
m
)A
Also, when several distributions are to be compared on the same graph paper,
frequency polygons are better than Histograms.
Illustration 1
Notes
e
Draw a histogram and frequency polygon from the following data
in
Age in Years Number of Persons
10-20 3
20-30 16
nl
30-40 22
40-50 35
O
50-60 24
60-70 15
70-80 2
ity
Scale along y axis 1cm = 5 units
35
30 Frequency polygon
No. of persons
25
s
20 erHistogram
15
10
5
v
0
10
40 20 30
50 60 70 80
age
ni
Ogives
When frequencies are added, they are called the cumulative frequencies. The
U
class, to get the cumulative frequencies. (ii) Plot classes on the horizontal (x-axis) and
cumulative frequencies on the vertical (y-axis).
Less than Ogive: To plot a less than ogive, data is arranged in ascending order of
magnitude and frequencies are cumulated from the top i.e. adding. Cumulative frequencies
m
are plotted against the upper class limits.Ogives under this method, gives a positivecurve
Greater than Ogive: To plot a greater than ogive, the data is arranged in the
ascending order of magnitude and frequencies are cumulated from the bottom or
)A
subtracted from the total from the top. Cumulative frequencies are plotted against the
lower class limits.Ogives under this method, gives negative curve
two distributions.
Illustration 1 –
Notes
e
Draw less than and more than ogive curves for the following frequency distribution
and obtain median graphically. Verify the result.
in
CI 0-20 20-40 40-60 60-80 80-100 100-120 120-140 140-160
nl
f 5 12 18 25 15 12 8 5
O
20 5 100 0
40 17 95 20
60 35 83 40
ity
80 60 65 60
100 75 40 80
120 87 25 100
140 95 13 120
s
160 100 5 140
180
Y
er
160
v
140
ni
120
80
60
ity
40
More than
20
0
m
X
20 40 60 80 100 120 140 160 180
Key takeaways
)A
●● Sample: A sample consists one or more observations drawn from the population.
Notes
e
Sample is the group of people who actually took part in your research.
●● Population: A population includes all of the elements from a set of data.
in
Population is the broader group of people that you expect to generalize your study
results to.
●● Frequency Polygon: These are the frequencies plotted against the mid-points of
nl
the class-intervals and the points thus obtained are joined by line segments
●● Bar Diagram: Only length of the bar is taken into account but not the width. In
other wards bar is a thick line whose width is shown merely, but length of thebarist
O
akenintoaccountiscalledone-dimensionaldiagram.
●● Simple Bar Diagram: It represents only one variable. Since these are of
the same width and vary only in lengths (heights), it becomes very easy for
ity
comparativestudy.Simplebardiagramsareverypopularinpractice.
●● Percentage bar diagram: the length of the entire bar kept equal to 100
(Hundred).Varioussegmentofeachbarmaychangeandrepresentpercentage on
anaggregate.
s
●● Range: The ‘Range’ of the data is the difference between the largest value of data
and smallest value of data.
●●
er
Multiple bar diagram: It is where two or more set of related data are represented,
and the components are shown as separate adjoining bars. The height of each
bar represents the actual value of the component.The component sare shown by
v
different shades or colours.
●● Deviation bars: They are used to represent the net quantities - excess or deficit
ni
i.e. net profit, net loss, net exports or imports etc. Such bars have both positive
and negative values. Positive values lie above the base line and negative values
lie below it.
U
represents the total of all the previous class frequencies including the class
against which it is written.
●● Pie Chart: A pie chart or a circle chart is a circular statistical graphic that is divided
)A
into slices to illustrate a numerical proportion. In a pie chart, the arc length of each
slice is proportional to the quantity it represents
b) Descriptive Statistics
Notes
e
c) Inferential statistics
d) Probability
in
2. In which technique the information collected by oral or written interrogation forms
the primary data.
nl
a) Indirect investigation
b) Descriptive Statistics
c) Inferential statistics
O
d) Observation method
3. A type of chart used to show information changing over time
ity
a) Bar chart
b) Pie chart
c) Line graph
d) Multiple bar diagram
s
4. _________________ represents the total of all the previous class frequencies
er
including the class against which it is written.
a) Cumulative frequency
b) Pie chart
v
c) Relative frequency
ni
a) Cumulative frequency
b) Pie chart
c) Histogram
ity
Further Readings
Notes
e
1. Richard I. Levin, David S. Rubin, Sanjay RastogiMasood Husain Siddiqui,
Statistics for Management, Pearson Education, 7th Edition,2016.
in
2. Prem.S.Mann, Introductory Statistics, 7th Edition, Wiley India,2016.
3. GarethJames,DanielaWitten,TrevorHastie,RobertTibshirani,AnIntroduction toSt
atisticalLearningwithApplicationsinR,Springer,2016.
nl
Bibliography
1. SrivastavaV.K.etal–QuantitativeTechniquesforManagerialDecisionMaking, Wiley
O
EasternLtd
2. Richard, I.Levin and Charles A.Kirkpatrick – Quantitative Approaches to Management,
McGraw Hill, Kogakusha Ltd.
ity
3. Prem.S.Mann, Introductory Statistics, 7th Edition, Wiley India,2016.
4. Budnik, Frank S Dennis Mcleaavey, Richard Mojena – Principles of Operation
Research - AIT BS New Delhi.
s
5. SharmaJK–OperationResearch-theoryandapplications-McMillan,NewDelhi
6. KalavathyS.–OperationResearch–VikasPubCo
er
7. GouldFJ–IntroductiontoManagementScience–EnglewoodCliffsNJPrentice Hall.
8. NarayJK,OperationResearch,theoryandapplications–McMillan,NewDehi.
v
9. TahaHamdy,OperationsResearch,PrenticeHallofIndia
10. Tulasian: Quantitative Techniques: PearsonEd.
ni
e
Objectives
in
1. To get introduced with limitations, applications and functions of statistics
2. To know about the measures of central tendency
nl
Outcomes
1. The learner will be able to use the measures of central tendency in business situations
O
“ It is the science of collection, presentation, analysis, and interpretation of
numerical data from logical analysis”
ity
Croxton and Cowden define-
2.1.1 Introduction
Measures of central tendency are a single value which can be considered as
s
representative of a set of observations. The value around which the observations
can be considered as centered is known as an Average or average value or a
er
location center. Since such representative values tend to lie centrally within a set of
observations when arranged according to magnitudes, these averages are then called
measures of central tendency.
v
2.1.2 Central Tendency Measures
ni
Central tendency has three main measures: mode, median and mean. Each
of those measurements represents a specific indication of the distribution’s typical or
central value.
U
●● Mean - The mean is the average of the numbers. It is easy to calculate: add up all
the numbers, then divide by how many numbers there are. In other words it is the
sum divided by the count
ity
●● Mode - The mode is the number most frequently seen in a dataset. A collection of
numbers may have one mode, one mode, or no mode at all. Other popular central
tendency measurements include a set’s mean, or mean, and a set ‘s median,
)A
middle value.
2.1.3 Average
An average is a single figure that sums up the characteristics of a whole group of
(c
figures.
In the words of clark “average is an attempt to find one single figure to describe
Notes
e
whole of figures. An average is described as a measure of central tendency as it is
more or less a central value around which various values cluster.
in
In the world of CROXTON and COWDEN “an average is a single value within the
range of the data that is used to represent all of the values in the series. Since an average
is somewhere within the range of the data, it is called a measure of cultural value.
nl
Objectives served by Averages
Averages serve the following purposes:
O
1. To obtain a clear and concise picture of large number of numerical data.
2. To compare different groups by the means of averages.
3. To obtain a clear picture of a whole group studying sample data.
ity
4. To provide definite rates to the relationship between different groups.
s
2. It is easy to understand and calculate, hence it is very popular.
er
3. It is based on all the observations; so that it can become a good representative.
4. It can be easily used for comparisons.
5. It is capable of further algebraic treatments, like finding the sum of the observation
v
values. Finding the mean and total number of the observations, and finding the
combined arithmetic mean when different groups are given etc.
ni
Symbolically – x= x1 + x2 + x3 + xn/n
e
1. The sum of the deviations, of all the values of x, from their arithmetic mean, is zero.
2. The product of the arithmetic mean and the number of items gives the total of all
in
items.
3. Finding the combined arithmetic mean when different groups are given.
nl
Demerits of Arithmetic Mean
1. Arithmetic mean is affected by the extreme values.
O
2. Arithmetic mean cannot be determined by inspection and cannot be located
graphically.
3. Arithmetic mean cannot be obtained if a single observation is lost or missing.
ity
4. Arithmetic mean cannot be calculated when open-end class intervals are present in
the data.
s
A) Individual Series
1. Direct Method
er
The following steps are involved in calculating arithmetic mean under an individual
series using direct method:
v
- Add up all the values of all the items in the series.
- Divide the sum of the values by the number of items. The result is the arithmetic
ni
mean.
The following formula is used: X = ∑ x/N
U
Illustration 1 – Value(x) – 125 128 132 135 140 148 155 157 159 191
Solution –
Mean = ∑ x ∑ 125 128 132 135 140 148 155 157 159 191 = 1440
m
X ∑ ∑ x/n ∑ 1440/10
= 144
)A
2. Find out the deviation of each value from the assumed average.
Notes
e
3. Add up the deviations
4. Apply the following formula. X = A d N + ∑
in
where, X = Arithmetic mean A = Assumed average ∑ d = Sum of the deviations
N = Number of items
nl
Illustration - 1
Calculate the arithmetic average of the data given below using short–cut method
O
Roll No 1 2 3 4 5 6 7 8 9 10
Marks 43 48 65 57 31 60 37 48 78 59
ity
Solution –
s
3 65 er 5
4 57 -3
5 31 -29
6 60 0
v
7 37 -23
8 48 -12
ni
9 78 18
10 59 -1
∑d = – 74
U
X = a + ∑d/N
ity
Arithmetic mean and number of items of two or more related groups are known
m
as combined mean of the entire group. The combined average of two series can be
calculated by the given formula –
n1x1 + n2x2/ n1 + n2
)A
Where, n1 = No. of items of the first group, n2 = No. of items of the second group
Example - From the following data ascertain the combined mean of a factory
(c
Solution:
Notes
e
Let the no. of workers in branch A be n1 = 500
in
Average salary x1 = 300
nl
n1x1 + n2x2/ n1 + n2
O
= 1, 50,000 + 2, 50,000/1500
= 266.66
ity
2.1.6 Weighted Arithmetic Mean
Some times, some observations get relatively more importance than other
observations. The weight for such observation must be given on the basis of their
relative importance. In weighted arithmetic mean, for finding an average the value of
s
each item is multiplied by its weight and then the product are divided by the number of
weights.
er
Symbolically = ∑wx / ∑w
Example – Calculate simple and weighted average from the following data –
v
Month Jan Feb March April May June
ni
Solution:
No. of tonnes
Month Price Per Tonn WX purchased (w)
( in 000)(x)
ity
Simple AM
X = ∑x/n = 294/6 = 49
(c
Weighted AM
The correct average price paid is `50.30 and not `49 i.e., weight arithmetic mean is
Notes
e
correct than simple arithmetic mean.
in
2.1.7 Median
Median is defined as the value of the item dividing the series into two equal
halves, whereonehalf contains all values a less than (or equal to) it and the other half
nl
contains all values greater than (or equal to) it. It is alsode fined as the “ central value
of the variable. In median, the value of items must be arranged in order of their size or
magnitude to find out themedian.
O
Median is a positional average. The term position refers to the place of a value in
the series, where the place of median is such that it is equal to the number of items
lying on the either side; therefore it is also called as locativeaverage.
ity
Merits of Median
Following are the advantages of median:
1. It is rigidly defined.
s
2. It is easy to calculate and understand.
3. It can be located graphically.
er
4. It is not affected by extreme values like the arithmeticmean.
5. It can be found by mere in spection.
v
6. It can be used for qualitativestudies.
7. Even if the extreme values are unknown, median can be calculate difone knows the
ni
number of items.
Demerits of Median
U
1. In the case of individual observations, the values are to be arranged in order of their
size to locate median. Such an arrangement of data is tedious task if the number of
ity
items islarge.
2. If the median is multiplied by the number of items, the total value of all the items
cannot be obtained as in the case of the arithmetic average.
3. It is not suitable for complex algebraic or mathematical treatment.
m
S. No Value or Size
Notes
e
1 15
2 20
in
3 23
4 23
5 25
nl
6 25
7 25
8 27
O
9 40
Median = 10/2
= 5th term
ity
= 25
Example:
The following steps are involved in calculating median in continuous series:
s
1) Find out the cumulative frequency
er
2) Find out the median item, i.e., N/2 th item.
3) Find out the group or class containing the median
4) Estimate the median applying the following formula.
v
n
− cf
Me = i + 2
ni
xi
fm
where me = Median
U
Example 1:
Calculate the median mark from the following frequency distribution.
m
0-20 13
0-30 20
0-40 32
0-50 60
(c
0-60 80
0-70 90
Solution:
Notes
e
Mark F CF
0-10 5 5
in
0-20 6 13
0-30 7 20
nl
0-40 12 32
0-50 28 60
0-60 20 80
O
0-70 10 90
N 90
M= = = 45
2 2
ity
n
− cf 50 - 40
Me = 1 + 2 = 40 +
xi 28
fm
10
s
= 40 + x 13
28 er
= 40 + 4.64 = 44.64
Find the median from the following series. Also draw less than ogive, more than
v
ogive and locate median on a graph.
0-20 82
20-40 112
U
40-60 150
60-80 95
80-100 48
ity
Solution:
Class Class
C.I. F L.C.F. M.C.F.
(Less then) (More then)
m
0-20 82 20 82 0 487
20-40 112 40 194 20 405
)A
600
Notes
e
500
No. of Persons
in
400 Less than ogive
300
nl
100
0
20 40 60 80 100
O
Median 50 Median Income
2.1.9 Mode
ity
The word “mode” is derived from the French word “1a mode” which means fashion.
So it can be regarded as the most fashionable item in the series or the group.
Croxtan and Cowden regard mode as “the most typical of a series of values”.
As are sult it can sum up the characteristics of a group more satisfactorily than the
s
arithmetic mean ormedian. Mode is defined as the value of the variable occurring most
frequently in a distribution. In other words it is the most frequent size of item in a series.
er
Merits of Mode
The following are the merits of mode:
1. The most important advantage of mode is that itisusuallyon an actual value.
v
2. In the case of discrete series, mode can be easily located by inspection.
ni
Demerits of Mode
The following are the demerits of mode:
ity
4. It will not truly represent the group if there are a small number of items of the same
size in a large group of items of different sizes
5. It is not suitable for further mathematical treatment
)A
The mode of this series can be obtained by mere inspection. The number which
occurs most often is the mode.
Illustration - 1
Notes
e
Locate mode in the data 7, 12, 8, 5, 9, 6, 10, 9, 4, 9, 9
in
Solution:
On inspection, it is observed that the number 9 has maximum frequency i.e., repeated
maximum of 4 times than any other number. Therefore mode (Z)= 9
nl
b) DiscreteSeries
The mode is calculated by applying grouping and analysis table.
O
i) Grouping Table: Consisting of six columns including frequency column, 1st
column is the frequency 2nd and 3rd column is the grouping two way frequencis
and 4th, 5th and 6th column is the grouping three way frequencies.
ity
ii) Analysis table: consisting of two columns namely tally bar and frequency
s
1. Group the frequencies bytwo’s. er
2. Leave the frequency and group the other frequencies intwo’s.
3. Group the frequencies inthrees.
4. Leave the frequency of the first size and add the frequencies of other sizes in
v
three’s.
5. Leave the frequencies of the first two sizes and add the frequencies of the other
ni
sizes in threes.
6. Prepare an analysis table to know the size occurring the maximum number
of times. Find out the size, which occurs the largest number of times. That
U
c) ContinuousSeries
ity
1. Find out the modal class. Modal class can be easily found out by inspection.
The group containing maximum frequency is the modal group. Where two or
more classes appearto be a modal class group, it can be decided by grouping
m
Mo = l + fm – f1 / 2fm – f1 – f2i
Example:
Marks F CF
(c
0-10 5 5
e
Daily wages in ` (x) : 20-25 25-30 30-35 35-40 40-45 45-50
in
No. of workers (f) : 1 2 8 12 7 5
Solution:
nl
Here, the maximum frequency is 12, corresponding to the class interval (35-40)
which is the modal class, Therefore, L1=35 L2=40 F1=12 FM=8 F2=7
X F
O
20-25 1
25-30 3
30-35 8
ity
35-40 12
40-45 7 f2
15-50 5
s
fm – f1 12 - 8
Mode = I + x I = 35+ ( ) 40.35
2 fm – f1–f1 2(12)=8.7
er
4 20
= 35 + ( ) 5 = 35 + ( ) =35+2.22 = 37.22
24–15 9
v
Example 2:
Less than
10 20 30 40 50 60 70 80
ni
Frequency:
4 16 40 76 96 112 120 125
U
Solution:
Need to ascertain lower limit of the continuous class (LL = UL –) Class length (CL)
= 20–10 = 10 i.e., (10–10 = 0............)
ity
fm – f1 36 - 24
Z=I+ x I = 30+ ( ) 40.30
2fm – f1–f1 2(36)= 24-20
12 120
Notes
e
= 30 + ( ) 10 = 30 + ( ) =30+4.285 Z= 37.22
72–44 28
in
2.1.11 Empirical Relationship between Mean, Median andMode
When mode is ill defined, it is difficult to find the value of mode, a sort of empirical
relationship exist among the mean, median and mode in such a way that the median
nl
lies between the mode and the mean.The mode departs (to the left i.e., positive
skewed) 2/3 difference from the median and the mean departs (to the right i.e.,
negatively skewed) 1/3 difference from the median. Karl Pearson’s expressed this
O
relationship as Z = 3M - 2X (when it is positives kewness).
Solution :
ity
Z = 3M - 2X
= 3(28)-2(29)
= 84– 78
s
=26 er
29>28>26
– M = ? AM = 39 Z = 36.5
Solution:
v
Z = 3M - 2X
ni
= 36.5 = 3(M)-2(39)
= 36.5 = 3M –78
U
= 3M = -78 - 36.5
M = - 114.5/-3
= 38.16
ity
Key Takeaways
●● Measures of central tendency: It is a single value which can be considered as
representative of a set of observations and around which the observations can be
m
central value around which various values cluster.In the world of CROXTON and
COWDEN “an average is a single value within the range of the data that is used to
represent all of the values in theseries.
●● Median: It is defined as the value of that item which divides the series into two
(c
equal halves, onehalf contains all values less than (or equal to)it and the other half
containing all values greater than (or equal to) it. It is also defined as the “central
value of the variable.
●● Mode: It is derived from the French word “1a mode” meaning fashion. So it can be
Notes
e
regarded as the most fashionable item in the series or the group.
●● Range: The ‘Range’ of the data is the difference between the largest value of data
in
and smallest value of data.
nl
1. _______________ is a single figure that sums up the characteristics of a whole
group of figures.
a) Average
O
b) Median
c) Standard deviation
d) Histogram
ity
2. __________________ is defined as the value obtained by dividing the total values
of all items in the series by their number
a) Mean
s
b) Median
c) Standard deviation
er
d) Histogram
3. Arithmetic mean and number of items of two or more related groups are known as
v
combined mean of the entire group.
a) Standard arithmetic mean
ni
b) Grouping table
c) Analysis table
d) Frequency table
m
b) Median
c) Mode
d) Average
2. What is an average? What are the characteristics and objectives of a good average
Notes
e
3. Describe the empirical relationship between mean, median and mode
4. Explain the steps in calculating mode in discrete series
in
5. Determine the median from the following – 25, 15, 23, 41, 28 26 24 25 20
nl
1. a) Average
2. a) Mean
O
3. b) Combined arithmetic mean
4. d) Frequency table
5. c) Mode
ity
Further Readings
4. Richard I. Levin, David S. Rubin, Sanjay RastogiMasood Husain Siddiqui,
Statistics for Management, Pearson Education, 7th Edition,2016.
s
5. Prem.S.Mann, Introductory Statistics, 7th Edition, Wiley India,2016.
er
6. Gareth James, Daniela Witten,Trevor Hastie, Robert Tibshirani, AnIntroduction
to Statistical Learning with Applicationsin R, Springer, 2016.
Bibliography
v
1. SrivastavaV.K.etal–QuantitativeTechniquesforManagerialDecisionMaking, Wiley
Eastern Ltd.
ni
e
Objectives
in
1. To understand the concept of dispersion and the difference between absolute and
relative measures
nl
2. To know about all the measures of dispersion
3. To get introduced with the concepts skewness, moments and kurtosis
O
4. To understand the measures to find coefficient of skewness
Outcomes
1. The learner will be able to use the measures of dispersion in business situations
ity
2. To learner will able to apply the concepts like skewness, kurtosis and moments in
order to answer business queries
s
“Statistics are the classified facts representing the conditions of the people
in the state. Specially those facts which can be stated in number or in table
of numbers or in any tabular or classified arrangement.”
Webster-
er
3.1.1 Introduction
v
Different series may possess different dispersions of items around the average.
ni
Measures of central tendency are averages of the first order. Measures of dispersion
are averages of the second order. A measure of dispersion gives an idea about the
extent of lack of uniformity in the sizes and qualities of the items in a series. It helps us
U
to know the degree of uniformity and consistency in the series. If the difference between
items is large the dispersion or variation is large and vice versa.
A measure of dispersion or variation in any data shows the extent to which the
numerical values tend to spread about an average. If the difference between items is
ity
small, the average represents and describes the data adequately. For large differences
it is proper to supplement information bycalculating a measure of dispersion in addition
to an average.It is useful to determine data for the knowledge it may serve:
are expressed. For example, if the series is expressed as Marks of the students in a
Notes
e
particular subject; the absolute dispersion will provide the value in Marks. The only
difficulty is that if two or more series are expressed in different units, the series cannot
be compared on the basis of dispersion.
in
Definition: ‘Relative’ or ‘Coefficient’ of dispersion is the ratio or the percentage of a
measure of absolute dispersion to an appropriate average. The basic advantage of this
nl
measure is that two or more series can be compared with each other despite the fact
they are expressed in different units.
Definition: A precise measure of dispersion is one that gives the magnitude of the
O
variation in a series, i.e. it measures in numerical terms, the extent of the scatter of the
values around the average.
ity
dispersion or variability. It is difficult to compare absolute values of dispersion in
different series, especially when the series in different units or have different sets of
values. A good measure of dispersion should have properties similar to those described
for a good measure of central tendency.
s
Measures of Dispersion Relative Variability
er
The range Relative range Relative range
The Quartile Deviation Deviation Relative Quartile Deviation
The Mean Deviation Deviation Relative Mean deviation
v
The Median Deviation Deviation Coefficient of Variation
ni
There are two main types of dispersion methods in statistics which are the absolute
Measure of Dispersion and the relative Measure of Dispersion
●● A measure of spread gives us an idea of how well the mean, for example,
represents the data.
●● It is usually used in conjunction with a measure of central tendency, such as
the mean or median, to provide an overall description of a set of data.
●● The range is the difference between the highest and lowest scores in a data
(c
●● Quartiles tell about the spread of a data set by breaking the data set into
Notes
e
quarters, just like the median breaks it in half.
3.1.4 Range
in
The ‘Range’ of the data is the difference between the largest value of data and
smallest value of data.
nl
This is an absolute measure of variability. However, if we have to compare two sets
of data, ‘Range’ may not give a true picture. In such case, relative measure of range,
called coefficient of range is used. This is given by,
O
Formulae: Range = L-S
ity
In individual observations and discrete series, L and S are easily identified. In
continuous series, the following two methods are used as follows:
s
Method 2: L - Mid value of the highest class.
er
S - Mid Value of the lowest class.
Example 1:
v
Find the set of observations 10 5 8 11 12 9
Solution: L = 12 S = 5
ni
Range = L – S
= 12 – 5
U
=7
Coefficient of range = L – S / L + S
ity
= 12 – 5/ 12 + 5
= 7/17
= 0.4118
m
Example 2:
Compute the range and the co-efficient of range from the following distribution.
)A
150 - 160 12
160-170 5
Solution:
Notes
e
In finding the range the frequencies are never taken into account. The upper limit
of the highest class and the lower limit of the smallest class are only taken into account
in
Range = L - S
= 170 - 120 = 50
nl
Co-efficient of Range = L-S/L+S = 170 – 120/170 + 120
= 50/290
O
= 0.1724
ity
Inter-quartile range and deviations are described in the following sub sections.
Inter-quartile Range
Inter-quartile range is a difference between upper quartile (third quartile) and lower
quartile
s
(First quartile).Thus, Inter Quartile Range = (Q3 - Q1)er
Quartile deviation
Quartile Deviation (QD) also gives the average deviation of upper and lower
quartiles from Median.
QD = (Q3 - Q1)/2 = Q3 - Q1 / Q3 + Q1
U
Example 1:
Weekly wages of labourers is given below. Calculate Q.D. and coefficient of Q.D.
ity
Solution:
m
200 8 13
400 21 34
500 12 46
600 6 52
(c
N = 52
Q1 = N+1 /4
= 52+1/4
Notes
e
13.25
in
Q1 = 13th value + 0.25 (14th value – 13th value)
nl
= 200 + 50
= 250
O
Q3 = 3(N+1 /4)
= 3 x 13.25
ity
= 39.75
s
= 500 + 0.75 X 0
= 500.
er
Q.D. = Q3 - Q1 / 2
= 500 – 250/2
v
= 250/2
ni
= 125
= 250/750
= 0.333
ity
Example 2:
Determine the interquartile range and percentile range of the following distribution:
11 - 13 8
13 - 15 10
)A
15 - 17 15
17 - 19 20
19 – 21 12
21 - 23 11
(c
23 – 25 4
Solution:
Notes
e
Class Intervals Frequency Less than C.F.
11 - 13 8 8
in
13 - 15 10 18
15 - 17 15 33
nl
17 - 19 20 53
19 – 21 12 65
21-23 11 76
O
23-25 4 80
ity
Calculation of Q1
N 80
Since = = 20, the first quartile class is 15–17
4 4
\ IQ1 = 15, fQ1 = 15, h = 2 and C = 18
s
20 – 18
Hence, Q1 = 15 + x 2 = 15.27
15 er
Calculation of Q3
3N 3 x 80
Since = = 60, the third quartile class is 19-21
4 4
v
\ IQ3 = 19, fQ3 = 12, h = 2 and C = 53
ni
60 – 53
Hence, Q3 = 19 + x 2 = 20.17
12
U
10N 10 x 80
Since, = = 8, P10 lies in the class interval 11–13
100 100
m
8–0
Hence, P10 = 11 + x 2 = 13
8
)A
Calculation of P10
90N 90 x 80
Since = = 72, P10 lies in the class interval 21-23
100 100
(c
72 – 65
Notes Hence, P90 = 21 + x 2 = 22.27
e
11
in
3.1.6 Mean Deviation
nl
Mean deviation is the arithmetic mean of the absolute deviations of the values
about their arithmetic mean or median or mode. Mean Deviation (MD) is an average
value of absolute deviation of observations from the data mean (or the median or the
O
mode). It gives how spread/dispersed the data is.
ity
Where,
Average used for calculating deviation can be the mean, the median or the mode.
s
However, usually the mean is used. There is also an advantage of taking deviations
from the median, because ‘Mean Deviation’ from median is lowest as compared to any
er
other ‘Mean Deviations’. Since absolute values of deviations ignoring sign are taken
for calculating Mean Deviation, the mean deviation is not amenable to further algebraic
treatment.
v
3.1.7 Mean Deviation Application
ni
Formulae:
Coefficient of Mean deviation (about mean) = =Mean deviation about Mean / Mean
ity
= Σ|x-x|/N
= Σf|x-M|/N
Coefficient of Mean deviation (about Mode) = Mean deviation about Mode / Mode
)A
= Σf|x-z|/N
Example:
12 7 9 7 7 4 10 9 15 20
Solution:
Notes
e
X = 12 + 7 + 9 + 7 + 7 + 4 + 10 + 9 + 15 + 20/ 10
in
= 100/10
= 10
nl
= 2 + 3 +1+ 3 + 3 + 6 + 0 +1+ 5 +10/ 10
= 34/10
O
= 3.4
Example 1:
ity
MD in Individual series
Value (x) 125 128 132 135 140 145 155 157 159 161
s
Solution: er
Steps 1: First compute AM Step 2: Deviation From X Mean
Sl. No. Value (x) Formula (X-X) = Dx deviation
A 125 Σx 125-144= -19 Σ Dx
v
X= MD =
B 128 n 128-144= -16 n
ni
12 120
m
Example 2:
)A
MD in Discrete Series
X 35 40 45 50 55 60 65 70 75 80 85 90
95
(c
f 3 8 12 9 4 7 15 5 10 7 5 3 2
Solution:
Notes
e
X f fx AM (X-X) = Dx fdx Mean deviation
35 3 105 Σ fx 35-61.95 = – 26.95 80.85 Σ Dx
in
x= MD =
40 8 320 n 40-61.95= – 21.95 175.60 n
45 12 540 5.575 45-61.95 = – 16.95 203.40 12.17.28
= MD =
50 9 450 90 50-61.95 = – 11.95 107.55 90
nl
55 4 220 = 61.95 55-61.95 = – 6.95 27.88
60 7 420 60-61.95 = – 1.95 13.65 MD = 13.525
O
65 15 975 65-61.95 = 3.05 45.75
70 5 350 70-61.95 = 8.05 40.25 Coefficient of
75 10 750 75-61.95 = 13.05 130.50 MD
MD =
80 7 560 80-61.95 = 18.05 126.35 x
ity
85 5 425 85-61.95 = 23.05 115.25 13.525
=
90 3 270 90-61.95 = 28.05 84.15 61.95
95 2 190 95-61.95 = 33.05 66.10 = 0.218
N=90 Σfx = 5575 Σfdx = 1217.28
s
Example 3:
er
MD in Continuous Series
Calculate mean deviation and its co-efficient for the following data:
v
X f
10-20 5
ni
20-30 4
30-40 7
U
40-50 12
50-60 10
60-70 8
70-80 4
ity
Solution
X f Mid fx AM (X- x ) =
fdx
Point X dx
m
3.1.8 Variance
Notes
e
Variance is defined as the average of squared deviation of data pointsfrom their
mean.
in
When the data constitute a sample, the variance is denoted byσ2x and averagingis
done by dividing the sum of the squared deviation from the mean by ‘n – 1’. When
observations constitute the population, the variance is denoted by σ2 and we divide by
nl
N for the average
O
n 2
2
å (x i –x )
Sample Variance Var (x) = σ x = i=1
n–1
Where,
ity
xi for i = 1, 2, ..., n are observation values.
x = Sample mean
n = Sample size
s
µ = Population mean er
N = Population size
Population Variance is,
2
å (x – µ)
v
i
Var ( x) = σ2 =
N
ni
n n n n
å (x
i=1
2
i – 2µxi + µ 2 ) å (x ) – 2µå x + µ å (1)
i=1
2
i
i=1
i
2
i=1
= =
N N
U
åx 2
1
= i=1
– µ2
N
2
Var ( x) = E ( X2 ) – éëE ( X)ùû
ity
arithmetic mean. S.D. is denoted by symbol σ (read sigma). The Standard Deviation
(SD) of a set of data is the positive square root of the variance of the set. This is also
referred as Root Mean Square (RMS.) value of the deviations of the data points. SD
of sample is the square root of the sample variance i.e. equal to σx and the Standard
)A
Deviation of a population is the square root of the variance of the population and
denoted by σ.
e
●● It is affected least by any sampling fluctuations.
●● It is affected by the extreme values and it gives more importance to the values
in
that are awayfrom the mean.
●● The main limitation is; we cannot compare the variability of different data sets
givenin different units
nl
2
Ex 2 æç å x ö÷÷
σ= – çç ÷
n çè n ÷÷ø
O
If an assumed value A is taken for mean and d = X - A, then
2
σ=
åd 2 æ å d ö÷
– ççç ÷÷
n ÷
èç n ø÷
ity
For a frequency distribution
2
σ=
å fd 2 æ å fd ÷ö
– ççç ÷÷ ´ c
N ÷
èç N ø÷
Where d = X – A and C is the true class int erval
s
N = Total frequency er
Example:
Solution:
Let the event that ‘a student selected at random has the book’ be termed as a
success. Since the group of students is large, 3 trials, i.e., the selection of 3 students,
U
Where r = 0, 1, 2 and 3
The mean is np = 3 x 0.8 = 2.4 and Variance is npq = 2.4 x 0.2 = 0.48
Frequency 6 14 10 8 1 3 8
e
0-10 5 6 30 -25 625 3750
10-20 15 14 210 -15 225 3150
in
20-30 25 10 250 -5 25 250
30-40 35 8 280 5 25 200
nl
40-50 45 1 45 15 225 225
50-60 55 3 165 25 625 1875
60-70 65 8 520 35 1225 9800
O
Σfi= 50 1500 19250
Mean = 1500/50 = 30
SD = √19250/50 = 19.62
ity
3.1.12 Combined Standard Deviation
The combined standard deviation can be calculated by following the same method
s
of calculating the combined mean of two or more than two groups. It may be denoted by
σ 12 formula for combined standard deviation of two groups is:
er
N1σ 12 + N 2σ 2 2 + N1d12 + N 2 d12
σ 12 =
N1 + N 2
v
3.1.14 Coefficient of Variation
ni
CV =σ/ μ×100
This is also called as varia–bility. Smaller value of CV indicates greater stability and
U
lesser variability.
Example:
Two batsmen A and B made the following scores in the preliminary round ofWorld
ity
Who will you select for the final? Justify your answer?
Solution:
)A
We will first calculate mean, standard deviation and Karl Pearson’s coefficientof
variation. We will select the player based on the average score as well asconsistency.
We not only want the player who has been scoring at high average butalso doing it
consistently. Thus, the probability of his playing good inning in final is high.
e
14 –26 676 196
13 –27 729 169
in
26 –14 196 676
53 13 169 2809
nl
17 –23 529 289
29 –11 121 841
79 39 521 6241
O
36 –4 16 1296
84 44 1936 7056
49 9 81 24021
ity
∑ xi = 400 ∑ (xi – µ) = 0 ∑ (xi – µ)2 = 0 ∑ xi2 = 21974
Now,
10
å xi 400
s
Mean = µ = i–1
= = 40
N 10
10
å (x
er
i – µ)
2
5974
Variance = Var ( x) = i–1
= = 597.4
N 10
St an dard Deviation = σ = Var ( x) = 597.4 = 24.44
v
ni
When the distribution stretches more to the right than it does to the left, the
distribution is said to be ‘right skewed’ or ‘positively skewed’. Similarly, a left-skewed
ity
distribution is the one that stretches asymmetrically to the left. Thus, the skewness is
a measure of the extent of symmetry or asymmetry of the distribution. In symmetrical
distribution, with single mode, we have (mode = mean = median). In such case
skewness is zero. In case of positive skewness (i.e. right skewness) the mean is to the
right of median, which in turn lies to the right of the mode. The opposite is for negative
m
skewness. Skewness can be measured either in absolute term as ‘mean minus mode’
or in relative terms. Some of the relative measures are as follows:
(Q3 – Q2 ) – (Q2 – Q1 )
(SK B ) = Notes
e
(Q3 + Q1 ) + (Q2 – Q1 )
(Q + Q1 ) – 2 ´Md
in
= 3
(P90 – P10 )
Where, Q is quartile.
nl
3. Kelly’s coefficient of skewness (Skk). It is defined as:
(x– )
i
O
Skewness =
N
Where, P is percentile.
ity
Skewness is also defined in term of the moment about mean. One such
measure is defined as:
(x–i )
s
Absolute Kurtosis =
N
curve is very useful for comparing two populations particularly when their means and
SD are same
calculated by multiplying the difference between the mean and median, multiplied by
three. The result is divided by the standard deviation.
X – Mo
SK1 =
S
Where x = the mean, Mo = the mode and s = the standard deviation for the sample.
)A
3( X – Md)
SK2 =
S
(c
Where x = the mean, Mo = the mode and s = the standard deviation for the sample.
e
Bowley skewness is a method to figure out whether there is a positively-skewed or
negatively skewed distribution. Bowley Skewness is used as an alternative to find out
in
more about the asymmetry of an distribution. It is very useful if there are extreme data
values ie. the outliers or if there is an open-ended distribution.
nl
Skewness = 0 means that the curve is symmetrical.
Skewness > 0 means the curve is positively skewed.
O
Skewness < 0 means the curve is negatively skewed.
In a symmetric distribution, like the normal distribution, the first (Q1) and third
(Q3) quartiles are at equal distances from the mean (Q2). In other words, (Q3-Q2) and
ity
(Q2-Q1) will be equal. If you have a skewed distribution then there will be a difference
between those two values.
s
Bowley Skewness is an absolute measure of skewness. It gives a result in the
units that the distribution is in. That’s compared to the Pearson Mode Skewness, which
er
gives the results in a dimensionless unit — the standard deviation. This means that one
cannot compare the skewness of different distributions with different units using Bowley
Skewness.
Example:
v
Find the Bowley’s coefficient of the data
ni
1 60 120
2 50 170
3 20 190
4 25 215
ity
5 10 225
6 or more 5 230
Solution –
m
Step 1: Finding the quartiles for the data set. Looking at for the “nth” observation
using the following formulas:
Step 2: Looking in the table to find the nth observations as calculated in Step 1:
(c
Q1 = 57.75th observation = 0
Q2 = 115.5th observation = 1
Amity Directorate of Distance & Online Education
Business Statistics 51
Q3 = 173.25th observation = 3
Notes
e
Step 3: Plugging the above values into the formula:
in
Skq = Q3 + Q1 – 2Q2 / Q3 – Q1
Skq = 3 + 0 – 2 / 3 – 0 = 1/3
nl
3.1.18 Measure of Kurtosis
O
Kurtosis is a measure of peaked-ness of distribution. Larger the kurtosis, more and
more peaked will be the distribution. The kurtosis is calculated either as an absolute or
a relative value. Absolute kurtosis is always a positive number. Absolute kurtosis of a
normal distribution (symmetric bell shaped distribution) is taken as 3. Relative kurtosis
ity
can be calculated as follows:
(x–i )
Absolute Kurtosis =
N
s
Relative kurtosis = Absolute kurtosis–3
●●
er
Relative kurtosis can be negative. Managers usually work with relative
kurtosis.
●● Negative kurtosis indicates a flatter distribution than the normal distribution,
and called as platykurtic.
v
●● A positive kurtosis means more peaked curve, called Leptokurtic.
ni
Example:
U
The first four central moments of a distribution are 0, 2.5, 0.7 and 18.75. Test the
skewness and kurtosis of the distribution.
Testing Skewness
ity
(2.5)3
Since β1 = + 0.031, the distribution is slightly skewed.
Testing Kurtosis:
When a distribution is more peaked than the normal, β2 is more than 3 and when it
Notes
e
is less peaked than the normal, β2 is less than 3.
μ4
β1 =
in
μ22
μ4 = 18.75, μ2 = 2.5
nl
18.75 18.75
β1 = = =3
(2.5)3 6.25
O
Key Takeaways:
Measure of dispersion: IT gives an idea about the extent of lack of uniformity in the
ity
sizes and qualities of the items in a series. It helps us to know the degree of uniformity
and consistency in the series. If the difference between items is large the dispersion or
variation is large and vice versa.
Range: The ‘Range’ of the data is the difference between the largest value of data
s
and smallest value of data.
Variance: It is the average squared deviation of the data from their mean. For
v
sample data, we take the average by dividing with (n-1) where n is a sample size. This
is to cater for degree of freedom. For population data, we average by dividing with the
ni
population size N.
effectively.
The Standard Deviation (SD) of a set: It is the positive square root of the
variance of the set. This is also referred as Root Mean Square (RMS.) value of the
deviations of the data points. SD of sample is the square root of the sample variance.
m
1. Negative kurtosis indicates a flatter distribution than the normal distribution, and
called as
a) platykurtic.
(c
b) Mesokurtic.
c) Leptokurtic.
d) Relative Kurtosis
Notes
e
2. A special type of graph, designed to describe as to how much a certain distribution
varies from a completely uniform distribution
in
a) Histogram
b) Lorenz Curve
nl
c) Skewness curve
d) Pie chart
3. A ________________ in any data shows the extent to which the numerical values
O
tend to spread about an average.
a) Hypothesis
b) Measure of dispersion
ity
c) Measured of Central Tendency
d) Standard deviation
4. The ________________ of data is the difference between the largest value of data
s
and smallest value of data.
a) Mean
er
b) Median
c) Range
v
d) Mode
5. The formula for inter - quartile range is -
ni
a) Q2-Q1
b) Q3-Q1
U
c) Q1-Q2
d) Q1-Q3
ity
2. b) Lorenz Curve
Notes
e
3. b) Measure of dispersion
4. c) Range
in
5. b) Q3-Q1
Further Readings
nl
1. Richard I. Levin, David S. Rubin, Sanjay RastogiMasood Husain Siddiqui,
Statistics for Management, Pearson Education, 7th Edition,2016.
O
2. Prem.S.Mann, Introductory Statistics, 7th Edition, Wiley India,2016.
3. GarethJames,DanielaWitten,TrevorHastie,RobertTibshirani,AnIntroduction toSt
atisticalLearningwithApplicationsinR,Springer,2016.
ity
Bibliography
1. SrivastavaV.K.etal–QuantitativeTechniquesforManagerialDecisionMaking,
Wiley EasternLtd
2. Richard, I.Levin and Charles A.Kirkpatrick – Quantitative Approaches to
s
Management, McGraw Hill, Kogakusha Ltd.
er
3. Prem.S.Mann, Introductory Statistics, 7th Edition, Wiley India,2016.
4. Budnik, Frank S Dennis Mcleaavey, Richard Mojena – Principles of Operation
Research - AIT BS NewDelhi.
v
5. SharmaJK–OperationResearch-theoryandapplications-McMillan,NewDelhi
6. KalavathyS.–OperationResearch–VikasPubCo
ni
7. GouldFJ–IntroductiontoManagementScience–EnglewoodCliffsNJPrentice Hall.
8. NarayJK,OperationResearch,theoryandapplications–McMillan,NewDehi.
U
9. TahaHamdy,OperationsResearch,PrenticeHallofIndia
10. Tulasian: Quantitative Techniques: PearsonEd.
11. Vohr.N.D. Quantitative Techniques in Management,TMH.
ity
e
Analysis
in
Objectives
1. To get introduced with the concept of correlation and regression analysis
nl
2. To discuss data collection and presentation techniques
3. To understand the importance and role of time series and its analysis in any business
organization
O
Outcomes
1. The learner will be able to use statistical tools like correlation and regression analysis
ity
in business management
2. The learner will be able to utilize time series analysis understanding in problem
solving
s
“Correlation is an analysis of the co variation between two or more variables.”
A.M. Tuttle says- er
4.1.1 Introduction
We often encounter the situations, where data appears as pairs of figures relating
to two variables, for example, price and demand of commodity, money supply and
v
inflation, industrial growth and GDP, advertising expenditure and market share, etc.
ni
as they naturally occur, since neither variable can be fixed at predetermined levels.
Correlation and regression analysis show how to determine the nature and strength off
the relationship between the variables.
nature, the appropriate statistical tool for discovering and measuring the
relationship and expressing it in a brief formula is known as correlation”.
●● L.R. Conner says, “If two or more quantities vary in sympathy so that the
movement in one tends to be accompanied by corresponding movements in
m
variables. It may be the case that one is the cause and other is an effect
i.e.independent and dependent variables respectively. On the other hand, both may
be dependent variables on a third variable. In some cases there may not be any
causeeffectrelationship at all. Therefore, if we do not consider and study the underlying
economic or physical relationship, correlation may sometimes give absurd results.
(c
For example in the case of global average temperature and Indian population.Both
Notes
e
are increasing over past 50 years but obviously not related. Correlation is an analysis of
the degree to which two or more variables fluctuate with reference to each other.
in
Correlation is expressed by a coefficient ranging between –1 and +1. Positive (+ve)
sign indicates movement of the variables in the same direction. E.g. Variation of the
fertilizers used on a farm and yield, observes a positive relationship within technological
nl
limits. Whereas negative (–ve) coefficient indicates movement of the variables in the
opposite directions, i.e. when one variable decreases, other increases.
O
relationship. Absence of correlation is indicated if the coefficient is close to zero. Value
of the coefficient close to ±1denotes a very strong linear relationship.
ity
●● To identify relationship of various factors and decision variables.
●● To estimate value of one variable for a given value of other if both are
correlated.
●● To understand economic behaviour and market forces.
s
●● To reduce uncertainty in decision-making to a large extent.
In business, correlation analysis often helps manager to take decisions
er
by estimating the effects of changing the values of the decision variables like
promotion,advertising, price, production processes, on the objective parameters like
costs, sales, market share, consumer satisfaction, competitive price. The decision
v
becomes more objective by removing subjectivity to certain extent. However, it must be
understood that the correlation analysis only tells us about the two or more variables
ni
in a data fluctuatetogether or not. It does not necessarily be due cause and effect
relationship. To know if the fluctuations in one of the variables indeed affects other or
not, one has to be established with logical understanding of the business environment.
U
does not deal with a cause or effect relationship. For instance, in a survey of a class
room, the researcher may be looking to count the number of boys and girls. In this
instance, the data would simply reflect the number, i.e. a single variable and its quantity
as per the below table. The key objective of Univariate analysis is to simply describe the
m
data to find patterns within the data. This is be done by looking into the mean, median,
mode, dispersion, variance, range, standard deviation etc.
Variable = x Number = n
)A
Boys 40
Girls 45
Univariate analysis is conducted through several ways which are mostly descriptive
in nature
(c
●● Frequency Polygons
Notes
e
●● Pie and Bar Charts
in
4.2 Bivariate Analysis - Introduction
The analysis of bivariates is slightly more analytical than the analysis of
univariates. If two variables are included in the data set and researchers plan to
nl
compare the two data sets, then the correct method of analysis technique is bivariate
analysis.
O
analysis the ratio of students who scored above 90% corresponding to their genders.
In this case, there are two variables – gender = X (independent variable) and result = Y
(dependent variable). A Bivariate analysis is will measure the correlations between the
two variables as shown the table below.
ity
Variable = x Number = n Ratio of students scoring above 90%
Boys 40 10
Girls 45 7
s
Bivariate analysis is conducted
er
1. Correlation coefficients
Correlation is a type of statistical correlation in which intensity is found in
v
the relationship between two variables. This indicates strength as strong or weak
correlations and is graded on a scale of-1 to 1, where 1 is a perfect direct correlation,-1
ni
2. Regression analysis
U
For the calculation of the relationships between two different variables, regression
analysis is used. If the emphasis is on the relationship between a dependent variable
and one or more independent variables, it involves methods for modelling and
analysing multiple variables. When any one of the independent variables is modified,
ity
it helps to explain how the meaning of the dependent variable changes. Regression
analysis is also used for advanced data modelling purposes like prediction and
forecasting.
m
e
for the two regression coefficients for a bivariate frequency distribution as given below:
in
b
N Σfi x12 – (Σfixi)2
X1 – A Yj – B
nl
or, if we define μ1 = and vj =
h k
k ∑ ∑ fi j ui v j − (∑ fi ui ) (∑ f j v j )
b=
O
h n ∑ fi u12 − (∑ fi ui ) 2
N ∑ ∑ f i j X iY j − ( ∑ f i X i ) ( ∑ f j y j )
Similarly, d =
n ∑ fi y12 − (∑ fi yi ) 2
ity
h N ∑ fi j ui v j − (∑ fi ui ) (∑ f j v j )
or d =
k n ∑ f j v j 2 − (∑ f j v j )2
4.4 Linear-Nonlinear-Relationship
s
A graph is plotted to find the relationship between two variables. It is common
er
practise to put on the x-axis (which is called the independent variable) the variable you
are actively modifying and the outcome on the y-axis which is called the dependent
variable being examined.
v
The relationships between two quantities are illustrated by linear and non-linear
relationships. A straight line forms the graph of a linear equation, while the graph for a
ni
non-linear relationship is curved. A non-linear relationship reflects that the same change
in the y variable is not always brought about by each unit change in the x variable.
A nonlinear relationship between two variables is one for which the slope of the
ity
curve showing the relationship changes as the value of one of the variables changes.
A nonlinear curve is a curve whose slope changes as the value of one of the variables
changes. If a relationship between two variables is not linear, the rate of increase or
decrease can change as one variable changes, causing a “curved pattern” in the data.
This curved trend might be better modeled by a nonlinear function, such as a quadratic
m
4.5 Correlation-Coefficient
)A
are grouped for X variable and columns are grouped for Y variable. Each cell say (i,
j) represents the frequency or count that falls in both groups of a particular range of
values of Xi and Yj. In this case correlation coefficient is given by
1
∑ f × mx × m y − ∑ ( f × mx ) ∑ ( f × m y ) Notes
e
r= n
∑ f × mx ) 2 ∑ f × my )2
in
2 2
∑ ( f × mx ) − ∑( f × m y ) −
2 2
Where mX and mY are class marks of frequency distributions of X and Y variables,
nl
fX and fY are marginal frequencies of X and Y and fXY are joint frequencies of X and Y
respectively.
O
x 0-500 500-1000 1000-1500 1500-2000 2000-2500 Total
y
0-200 12 6 - - - 18
ity
200-400 2 18 4 2 1 27
400-600 - 4 7 3 - 14
600-800 - 1 - 2 1 4
800-1000 - - 1 2 3 6
s
Total 14 29 12 er 9 5 69
Solution:
Let the assumed mean for X be a = 1250 and the scaling factor g = 500. Therefore,
we can calculate f x dy and f x dx2 from the marginal distribution of X as,
v
Class mx - a
X dx = Frequecny f f x dx f x dx2
ni
Mark mx g
1000-1500 1250 0 12 0 0
1500-2000 1750 1 9 9 9
2000-2500 2250 2 5 10 20
ity
Cov x.cov y
r=
σ xσ y
)A
1 _ _
∑( x − x)( y − y )
r=n
σ xσ y
(c
e
variables X and Y. It is denoted as Cov (x,y). The Correlation Coefficient r is a
dimensionless number whose value lies between +1 and –1. Positive values of r
in
indicate positive (or direct) correlation between the two variables X and Y i.e. both X
and Y increase or decrease together.
nl
that an increase in one variable X or Y results in a decrease in the value of the other
variable. A zero correlation means that there is no association between the two
variables.
O
formulla can be modified as:
1 _ _
1 _ _ _ _
∑( x − x)( y − y ) ∑( xy − x y − x y + x y
=r n= n
σ xσ y σ xσ y
ity
N N N
= 2 2
2 2
∑x ∑x ∑y ∑y
− −
n n n n
s
E [ XY ] − E[ x]E[Y ]
=
E[ X 2 ] − ( E[ X ]) 2 E[Y 2 ] − ( E[Y ]) 2
er
Equations (2) and (3) are alternate forms of equation (1). These have advantage
that we don’t have to subtract each value from the mean.
v
4.6 Correlation Coefficient Application
ni
Example:
The data of advertisement expenditure (X) and sales (Y) of a company for past
U
10 year period is given below. Determine the correlation coefficient between these
variables and comment the correlation.
X 50 50 50 40 30 20 20 15 10 5
ity
Y 700 650 600 500 450 400 300 250 210 200
Solution:
We shall take U to be the deviation of X values from the assumed mean of 30
m
divided by 5. Similarly, V represents the deviation of Y values from the assumed mean
of 400 divided by 10.
5 30 450 0 5 0 0 25
6 20 400 -2 0 0 4 0
e
8 15 250 -3 -15 45 9 225
9 10 210 -4 -19 76 16 361
in
10 5 200 -5 -20 100 25 400
Total -2 26 561 110 3136
nl
Short cut procedure for calculation of correlation coefficient
n
1 n n
∑ u1v1 − ∑ 1 ∑ v1
u
n i −1 i −1
O
r= i −1
2 2
n
1 n 2 n
1 n
∑
i −1
u1
2
− ∑ v1
n i −1
∑ v12 −
i −1
∑ v1
n i −1
(−2)(26)
561 −
ity
10 561 + 5.2
= = = 0.976
4 676 109.6 3068.4
110 − 3136 −
10 10
Interpretation of r
s
●● The correlation coefficient, r ranges from −1 to 1. A value of 1 implies that a linear
er
equation describes the relationship between X and Y perfectly, with all data points
lying on a line for which Y increases as X increases. A value of −1 implies that all
data points lie on a line for which Y decreases as X increases. A value of 0 implies
that there is no linear correlation between the variables.
v
●● More generally, note that (Xi − X) (Yi − Y) is positive if and only if Xi and Yi lie on
the same side of their respective means. Thus the correlation coefficient is positive
ni
e
There are three broad reasons for computing a correlation matrix:
in
The observable trend in our example above is that all the variables correlate
strongly with each other.
nl
2. To input into other analyses. For example, correlation matrixes are used as
inputs for exploratory factor analysis, confirmatory factor analysis, structural
equation models, and linear regression when excluding missing values pairwise.
O
3. As a diagnostic while checking various other analyses. For example, with linear
regression, a high amount of correlations suggests that the linear regression
estimates will be unreliable.
ity
4.8 Rank Correlation
Quite often the data is available in the form of some ranking for different
variables. Also there are occasions where it is difficult to measure the cause-effect
variables. For example, while selecting a candidate, there are number of factors on
s
which the experts base their assessment. It is not possible to measure many of these
parameters in physical units e.g. sincerity, loyalty, integrity, tactfulness, initiative, etc.
er
Similar is the case during dance contests. However, in these cases the experts may
rank the candidates. It is then necessary to find out whether the two sets of ranks
are in agreement with each other. This is measured by Rank Correlation Coefficient.
The purpose of computing a correlation coefficient in such situations is to determine
v
the extent to which the two sets of ranking are in agreement. The coefficient that is
determined from these ranks is known as Spearman’s rank coefficient, rS.
ni
rs = 1 i −1
2
n(n − 1)
Where, n = Number of observation pairs
ity
D = Xi - Yi
3 1 4 2 6 9 8 10 5 7
Y
Solution:
Notes
e
Computations of Spearman’s Rank Correlation as shown below:
in
Individual Rank in Maths (X=X2) Rank In Physics d1=x1-y1 d12
(Y=y1)
1 1 3 +2 4
nl
2 2 1 -1 1
3 3 4 +1 1
4 4 2 -2 4
O
5 5 6 +1 1
6 6 9 +3 9
7 7 8 +1 1
ity
8 8 10 +2 4
9 9 5 -4 10
10 10 7 -3 9
Total 50
s
N
Now, n = 10, d 1
2
= 50 er
i-1
i-1
2
10 (100–1)
N (n -1)
ni
We can say that there is a high degree of correlation between the performance in
mathematics and physics.
U
X 75 88 95 70 60 80 81 50
ity
X Y R1 R2 d=R1–R2 d2
m
75 120 5 5 0 0
00 134 2 4 –2 4
)A
95 150 1 1 0 0
70 115 6 6 0 0
60 110 7 7 0 0
80 140 4 3 1 1
81 142 3 2 1 1
(c
50 100 8 8 0 0
6
Notes
e
Conffident of Correlation P–1–
6 d 2
– 1–
6x6
= + 93
8 (64 –1)
2
N (n -1)
in
In this method the biggest item gets the first rank, the next biggest second rank and
so on.
nl
Rank correlation for tied ranks
In case of a tie, i.e., when two or more individuals have the same rank, each
individual is assigned a rank equal to the mean of the ranks that would have been
O
assigned to them in the event of there being slight differences in their values. To
understand this, let us consider the series 20, 21, 21, 24, 25, 25, 25, 26, 27, 28. Here
the value 21 is repeated two times and the value 25 is repeated three times. When
we rank these values, rank 1 is given to 20. The values 21 and 21 could have been
ity
assigned ranks 2 and 3 if these were slightly different from each other. Thus, each value
will be assigned a rank equal to mean of 2 and 3, i.e., 2.5. Further, the value 24 will be
assigned a rank equal to 4 and each of the values 25 will be assigned a rank equal to 6,
the mean of 5, 6 and 7 and so on.
s
Since the Spearman’s formula is based upon the assumption of different ranks to
different individuals, therefore, its correction becomes necessary in case of tied ranks. It
er
should be noted that the means of the ranks will remain unaffected.
Σd1 .
2
For example, if the ranks of X are 1, 2, 3, 3, 5,..... showing that there are two items
with the same 3rd rank and fourth rank is skipped, then instead of writing 3, we write 3
U
1 1 1
for both. Thus the sum of these ranks which is 7 (3+4=3 + 3 =7) remains same
2 2 2
keeping the mean of ranks unaffected. But in such cases the standard deviation is
affected. Therefore, correction is required for the Rank Correlation Coefficient. For this,
(m3 –m)
ity
Σd12 is increased by for each tie, where m is number of items in each tie.
12
We must remember that if there are more than one gorup of items with common
rank, this correction factor is to be added that many times once for each group.
m
Example: Twelve salesmen are ranked for efficiency and length of service as
below:
Salesman A B C D E F G H I J K L
)A
Efficiency (X) 1 2 3 4 4 4 7 8 9 10 11 12
Solution:
e
(Y=y1)
A 1 2 -1 1
in
B 2 1 1 1
C 3 5 -2 4
nl
D (4+5+6)/3=5 3 2 4
E (4+5+6)/3=5 9 -4 16
F (4+5+6)/3=5 (7+8)/3=7.5 -2.5 6.25
O
G 7 (7+8)/3=7.5 -0.5 0.25
H 8 6 2 4
I 9 4 5 25
ity
J 10 (11+12)/2=11.5 -1.5 2.25
K 11 10 1 1
L 12 (11+12)/2=11.5 0.5 0.25
Total 65
s
N
Now, n = 12, d 2
= 65
i-1
1
er
Using the formula
N 1 1 1
{ d + x (33 – 3)
2
x (23 – 2) + x (23 – 2)}
12 12 12
v
1
rs = 1 i-1
n (n2 –1)
ni
We can conclude that there is a high degree of correlation between efficiency and
length of service.
There is a need for a statistical model that will extract information from the given
data to establish the regression relationship between independent and dependent
relationship. The model should capture systematic behaviour of data. The non-
systematic behaviour cannot be captured and called as errors. The error is due to
m
random component that cannot be predicted as well as the component not adequately
considered in statistical model. Good statistical model captures the entire systematic
component leaving only random errors.
)A
e
which is closest to the observations is called the ‘best fit’.
The best fit is calculated as per Legender’s principle of least sum squares of
in
deviations of the observed data points from the corresponding values on the ‘best
fit’ curve. This is called as minimum squared error criteria. It may be noted that the
deviation (error) can be measured in X direction or Y direction. Accordingly we will get
nl
two ‘best fit’ curves. If we measure deviation in Y direction, i.e. for a given i x value of
data point ( x,y ) and then we measure corresponding y value on ‘best fit’ curve and
then take the value of deviation in y, we call it as regression of Y on X. In the other
case, if we measure deviations in X direction we call it as regression of X and Y.
O
Definition: According to Morris Myers Blair, regression is the measure of the
average relationship between two or more variables in terms of the original units of the
data.
ity
4.12 Linear Regression Model
Linear regression is a linear model, e.g. a model that assumes a linear relationship
between the input variables (x) and the single output variable (y). More specifically, that
s
y can be calculated from a linear combination of the input variables (x).
er
A simple and widely used kind of predictive analysis is linear regression. Two
questions are discussed in the general concept of regression:
how they influence the outcome variable, as shown by the magnitude and sign of the
beta estimates.
The relationship between one dependent variable and one or more independent
U
When there is a single input variable (x), the method is referred to as simple linear
regression. When there are multiple input variables, literature from statistics often refers
to the method as multiple linear regression
m
Regression Lines
For a bivariate data (Xi, Yi), i = 1,2, ...... n, we can have either X or Y as
)A
one cannot be derived from the other by mere transfer of terms, because the derivation
of each line is dependent on a different set of assumptions.
Line of Regression of Y on X
Notes
e
The general form of the line of regression of Y on X is YCi = a + bXi , where YCi
denotes the average or predicted or calculated value of Y for a given value of X = Xi.
in
This line has two constants, a and b. The constant a is defined as the average value
of Y when X = 0. Geometrically, it is the intercept of the line on Y-axis. Further, the
constant b, gives the average rate of change of Y per unit change in X, is known as the
nl
regression coefficient.
Y
Y1
O
bX i
a+
Ya =
Ya
ity
a
{
O X1 X
s
The above line is known if the values of a and b are known. These values are
estimated from the observed data (Xi, Yi), i = 1,2, ...... n.
er
Line of Regression of X on Y
The general form of the line of regression of X on Y is XCi = c + dYi , where XCi
v
denotes the predicted or calculated or estimated value of X for a given value of Y = Yi
and c and d are constants. d is known as the regression coefficient of regression of X
ni
on Y.
Y
U
Yi
c+b
Xa =
Yi
Xa Xi
ity
C
{
O X
m
In this case, we have to calculate the value of c and d so that S1 = Σ(X1 –Xa)2 is
minimised.
)A
This shows that the line of regression also passes through the point X,Y . Since
both the lines of regression passes through the point X,Y , therefore X,Y is their point of
intersection as shown
(c
Notes
e
bY
i
Y
+
c
=
bX i
in
a+
a
X
Ya =
Y
Xi
nl
C
O
O X X
We can write c = X – dY
ity
4.13 Population Regression
It is the regression model is based on a sample of n bivariate observations drawn
from a larger population of measurements.
ŷ = b0 + b1x
s
To construct an ordinary least-squares regression line, we use the means and
er
standard deviations of our sample data to calculate the slope (b1) and y-intercept (b0).
of the response variable y are normally distributed with a mean that depends on x.
To represent such means μy is used. It is also assumed that these means all lie on
a straight line when plotted against x (a line of means). The sample data then fit the
U
statistical model:
yi = (β0 + β1xi) + ϵi
ity
where the errors (εi) are independent and normally distributed N (0, σ). Linear
regression also assumes equal variance of y (σ is the same for all values of x). We use
ε (Greek epsilon) to stand for the residual part of the statistical model. A response y is
the sum of its mean and chance deviation εfrom the mean. The deviations ε represents
m
the “noise” in the data. In other words, the noise is the variation in y due to other causes
that prevent the observed (x, y) from forming a perfectly straight line.
The sample data used for regression are the observed values of y and x. The
)A
response y to a given xis a random variable, and the regression model describes the
mean and standard deviation of this random variable y. The intercept β0, slope β1, and
standard deviation σ of y are the unknown parameters of the regression model and
must be estimated from the sample data.
(c
●● The value of ŷ from the least squares regression line is really a prediction of
the mean value of y (μy) for a given value of x.
e
sample data is the best estimate of the true population regression line
●● (μy =β0+ β1xμy = β0 + β1x).
in
4.14 Least Squares Method
The generally used method to find the ‘best’ fit that a straight line of this kind can
nl
give is the least-square method. To use it efficiently, we first determine.
∑ xi 2 =
∑ X i 2 − nX 2
O
∑ yi 2 =
∑ Yi 2 − nY 2
∑ xi yi =
∑ X iYi − nX .Y
∑ xi yi
b= , a= Y − bX
ity
∑ xi 2
These measures define a and b which will give the best possible fit through the
original X and Y points and the value of r can then be worked out as under
b ∑ xi 2
s
r=
∑ yi 2 er
Thus, the regression analysis is a statistical method to deal with the formulation
of mathematical model depicting relationship amongst variables which can be used for
the purpose of prediction of the values of dependent variable, given the values of the
v
independent variable.
values of X and Y variables, we can find the values of the two constants viz., a and b by
using the following two normal equations:
∑ yi = na + b ∑ X 1
U
∑ X iYi = a ∑ X i + b ∑ X i 2
and then solving these equations for finding a and b values. Once these values are
obtained and have been put in the equation Y = a + bX, we say that we have fitted the
ity
the nth degree. Polynomial regression fits a nonlinear relationship between the value of
x and the corresponding conditional mean of y, denoted E(y |x).
It is used as –
●● There are some relationships that a researcher will hypothesize is curvilinear.
(c
●● Inspection of residuals. If we try to fit a linear model to curved data, a scatter plot
Notes
e
of residuals (Y axis) on the predictor (X axis) will have patches of many positive
residuals in the middle. Hence in such situation it is not appropriate.
in
●● An assumption in usual multiple linear regression analysis is that all the
independent variables are independent. In polynomial regression model, this
assumption is not satisfied.
nl
These are basically used to define or describe non-linear phenomenon such as:
●● Growth rate of the tissues.
O
●● Progression of disease epidemics
●● Distribution of carbon isotopes in lake sediments
The basic goal of regression analysis is to model the expected value of a
ity
dependent variable y in terms of the value of an independent variable x. In simple
regression, we used following equation y = a + bx + e where y is dependent variable, a
is y intercept, b is the slope and e is the error rate.
s
A model parameter is a configuration variable whose value can be calculated from
er
data and which is internal to the model. When making predictions, they’re needed
by the model. These values decide the model’s ability to solve a problem and are
calculated or acquired from data
v
For a variable, such as a Gaussian distribution, we might assume a distribution.
The mean (mu) and the standard deviation (sigma) are two parameters of the Gaussian
ni
distribution. This takes place in machine learning, where data can be calculated and
used as part of a predictive model for these parameters. For example: the coefficients
of a linear regression or logistic regression.
U
Key Takeaways
●● Correlation: It is an analysis of co variation between two or more variables.
●● Correlation Coefficient: It is a numerical measure of the degree of association
ity
a) Negative
Notes
e
b) Different
c) Same
in
d) Zero
2. Regression analysis is used to study ___________ between the variables
nl
a) Relationship
b) Dependence
O
c) Positivity
d) Negativity
3. If a regression coefficient is negative then the correlation between the variables
ity
would also be
a) Negative
b) Different
c) Same
s
d) Zero
er
4. ______________ correlation refers to the movement of the variables in opposite
direction.
a) Negative
v
b) Different
ni
c) Same
d) Zero
5. Graph plotted to show relationship between two variables is
U
a) Pie chart
b) Histogram
ity
c) Scatter Diagram
d) Bar graph
__________n
3. Write a note on the standard error of the estimate.
4. Explain the scatter diagrams
5. Explain, fully, the meaning of regression of one variable Y on another variable X.
(c
Discuss the method of least squares for fitting a linear regression of the form Y = a +
bX.
e
1. c) Same
2. b) Dependence
in
3. a) Negative
4. a) Negative
nl
5. c) Scatter Diagram
Further Readings
O
1. Richard I. Levin, David S. Rubin, Sanjay Rastogi Masood Husain Siddiqui,
Statistics for Management, Pearson Education, 7th Edition, 2016.
2. Prem.S.Mann, Introductory Statistics, 7th Edition, Wiley India, 2016.
ity
3. Gareth James, Daniela Witten, Trevor Hastie, Robert Tibshirani, An
Introduction to Statistical Learning with Applications in R, Springer, 2016.
Bibliography
s
1. Srivastava V. K. etal – Quantitative Techniques for Managerial Decision Making,
Wiley Eastern Ltd
2.
er
Richard, I.Levin and Charles A.Kirkpatrick – Quantitative Approaches to Management,
McGraw Hill, Kogakusha Ltd.
3. Prem.S.Mann, Introductory Statistics, 7th Edition, Wiley India, 2016.
v
4. Budnik, Frank S Dennis Mcleaavey, Richard Mojena – Principles of Operation
Research - AIT BS New Delhi.
ni
e
Objectives
in
1. To understand the probability theory
2. To know about various probability distributions applicable in different scenarios
nl
Outcomes
1. To be able to connect business problems to probability theory and to utilize the
O
knowledge of probability distributions
ity
collection of estimates.”
W.I. King-
5.1.1 Introduction
s
Time series analysis systematically identifies and isolates different kinds of time-
relatedpatterns in the data. Four common relationship patterns are horizontal, trend,
er
seasonaland cyclic. The random component is superimposed on these patterns.
There is aprocedure for decomposing the time series in these patterns. These are
used forforecasting. However, more accurate and statistically sound procedure is
to identify thepatterns in time series using auto-correlations that was explained in
v
previoussubsection. It is correlation between the values of same variable at different
time lag.
ni
When the time series represents completely random data, the auto correlation
forvarious time lags is close to zero with values fluctuating both on positive and
negativeside. If auto correlation slowly drops to zero, and more than two or three
U
differsignificantly from zero, it indicates presence of trend in the data. The trend can
beremoved by taking difference between consecutive values and constructing a
newseries. This is called numerical differentiation.
ity
Definition
A time series is a collection of data obtained by observing a response variable at
periodic points in time. If repeated observations on a variable produce a time series, the
variable is called a time series variable. We use Yi to denote the value of the variable at
m
time i.
The analysis of time series implies its decomposition into various factors that affect
the value of its variable in a given period. It is a quantitative and objective evaluation of
the effects of various factors on the activity under consideration.
There are two main objectives of the analysis of any time series data:
(c
2. To make forecasts for future. The study of past behaviour is essential because
Notes
e
it provides us the knowledge of the effects of various forces. This can facilitate
the process of anticipation of future course of events and, thus, forecasting the
in
value of the variable as well as planning for future.
nl
Secular trend or simply trend is the general tendency of the data to increase or
decrease or stagnate over a long period of time. Most of the business and economic
time series would reveal a tendency to increase or to decrease over a number of years.
O
For example, data regarding industrial production, agricultural production, population,
bank deposits, deficit financing, etc., show that, in general, these magnitudes have
been rising over a fairly long period. As opposed to this, a time series may also reveal
a declining trend, e.g., in the case of substitution of one commodity by another, the
ity
demand of the substituted commodity would reveal a declining trend such as the
demand for cotton clothes, demand for coarse grains like bajra, jowar, etc. With
the improved medical facilities, the death rate is likely to show a declining trend, etc.
The change in trend, in either case, is attributable to the fundamental forces such as
changes in population, technology, composition of production, etc.
s
5.1.4 Time Series Analysis - Seasonal Component
er
Cycles that occurs over short periods of time, normally < 1 year. e.g. monthly,
weekly, daily. A time series, where the time interval between successive observations
is less than or equal to one year, may have the effects of both the seasonal and cyclical
v
variations. However, the seasonal variations are absent if the time interval between
successive observations is greater than one year.
ni
●● Climatic Conditions
●● Customs and Traditions
ity
(i) Climatic Conditions: The changes in climatic conditions affect the value
of time series variable and the resulting changes are known as seasonal
variations. For example, the sale of woolen garments is generally at its
peak in the month of November and December because of the beginning
of winter season. Similarly, timely rainfall may increase agricultural output,
m
(ii) Customs and Traditions: The customs and traditions of the people
also give rise to the seasonal variations in time series. For example, the
purchase of clothing and ornaments may be highest during the marriage
season, sale of sweets during Diwali, etc., are variations which are the
results of customs and traditions of the people.
(c
e
Cyclical variations are revealed by most of the economic and business time
series and, therefore, are also termed as trade or the business cycles. Any trade cycle
in
has four phases which are respectively known as boom, recession, depression and
recovery.
Various phases repeat themselves regularly one after another in the given
nl
sequence. The time interval between two identical phases is known as the period of
cyclical variations. The period is always greater than one year. Normally, the period of
cyclical variations lies between 3 to 10 years.
O
Objectives of Measuring Cyclical Variations
The main objectives of measuring cyclical variations are:
ity
(i) To analyse the behaviour of cyclical variations in the past.
(ii) To predict the effect of cyclical variations so as to provide guidelines for future
business policies.
s
5.1.6 Time Series Analysis - Random Component
As the name suggests, these variations do not reveal any regular pattern of
er
the movements. These variations are caused by random factors such as strikes,
fire, floods, war, famines, etc. Random variations is that component of a time series
that cannot be explained in terms of any of the components discussed so far. This
v
component is obtained as a residue after the elimination of trend, seasonal and cyclical
components and hence is often termed as residual component. Random variations
ni
are usually short-term variations but sometimes their effect may be so intense that the
value of trend may get permanently affected.
U
∑ x i2 =
∑ X i 2 − nX 2
∑ y i2 =
∑ Yi 2 − nY 2
∑ xi yi =
∑ X iYi − nX .Y
∑ xi y
m
b= , a= Y − bX
∑ x i2
These measures define a and b which will give the best possible fit through the
)A
original X and Y points and the value of r can then be worked out as under:
b ∑ x i2
r=
∑ y i2
(c
e
Alternatively, for fitting a regression equation of the type Y = a + bXto the given
values of X and Y variables, we can find the values of the two constants viz., a andb by
in
using the following two normal equations:
∑ yi = na + b ∑ xi
nl
∑ X iYi = a ∑ X i + b ∑ X i 2
and then solving these equations for finding a and b values. Once these values are
obtained and have been put in the equation Y = a + bX, we say that we have fitted the
O
regression equation of Y on X to the given data. In a similar fashion, we can develop
the regression equation of X and Y viz., X = a + bX, presuming Y as an independent
variable and X as dependent variable.
ity
5.1.8 Method of Least Square Parabolic Trend
The mathematical form of a parabolic trend is given by Yt = a + bt + ct2 or Y =
a + bt + ct2 (dropping the subscript for convenience). Here a, b and c are constants
to be determined from the given data. Using the method of least squares, the normal
s
equations for the simultaneous solution of a, b, and c are:
er
∑ Y = na + b ∑ t + c ∑ t 2
∑ tY = a ∑ t + b ∑ t 2 + c ∑ t 3
∑ t2Y = a ∑ t 2 + b ∑ t 3 + c ∑ t 4
v
By selecting a suitable year of origin, i.e., define X = t - origin such that SX = 0, the
computation work can be considerably simplified. Also note that if SX = 0, then SX3 will
ni
also be equal to zero. Thus, the above equations can be rewritten as:
∑ Y = na + cX 2 ...(i)
U
b X 2 ...(ii)
∑ XY =∑
∑ X 2Y = a ∑ X 2 + c ∑ X 4 ...(iii)
∑ XY
ity
n ∑ X 2Y − ( ∑ X 2 )( ∑ Y )
And from equation (iii), we get c= ...(vi)
n ∑ X 4 − ( ∑ X 2 )2
)A
Thus, equations (iv), (v) and (vi) can be used to determine the values of the
constants a, b and c.
This method is based on the principle that the total effect of periodic variations at
different points of time in its cycle gets completely neutralised, i.e., S = 0 t in one year
and C = 0 t in the period of cyclical variations.
e
from overlapping groups of successive values of a time series. Each group includes all
the observations in a given time interval, termed as the period of moving average. The
in
next group is obtained by replacing the oldest value by the next value in the series. The
averages of such groups are known as the moving averages.
nl
The moving average of a group is always shown at the centre of its period. The
process of computing moving averages smoothens out the fluctuations in the time
O
series data. It can be shown that if the trend is linear and the oscillatory variations are
regular, the moving average with period equal to the period of oscillatory variations
would completely eliminate them. Further, the effect of random variations would get
minimised because the average of a number of observations must lie between the
ity
smallest and the largest observation. It should be noted here that the larger is the
period of moving average the more would be the reduction in the effect of random
component but more information is lost at the two ends of data. When the trend is non-
linear, the moving averages would give biased rather than the actual trend values.
s
Example: Determine the trend values of the following data by using 3-year moving
average. Also find short-term fluctuations for various years, assuming additive model.
er
Year 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000
Production 26 27 28 30 29 27 30 31 32 31
v
Solution:
1991 26
1992 27 81 27 0.0
1993 28 85 28.3 -0.33
ity
1994 30 87 29 1
1995 29 86 28.6 0.33
1996 27 86 28.6 - 1.67
1997 30 88 29.3 0.67
m
1998 31 93 31 0.0
1999 32 94 31.33 0.67
2000 31
)A
Merits
1. This method is easy to understand and also easy to use as there are no mathematical
complexities involved.
(c
2. It is an objective method.
3. It is a flexible method in the sense that if a few more observations are added, the
Notes
e
entire calculations are not changed.
4. When the period of oscillatory movements is equal to the period of moving average,
in
these movements are completely eliminated.
5. By the indirect use of this method, it also possible to isolate seasonal, cyclical and
random components.
nl
5.1.11 Method of Least Square - Exponential Trend
The general form of an exponential trend is Y = a.bt, where a and b are constants
O
to be determined from the observed data. Taking logarithms of both sides, we have logY
= log a + t log b. This is a linear equation in log Y and t and can be fitted in a similar way
as done in case of linear trend. Let A = log a and B = log b, then the above equation can
ity
be written as log Y = A + Bt.
∑ log Y = nA + B ∑ t
and ∑ t log Y = A ∑ t + B ∑ t 2
s
By selecting a suitable origin, i.e., defining X=t- origin, such that SX = 0, the
er
computation work can be simplified. The values of A and B are given by
∑ log Y ∑ X log Y
=A = and B
n ∑X2
v
respectively. Thus, the fitted trend equation can be written as long Y=A+BX
ni
5.2.1 Introduction
A probability is the quantitative measure of risk. Statistician I.J. Good suggests,
“The theory of probability is much older than the human species, since the assessment
ity
of uncertainty incorporates the idea of learning from experience, which most creatures
do.”
used in theory of probability. Although these terms are commonly used in business, they
Notes
e
have precise technical meaning.
in
in outcomes under study is called experiment, for example, sampling from a
production lot. Random experiment is an experiment whose outcome is not
predictable in advance. There is a chance or risk (sometimes also called as
nl
uncertainty) associated with each outcome.
●● Sample Space: It is a set of all possible outcomes of an experiment. It is usually
represented as S.
O
Example:
If the random experiment is rolling of a die, the sample space is a set, S = {1, 2, 3,
4, 5, 6}.
ity
Similarly, if the random experiment is tossing of three coins, the sample space is, S
= {HHH, HHT, HTH, THH, HTT, THT, TTH, TTT} with total of 8 possible outcomes. (H is
heads, and T is Tails showing up.)
s
If we select a random sample of 2 items from a production lot and check them for
defect, the sample space will be S = {DD, DS, DR, RS, RR, SS} where D stands for
er
defective, S stands for serviceable and R stands for re-workable.
and F. It is denoted as E U F.
●● Intersection of events: If E and F are two events, then another event defined to
include all outcomes that are in both E and F is called as an intersection of events
E and F. It is denoted as E∩ F.
m
●● Mutually exclusive events: The events E and F are said to be mutually exclusive
events if they have no outcome of the experiment common to them. In other
words, events E and F are said to be mutually exclusive events if E∩ F = φ, where
)A
all outcomes that are not in the E. It is denoted as EC. Thus, E ∩ EC = φ and E U
EC = S.
e
If one task can be done in n1 ways and other task can be done in n2 ways and if
these tasks cannot be done at the same time, then there are (n1 + n2) ways of doing
in
one of these tasks (either one task or the other). When logical OR is used in deciding
outcomes of the experiment and events are mutually exclusive then the ‘Sum Rule’ is
applicable.
nl
The Addition rule of probability states that:
1. If ‘A’ and ‘B’ are any two events then the probability of the occurrence of either
O
‘A’ or ‘B’ is given by:
P (A U B) = P (A) +P (B) – P (A∩B)
2. If ‘A’ and ‘B’ are two mutually exclusive events then the probability of occurrence
ity
of either A or B is given by
P (A U B) = P (A) + P (B)
Example:
s
An urn contains 10 balls of which 5 are white, 3 black and 2 red. If we select one
ball randomly, how many ways are there that the ball is either white or red?
er
Solution:
Answer is 5 + 2 = 7.
v
Example:
ni
In a triangular series the probability of Indian team winning match with Zimbawe is
0.7 and that with Australia is 0.4. If the probability of India winning both matches is 0.3,
what is the probability that India will win at least one match so that it can enter the final?
U
Solution:
Now, given that probability of the Indian team winning the match with Zimbawe P
(A) = 0.7, Australia P (A) = 0.4 and with both P (A ∩B) = 0.3
ity
Therefore, probability that India will win at least one match is,
= 0.8
)A
having completed the first experiment the second experiment outcome can be in n2,
then similarly outcome of the third experiment can be in n3 ways, and so on. Then there
Notes
e
is a total of n1 × n2 × n3 ×…× nr possible outcomes of the r experiments.
in
If ‘A’ and ‘B’ are two independent events then the probability of occurrence of ‘A’
and ‘B’ is given by:
nl
P (A∩B) = P (A) P (B)
Yours truly, must remember when the logical AND is used to indicate successive
experiments then, the ‘Product Rule’ is applicable.
O
Example:
How many outcomes are there if we toss a coin and then throw a dice?
ity
Answer is 2 × 6 = 12.
Example:
It has been found that 80% of all tourists who visit India visit Delhi, 70% of them
s
visit Mumbai and 60% of them visit both.
er
1. What is the probability that a tourist will visit at least one city?
2. Also, find the probability that he will visit neither city.
Solution:
v
Let D indicate visit to Delhi and M denote visit to Mumbai.
ni
of the event F happening. Probability that E occurs given that F has occurred is the
conditional probability and denoted by P(E F) . If event F occurs, then our sample space
is reduced to the event space of F. Also now for event E to occur, we must have both
events E and F occur simultaneously. Hence probability that event E occurs, given that
)A
Conditional probability satisfies all the properties and axioms of probabilities. Now
Notes
e
onwards, we would write (E ∩ F) as EF, which is a common convention.
Definition: Conditional probability is the probability that an event will occur given
in
that another event has already occurred. If A and B are two events, then the conditional
probability of A given B is written as P (A/B) and read as “the probability of A given that
B has already occurred.”
nl
Example: The probability that a new product will be successful if a competitor
does not launch a similar product is 0.67. The probability that a new product will be
successful in the presence of a competitor’s new product is 0.42. The probability that
O
the competitor will launch a new product is 0.35. What is the probability that the product
will be success?
Solution: Let S denote that the product is successful, L denote competitor will
ity
launch a product and LC denotes competitor will not launch the product. Now, from
given data,
s
Now, using conditional probability formula, probability that the product will be
er
success P(S) is,
Consider two events, E and F. whatsoever be the events, we can always say
that the probability of E is equal to the probability of intersection of E and F, plus, the
U
P (E) = P (E ∩ F) + P (E ∩ F ∩ C)
Baye’s Formula
ity
we have by Axiom 3,
Suppose now that E has occurred and we are interested in determining the
probability of Fi has occurred, then using above equations, we have following
proposition.
P(EFi) P(EFi) x P(Fi)
(c
e
possible ‘hypothesis’ about proportionality of some subject matter, say market shares of
a competitors, then Baye’s’ formula gives us how these should be modified by the new
in
evidence of the experiment, says a market survey.
Example:
nl
A bin contains 3 different types of lamps. The probability that a type 1 lamp will give
over 100 hours of use is 0.7, with the corresponding probabilities for type 2 and 3 lamps
being 0.4 and 0.3 respectively. Suppose that 20 per cent of the lamps in the bin are of
type 1, 30 per cent are of type 2 and 50 per cent are of type 3. What is the probability
O
that a randomly selected lamp will last more than 100 hours?
Given that a selected lamp lasted more than 100 hours, what are the conditional
probabilities that it is of type 1, type 2 and type 3?
ity
Solution:
Let type 1, type 2 and type 3 lamps be denoted by T1, T2 and T3 respectively.
Also, we denote S if a lamp lasts more than 100 hours and SC if it does not. Now,
s
as per given data, er
P(S|T1) = 0.7 , = P(S|T2 ) 0.4 , = P(S|T3 ) 0.3
= 0.41
U
function f(x), for all real values of X, having property that for any set B of real numbers,
P( x∈B ) = ∫f(x ) dx
The function f(x) is called the probability density function of the random variable X.
Notes
e
Again note that f(x) must satisfy axioms of probability.
in
5.2.8 Probability Distributions of Random Variables
In many practical situations, the random variable of interest follows a specific
pattern. Random variables are often classified according to the probability mass
nl
function in case of discrete, and probability density function in case of continuous
random variable. When the distributions are known fully, all statistical calculations are
possible. In practice, however, the distributions may not be known fully. But we may be
O
able to approximate the random variable to one of the known types of standard random
variables by examining the processes that make it random.
ity
can be calculated using known closed formulae. We will study some of the common
types of probability distributions. The normal distribution is the backbone of statistical
inference and hence we will study it in more detail.
There are broadly four theoretical distributions which are generally applied in
s
practice. They are: er
1. Bernoulli distribution
2. Binomial distribution
3. Poisson distribution
v
4. Normal distribution
ni
Suppose we perform n independent Bernoulli trials (each with two possible outcomes
and probability of success p) each of which results in a success with probability p and
probability of failure (1 – p). If random variable X represents the number of successes
that occur in n trials (order of successes not important), then X is said to be a Binomial
ity
Note that Bernoulli random variable is a Binomial random variable with parameter
(1, p) i.e. n = 1. The probability mass function of a binomial random variable with
parameters (n, p) is given by,
m
μ = E[X] = np
●● Trials are finite (and not very large), performed repeatedly for ‘n’ times.
Notes
e
●● Each trial (random experiment) should be a Bernoulli trial, the one that results in
either success or failure.
in
●● Probability of success in any trial is ‘p’ and is constant for each trial.
●● All the trials are independent.
nl
●● These trials are usually the experiments of selection ‘with replacement’. In cases
where the number of the population is very large, drawing a small sample from it
does not change probability of success significantly. Hence, we could consider the
O
distribution as Bernoulli distribution.
Following are some of the real life examples of applications of binomial distribution.
ity
●● Number of female births out of n births in a hospital.
●● Number of correct answers in a multiple-choice test.
●● Number of seeds germinated in a row of n planted seeds.
●● Number of recaptured fish n a sample of n fish.
s
●● Number of missiles hitting the targets out of n fired.
er
Example:
Suppose that the probability that a light in a classroom will be burnt out is 1/3. The
classroom has in all five lights and it is unusable if the number of lights burning is less
v
than two. What is the probability that the class room is unusable on a random occasion?
ni
Solution:
1
This a case of binomial distribution with n = 5 and p = class room is unusable if
the number of burnouts is 4 or 5. that is i=4 or 5. 3
U
n
Noting that, P(X = 4) + P(X = i) = ( ) (P)i (1-P)n-i
i
Thus, the probability that the class room is unusable on a random occasion is,
5 1 2 5 1 2
ity
Example:
It is observed that 80% of T.V. viewers watch Aap ki Adalat programme. What is
m
the probability that at least 80% of the viewers in a random sample of 5 watch this
programme?
Solution:
)A
4 5
0.3277 = 0.7373
e
that the binomial random variable falls within a specified range (e.g., is greater thatn or
equal to a stated lower limit and less that or equal to a stated upper limit).
in
5.2.11 Poisson Distribution
A random variable X, taking one of the values 0, 1, 2, is said to be a Poisson
nl
random variable with parameter l, if for some l > 0,
O
P(X = i) is a probability mass function (p.m.f.) of the Poisson random variable. Its
expected value and variance are,
μ = E[X] = λ
ity
Var(X) = λ
Poisson random variable has wide range of applications. It can also be used as
an approximation for a binomial random variable with parameters if n is large and p is
small enough to make the product np of moderate size. In this case, we call np = l an
s
average rate. Some of the common examples where Poisson random variable can be
used to define the probability distribution are:
er
1. Number of accidents per day on expressway.
2. Number of earthquakes occurring over fixed time span.
v
3. Number of misprints on a page.
4. Number of arrivals of calls on telephone exchange per minute.
ni
Procedure for Using Cumulative Poisson Probabilities Table Poisson p.m.f. for
given l and i can be easily calculated using scientific calculators. But while calculating
cumulative probabilities i.e., ‘c.d.f.’, manual calculations become too tedious. In
ity
such cases, we can use the Cumulative Poisson Probabilities. Cumulative Poisson
Probabilities is referred as follows:
Example:
Average number of accidents on express way is five per week. Find the probability
of exactly two accidents that would take place in a given week. Also find the probability
(c
of at the most two accidents that will take place in next week.
Solution:
Notes
e
Method I
in
(0) + p (1) = 10C0 (0.1)0 (0.1) 10 + 10 C1 (0.1)1 (0.1)9 = 0.7361 Or, Using Cumulative
Binomial Probabilities Table. We can read for n=10. [=0.1 and i =1, the cumulative
probability as 0.7361.
nl
Method II
O
parameter /=10 x 0.1 = 1 we get, P {X ≤ 1} = p (0) + p (1) = [e–1(I) 0] / 01 + [e–1(I)1] /1I
= e–1 + e–1 = 0.7358 Or, Using Cumulative Poisson Probabilities Table.
ity
Note: That Poisson distribution gives reasonable good approximation.
Example:
Average time for updating a passbook by a bank clerk is 15 seconds. Someone
s
arrives just ahead of you. Find the probability that you will have to wait for your turn,
Solution:
v
Now, λ = 60/15 = 4 passbooks per minute
ni
= 1 - 0.1353
= 0.8647
ity
If random variable is affected by many independent causes, and the effect of each
cause is not significantly large as compared to other effects, then the random variable
)A
will closely follow the normal distribution, e.g., weights of coffee filled in packs, lengths
of nails manufactured on a machine, hardness of ball bearing surface, diameters of
shafts produced on lathe, effectiveness of training programme on the employees’
productivity, etc., are examples of normally distributed random variables.
(c
Further, many sampling statistics, e.g., sample means X bar, are normally
distributed. A random variable X is a normal random variable with parameters μ and σ if
the probability density function (p.d.f.) of X is given by,
( x − µ )2
1
Notes
e
=f ( x) e 2σ Where, – ∞ < × < ∞
σ 2π
This distribution is bell-shaped curve that is symmetric about μ. It gives a
in
theoretical base to the observation that, in practice, many random phenomena obey
approximately, a normal probability distribution. Mean of normal random variable is E(X)
= μ and variance of normal random variable is Var(X) σ2. If X is normally distributed
nl
with parameters μ and σ, then another random variable is also normally distributed with
parameters (aμ + b) and (a σ).
O
1. It is perfectly symmetric about the mean μ.
2. For a normal distribution mean = median = mode.
ity
3. It is uni-modal (one mode), with skewness = 0 and kurtosis = 0.
4. Normal distribution is a limiting form of binomial distribution when number trials
n is large, and neither the probability p nor (1-p) is very small.
5. Normal distribution is a limiting case of Poisson distribution when mean μ = λ is
s
very large. er
6. While working on probability of normal distribution we usually use normal
distribution (more often standard normal distribution) tables.
While reading these tables, properties are:
v
(a) The probability that a normally distributed random variable with mean μ and
variance σ² lies between two specified values a and b is P (a < X < b) = area
ni
tabulation also has a problem that we must have tables for every possible value of μ
and σ² (which is not feasible). Hence, we transform Normal Random Variable to another
random variable known as Standard Normal Random Variable. For this, we use a
transformation,
(x– μ) 1 μ
m
z= = x–
σ σ σ
z is a normally distributed random variable with parameters, μ = 0 and σ = 1. Any
normal random variable can be transformed to standard normal random variable z. We
)A
This has been calculated for various values of ‘a’ and tabulated. Also, we know
that,
Amity Directorate of Distance & Online Education
Business Statistics 89
e
5.2.15 Standard Normal Distribution - Empirical Rule & App
in
A statement about normal distributions is the Empirical Law. An abbreviated version
of this, known as the 95 percent Rule, is used in your textbook since 95 percent is the
most widely used interval. The 95 percent Rule notes that on a normal distribution,
nl
about 95 percent of observations fall within two standard deviations of the mean.
Example:
O
Tea is filled in the packs of 200 gm by a machine with variability of 0.25 gms.
Packs weighing less than 200 gm would be rejected by customers and not legally
acceptable. Therefore, marketing and legal department requests production manager to
set the machine to fill slightly more quantity in each pack. However, finance department
ity
objects to this since it would lead to financial loss due to overfilling the packs. The
general manager wants to know the 99% confidence interval, when the machine is set
at 200gms, so that he can take a decision. Find confidence interval. What is your advice
to the production manger?
s
Solution: er
Let weight of the tea in a pack is a random variable X.
We know that the mean μ = 200 gm and variance σ² = 0.25 gms i.e. σ = 0.5 gm.
v
First, we find the value of z for 99% confidence. Standard Normal Distribution curve
is symmetric about mean.
ni
= 0.99/2
U
= 0.495.
(198.71 to 201.29).
Hence, we can advise the production manager to set his machine to fill tea with
mean weight as 201.2875 or say 201.29. In that case we have 99% confidence of
m
meeting legal requirement and at the same time to keep the cost of excess filling of the
coffee to minimum.
)A
Key Takeaways
●● Time Series: A time series is a collection of data obtained by observing a
response variable at periodic points in time.
●● Standard Error of Estimate: Standard Error of Estimate is the measure of
(c
e
mathematical model.
●● Random Component: Variations that do not reveal any regular pattern of the
in
movements
●● Probability: Probability of a given event is an expression of likelihood or chance
of occurrence of an event. A probability is a number which rages from zero to one.
nl
●● Continuous Probability Distributions: Continuous random variables are those
that take on any value including fractions and decimals. Continuous random
variables give rise to continuous probability distributions. Continuous is the
O
opposite of discrete.
●● Event: One or more possible outcomes that belong to certain category of our
interest are called as event. A sub set E of the sample space S is an event. In
ity
other words, an event is a favorable outcome.
●● Event space: It is a set of all possible events. It is usually represented as E. Note
that usually in probability and statistics; we are interested in number of elements in
sample space and number of elements in event space.
s
●● Random Experiment: In theory of probability, a process or activity that results
in outcomes under study is called experiment, for example, sampling from a
production lot.
er
●● Sample: A sample is that part of the universe which the select for the purpose
of investigation. A sample exhibits the characteristics of the universe. The word
v
sample literally means small universe.
●● Sampling: Sampling is defined as the selection of some part of an aggregate
ni
1. In probability theories, events which can never occur together are classified as
a. Collectively exclusive events
b. Mutually exhaustive events
(c
2. Value which is used to measure distance between mean and random variable x in
Notes
e
terms of standard deviation is called
a. Z-value
in
b. Variance
c. Probability of x
nl
d. Density function of x
3. test is applied when samples are less than 30.
a. T
O
b. Z
c. Rank
ity
d. None of these
4. Under non-random sampling method, samples are selected on the basis of
a. Stages
b. Strategy
s
c. Originality
d. Convenience
er
5. Probability of second event in situation if first event has been occurred is classified
as
v
a. Series probability
ni
b. Conditional probability
c. Joint probability
d. Dependent probability
U
2. a) Z value
3. a) T test
4. d) Convenience
(c
5. b) Conditional probability
Further Readings
Notes
e
1. Richard I. Levin, David S. Rubin, Sanjay Rastogi Masood Husain Siddiqui,
Statistics for Management, Pearson Education, 7th Edition, 2016.
in
2. Prem.S.Mann, Introductory Statistics, 7th Edition, Wiley India, 2016.
3. Gareth James, Daniela Witten, Trevor Hastie, Robert Tibshirani, An
Introduction to Statistical Learning with Applications in R, Springer, 2016.
nl
Bibliography
1. Srivastava V. K. etal – Quantitative Techniques for Managerial Decision Making,
O
Wiley Eastern Ltd
2. Richard, I.Levin and Charles A.Kirkpatrick – Quantitative Approaches to Management,
McGraw Hill, Kogakusha Ltd.
ity
3. Prem.S.Mann, Introductory Statistics, 7th Edition, Wiley India, 2016.
4. Budnik, Frank S Dennis Mcleaavey, Richard Mojena – Principles of Operation
Research - AIT BS New Delhi.
s
5. Sharma J K – Operation Research- theory and applications-Mc Millan,New Delhi
6. Kalavathy S. – Operation Research – Vikas Pub Co
er
7. Gould F J – Introduction to Management Science – Englewood Cliffs N J Prentice
Hall.
8. Naray J K, Operation Research, theory and applications – Mc Millan, New Dehi.
v
9. Taha Hamdy, Operations Research, Prentice Hall of India
ni