You are on page 1of 3

QUANTITATIVE TECHNIQUES the events and activities of a project. This helps the of Y.

a project. This helps the of Y. This is the problem of non-linear correlation, 3. When r = 0, it means there is no relationship When there is equal ranks, we have to apply the
Meaning and Definition: Quantitative techniques management in proper deployment of resources. when we plot the data on a graph paper, the plotted between the variables. following formula to compute rank correlation
may be defined as those techniques which provide 4. Decision tree analysis and simulation technique points would not fall on a straight line. 4. When ‘r’ is closer to +1, it means there is high coefficient:-
the decision makes a systematic and powerful means help the management in taking the best possible Degrees of correlation: degree of positive correlation between variables.
of analysis, based on quantitative data. It is a course of action under the conditions of risks and 1. Perfect positive correlation If an increase in the 5. When ‘r’ is closer to – 1, it means there is high R=
scientific method employed for problem solving and uncertainty. value of one variable is followed by the same degree of negative correlation between variables.
D – Difference of rank in the two series
decision making by the management. With the help 5. Queuing theory is used to minimize the cost of proportion of increase in other related variable or if a 6. When ‘r’ is closer to ‘O’, it means there is less
N - Total number of pairs
of quantitative techniques, the decision maker is able waiting and servicing of the customers in queues. decrease in the value of one variable is followed by relationship between variables.
m - Number of times each rank repeats
to explore policies for attaining the predetermined 6. Replacement theory helps the management in the same proportion of decrease in other related Properties of Pearson’s Co-efficient of
Merits of Rank Correlation method
objectives. In short, quantitative techniques are determining the most economic replacement policy variable, it is perfect positive correlation. eg: if 10% Correlation
1. Rank correlation coefficient is only an
inevitable in decision-making process. regarding replacement of an equipment. rise in price of a commodity results in 10% rise in its 1. If there is correlation between variables, the Co-
approximate measure as the actual values are not
Classification of Quantitative Techniques: 1. Limitations of Quantitative Techniques: supply, the correlation is perfectly positive. efficient of correlation lies between +1v and -1.
used for calculations
Mathematical Quantitative Techniques 2. Statistical 1. Quantitative techniques involves mathematical Similarly, if 5% full in price results in 5% fall in 2. If there is no correlation, the coefficient of
2. It is very simple to understand the method.
Quantitative Techniques 3. Programming models, equations and other mathematical supply, the correlation is perfectly positive. correlation is denoted by zero (ie r=0)
3. It can be applied to any type of data, ie
Quantitative Techniques expressions 2. Perfect Negative correlation If an increase in the 3. It measures the degree and direction of change
quantitative and qualitative
Mathematical Quantitative Techcniques: A 2. Quantitative techniques are based on number of value of one variable is followed by the same 4. If simply measures the correlation and does not
4. It is the only way of studying correlation between
technique in which quantitative data are used along assumptions. Therefore, due care must be ensured proportion of decrease in other related variable or if help to predict cansation.
qualitative data such as honesty, beauty etc.
with the principles of mathematics is known as while using quantitative techniques, otherwise it will a decrease in the value of one variable is followed 5. It is the geometric mean of two regression co-
5. As the sum of rank differences of the two
mathematical quantitative techniques. Mathematical lead to wrong conclusions. by the same proportion of increase in other related efficients.
qualitative data is always equal to zero, this method
quantitative techniques involve: 3. Quantitative techniques are very expensive. variably it is Perfect Negative Correlation. For r= facilitates a cross check on the calculation.
1. Permutations and Combinations: Permutation 4. Quantitative techniques do not take into example if 10% rise in price results in 10% fall in its Computation of Pearson’s Co-efficient of
Demerits of Rank Correlation method
means arrangement of objects in a definite order. consideration intangible facts like skill, attitude etc. demand the correlation is perfectly negative. correlation: 1. Rank correlation coefficient is only an
The number of arrangements depends upon the total 5. Quantitative techniques are only tools for analysis Similarly if 5% fall in price results in 5% increase in Pearson’s correlation co-efficient can be computed
approximate measure as the actual values are not
number of objects and the number of objects taken at and decision-making. They are not decisions itself. demand, the correlation is perfectly negative. in different ways. They are: used for calculations.
a time for arrangement. The number of permutations CORRELEATION ANALYSIS 3. Limited Degree of Positive correlation: When a Arithmetic mean method
2. It is not convenient when number of pairs (ie. N)
or arrangements is calculated by using the following Definition: Two or more variables are said to be an increase in the value of one variable is followed b Assumed mean method
is large
formula:- correlated if the change in one variable results in a by a non-proportional increase in other related c Direct method 3. Further algebraic treatment is not possible.
npr=n!/(n-r)! corresponding change in the other variable. variable, or when a decrease in the value of one :- Concurrent Deviation Method: Concurrent
Combination means selection or grouping objects According to Simpson and Kafka, “Correlation variable is followed by a decrease in other related Under arithmetic mean method, co-efficient of
deviation method is a very simple method of
without considering their order. analysis deals with the association between two or variable, it is called limited degree of positive correlation is calculated by taking actual mean. measuring correlation. Under this method, we
The number of combinations is calculated by using more variables”. Lun chou defines, “ Correlation correlation. For example, if 10% rise in price of a
r= consider only the directions of deviations. The
the following formula:- analysis attempts to determine the degree of commodity results in 5% rise in its supply, it is
magnitudes of the values are completely ignored.
ncr=n!/(n-r)! relationship between variables”. Boddington states limited degree of positive correlation. Similarly if r= x=x- , y=y- Therefore, this method is useful when we are
2. Set Theory:- Set theory is a modern mathematical that “Whenever some definite connection exists 10% fall in price of a commodity results in 5% fall
interested in studying correlation between two
device which solves various types of critical between two or more groups or classes of series of in its supply, it is limited degree of positive
Computation of Pearson’s Coefficient of variables in a casual manner and not interested in
problems. data, there is said to be correlation.” In nut shell, correlation.
degree (or precision). Under this method, the nature
3. Matrix Algebra: Matrix is an orderly correlation analysis is an analysis which helps to 4. Limited degree of Negative correlation When correlation
of correlation is known from the direction of
arrangement of certain given numbers or symbols in determine the degree of relationship exists between an increase in the value of one variable is followed Assumed mean method:
deviation in the values of variables. If deviations of
rows and columns. It is a mathematical device of two or more variables. by a non-proportional decrease in other related r=
2 variables are concurrent, then they move in the
finding out the results of different types of algebraic Correlation Coefficient: Correlation analysis is variable, or when a decrease in the value of one
Direct Method: same direction, otherwise in the opposite direction.
operations on the basis of the relevant matrices. actually an attempt to find a numerical value to variable is followed by a nonproportional increase in
Under direct method, coefficient of correlation is The formula for computing the coefficient of
4. Determinants: It is a powerful device developed express the extent of relationship exists between two other related variable, it is called limited degree of
calculated without taking actual mean or concurrent deviation is: -
over the matrix algebra. This device is used for or more variables. The numerical measurement negative correlation. For example, if 10% rise in
assumed mean
finding out values of different variables connected showing the degree of correlation between two or price results in 5% fall in its demand, it is limited r=
with a number of simultaneous equations. more variables is called correlation coefficient. degree of negative correlation. Similarly, if 5% fall r=
5. Differentiation: It is a mathematical process of Correlation coefficient ranges between -1 and +1. in price results in 10% increase in demand, it is N = No. of pairs of symbol
finding out changes in the dependent variable with SIGNIFICANCE OF CORRELATION limited degree of negative correlation. C = No. of concurrent deviations (ie, No. of + signs
Probable Error and Coefficient of Correlation in ‘dx dy’ column)
reference to a small change in the independent ANALYSIS 5. Zero Correlation (Zero Degree correlation) If Probable error (PE) of the Co-efficient of correlation
variable. 1. Correlation analysis helps us to find a single there is no correlation between variables it is called is a statistical device which measures the reliability Steps:
6. Integration: Integration is the reverse process of figure to measure the degree of relationship exists zero correlation. In other words, if the values of one and dependability of the value of co-efficient of 1. Every value of ‘X’ series is compared with its
differentiation. between the variables. variable cannot be associated with the values of the correlation. proceeding value. Increase is shown
7. Differential Equation: It is a mathematical 2. Correlation analysis helps to understand the other variable, it is zero correlation. by ‘+’ symbol and decrease is shown by ‘-‘
equation which involves the differential coefficients economic behavior. Methods of measuring correlation
Probable Error = OR 2. The above step is repeated for ‘Y’ series and we
of the dependent variables. 3. Correlation analysis enables the business I Graphic Methods: 1) Scatter Diagram 2) = 0.6745 x standard error get ‘dy’
Statistical Quantitative Techniques: Statistical executives to estimate cost, price and other Correlation graph 3. Multiply ‘dx’ by ‘dy’ and the product is shown in
Standard Error (SE) =
techniques are those techniques which are used in variables. II Algebraic methods: 1) Karl Pearson’s Co-efficient the next column. The column
conducting the statistical enquiry concerning to 4. Correlation analysis can be used as a basis for the of correlation 2) Spear mans Rank correlation PE= * heading is ‘dxdy’.
certain Phenomenon. They include all the statistical study of regression. Once we know that two method 3) Concurrent deviation method If the value of coefficient of correlation ( r) is less 4. Take the total number of ‘+’ signs in ‘dxdy’
methods beginning from the collection of data till variables are closely related, we can estimate the Scatter Diagram This is the simplest method for column. ‘+’ signs in ‘dxdy’ column
than the PE, then there is no evidence of correlation.
denotes the concurrent deviations, and it is indicated
interpretation of those collected data. Statistical value of one variable if the value of other is known. ascertaining the correlation between variables. If the value of ‘r’ is more than 6 times of PE, the
techniques involve: 5. Correlation analysis helps to reduce the range of Under this method all the values of the two variable correlation is certain and significant. By adding and by ‘C’.
1. Collection of data 2. Measures of Central uncertainty associated with decision making. The are plotted in a chart in the form of dots. Therefore, submitting PE from coefficient of correlation, we 5. Apply the formula:
tendency, dispersion, skewness and Kurtosis 3. prediction based on correlation analysis is always it is also known as dot chart. By observing the Merits of concurrent deviation method:
can find out the upper and lower limits within which
Correlation and Regression Analysis: 4. Index near to reality. scatter of the various dots, we can form an idea that the population coefficient of correlation may be 1. It is very easy to calculate coefficient of
Numbers: 5. Time series Analysis: 6. Interpolation 6. It helps to know whether the correlation is whether the variables are related or not. A scatter correlation
expected to lie.
and Extrapolation: 7. Statistical Quality Control 8. significant or not. This is possible by comparing the diagram indicates the direction of correlation and 2. It is very simple understand the method
Uses of PE:
Ratio Analysis: 9. Probability Theory: 10. Testing of correlation co-efficient with 6PE. It ‘r’ is more than tells us how closely the two variables under study 1) PE is used to determine the limits within which 3. When the number of items is very large, this
Hypothesis 6 PE, the correlation is significant. are related. The greater the scatter of the dots, the method may be used to form quick idea about the
the population coefficient of correlation may be
Programming Techniques: Classification of Correlation lower is the relationship degree of relationship
expected to lie.
1. Linear Programming: Linear programming 1. Positive and Negative correlation Merits of Scatter Diagram method 2) It can be used to test whether the value of 4. This method is more suitable, when we want to
technique is used in finding a solution for optimizing 2. Simple, partial and multiple correlation 1. It is a simple method of studying correlation know the type of correlation (ie, whether positive or
correlation coefficient of a sample is significant with
a given objective under certain constraints. 3. Linear and Non-linear correlation between variables. negative).
that of the population
2. Queuing Theory: Queuing theory deals with Positive and Negative Correlation 2. It is a non-mathematical method of studying Demerits of concurrent deviation method:
Coefficient of Determination
mathematical study of queues. It aims at minimizing Positive Correlation When the variables are correlation between the variables. It does not require One very convenient and useful way of interpreting 1. This method ignores the magnitude of changes. Ie.
cost of both servicing and waiting. varying in the same direction, it is called positive any mathematical calculations. Equal weight is give for small and big changes.
the value of coefficient of correlation is the use of
3. Game Theory: Game theory is used to determine correlation. In other words, if an increase in the 3. It is very easy to understand. It gives an idea the square of coefficient of correlation. The square 2. The result obtained by this method is only a rough
the optimum strategy in a competitive situation. value of one variable is accompanied by an increase about the correlation between variables even to a indicator of the presence or absence of correlation
of coefficient of correlation is called coefficient of
4. Decision Theory: This is concerned with making in the value of other variable or if a decrease in the layman. 3. Further algebraic treatment is not possible
determination.
sound decisions under conditions of certainty, risk value of one variable is accompanied by a decree se 4. It is not influenced by the size of extreme items. Coefficient of determination = 4. Combined coefficient of concurrent deviation of
and uncertainty. in the value of other variable, it is called positive 5. Making a scatter diagram is, usually, the first step Coefficient of non-determination ( ) = 1 – different series cannot be found as in the case of
5. Inventory Theory: Inventory theory helps for correlation. Eg: 1) A: 10 20 30 40 50 in investigating the relationship between two arithmetic mean and standard deviation.
= 1- coefficient of determination
optimizing the inventory levels. It focuses on B: 80 100 150 170 200 variables. REGRESSION ANALYSIS
Merits of Pearson’s Coefficient of Correlation:- Definition:
minimizing cost associated with holding of 2) X: 78 60 52 46 38 Demerits of Scatter diagram method 1. This is the most widely used algebraic method to
inventories. Y: 20 18 14 10 5 1. It gives only a rough idea about the correlation “Regression is the measure of the average
measure coefficient of correlation. relationship between two or more variables in terms
6. Net work programming: It is a technique of Negative Correlation: When the variables are between variables. 2. It gives a numerical value to express the
planning, scheduling, controlling, monitoring and moving in opposite direction, it is called negative 2. The numerical measurement of correlation co- of the original units of the date”. “Regression
relationship between variables analysis is an attempt to establish the nature of the
co-ordinating large and complex projects comprising correlation. In other words, if an increase in the efficient cannot be calculated under this method. 3. It gives both direction and degree of relationship
of a number of activities and events. It serves as an value of one variable is accompanied by a decrease 3. It is not possible to establish the exact degree of between variables relationship between variables-that is to study the
instrument in resource allocation and adjustment of in the value of other variable or if a decrease in the relationship between the variables. functional relationship between the variables and
4. It can be used for further algebraic treatment such thereby provides a mechanism for prediction or
time and cost up to the optimum level. It includes value of one variable is accompanied by an increase Correlation graph Method Under correlation graph as coefficient of determination coefficient of non-
CPM, PERT etc. in the value of other variable, it is called negative method the individual values of the two variables are determination etc. forecasting”. It is clear from the above definitions
7. Simulation: It is a technique of testing a model correlation. Eg: 1) A: 5 10 15 20 25 plotted on a graph paper. Then dots relating to these 5. It gives a single figure to explain the accurate that Regression Analysis is a statistical device with
which resembles a real life situations B: 16 10 8 6 2 variables are joined separately so as to get two the help of which we are able to estimate the
degree of correlation between two variables unknown values of one variable from known values
8. Replacement Theory: It is concerned with the 2) X: 40 32 25 20 10 curves. By examining the direction and closeness of Demerits of Pearson’s Coefficient of correlation
problems of replacement of machines, etc due to Y: 2 3 5 8 12 the two curves, we can infer whether the variables 1. It is very difficult to compute the value of of another variable. The variable which is used to
their deteriorating efficiency or breakdown. It helps Simple, Partial and Multiple correlation are related or not. If both the curves are moving in coefficient of correlation. predict the another variable is called independent
to determine the most economic replacement policy. Simple Correlation In a correlation analysis, if only the same direction( either upward or downward) variable (explanatory variable) and, the variable we
2. It is very difficult to understand are trying to predict is called dependent variable
9. Non Linear Programming: It is a programming two variables are studied it is called simple correlation is said to be positive. If the curves are 3. It requires complicated mathematical calculations
technique which involves finding an optimum correlation. Eg. the study of the relationship between moving in the opposite directions, correlation is said 4. It takes more time (explained variable). The dependent variable is
solution to a problem in which some or all variables price & demand, of a product or price and supply of to be negative. denoted by X and the independent variable is
5. It is unduly affected by extreme items denoted by Y. The analysis used in regression is
are non-linear. a product is a problem of simple correlation. Merits of Correlation Graph Method 6. It assumes a linear relationship between the
10. Sequencing: Sequencing tool is used to Multiple correlation In a correlation analysis, if 1. This is a simple method of studying relationship variables. But in real life situation, it may not be so. called simple linear regression analysis. It is called
determine a sequence in which given jobs should be three or more variables are studied simultaneously, it between the variable simple because three is only one predictor
Spearman’s Rank Correlation Method (independent variable). It is called linear because, it
performed by minimizing the total efforts. is called multiple correlation. For example, when we 2. This does not require mathematical calculations. Pearson’s coefficient of correlation method is
11. Quadratic Programming: Quadratic study the relationship between the yield of rice with 3. This method is very easy to understand is assumed that there is linear relationship between
applicable when variables are measured in independent variable and dependent variable.
programming technique is designed to solve certain both rainfall and fertilizer together, it is a problem of Demerits of correlation graph method: quantitative form. But there were many cases where
problems, the objective function of which takes the multiple correlation. 1. A numerical value of correlation cannot be Types of Regression:-
measurement is not possible because of the There are two types of regression. They are linear
form of a quadratic equation. Partial correlation In a correlation analysis, we calculated. qualitative nature of the variable. For example, we
12. Branch and Bound Technique It is a recently recognize more than two variable, but consider one 2. It is only a pictorial presentation of the regression and multiple regression.
cannot measure the beauty, morality, intelligence, Linear Regression:
developed technique. This is designed to solve the dependent variable and one independent variable relationship between variables. honesty etc in quantitative terms. However it is
combinational problems of decision making where and keeping the other Independent variables as 3. It is not possible to establish the exact degree of possible to rank these qualitative characteristics in It is a type of regression which uses one independent
there are large number of feasible solutions. constant. For example yield of rice is influenced b relationship between the variables. variable to explain and/or predict the dependent
some order. The correlation coefficient obtained variable.
Problems of plant location, problems of determining the amount of rainfall and the amount of fertilizer Karl Pearson’s Co-efficient of Correlation Karl from ranks of the variables instead of their
minimum cost of production etc. are examples of used. But if we study the correlation between yield Pearson’s Coefficient of Correlation is the most Multiple Regression:
quantitative measurement is called rank correlation. It is a type of regression which uses two or more
combinational problems. of rice and the amount of rainfall by keeping the popular method among the algebraic methods for This was developed by Charles Edward Spearman in
Functions of Quantitative Techniques: amount of fertilizers used as constant, it is a problem measuring correlation. This method was developed 1904. independent variable to explain and/or predict the
1. To facilitate the decision-making process 2. To of partial correlation. by Prof. Karl Pearson in 1896. It is also called dependent variable.
provide tools for scientific research 3. To help in Linear and Non-linear correlation product moment correlation coefficient. Pearson’s Spearman’s coefficient correlation (R) = Regression Lines:
choosing an optimal strategy 4. To enable in proper Linear Correlation In a correlation analysis, if the coefficient of correlation is defined as the ratio of Where D = difference of ranks between the two Regression line is a graphic technique to show the
deployment of resources 5. To help in minimizing ratio of change between the two sets of variables is the covariance between X and Y to the product of variables functional relationship between the two variables X
costs 6. To help in minimizing the total processing same, then it is called linear correlation. For their standard deviations. This is denoted by ‘r’ or N = number of pairs and Y. It is a line which shows the average
time required for performing a set of jobs example when 10% increase in one variable is rxy relationship between two variables X and Y. If there
USES OF QUANTITATE TECHNIQUES accompanied by 10% increase in the other variable, r = (Covariance of X and Y)/( (SD of X) x (SD of Computation of Rank Correlation Coefficient is perfect positive correlation between 2 variables,
Business and Industry it is the problem of linear correlation. X: 10 15 30 60 Y)) when Ranks are Equal then the two regression lines are winding each other
1. Quantitative techniques of linear programming is Y: 50 75 150 300 Here the ratio of change between Interpretation of Co-efficient of Correlation There may be chances of obtaining same rank for and to give one line. There would be two regression
used for optimal allocation of scarce resources in the X and Y is the same. When we plot the data in graph Pearson’s Co-efficient of correlation always lies two or more items. In such a situation, lines when there is no perfect correlation between
problem of determining product mix paper, all the plotted points would fall on a straight between +1 and -1. The following it is required to give average rank for all. Such two variables. The nearer the two regression lines to
2. Inventory control techniques are useful in line. Non-linear correlation In a correlation analysis general rules will help to interpret the Co-efficient of items. For example, if two observations got 4th each other, the
dividing when and how much items are to be if the amount of change in one variable does not correlation: rank, each of those observations should be given the higher is the degree of correlation and the farther the
purchase so as to maintain a balance between the bring the same ratio of change in the other variable, 1. When r - +1, It means there is perfect positive rank 4.5 (ie. regression lines from each other, the lesser is the
cost of holding and cost of ordering the inventory it is called non linear correlation. X: 2 4 6 10 15 Y: 8 relationship between variables. degree of correlation.
3. Quantitative techniques of CPM, and PERT helps 10 18 22 26 Here the change in the value of X does 2. When r = -1, it means there is perfect negative Properties of Regression lines:-
in determining the earliest and the latest times for not being the same proportionate change in the value relationship between variables.
1. The two regression lines cut each other at the An event whose occurrence is inevitable is called (a) Addition theorem (Mutually Exclusive 4. Mean of the Binomial distribution increases as ‘n’ is valid or not. The main objective of hypothesis
point of average of X and average of Y sure even. Eg:- Getting a white ball from a box Events) increases with ‘p’ remaining constant. testing is whether to accept or reject the hypothesis.
2. When r = 1, the two regression lines coincide each containing all while balls. If two events, ‘A’ and ‘B’, are mutually exclusive 5. The mean of Binomial distribution is np. Procedure for Testing of Hypothesis:
other and give one line. Impossible Events the probability of the occurrence of either ‘A’ or ‘B’ 6. The Standard deviation of Binomial distribution is 1. Set Up a Hypothesis: The first step in testing of
3. When r = 0, the two regression lines are mutually An event whose occurrence is impossible, is called is the sum of the individual probability of A and B. ���� hypothesis is to set p a hypothesis about population
perpendicular. impossible event. Eg:- Getting a white ball from a P(A or B) = P(A) + P(B) 7. If ‘n’ is large and if neither ‘p’ nor ‘q’ is too close parameter. Normally, the researcher has to fix two
Regression Equations (Estimating Equations) box containing all red balls. i.e., P(A B) = P(A) + P(B) zero, Binomial distribution may be approximated to types of hypothesis. They are null hypothesis and
Regression equations are algebraic expressions of Uncertain Events (b)Addition theorem (Not mutually exclusive Normal Distribution. alternative hypothesis.
the regression lines. Since there are two regression An event whose occurrence is neither sure nor events) 8. If two independent random variables follow Null Hypothesis:- Null hypothesis is the original
lines, therefore two regression equations. They are :- impossible is called uncertain event. Eg:- Getting a If two events, A and B are not mutually exclusive Binomial distribution, their sum also follows hypothesis. It states that there is no significant
1. Regression Equation of X on Y:- This is used to white ball from a box containing white balls and the probability of the occurrence of Binomial distribution. difference between the sample and population
describe the variations in the values of X for given black balls. either A or B is the sum of their individual Fitting a Binomial Distribution regarding a particular matter under consideration.
changes in Y. Equally likely Events probability minus probability for both to happen. Steps: The word “null” means ‘invalid’ of ‘void’ or
2. Regression Equation of Y on X :- This is used to Two events are said to be equally likely if anyone of P(A or B) = P(A) + P(B) – P(A and B) 1. Find the value of n, p and q ‘amounting to nothing’. Null hypothesis is denoted
describe the variations in the value of Y for given them cannot be expected to occur in preference to i.e., P(A B) = P(A) + P(B) – P(A∩B) 2. Substitute the values of n, p and q in the Binomial by Ho. For example, suppose we want to test
changes in X. other. For example, getting herd and getting tail MULTIPLICATION THEOREM Distribution function of nC r prqn-r whether a medicine is effective in curing cancer.
Least Square Method of computing Regression when a coin is tossed are equally likely events. (a)Multiplication theorem (independent events) 3. Put r = 0, 1, 2, ……….. in the function nC r prqn- Hence, the null hypothesis will be stated as follows:-
Equation: Mutually exclusive events If two events are independent, then the probability of r H0: The medicine is not effective in curing cancer
The method of least square is an objective method of A set of events are said to be mutually exclusive of occurring both will be the product of 4. Multiply each such terms by total frequency (N) (i.e., there is no significant difference between the
determining the best relationship between the two the occurrence of one of them excludes the the individual probability to obtain the expected frequency. given medicine and other medicines in curing cancer
variables constituting a bivariate data. To find out possibiligy of the occurrence of the others. P(A and B) = P(A).P(B) POISSON DISTRIBUTION disease.)
best relationship means to determine the values of Exhaustive Events: i.e., P(A B) = P(A).P(B) Meaning and Definition: Alternative Hypothesis:-
the constants involved in the functional relationship A group of events is said to be exhaustive when it (b)Multiplication theorem (dependent Events):- Poisson Distribution is a limiting form of Binomial Any hypothesis other than null hypothesis is called
between the two variables. This can be done by the includes all possible outcomes of the random If two events, A and B are dependent, the probability Distribution. In Binomial Distribution, the total alternative hypothesis. When a null hypothesis is
principle of least squares: The principle of least experiment under consideration. of occurring 2nd event will be affected by the number of trials are known previously. But in certain rejected, we accept the other hypothesis, known as
squares says that the sum of the squares of the Dependent Events: outcome of the first. real life situations, it may be impossible to count the alternative hypothesis. Alternative hypothesis is
deviations between the observed values and Two or more events are said to be dependent if the P(A B) = P(A).P(B/A) total number of times a particular event occurs or denoted by H1. In the above example, the alternative
estimated values should be the least. In other words, happening of one of them affects the happening of Inverse Probability does not occur. In such cases Poisson Distribution is hypothesis may be stated as follows:-
Σ(y-yc will be the minimum. With a little algebra the other. If an event has happened as a result of several more suitable. Poison Distribution is a discrete H1: The medicine is effective in curing cancer. (i.e.,
and differential calculators we can develop some PERMUTATIONS causes, then we may be interested to find out the probability distribution. It was originated by Simeon there is significant difference between the given
equations (2 equations in case of a linear Permutation means arrangement of objects in a probability of a particular cause of happening that Denis Poisson. The Poisson Distribution is defined medicine and other medicines in curing cancer
relationship) called normal equations. By solving definite order. The number of arrangements events. This type of problem is called inverse as:- disease.)
these normal equations, we can find out the best (permutations) depends upon the total number of probability. Baye’s theorem is based upon inverse 2. Set up a suitable level of significance: After
values of the constants. objects and the number of objects taken at a time for probability. p (r) = setting up the hypothesis, the researcher has to set up
Regression Equation of Y on X:- arrangement. The number of permutations is BAYE’S THEOREM: r = random variable (i.e., number of success in ‘n’ a suitable level of significance. The level of
Y = a + bx Baye’s theorem is based on the proposition that trials. significance is the probability with which we may
The normal equations to compute ‘a’ and ‘b’ are: - probabilities should revised on the basis of all the e = 2.7183 reject a null hypothesis when it is true. For example,
Σy=Na+bΣx ! = Factorial available information. The revision of probabilities m = mean of poisson distribution if level of significance is 5%, it means that in the
Σxy=aΣx+bΣ n = Total number of objects based on available information will help to reduce Properties of Poisson Distribution long run, the researcher is rejecting true null
Regression Equation of X on Y:- r = Number of objects taken at a time for the risk involved in decision-making. The 1. Poisson Distribution is a discrete probability hypothesis 5 times out of every 100 times. Level of
X = a + by arrangement probabilities before revision is called priori distribution. significance is denoted by α (alpha).
The normal equations to compute ‘a’ and ‘b’ are:- If whole the objects are taken at a time for probabilities and the probabilities after revision is 2. Poisson Distribution has a single parameter ‘m’. α = Probability of rejecting H0 when it is true.
Σx =Na+nΣy arrangement, then number of permutations is called posterior probabilities. According to Baye’s When ‘m’ is known all the terms can be found out. Generally, the level of significance is fixed at 1% or
Σxy =aΣy + bΣ calculated by using the formula : theorem, the posterior probability of event (A) for a 3. It is a positively skewed distribution. 5%.
Regression Coefficient method of computing n particular result of an investigation (B) may be 4. Mean and Varriance of Poisson Distribution are 3. Decide a test criterion: The third step in testing
Regression Equations: found from the following formula:- equal to ‘m’. of hypothesis is to select an appropriate test
Regression equations can also be computed by the = , 5. In Poisson Distribution, the number of success is criterion. Commonly used tests are z-test, t-test, X2
use of regression coefficients.
P(A/B) = – test, F-test, etc.
n =n! relatively small.
Regression coefficient X on Y is denoted as bxy and
DIFFERENT SCHOOLS OF THOUGHT ON Steps in computation 6. The standard deviation of Poisson Distribution is 4. Calculation of test statistic: The next step is to
that of Y on X is denoted as byx. √�. calculate the value of the test statistic using
PROBABILITY 1. Find the prior probability
Regression Equation x on y:
Different Approaches/Definitions of Probability 2. Find the conditional probability. Practical situations where Poisson Distribution appropriate formula. The general fromfor computing
x- =bxy(y- can be used the value of test statistic is:-
There are 4 important schools of thought on 3. Find the joint probability by multiplying step 1
i.e x- =r. probability :- and step 2. 1. To count the number of telephone calls arising at Value of Test statistic =
Regression Equation y on x: 1. Classical or Priori Approach Objective Probability 4. Find posterior probability as percentage of total a telephone switch board in a unit of time. 5. Making Decision:
2. Relative frequency or Empirical Approach joint probability. 2. To count the number of customers arising at the Finally, we may draw conclusions and take
y- =bxy(x-
Approach PROBABILITY DISTRIBUTION super market in a unit of time. decisions. The decision may be either to accept or
i.e y- = r. 3. To count the number of defects in Statistical
3. Subjective or Personalistic Approach (THEORETICAL DISTRIBUTION) reject the null hypothesis. If the calculated value is
Properties of Regression Coefficient: Quality Control. more than the table value, we reject the null
4. Modern or Axiomatic Approach DEFINITION
1. There are two regression coefficients. They are Probability distribution (Theoretical Distribution) 4. To count the number of bacterias per unit. hypothesis and accept the alternative hypothesis. If
bxy and byx 1. Classical or Priori Approach
If out of ‘n’ exhaustive, mutually exclusive and can be defined as a distribution obtained for a 5. To count the number of defectives in a park of the calculated value is less than the table value, we
2. Both the regression coefficients must have the manufactured goods.
equally likely outcomes of an experiment; ‘m’ are random variable on the basis of a mathematical accept the null hypothesis.
same signs. If one is +ve, the other will also be a +ve 6. To count the number of persons dying due to heart Sampling Distribution
favourable to the occurrence of an event ‘A’, then model. It is obtained not on the basis of actual
value. observation or experiments, but on the basis of attack in a year. The distribution of all possible values which can be
3. The geometric mean of regression coefficients the probability of ‘A’ is defined as tobe
probability law. 7. To count the number of accidents taking place in a assumed by some statistic, computed from samples
will be the coefficient of correlation. P(A) = Random variable Random variable is a variable day on a busy road. of the same size randomly drawn from the same
r= According to Laplace, a French Mathematician, “ who value is determined by the outcome of a NORMAL DISTRIBUTION population is called Sampling distribution of that
4. If x and are the same, then the regression the probability is the ratios of the number of random experiment. Random variable is also called Definition of Normal Distribution statistic.
coefficient and correlation coefficient will be the favourable cases to the total number of equally chance variable or stochastic variable. For example, A continuous random variable ‘X’ is said to follow Standard Error (S.E)
same. likely cases.” suppose we toss a coin. Obtaining of head in this Normal Distribution if its probability function is: Standard Error is the standard deviation of the
Computation of Regression Co-efficients P(A) = random experiment is a random variable. Here the P (X) = sampling distribution of a statistic. Standard error
1. Actual mean method random variable of “obtaining heads” can take the π = 3.146 plays a very important role in the large sample
2. Assumed mean method Limitations of Classical Definition: numerical values. Now, we can prepare a table theory. The following are the important uses of
1. Classical definition has only limited application in e = 2.71828
3. Direct method showing the values of the random variable and standard errors:-
Actual mean method:- coin-tossing die throwing etc. It fails to answer
corresponding probabilities. This is called μ = mean of the distribution 1. Standard Error is used for testing a given
question like “What is the probability that a female
Regression coefficient x on y (bxy ) = probability distributions or theoretical distribution. σ = standard deviation of the distribution hypothesis
will die before the age of 64?” 2. S.E. gives an idea about the reliability of a
In the above, example probability distribution is :- Properties of Normal Distribution (Normal
Regression coefficient y on x (byx ) = 2. Classical definition cannot be applied when the sample, because the reciprocal of S.E. is a measure
Properties of Probability Distributions: Curve)
possible outcomes are not equally likely. How can
x = x– 1. Every value of probability of random variable will 1. Normal distribution is a continuous distribution. of reliability of the sample.
we apply classical definition to find the probability 3. S.E. can be used to determine the confidence
y = y- be greater than or equal to zero. 2. Normal curve is symmetrical about the mean.
of rains? Here, two possibilities are “rain” or “no limits within which the population parameters are
Assumed mean method: i.e., P(X) 0 3. Both sides of normal curve coincide exactly.
rain”. But at any given time these two possibilities expected to lie.
Regression coefficient x on y (bxy)= i.e., P(X) Negative value 4. Normal curve is a bell shaped curve.
are not equally likely.
3. Classical definition does not consider the
2. Sum of all the probability values will be 1 5. Mean, Median and Mode coincide at the centre of Test Statistic
ΣP(X) = 1 the curve. The decision to accept or to reject a null hypothesis
outcomes of actual experimentations.
E(X) = Σ[X.P(X)] 6. Quantities are equi-distant from median. Q3 – Q2 is made on the basis of a statistic computed from the
Relative Frequency Definition or Empirical sample. Such a statistic is called the test statistic.
Classification of Probability Distribution = Q2 – Q1
Regression coefficient y on x Approach There are different types of test statistics. All these
Discrete Probability Distribution 7. Normal curve is asymptotic to the base line.
According to Relative Frequency definition, the test statistics can be classified into two groups. They
(byx)= If the random variable of a probability distribution 8. Total area under a normal curve is 100%.
probability of an event can be defined as the relative are
dx = deviation from assumed mean of X assumes specific values only, it is called discrete 9. The ordinate at the mean divide the whole area
frequency with which it occurs in an indefinitely
probability distributions. Binomial distribution and under a normal curve into two equal parts. (50% on a. Parametric Tests
dy = deviation from assumed mean of Y large number of trials. If an even ‘A’ occurs ‘f’ b. Non-Parametric Tests
poisson distribution are discrete probability either side).
Direct method:- number of trials when a random experiment is
repeated for ‘n’ number of times
distributions. 10. The height of normal curve is at its maximum at PARAMETRIC TESTS
Regression Coefficient x on y (bxy ) Continuous Probability Distributions:- The statistical tests based on the assumption that
For practical convenience, the above equation may the mean.
If the random variable of a probability distribution 11. The normal curve is unimodel, i.e., it has only population or population parameter is normally
Regression Coefficient y on x (byx) be written as P(A) = assumes any value in a given interval, then it is one mode. distributed are called parametric tests. The important
THEORY OF PROBABILITY Here, probability has between 0 and 1, called continuous probability distributions. Normal 12. Normal curve is mesokurtic. parametric tests are:-
Definition of Probability i.e. 0 ≤ P(A) ≤ 1 distributions is a continuous probability distribution. 13. No portion of normal curve lies below the x-axis. 1. z-test
The probability of given event may be defined as the Subjective (Personalistie) Approach to BIONOMIAL DISTRIBUTION 2. t-test
14. Theoretically, the range of normal curve is – α to
numerical value given to the likely hood of the Probability Meaning & Definition: 3. f-test
occurrence of that event. It is a number lying The exponents of personalistie approach defines Binomial Distribution is associated with James + α . But practically the range is μ - 3σ to μ + 3σ. Z-test:
between ‘0’ and ‘1’ ‘0’ denotes the even which probability as a measure of personal confidence or Bernoulli, a Swiss Mathematician. Therefore, it is Fitting of a Normal Distribution Z-test is applied when the test statistic follows
cannot occur, and ‘1’ denotes the event which is belief based on whatever evidence is available. For also called Bernoulli distribution. Binomial Procedure : normal distribution. It was developed by
certain to occur. For example, when we toss on a example, if a teacher wants to find out the distribution is the probability distribution expressing 1. Find the mean and standard deviation of the given Prof.R.A.Fisher. The following are the important
coin, we can enumerate all the possible outcomes probability that Mr. X topping in M.Com the probability of one set of dichotomous distribution. (i.e., μ and σ) uses of z-test:-
(head and tail), but we cannot say which one will examination, he may assign a value between zero alternatives, i.e., success or failure. In other words, it 2. Take the lower limit of each class. 1. To test the population mean when the sample is
happen. Hence, the probability of getting a head is and one according to his degree of belief for possible is used to determine the probability of success in large or when the population standard deviation is
neither 0 nor 1 but between 0 and 1. It is 50% or ½ occurrence. He may take into account such factors as experiments on which there are only two mutually 3. Find Z value for each of the lower limit. known.
Terms use in Probability. the past academic performance in terminal exclusive outcomes. Binomial distribution is discrete Z = 2. To test the equality of two sample means when
Random Experiment examinations etc. and arrive at a probability figure. probability distribution. Binomial Distribution can 4. Find the area for z values from the table. The first the samples are large or when the population
A random experiment is an experiment that has two The probability figure arrived under this method be defined as follows: “A random variable r is said and the last values are taken as 0.5. standard deviation is known.
or more outcomes which vary in an unpredictable may vary from person to person. Hence it is called to follow Binomial Distribution with parameters n 5. Find the area for each class. Take difference 3. To test the population proportion.
manner from trial to trail when conducted under subjective method of probability. and p if its probability function is: P(r) = nC r prqn-r between 2 adjacent values if same signs and take 4. To test the equality of two sample proportions.
uniform conditions. In a random experiment, all the Axiomatic Approach (Modern Approach) to P = probability of success in a single trial total of adjacent values if opposite signs. 5. To test the population standard deviation when the
possible outcomes are known in advance but none of Probability q=1–p 6. Find the expected frequency by multiplying area sample is large.
the outcomes can be predicted with certainty. For Let ‘S’ be the sample space of a random experiment, n = number of trials for each class by N. 6. To test the equality of two sample standard
example, tossing of a coin is a random experiment and ‘A’ be an event of the random experiment, so r = number of success in ‘n’ trials. TESTING OF HYPOTHESIS deviations when the samples are large or when
because it has two outcomes (head and tail), but we that ‘A’ is the subset of ‘S’. Then we can associate a Assumption of Binomial Didstribution OR Statistical Inference: population standard deviations are known.
cannot predict any of them which certainty. real number to the event ‘A’. This number will be (Situations where Binomial Distribution can be Statistical inference refers to the process of selecting 7. To test the equality of correlation coefficients.
Sample Point called probability of ‘A’ if it satisfies the following applied) and using a sample statistic to draw conclusions Z-test is used in testing of hypothesis on the basis of
Every indecomposable outcome of a random three axioms or postulates :- 1. The random experiment has two outcomes i.e., about the population parameter. Statistical inference some assumptions. The important assumptions in z-
experiment is called a sample point. It is also called (1) The probability of an event ranges from 0 and 1. success and failure. deals with two types of problems. test are:-
simple event or elementary outcome. Eg. When a die If the event is certain, its probability shall be 1. If the 2. The probability of success in a single trial remains They are:- 1. Sampling distribution of test statistic is normal.
is thrown, getting ‘3’ is a sample point. event cannot take place, its probability shall be zero. constant from trial to trial of the experiment. 1. Testing of Hypothesis 2. Sample statistics are dose the population
Sample space (2) The sum of probabilities of all sample points of 3. The experiment is repeated for finite number of 2. Estimation parameter and therefore, for finding standard error,
Sample space of a random experiment is the set the sample spece is equal to 1. i.e, P(S) = 1 times. Hypothesis: sample statistics are used in place where population
containing all the sample points of that random (3) If A and B are mutually exclusive (disjoint) 4. The trials are independent. Hypothesis is a statement subject to verification. parameters are to be used.
experiment. Eg:- When a coin is tossed, the sample events, then the probability of occurrence of either A Properties (features) of Binomial Distribution: More precisely, it is a quantitative statement about a T-test:
space is (Head, Tail) or B shall be : 1. It is a discrete probability distribution. population, the validity of which remains to be t-distribution was originated by W.S.Gosset in the
Event P(A B) = P(A) + P(B) 2. The shape and location of Binomial distribution tested. In other words, hypothesis is an assumption early 1900. t-test is applied when the test statistic
An event is the result of a random experiment. It is a THEOREMS OF PROBABILITY changes as ‘p’ changes for a given ‘n’. made about a population parameter. follows t-distribution. Uses of t-test are:-
subset of the sample space of a random experiment. Addition Theorem 3. The mode of the Binomial distribution is equal to Testing of Hypothesis: 1. To test the population mean when the sample is
Sure Event (Certain Event) (a) Events are mutually exclusive the value of ‘r’ which has the largest probability. Testing of hypothesis is a process of examining small and the population s.D.is unknown.
(b) Events are not mutually exclusive whether the hypothesis formulated by the researcher
2. To test the equality of two sample means when significant difference, we can consider the samples As a non-parametric test, -test is mainly used to 2. It tests whether the difference in the means of
the samples are small and population S.D. is are drawn from the same population. test the goodness of fit between the observed different sample is due to chance or due to any
unknown. Procedure: frequencies and expected frequencies. Procedure:- significance cause.
3. To test the difference in values of two dependent 1. Set up mull hypothesis that there is goodness of fit 3. It uses the statistical test called, F – Ratio.
1. Set up null hypothesis that there is no significant
samples. difference between the tow means. between observed and expected frequencies. Types of Variance Analysis:
4. To test the significance of correlation coefficients. H0 : μ1 = μ2 2. Find the value using the following formula:- 1. One way Analysis of Variance
The following are the important assumptions in t- H1 : μ1 μ2 =Σ 2. Two way analysis of Variance
test:- 2. Decide the test criterion: One way Analysis of Variance:
1. The population from which the sample drawn is O = Observed frequencies In one way analysis of variance, observations are
• If sample is large, apply z – test E = Expected frequencies
normal. classified into groups on the basis of a single
2. The sample observations are independent. • If sample is small, but population S.D. is known, 3. Compute the degree of freedom. criterion. For example, yield of a crop is influenced
3. The population S.D.is known. apply z-test. d. f. = n – r – 1 by quality of soil, availability of rainfall, quantity of
4. When the equality of two population means is Where ‘r’ is the number of independent constraints seed, use of fertilizer, etc. It we study the influence
• If sample is small and population S.D. is unknown, to be satisfied by the frequencies
tested, the samples are assumed to be independent of one factor, It is called one way analysis of
and the population variance are assumed to be equal apply t-test. 4. Obtain the table value corresponding to the lord of variance. If we want to study the effect of fertilizer
and unknown. 3. Apply the formula: significance and degrees of freedom. of yield of crop, we apply different kinds of
F-test: Z or t = = 5. Decide whether to accept or reject the null fertilizers on different paddy fields and try to find
F-test is used to determined whether two SE is computed as follows: hypothesis. If the calculated value is less than out the difference in the effect of these different
independent estimates of population variance the table value, we accept the null hypothesis and kinds of fertilizers on yield.
• It population S.D. are known and equal, S.E. = conclude that there is goodness of fit. If the
significantly differ or to establish both have come Procedure:-
from the same population. For carrying out the test calculated value is more than the table value we 1.Set up null and alternative hypothesis:
of significance, we calculate a ration, called F-ratio. reject the null hypothesis and conclude that there is H0: There is no significant difference.
F-test is named in honour of the great statistician • It population S.D. are known but different, S.E. = no goodness of fit. H1: There is significant difference.
R.A.Fisher. It is also called Variance Ration Test. – test as a test of independence: – test is used 2. Compute sum of squares Total (SST)
F-ratio is defined as follows:- to find out whether one or more attributes are
associated or not. SST = Sum of squares of all observations -
F= Procedure:- 3. Compute sum of squares between samples
• It population S.D. are unknown and samples are 1. Set up null and alternative hypothesis. (SSC)
w large, then assuming Ho: Two attributes are independent (i.e., there is no SSC =
population S.D. are different, association between the attributes)
w H1: Two attributes are dependent (i.e., there is an
S.E. =
association between the attributes) 4. Compute sum of squares within sample (SSE)
While calculating F-ratio, the numerator is the • It population S.D. are unknown and samples are 2. Find the χ� value. SSE = SST – SSC
greater variance and denominator is the smaller =Σ
small, then assuming 5. Compute MSC
variance. So,
population S.D. are equal, 3. Find the degree of freedom MSC = =
F= S.E. = d.f. = (r-1)(c-1)
r = Number of rows 6. Compute MSE
Uses of F-distribution:-
1. To test the equality of variances of two 4. Fix the degree of freedom: c = Number of columns MSE = =
populations. For Z-test : Infinity 4. Obtain table value corresponding to the level of
For t-test: 7. Compute F – ratio:
2. To test the equality of means of three or more significance and degree of freedom.
populations. 5. Obtain the table value. 5. Describe whether to accept or reject the Ho. If the F =
3. To test the linearity of regression 6. Decide whether to accept or reject the H0. calculated value is less than the table value, we 8. Incorporate all these in an ANOVA TABLE
Assumptions of F-distribution:- to find out the same of the 2 groups. accept the H0and conclude that the attributes are as flows:
1. The values in each group are normally distributed. TESTING OF EQUALITY OF TWO SAMPLE independent. If the H and conclude that the attributes ANOVA TABLE
2. The variance within each group should be equal STANDARD DEVIATIONS are dependent. ANOVA TABLE
for all groups. This test is used to test whether there is any – test as a test of homogeneity
3. The error (Variation of each value around its own significant difference between the standard between – test is used to find whether the samples are
the standard deviation of two samples. homogeneous as far as a particular attribute is Source Sum of Degree Mean F-ratio
group mean) should be independent for each value.
Procedure: concerned. of squares of square
TYPES OF ERRORS IN TESTING OF
1. Set the null hypothesis that there is no Steps: variance freedom
HYPOTHESIS:
Between SSC C-1 MSC= F=
In any test of hypothesis, the decision is to accept or significant difference between two standard 1. Set up null and alternative hypotheses:
reject a null hypothesis. The four possibilities of the H0: There is homogeneity. samples
deviations.
decision are:- H1: There is no homogeneity (heterogeneity)
With in SSE N-C MSE=
1. Accepting a null hypothesis when it is true. 2. Find the value. sample
2. Rejecting a null hypothesis when it is false. ,

3. Rejecting a null hypothesis when it is true. total SST N-1
4. Accepting a null hypothesis when it is false. 2. Decide the test criterion: 3. Find the degree of freedom
d.f. = (r-1)(c-1)
Out of the above 4 possibilities, 1 and 2 are correct, If sample is large, apply Z – test 4. Obtain the table value
while 3 and 4 are errors. The error included in the If sample is sample, apply F – test
above 3rd possibility is called type I error and that in 5. Decide whether to accept or reject the null 9. Obtain table value at corresponding to the level of
the 4th possibility is called type II error.
3. Apply the formula: hypothesis.
significance and for degree of freedom of (C-1, N-
Type I Error If Z test: Limitations of Chi-square tests:- C).
The error committed by rejecting a null hypothesis 1. It is not as reliable as a parametric test. Hence it 10. Decide whether to accept or reject the null
Z= should be used only when parametric tests cannot be hypothesis.
when it is true, is called Type I error. The
probability of committing Type I error is denoted by used.
SE = (When population S.D. are 2. value can not be computed when the given TWO WAY ANALYSIS OF VARIANCE
α (alpha). Two way analysis of variance is used to test the
α = Prob. (Type I error) values are proportions or percentages. effect of two factors simultaneously on a particular
known) WILCOXON MATCHED PAIRS TEST
= Prob. (Rejecting H0 when it is true) variable.
Type II Error SE = (When population S.D. are SIGNED RANK TEST
Signed rank test was developed by Frank Wilcoxon.
Procedure:-
The error committed by accepting a null hypothesis 1. Set up null and alternative hypothesis.
not known) It is an important non-parametric test. This method
is used when we can determine both direction and H0: There is no significant difference between
when it is false is called Type II error. The
probability of committing Type II error is denoted If F – test: columns. There is no significant difference
by β (beta). magnitude of difference between matched values.
Here there are two cases:- between rows.
β = Prob. (Type II error) F= (Larger value must be numerator and a) When the number of matched pairs are less than H1: There is significant difference between
β = Prob. (Accepting H0 when it is false)
Small and Large samples or equal to 25. columns. There is significant difference
The size of sample is 30 or less than 30, the sample smaller must be denominator) b) When the number of matched pairs are more than between rows.
is called small sample. When the size of sample 4. Fix the degree of freedom 25. 2. Compute SST
exceeds 30, the sample is called large sample. For Z – test: Infinity Case:1
When the number of matched pairs are less than or SST = Sum of squares of all observations -
Degree of freedom For F – test:( ) equal to 25 Procedure:- 3. Compute SSC
Degree of freedom is defined as the number of 5. Obtain the table value.
independent observations which is obtained by 1. Set up null hypothesis:
6. Decide whether to accept or reject the null H0: There is no significant difference. SSC =
subtracting the number of constraints from the total
number of observations. hypothesis. H1: There is significant difference. 4. Compute SSR
Degree of freedom = Total number of observations NON-PARAMETRIC TESTS 2. Find the difference between each pair of values. SSR =
– Number of constraints. A non-parametric test is a test which is not 3. Assign ranks to the differences from the smallest
concerned with testing of parameters. to the largest without any regard to sign. Here Σ X�, Σ X�, etc denote the row totals
Rejection region and Acceptance region
The entire area under a normal curve may be divided Nonparametric tests do not make any assumption 4. Then actual signs of each difference are put to the 5. Compute SSE
into two parts. They are rejection region and regarding the form of the population. Therefore, corresponding ranks. SSE = SST – (SSC + SSR)
acceptance region. Rejection Region: Rejection non-parametric tests are also called distribution free 5. Find the total of positive ranks and negative ranks. 6. Compute MSC
region is the area which corresponds to the tests. 6. Smaller value, as per steps 5 is taken as the
MSC = =
predetermined level of significance. If the calculated Following are the important non-parametric tests:- calculated value.
value of the test statistic falls in the rejection region, 1. Chi-square test ( ) 7. Obtain the table value of Wilcoxon’s T-Table. 7. Compute MSR
we reject the null hypothesis. Rejection region is 2. Sign test 8. Decide whether to accept or reject the null MSR = =
also called critical region. It is denoted by α (alpha). 3. Signed rank test (Wilcoxon matched pairs test) hypothesis.
4. Rank sum test (Mann-whitney U-test and Case :2 8. Compute MSE
Acceptance Region:
Acceptance region is the area which corresponds to Kruskal-Wallis H test) When the number of matched pairs are more than 25 MSE =
1 – α. 5. Run test Procedure:-
1. Set up null hypothesis:
9. Compute F – ratio in respect of columns
Acceptance region = 1 – rejection region 6. Kolmogrov-Smirnor Test (K-S-test)
= 1- α. CHI-SQUARE TEST ( ) H0: There is no significant difference. Fc =
If the calculated value of the test statistic falls in the The value of chi-square describes the magnitude of H1: There is significant difference. 10. Compute F – ratio in respect of rows
acceptance region, we accept the null hypothesis. difference between observed frequencies and 2. Find the difference between each pair of values.
Fr =
TWO TAILED AND ONE TAILED TESTS: expected frequencies under certain assumptions. 3. Assign ranks to the differences from the smallest
A two tailed test is one in which we reject the null value ( quantity) ranges from zero to infinity. It to the largest without any regard to sign. 11. Obtain the table value
hypothesis if the computed value of the test statistic is zero when the expected frequencies and observed 4. Then actual signs of each difference are put to the 12. Decide whether to accept or reject the H0:
is significantly greater or lower than the critical frequencies completely coincide. So greater the corresponding ranks.
value (table value) of the test statistic. Thus, in two value of , greater is the discrepancy between 5. Find the total of positive ranks and negative ranks. TWO WAY ANOVA TABLE
tailed test the critical region is represented by both observed and expected frequencies. -test is a 6. Apply Z test and compute the value of ‘Z’
Source of Sum of Degree of Mean F-ratio
tails of the normal curve. If we are testing statistical test which tests the significance of Z= variance squares freedom square
hypothesis at 5 % level of significance, the size of difference between observed frequencies and Where T = Smaller value as per steps (5) Between SSC C-1 MSC= =
the acceptance region is 0.95 and the size of the corresponding theoretical frequencies of a
U= Columns
rejection region is 0.05 on both sides together. (i.e. distribution without any assumption about the
Between SSR R-1 MSE=
0.025 on left side and 0.025 on right side of the distribution of the population. This is one of the
= rows
curve). Procedure: simplest and most widely used nonparametric test in
1.ser the null hypothesis that that there is no statistical work. This test was developed by Prof. 7. Obtain table value of Z at specified level of residual SSE (C-1)(r-1) MSE=
significant difference b/w sample mean and Karl Pearson in 1990. significance for infinity degrees of freedom.
population mean Uses of - test 8. Decide whether to accept or reject the null total SST N-1
= 1. Useful for the test of goodness of fit:- - test can hypothesis.
= be used to test whether there is goodness of fit ANALYSIS OF VARIANCE
2. Decide the test criterion between the observed frequencies and expected Definition of Analysis of Variance
 If sample is large apply Z-test frequencies. Analysis of variance may be defined as a technique
 If sample is small but population 2. Useful for the test of independence of attributes:- which analyses the variance of two or more
deviation is know. Apply z test test can be used to test whether two attributes are comparable series (or samples) for determining the
 If sample is small and population associated or not. significance of differences in their arithmetic means
standard is unknown apply t test 3. Useful for the test of homogeneity:- -test is and for determining whether different samples under
3.apply formula very useful t5o test whether two attributes are study are drawn from same population or not, with
homogeneous or not. the of the statistical technique, called F – test.
Z or t = \
4. Useful for testing given population variance:- - Characteristics of Analysis of Variance:
TESTING OF EQUALITY OF TWO SAMPLE test can be used for testing whether the given 1. It makes statistical analysis of variance of two or
MEANS population variance is acceptable on the basis of more samples.
This test is used to test whether there is significant samples drawn from that population.
difference between two sample means. It there is no
-test as a test of goodness of fit:

You might also like