This action might not be possible to undo. Are you sure you want to continue?
carries 10 Marks. Answer all the questions Q1. (a) ‘Statistics is the backbone of decision-making’. Comment. [ 5 marks] (b) Give plural meaning of the word Statistics? [ 5 marks]
Answer: (a) Decision making is a very important part of human life. And to make the decisions successful, they need to be supported by some methods / trends / results which have already been successful. To find these trends statistics play a very important role, because through statistic we can find the trends which lead us to the success of our decisions. For example if we go for buying a car in the market, we need to know about different factors like the mileage, fuel capacity, maintenance, cost of the car etc. To find these factors with a reasonable acceptance level we need to analyze the different type of car available in the market with the different vendors and this can be done with the help of statistics only. We have to collect the data of different cars available from different brads like the mileage of the car, size, features like (A/C, central locking system, windows and many more other features), breaking system, reliability and durability, based on these factors only one can decide to go for a suitable car. Collecting this data, organizing it and analyzing it is nothing but the statistics.
(b) In plural sense, the word statistics refer to numerical facts and figures collected in a systematic
manner with a definite purpose in any field of study. In this sense, statistics are also aggregates of facts which are expressed in numerical form. For example, Statistics on industrial production, statistics on population growth of a country in different years etc Q2. a. In a bivariate data on ‘x’ and ‘y’, variance of ‘x’ = 49, variance of ‘y’ = 9 and covariance (x,y) = -17.5. Find coefficient of correlation between ‘x’ and ‘y’. [ 5 marks] b. Enumerate the factors which should be kept in mind for proper planning. [ 5 marks] Answer: (a) coefficient of correlation (x,y)
= Cov (x,y)/(SD(x).SD(y)) = Cov(x,y))/ (sqrt(var(x)).sqrt(var(y))) = -17.5 / (sqrt(49).sqrt(9) = -17.5 / ((7)(3)) = -17.5 / 21 = -0.8333
(b) The factors which should be kept in mind for proper planning:
1. 2. 3. 4. 5. 6. 7. 8. Nature of problem to be investigated should be defined in an unambiguous manner. Objectives of the investigation should be treated at the outset. Scope of investigation has to be made clear. Determine whether to use primary of secondary data. Determination of no. of investigators. Training of investigators. Supervision of investigators Requirement of funds.
Q3. The percentage sugar content of Tobacco in two samples was represented in table 11.11. Test whether their population variances are same. [ 10 marks] Table 1. Percentage sugar content of Tobacco in two samples Sample A 2.4 2.7 2.6 2.1 2.5
Sample B Answer. Sample A X 2.4 2.7 2.6 2.1 2.5 N=5
X-Mean(X) 2.4-2.46 = -0.06 2.7-2.46 = 0.24 2.6-2.46 = 0.14 2.1-2.46 = -0.36 2.5-2.46 = 0.04
sqr(X-Mean(X)) 0.0036 0.0576 0.0196 0.1296 0.0014 SUM = 0.0424
Mean(X) = SUM(X)/ N = (2.4+2.7+2.6+2.1+2.5) / 5 = 2.46 Variance = SUM(SQR(X-Mean(X))) / N = 0.212 / 5 = 0.0424 Sample B X 2.7 3.0 2.8 3.1 2.2 3.6 N=5 X-Mean(X) 2.7-2.9 = -0.20 3.0-2.9 = 0.10 2.8-2.9 = -0.10 3.1-2.9 = 0.20 2.2-2.9 = -0.70 3.6-2.9 = 0.70 sqr(X-Mean(X)) 0.04 0.01 0.01 0.04 0.49 0.49 SUM = 1.08
Mean(X) = SUM(X)/ N = (2.7 + 3.0 + 2.8 + 3.1 + 2.2 + 3.6) / 6 = 17.4/6 = 2.9 Variance = SUM(SQR(X-Mean(X))) / N = 1.08 / 6 = 0.18 Q4. a. Explain the characteristics of business forecasting. [ 5 marks] b. Differentiate between prediction, projection and forecasting. [ 5 marks]
a. Characteristics of Business Forecasting
i. Based on past and present conditions: The business forecasting is based on past and present economic condition of the business. To forecast the future, various data, information and facts concerning to economic condition of business for past and present are analyzed. ii. Based on mathematical and statistical methods: The process of forecasting includes the use of statistical and mathematical methods. By using these methods the actual trend which may take place in future can forecasted. iii. Period: The forecasting can be made for long term, short term, medium term or any specific term. iv. Estimation of future: The business forecasting is to forecast the future regarding probable economic conditions. v.Scope: The forecasting can be physical as well as financial.
b. Difference between prediction, projection and forecasting:
A prediction is an estimate based solely in past data of the series under investigation. It is purely mechanical extrapolation. A projection is a prediction where the extrapolated values are subjects to a certain numerical assumptions. A forecast is an estimate which relates the series in which we are interested to external factors. Forecasts are made by estimating future values of the external factors by means of prediction, projection or forecast and from these values calculating the estimate of the dependent variable.
Q5. What are the components of time series? Bring out the significance of moving average in analyzing a time series and point out its limitations. [ 10 marks] Answer: There are 4 components of time series: 1. Secular trend 2. Seasonal variation 3. Cyclical variation 4. Irregular variation 1. Secular trend: A time series data may show upward trend or downward trend for a period of years and this may be due to factors like increase in population, change in technological progress, large scale shift in consumers’ demands, etc. Example - Population increases over a period of time, price increases over a period of years, production of goods on the capital market of the country increases over a period of years. These are the examples of upward trend. The sales of a commodity may decrease over a period of time because of better products coming to the market. This is an example of declining trend or downward trend. The increase or decrease in the movements of a time series is called Secular trend. 2. Seasonal variation: Seasonal variations are short-term fluctuation in a time series which occur periodically in a year. This continues to repeat year after year. Example - The major factors that are responsible for the repetitive pattern of seasonal variations are weather conditions and customs of people. More woolen clothes are sold in winter than in the season of summer .Regardless of the trend we can observe that in each year more ice creams are sold in summer and very little in winter season. The sales in the departmental stores are more during festive seasons that in the normal days. 3. Cyclical variations: Cyclical variations are recurrent upward or downward movements in a time series but the period of cycle is greater than a year. Also these variations are not regular as seasonal variation. There are different types of cycles of varying in length and size. Example - The ups and downs in business activities are the effects of cyclical variation. A business cycle showing these oscillatory movements has to pass through four phases-prosperity, recession, depression and recovery. In a business, these four phases are completed by passing one to another in this order. 4. Irregular variation: Irregular variations are fluctuations in time series that are short in duration, erratic in nature and follow no regularity in the occurrence pattern. These variations are also referred to as residual variations since by definition they represent what is left out in a time series after trend, cyclical and seasonal variations. Example - Irregular fluctuations results due to the occurrence of unforeseen events like floods, earthquakes, wars, famines etc. Q6. List down various measures of central tendency and explain the difference between them? [ 5 marks] b. What is a confidence interval, and why it is useful? What is a confidence level? [ 5 marks] Answer: Central Tendency - The term central tendency refers to the "middle" value or a typical value of the data. There are 3 measures of central tendency – mean, median and mode. Each of these measures is calculated differently, and the one that is best to use depends upon the situation. Difference between mean median and mode: # 1. 2. 3. Mean This is the average value of a set of numbers It is based on all the values. It is affected by extreme values Median This is the number that is in the middle of the set It is not based on all the values. It is not affected by extreme values Mode This is the number that occurs most frequently in the set It is not based on all the values. Much affected by sampling fluctuations.
It can be located graphically sometime. It can not be determined for distributions with openend class intervals. Can be used for Quantitative data Capable of further algebraic treatment
Can be located graphically. It can not be determined for distributions with openend class intervals. Can be used for Qualitative data Not Capable of further algebraic treatment
Can be located graphically. It can be determined for distributions with openend class intervals. Can be used for Qualitative data Not Capable of further algebraic treatment
b. Confidence interval – It gives an estimated range of values which is likely to include an unknown population parameter, the estimated range being calculated from a given set of sample data. A confidence interval is used to describe the amount of uncertainty associated with a sample estimate of a population parameter Confidence interval is used to express the precision and uncertainty associated with a particular sampling method. A confidence interval consists of three parts. 1. Confidence level. 2. Statistic. 3. Margin of error. The confidence level describes the uncertainty of a sampling method. The statistic and the margin of error define an interval estimate that describes the precision of the method. The interval estimate of a confidence interval is defined by the sample statistic + margin of error. For example, we might say that we are 95% confident that the true population mean falls within a specified range. This statement is a confidence interval. It means that if we used the same sampling method to select different samples and compute different interval estimates, the true population mean would fall within a range defined by the sample statistic + margin of error 95% of the time. Confidence level - The probability part of a confidence interval is called a confidence level. The confidence level describes how strongly we believe that a particular sampling method will produce a confidence interval that includes the true population parameter. Example - Suppose we collected many different samples, and computed confidence intervals for each sample. Some confidence intervals would include the true population parameter; others would not. A 95% confidence level means that 95% of the intervals contain the true population parameter; a 90% confidence level means that 90% of the intervals contain the population parameter; and so on. MBA SEMESTER 1 MB0040 – STATISTICS FOR MANAGEMENT- 4 Credits (Book ID: B1129) Assignment Set- 2 (60 Marks) Note: Each question carries 10 Marks. Answer all the questions Q1. (a) What are the characteristics of a good measure of central tendency? [ 5 marks] (b) What are the uses of averages? [ 5 marks] Answer: (a) The characteristics of a good measure of central tendency are: 1. Present mass data in a concise form The mass data is condensed to make the data readable and to use it for further analysis. 2. Facilitate comparison It is difficult to compare two different sets of mass data. But we can compare those two after computing the averages of individual data sets. While comparing, the same measure of
average should be used. It leads to incorrect conclusions when the mean salary of employees is compared with the median salary of the employees. 3. Establish relationship between data sets The average can be used to draw inferences about the unknown relationships between the data sets. Computing the averages of the data sets is helpful for estimating the average of population. 4. Provide basis for decision-making In many fields, such as business, finance, insurance and other sectors, managers compute the averages and draw useful inferences or conclusions for taking effective decisions. The following are the requisites of a measure of central tendency: •It should be simple to calculate and easy to understand •It should be based on all values •It should not be affected by extreme values •It should not be affected by sampling fluctuation •It should be rigidly defined •It should be capable of further algebraic treatment (b) Appropriate Situations for the use of Various Averages 1. Arithmetic mean is used when: a. In depth study of the variable is needed b. The variable is continuous and additive in nature c. The data are in the interval or ratio scale d. When the distribution is symmetrical 2. Median is used when: a. The variable is discrete b. There exist abnormal values c. The distribution is skewed d. The extreme values are missing e. The characteristics studied are qualitative f. The data are on the ordinal scale 3. Mode is used when: a. The variable is discrete b. There exist abnormal values c. The distribution is skewed d. The extreme values are missing e. The characteristics studied are qualitative 4. Geometric mean is used when: a. The rate of growth, ratios and percentages are to be studied b. The variable is of multiplicative nature 5. Harmonic mean is used when: a. The study is related to speed, time b. Average of rates which produce equal effects has to be found. Q2. Calculate the 3 yearly and 5 yearly averages of the data in table below. [ 10 marks] Table 1: Production data from 1988 to 1997 Year 1988 1989 1990 1991 Production (in Lakh ton) 15 18 16 22 1992 19 1993 24 1994 20 1995 28 1996 22 1997 30
The table displays the calculated values of 3 yearly and 5 yearly averages.
Year 1988 1989 1990 1991 1992 1993 1994 1995 1996 Production (Thousand Y Tonnes) 21 22 23 25 24 22 25 27 26 3 –yearly moving totals 66 70 72 71 71 73 79 5 –yearly moving totals Ye 22.00 23.33 24.00 23.67 23.67 24.33 26.33 Short term fluctuation s (Y - Yc) 0 - 0.33 1.00 0.33 - 1.67 0.67 0.67 -
3. (a) What is meant by secular trend? Discuss any two methods of isolating trend values in a time series. [ 5 Marks] (b)What is seasonal variation of a time series? Describe the various methods you know to evaluate it and examine their relative merits. [ 5 marks] Answer:
(a) Secular Trend - This refers to the smooth or regular long term growth or decline of the series.
This movement can be characterized by a trend curve. If this curve is a straight line, then it is called a trend line. If the variable is increasing over a long period of time, then it is called an upward trend. If the variable is decreasing over a long period of time, then it is called a downward trend. If the variable moves upward or downwards along a straight line then the trend is called a linear trend, otherwise it is called a non-linear trend.
Methods of isolating trend values in a time series 1. Free hand or graphic method - This is the simplest method of drawing a trend curve. We plot the values of the variable against time on a graph paper and join these points. The trend line is then fitted by inspecting the graph of the time series. Fitting a trend line by this method is arbitrary. The trend line is drawn such that the numbers of fluctuations on either side are approximately the same. The trend line should be a smooth curve. The free hand method has the following disadvantages. i. It depends on individual judgment ii. It cannot be used for any predictions of trends, as drawing the trend curve is arbitrary 2. Method of Semi average: The methods of fitting a linear trend with the help of semi average method are as follows: i. When the number of years is even:, then the data of the time series is divided into two equal parts. The total of the items in each of the part is done and it is then divided by the number of items to obtain arithmetic means of the two parts. Each average is then centred in the period of time from which it has been computed and plotted on the graph paper. A straight line is drawn passing through these points. This is the required trend line. ii. When the number of years is odd, then the value of the middle year is omitted to divide the time series into two equal parts. Then the procedure described in ‘i’ is followed. A trend value of any future year may be predicted by multiplying the periodic increment by the number of years into the future that is desired and adding the result to the best trend value listed in the series.
(b) Seasonal variation: Variations in a time series that are periodic in nature and occur regularly
over short periods of time during a year are called seasonal variations. By definition, these variations are precise and can be forecasted.
The following are examples of seasonal variations in a time series. i. The prices of vegetables drop down after rainy season or in winter months and they go up during summer, every year. ii. The prices of cooking oils reduce after the harvesting of oil seeds and go up after some time. Methods of evaluating seasonal variations are: i. Seasonal variation index or seasonal average method ii. Seasonal variation through moving averages iii. Chain or link relative method iv. Ratio to trend method Seasonal average method In the seasonal average method, the steps followed are described below. i) The time series is arranged by years and months or quarters. ii) Totals of each month or quarter over all the years are obtained. iii) The average for each month or quarter is obtained. The average may be mean or median. In general, we take mean if not specified otherwise. iv) Taking the average of monthly or quarterly average equal to 100, seasonal index for each month or quarter is calculated by the following formula: v) Seasonal Index for a month (or quarter) = Monthly (or quarterly) Averageformonth (or quarter) X 100 Average or Monthly (or quarterly) averages Symbolically, seasonal index for first term is given by: I1 = S1 X 100 S Where, S1 = Average of first term S = Average of all terms ∑Sj / k j = 1, 2, 3, 4……..k k = 12 for monthly data k = 4 for quarterly data Merits This method is the simplest one. Demerits Most economic time series have trends and therefore, the seasonal index computed by this method is really an index of trends and seasons. The simple averages method of isolating seasonal fluctuations in time series is based on the assumption that the series contains only the seasonal and irregular fluctuations. This method does not give a true reflection of the normal seasonal variation. This is because it is obtained from the original data which are affected by not only seasonal movements but also by remaining three components. The effects of cycles of the original data are not eliminated by the process of averaging.
This method is useful where no definite trend exists in the time series.
Q4. The probability that a contractor will get an electrical job is 0.8, he will get a plumbing job is 0.6 and he will get both 0.48. What is the probability that he get at least one? Is the probabilities of getting electrical and plumbing job are independent? [ 10 marks] Answer:
(a) Probability that a contractor will get an electrical job = P(E) = 0.8 Probability that a contractor will get an plumbing job = P(P) = 0.6 P(E ∩ P) = 0.48 P( E U P) = P(E) + P(P) - P(E ∩ P) = 0.8 + 0.6 – 0.48 = 1.40 – 0.48 = 0.92 (b) As per the multiplication rule if ‘A’ and ‘B’ are two independent events then the probability of the occurrence of ‘A’ and ‘B’ is given by P(A ∩ B) = P(A) X P(B) So for probabilities of getting electrical and plumbing job to be independent of each other P(E ∩ P) should be equal to P(E) X P(P). Here P(E ∩ P) = 0.48 and P(E) X P(P) = 0.8 X 0.6 = 0.48 i.e. P(E ∩ P) = P(E) X P(P) i.e probabilities of getting electrical and plumbing job are independent. Q5. (a) Discuss the errors that arise in statistical survey. [ 5 marks] (b) What is quota sampling and when do we use it? [ 5 marks]
Answer (a) The term „error‟ denotes the difference between population value and its estimate provided by sampling technique. Therefore, the term is not referred in its ordinary sense in statistics There are four types of errors in statistics: Sampling errors The sample results are bound to differ from population results, since sample is only a small portion of the population. It is also known as inherent error and cannot be avoided. It is not worth to eliminate them completely. These errors may be due to the following factors: i. Faulty selection of sample ii. Substitution of units to be studied iii. Faulty demarcation of sampling units iv. Error due to bias in estimation However, the sampling errors follow random or chance variations and tend to cancel out each other on averaging. Non-sampling errors Non-sampling errors are attributed to factors that can be controlled and eliminated by suitable actions. It is worth to eliminate these errors. They are due to the following factors: i. Faulty planning, faulty definitions ii. Defective methods of interviewing iii. Personal bias of investigator iv. Lack of trained and qualified investigators v. Respondents‟ failure to answer vi. Improper coverage vii. Compiling errors viii. Publication errors Biased errors It arises in both census and sampling method. These errors occur due to personal bias of the investigator and the instruments used for measuring. They are also due to faculty collection of data, respondent’s bias and bias due to non-response. Biased errors have a tendency to grow with sample size. Therefore, they are also known as cumulative errors. The magnitude of biased errors is directly proportional to the sample size.
Unbiased errors The errors that are due to over-estimation and under-estimation such that they are equal are known as unbiased errors. They are also known as compensatory errors. They do not increase with sample size.
(b) Quota sampling
It is a type of judgment sampling. Under this design, quotas are set up according to some specified characteristic such as age groups or income groups. From each group a specified number of units are sampled according to the quota allotted to the group. Within the group the selection of sample units depends on personal judgment. It has a risk of personal prejudice and bias entering the process. This method is often used in public opinion studies.
Q6. (a) Why do we use a chi-square test? [ 5 marks] (b) Why do we use analysis of variance? [ 5 marks] Answer
(a) Chi square is a non-parametric test of statistical significance for bivariate tabular analysis (also known as crossbreaks). Any appropriately performed test of statistical significance lets you know the degree of confidence you can have in accepting or rejecting a hypothesis. Typically, the hypothesis tested with chi square is whether or not two different samples (of people, texts, whatever) are different enough in some characteristic or aspect of their behavior that we can generalize from our samples that the populations from which our samples are drawn are also different in the behavior or characteristic. A non-parametric test, like chi square, is a rough estimate of confidence; it accepts weaker, less accurate data as input than parametric tests (like t-tests and analysis of variance, for example) and therefore has less status in the pantheon of statistical tests. Nonetheless, its limitations are also its strengths; because chi square is more 'forgiving' in the data it will accept, it can be used in a wide variety of research contexts. Chi square is used most frequently to test the statistical significance of results reported in bivariate tables, and interpreting bivariate tables is integral to interpreting the results of a chi square test. Actually, Chi-Square tests allow us to do a lot more than just test for the quality of several proportions. If we classify a population into several categories with respect to two attributes (such as age and job performance), we can then use a Chi-Square test to determine whether the two attributes are independent of each other. So, Chi-Square tests can be applied on contingency table. (b) We use the analysis of variance for the following reasons: When we have more than two populations, we have to use the analysis of variance to evaluate the mean differences between two or more populations. It enables us to test for the significance of the differences of variances among more than two sample means Using analysis of variance, we will be able to make inferences about whether our samples are drawn from populations having the same mean or not.