You are on page 1of 11

Name Roll No Learning Centre Subject Assignment No

: : : : : STATISTICS MANAGEMENT TWO

Date of Submission at the learning centre:

Q1.

What are the characteristics of a good measure of central tendency?

Ans. In statistics, the term central tendency relates to the way in which quantitative data tend to cluster around some value. A measure of central tendency is any of a number of ways of specifying this "central value". In practical statistical analyses, the terms are often used before one has chosen even a preliminary form of analysis: thus an initial objective might be to "choose an appropriate measure of central tendency". In the simplest cases, the measure of central tendency is an average of a set of measurements, the word average being variously construed as mean, median, or other measure of location, depending on the context. However, the term is applied to multidimensional data as well as to univariate data and in situations where a transformation of the data values for some or all dimensions would usually be considered necessary: in the latter cases, the notion of a "central location" is retained in converting an "average" computed for the transformed data back to the original units. In addition, there are several different kinds of calculations for central tendency, where the kind of calculation depends on the type of data (level of measurement).Both "central tendency" and "measure of central tendency" apply to either statistical populations or to samples from a population Three measures of central tendency are: mean, median, and mode. The mean for a distribution is the sum of the scores divided by the number of scores. SampleMean= Number of Scores Sum of the Scores Population Mean= Sum of the Scores Number of Scores

M= x n

= X N

Some characteristics of the mean include: Every score influences the mean. Changing a score changes the mean. Adding or subtracting a score changes the mean (unless the score equals the 2

mean). If a constant value is added to every score, the same constant will be added to the mean. If a constant value is subtracted from every score, the same constant will be subtracted from the mean. If every score is multiplied or divided by a constant, the mean will change in the same way It is inappropriate to use the mean to summarize nominal and ordinal data; it is appropriate to use the mean to summarize interval and ratio data. If the distribution is skewed or has some outliers, the mean will be distorted.

Median: If the scores in a distribution are listed in order, the median is the midpoint of the list. Half of the scores are below the median; half of the scores are above the median. 1. Place the data in descending order. (Ascending would have worked too.) 2. Find the score that cuts the sample into two halves.

Characteristics of the Median include: 1. It is inappropriate to use the median to summarize nominal data; it is appropriate to use the median to summarize ordinal, interval, and ratio data. 2. The median depends on the frequency of the scores, not on the actual values. 3. The median is not distorted by outliers or extreme scores. 4. The median is the preferred measure of central tendency when the distribution is skewed or distorted by outliers. 3

Mode: In a frequency distribution, the mode is the score or category that has the greatest frequency.

Characteristics of the Mode include: The mode may be used to summarize nominal, ordinal, interval, and ratio data. There may be more than one mode. The mode may not exist.

Relationships among the Mean, Median, and Mode The mean and median are equal if the distribution is symmetric. The mean, median, and mode are equal if the distribution is uni modal and symmetric. Otherwise, they do not give you the same answer.

Q 2. Your company has launched a new product .Your company is a reputed company with 50% market share of similar range of products. Your competitors also enter with their new products equivalent to your new product. Based on your earlier experience, you initially estimated that, your market share of the new product would be 50%. You carry out random sampling of 25 customers who have purchased the new product ad realize that only eight of them have actually purchased your product. Plan a hypothesis test to check whether you are likely to have a half of market share. Ans: A company has launched a new product. Our earlier experience, initially estimated that, market share of the new product would be 50%.Any hypothesis which specifies the population distribution completely. statistical hypothesis testing plays a fundamental role.[6] The usual line of reasoning is as follows: 1. We start with a research hypothesis of which the truth is unknown. 4

2. The first step is to state the relevant null and alternative hypotheses. This is important as mis-stating the hypotheses will muddy the rest of the process. Specifically, the null hypothesis allows to attach an attribute: it should be chosen in such a way that it allows us to conclude whether the alternative hypothesis can either be accepted or stays undecided as it was before the test. 3. The second step is to consider the statistical assumptions being made about the sample in doing the test; for example, assumptions about the statistical independence or about the form of the distributions of the observations. This is equally important as invalid assumptions will mean that the results of the test are invalid. 4. Decide which test is appropriate, and stating the relevant test statistic T. 5. Derive the distribution of the test statistic under the null hypothesis from the assumptions. In standard cases this will be a well-known result. For example the test statistics may follow a Student's t distribution or a normal distribution. 6. The distribution of the test statistic partitions the possible values of T into those for which the null-hypothesis is rejected, the so called critical region, and those for which it is not. 7. Compute from the observations the observed value tobs of the test statistic T. 8. Decide to either fail to reject the null hypothesis or reject it in favor of the alternative. The decision rule is to reject the null hypothesis H0 if the observed value tobs is in the critical region, and to accept or "fail to reject" the hypothesis otherwise. It is important to note the philosophical difference between accepting the null hypothesis and simply failing to reject it. The "fail to reject" terminology highlights the fact that the null hypothesis is assumed to be true from the start of the test; if there is a lack of evidence against it, it simply continues to be assumed true. The phrase "accept the null hypothesis" may suggest it has been proved simply because it has not been disproved, a logical fallacy known as the argument from ignorance. Unless a test with particularly high power is used, the idea of "accepting" the null hypothesis may be dangerous. Nonetheless the terminology is prevalent throughout statistics, where its meaning is well understood. Alternatively, if the testing procedure forces us to reject the null hypothesis (H-null), we can accept the alternative hypothesis (H-alt)and we conclude that the research hypothesis is supported by

the data. This fact expresses that our procedure is based on probabilistic considerations in the sense we accept that using another set could lead us to a different conclusion.

3. The upper and the lower quartile income of a group of workers are Rs 8 and Rs 3 per day respectively. Calculate the Quartile deviations and its coefficient?

Ans. Quartile Deviation: . The difference divided by

It is based on the lower quartile

and the upper quartile

is called the inter quartile range. The difference

is called semi-inter-quartile range or the quartile deviation. Thus

Quartile Deviation (Q.D)

In this question

= 3 and

=8

Q.D

8-32=2.5

Here Quartile deviation is Rs 2.5 per day.

Coefficient of Quartile Deviation Coefficient of Quartile Deviation: A relative measure of dispersion based on the quartile deviation is called the

coefficient of quartile deviation. It is defined as

Here In this question

= 3 and

=8

Coefficient of Quartile Deviation is Rs 0.455 per day 0.455 = 5 11 = Coefficient of Quartile Deviation = 2 8+3 2 83

4.

The cost of living index number on a certain data was 200. From the base

period, the percentage increases in prices wereRent Rs 60, clothing Rs 250, Fuel and Light Rs 150 and Miscellaneous Rs 120. The weights for different groups were food 60, Rent 16, clothing 12, Fuel and Light 8 and Miscellaneous 4. 7

Ans. Arranging the data in tabular form for easy representation ITEM RENT CLOTHING FUEL AND LIGHT MISCELLANEOU S FOOD P 60 250 150 120 W(Wt) 16 12 8 4 60 W= 100 wP 960 3000 1200 480 60 wP = 5700

P01 = wP W = 5700100 = 57 Hence living Index No is 57.

5.

Education seems to be a difficult field in which to use quality techniques. One

possible outcome measures for colleges is the graduation rate (the percentage of the students matriculating who graduate on time). Would you recommend using P or R charts to examine graduation rates at a school? Would this be a good measure of Quality? Ans. In statistical quality control, the p-chart is a type of control chart used to monitor the proportion of nonconforming units in a sample, where the sample proportion nonconforming is defined as the ratio of the number of nonconforming units to the sample size, n. The pchart only accommodates "pass"/"fail"-type inspection as determined by one or more gono go gauges or tests, effectively applying the specifications to the data before they're plotted on the chart. Other types of control charts display the magnitude of the quality characteristic under study, making troubleshooting possible directly from those charts. Some practitioners have pointed out that the p-chart is sensitive to the underlying assumptions, using control limits derived from the binomial distribution rather than from the observed sample variance. Due to this sensitivity to the underlying assumptions, p-charts are often implemented incorrectly, with control limits that are either too wide or too narrow, leading to incorrect decisions regarding process stability. A p-chart is a form of the Individuals chart (also referred to as "XmR" or "ImR"), and these practitioners recommend the individuals chart as a more robust alternative for count-based data 8

R Chart :

Range charts are used when you can rationally collect measurements in

groups (subgroups) of between two and ten observations. Each subgroup represents a "snapshot" of the process at a given point in time. The charts' x-axes are time based, so that the charts show a history of the process. For this reason, you must have data that is time-ordered; that is, entered in the sequence from which it was generated. If this is not the case, then trends or shifts in the process may not be detected, but instead attributed to random (common cause) variation. For subgroup sizes greater than ten, use X-bar / Sigma charts, since the range statistic is a poor estimator of process sigma for large subgroups. In fact, the subgroup sigma is ALWAYS a better estimate of subgroup variation than subgroup range. The popularity of the Range chart is only due to its ease of calculation, dating to its use before the advent of computers. For subgroup sizes equal to one, an Individual-X / Moving Range chart can be used, as well as EWMA or Cu Sum charts. X-bar Charts are efficient at detecting relatively large shifts in the process average, typically shifts of +-1.5 sigma or larger. The larger the subgroup, the more sensitive the chart will be to shifts, providing a Rational Subgroup can be formed. Hence, R Chrt will be a good measure of quality instead of P chart. 6. (a) Why do we use a chi-square test? Ans. Chi-Square test is a non-parametric test. It is used to test the independence of attributes, goodness of fit and specified variance. The Chi-Square test does not require any assumptions regarding the shape of the population distribution from which the sample was drawn. Chi-Square test assumes that samples are drawn at random and external forces, if any, act on them in equal magnitude. Chi-Square distribution is a family of distributions. For every degree of freedom, there will be one chi-square distribution. An important criterion for applying the Chi-Square test is that the sample size should be very large. None of the theoretical expected values calculated should be less than five. The important applications of Chi-Square test are the tests for independence of attributes, the test of goodness of fit and the test for specified variance. The chi-square (c2) test measures the alignment between two sets of frequency measures. These must be categorical counts and not percentages or ratios measures (for these, use another correlation test). Note that the frequency numbers should be significant and be at least above 5 (although an occasional lower figure may be possible, as long as they are not a part of a pattern of low figures). 9

Goodness of fit: A common use is to assess whether a measured/observed set of measures follows an expected pattern. The expected frequency may be determined from prior knowledge (such as a previous year's exam results) or by calculation of an average from the given data. The null hypothesis, H0 is that the two sets of measures are not significantly different. Independence: The chi-square test can be used in the reverse manner to goodness of fit. If the two sets of measures are compared, then just as you can show they align, you can also determine if they do not align. The null hypothesis here is that the two sets of measures are similar. The main difference in goodness-of-fit vs. independence assessments is in the use of the Chi Square table. For goodness of fit, attention is on 0.05, 0.01 or 0.001 figures. For independence, it is on 0.95 or 0.99 figures (this is why the table has two ends to it).

(b)

Why do we use analysis of variance?

Ans. Let's start with the basic concept of a variance. It is simply the difference between what you expected and what you really received. If you expected something to cost $1 and it, in fact, cost $1.25, then you have a variance of $0.25 more than expected. This, of course, means that you spent $0.25 more than what you planned. When you are calculating your variances, take materiality into consideration. If you have a variance of $0.25, that isn't a big deal if the quantity produced is very small. However, as the production run increases, then that variance can add up quickly. Most projects generate tons of variances every day. To avoid a tidal wave of numbers that are inconsequential, instead focus on the large variances. For example, it is far more important to find out why there is a $10,000 cost variance than to spend two days determining why an expense report was $75 over budget. we want to do variance analysis in order to learn. One of the easiest and most objective ways to see that things need to change is to watch the financials and ask questions. Don't get me wrong: You cannot and should not base important decisions solely on financial data. You must use the data as a basis to understand areas for further analysis. For example, if a bandsaw is a bottleneck, then go to the department and ask why. The reasons for the variance may range from the normal operator being out sick, to a worn blade, to there not being enough crewing and a great deal of overtime being incurred. Use the numbers to highlight areas to investigate, but do not make decisions without first investigating further. Point in time variances, meaning singular occurrences, can help some. To make real 10

gains, look at trends over time. If our earlier variance of $0.25 is judged as a one-time event, is that good or bad? We cannot tell with just one value, so let's look at the trend over time. If we see that the negative variance over time was $0.01, $0.05, $0.10, $0.12 and $0.25, then we can see that there apparently is a steady trend of increasing costs and, if large enough to be material, should be investigated. Yes, this can take a lot of time if done manually. However, spreadsheets and computer systems can be used to generate real-time variance reports that are incredibly useful with little to no work to actually run the report. Variance analysis and cost accounting in general are very interesting fields with a great deal of specialized knowledge. By using variance analysis to identify areas of concern, management has another tool to monitor project and organizational health. People reviewing the variances should focus on the important exceptions so management can become aware of changes in the organization, the environment and so on. Without this information, management risks blindly proceeding down a path that cannot be judged as good or bad.

11

You might also like