You are on page 1of 11

# ASA University Bangladesh Nonparametric Methods

## jesmin Akter Assistant Professor, Faculty of Business

Non-parametric methods are statistical methods that require less restrictive assumptions about the level of data measurement and fewer assumptions about the form of the probability distributions generating the sample data. Nonparametric methods are often called distributionfree methods. In general, for a statistical method to be classified as nonparametric, it must satisfy at least one of the following conditions. The method can be used with nominal data. The method can be used with ordinal data. The method can be used with interval or ratio data when no assumption can be made about the population probability distribution. Sign Test Sign test is a non-parametric statistical test for identifying the differences between two populations based on the analysis of nominal data. A common application of the sign test involves using a sample of potential customers to identify a preference for one of two brands of a product. To record the preference data, a plus sign is used if the individual prefers one brand and a minus sign if the individual prefers the other brand. Since the data are recorded as plus and minus signs, this test is called the sign test. Example 1 Sun Coast Farms produces an orange juice product marketed under the name Citrus Valley. A competitor of Sun Coast Farms produces an orange juice product known as Tropical Orange. In a study of consumer preferences for the two brands, 12 individuals were given unmarked sample of each product. The following table lists the preferences given by 12 individuals about the two brands of orange juice. Individual 1 2 3 4 5 6 7 8 9 10 11 12 Brand Preference Tropical Orange Tropical Orange Citrus Valley Tropical Orange Tropical Orange Tropical Orange Tropical Orange Tropical Orange Citrus Valley Tropical Orange Tropical Orange Tropical Orange Recorded data + + -

## jesmin Akter Assistant Professor, Faculty of Business

With , test whether there is a significant difference in the preference for the two brands of orange juice and make an appropriate conclusion. Solution Let denote the proportion of the population of customer favoring Citrus Valley. We have to test the following hypothesis:

(or

From the given information, we have the number of plus sign = 2 By adding the probabilities for 2, 1, and 0 from the binomial table with we have 0.0161+0.0029+0.0002 = 0.0192 Now the p value will be p-value = 2(0.0192) = 0.384 As p value Comment: We may conclude that the customer preferences are different for the two brands of orange juice. We can also say that the customers prefer Tropical Orange that Citrus Valley. Assignment 3 Problem 1 The following table lists the preferences indicated by 10 individuals in taste test involving two brands of a product. Individual 1 2 3 4 5 Brand A Versus Brand B + Individual 6 7 8 9 10 Brand A Versus Brand B + 0.05, we may reject the null hypothesis. ,

With , test for a significant difference in the preferences for the two brands. A plus indicates a preferences for brand A over brand B.

## jesmin Akter Assistant Professor, Faculty of Business

Wilcoxon signed rank test is a nonparametric statistical test for identifying the differences between two populations based on the analysis of two matched or paired samples. Example 2 A manufacturer firm is attempting to determine whether two production methods differ in task completion time. A sample of 11 workers was selected, and each worker completed a production task using each of the production method. The production task completion time is given in the following table: Worker 1 2 3 4 5 6 7 8 9 10 11 Method 1 10.2 9.6 9.2 10.6 9.9 10.2 10.6 10.0 11.2 10.7 10.6 2 9.5 9.8 8.8 10.1 10.3 9.3 10.5 10.0 10.6 10.2 9.8

Use the Wicoxon Signed-Rank to see whether the task completion times differ for the two methods at 5% level of significance. What is your conclusion? Solution We have two populations of task completion times, one population associated with each method. We have to test the following hypothesis:

ASA University Bangladesh Calculation Table Method 1 10.2 9.6 9.2 10.6 9.9 10.2 10.6 10.0 11.2 10.7 10.6 2 9.5 9.8 8.8 10.1 10.3 9.3 10.5 10.0 10.6 10.2 9.8

## jesmin Akter Assistant Professor, Faculty of Business

Worker 1 2 3 4 5 6 7 8 9 10 11

Difference 0.7 -0.2 0.4 0.5 -0.4 0.9 0.1 0.0 0.6 0.5 0.8

Absolute Value of Difference 0.7 0.2 0.4 0.5 0.4 0.9 0.1 0.0 0.6 0.5 0.8

## Signed Rank +8 -2 +3.5 +5.5 -3.5 +10 +1 +7 +5.5 +9

denote the sum of signed rank values. Now the mean and standard deviation of and Test Statistic: The test statistic for Wilcoxon Signed-Rank test will be
( )( )

are

)(

## Decision rule: As | | , we may reject the null hypothesis.

Comment: We may conclude that the two populations are not identical and that the methods differ in task completion time. Method 2s shorter completion times for the 8 workers lead us to conclude that method 2 is the preferred production method.

## jesmin Akter Assistant Professor, Faculty of Business

Two additives are tested to determine their effects on miles per gallon for passenger cars. Test results for 12 cars follows: each car was tested with both fuel additives. Use and the Wilcoxon signed - ranked test to see whether there is a significance difference in the additives. Cars 1 2 3 4 5 6 Additive 1 2 20.12 18.05 23.56 21.77 22.03 22.57 19.15 17.06 21.23 21.22 24.77 23.80 Cars 7 8 9 10 11 12 Additive 1 2 16.16 17.20 18.55 14.98 21.87 20.03 24.23 21.15 23.21 22.78 25.02 23.70

Mann Whitney Wilcoxon (MWW)Test Wilcoxon signed rank test is a nonparametric statistical test for identifying the differences between two populations based on the analysis of two independent samples. The test was jointly developed be Mann, Whitney and Wilcoxon. It is sometimes called Wilcoxon rank-sum test. Small Sample Case The small sample case for the MWW test should be used whether the sample sizes for both populations are less than or equal to 10. Example 3 The majority of the students attending Johnston High School previously attended either Garfield Junior High School or Mulberry Junior High School. The school administration wish to know whether the population students who had attended Garfield was identical to the population of students who had attended Mulberry in terms of academic potential. The sample ordinal class standing is presented in the following table: Garfield Students Student Class Standing Fields 8 Clark 52 Jones 112 Tibbs 21 Mulberry Students Student Class Standing Hart 70 Phipps 202 Kirkwood 144 Abbott 175 Guest 146

At 5% level of significance test whether the population students who had attended Garfield was identical to the population of students who had attended Mulberry in terms of academic potential by MWW test.

## jesmin Akter Assistant Professor, Faculty of Business

Calculation Table Garfield Students Class Sample Student Standing Rank Fields 8 1 Clark 52 3 Jones 112 5 Tibbs 21 2 Sum of Ranks T = 11 Mulberry Students Class Sample Student Standing Rank Hart 70 4 Phipps 202 9 Kirkwood 144 6 Abbott 175 8 Guest 146 7

The rank sum for the sample of four students from Garfield is T = 11. In order to compare the calculated rank sum T, we need to find the value of is directly read from table. Now from the table we have for and at and where

Therefore,

Thus, the decision rule for MWW test is that the null hypothesis of identical population is rejected if the rank sum for the first sample (Garfield) is less than 12 or greater than 28. As , we may reject the null hypothesis.

Comment: We conclude that the population of students at Garfield differs from the population of students at Mulberry in terms of academic potential. The higher class ranking obtained by the sample of Garfield students suggest that Garfield students are better prepared for high school than the Mulberry students.

Assignment 3 Problem 3 Two fuel additives are being tested to determine their effect on gas mileage. Seven cars were tested with additive 1 and nine cars were tested with additive 2. The following data shows the miles per gallon obtained with the two additives. Use and MWW test to see whether there is a significant difference in gasoline mileage for the two additives.

ASA University Bangladesh Additive 1 17.3 18.4 19.1 16.7 18.2 18.6 17.5

jesmin Akter Assistant Professor, Faculty of Business Additive 2 18.7 17.8 21.3 21.0 22.1 18.7 19.8 20.7 20.2

Large Sample Case Large sample case MWW is considered if the sample sizes for both the sample are greater than 10. Example 4 Third National Bank has two branch offices. Data of account balance for each branch is collected through independent simple random sampling and presented in the following table: Branch 1 Account Balance (\$) 1 1095 2 955 3 1200 4 1195 5 925 6 950 7 805 8 945 9 875 10 1055 11 1025 12 975 Branch 2 Account Balance (\$) 1 885 2 850 3 915 4 950 5 800 6 750 7 865 8 1000 9 1050 10 935

Do the data indicate whether the populations of balance for two branches are identical? Use . Solution In order to find whether the populations of balance for two branches are identical, we have test the following hypothesis:

ASA University Bangladesh Calculation Table Branch 1 Account Balance (\$) 1 1095 2 955 3 1200 4 1195 5 925 6 950 7 805 8 945 9 875 10 1055 11 1025 12 975 Sum of Ranks Rank 20 14 22 21 9 12.5 3 11 6 19 17 15 169.5 Account 1 2 3 4 5 6 7 8 9 10

## jesmin Akter Assistant Professor, Faculty of Business

Branch 2 Balance (\$) 885 850 915 950 800 750 865 1000 1050 935

Sum of Ranks

## where, Hence, Critical Value The critical of

(
( )

)
(

(
)( )

and

at 5% level of significance is

## Decision rule: As | | , we may reject the null hypothesis.

Comment: From the hypothesis test we may conclude that the account balances at the two branches are not same. Assignment 3 Problem 4 Mileage performance tests were conducted for two models of automobiles. Twelve automobiles of each model were selected randomly and a miles-per-gallon rating for each model was developed on the basis of 1000 miles of highway driving. The data follow:

jesmin Akter Assistant Professor, Faculty of Business Model 2 Automobile Miles per Gallon 1 21.3 2 17.6 3 17.4 4 18.5 5 19.7 6 21.1 7 17.3 8 18.8 9 17.8 10 16.9 11 18.0 12 20.1

Model 1 Automobile Miles per Gallon 1 20.6 2 19.9 3 18.6 4 18.9 5 18.8 6 20.2 7 21.0 8 20.5 9 19.8 10 19.8 11 19.2 12 20.5

Use and MWW test to see whether there is a significant difference in the populations of miles-per-gallon ratings for the two models. Spearman Rank Correlation Coefficient and Test Spearman rank correlation coefficient is a measure of the association between two ordinal data. It is denoted by and calculated as
( )

where,

the number of items or individuals being ranked = the rank of item i with respect to one variable = the rank of item i with respect to second variable

Example 5 A company wants to determine whether individuals who were expected at the time of employment to be better sales persons actually turn out to have better sales records. To investigate this question the vice president in charge of personnel ranked the 10 individuals in terms of their potential for success, basing the assessment solely on the information available at the time of employment. Then a list was obtained of the number of units sold by each sales person over the first two years. On the basis of actual performance, a second ranking of the 10 salespersons was carried out. The following table gives the relevant data:

jesmin Akter Assistant Professor, Faculty of Business Two Years Sales (Units) 400 360 300 295 280 350 200 260 220 385 Ranking According to Two-Years Sales 1 3 5 6 7 4 10 8 9 2

Salesperson A B C D E F G H I J

Raking of Potential 2 4 7 1 6 3 10 9 8 5

a) Compute the rank correlation coefficient for the data. b) Test whether the rank correlation coefficient is significant or not at 5% level of significance. Make an appropriate comment. Solution a) We know, the Spearman rank correlation coefficient can be calculated as:
( )

## Calculation Table Salesperson A B C D E F G H I J Raking of Potential 2 4 7 1 6 3 10 9 8 5 Raking of Sales Performance 1 3 5 6 7 4 10 8 9 2

1 1 2 -5 -1 -1 0 1 -1 3

1 1 4 25 1 1 0 1 1 9 =44

## jesmin Akter Assistant Professor, Faculty of Business

Test Statistic Under the null hypothesis, the test statistic will be

## where, and Therefore, Critical Value The critical of

at 5% level of significance is

## Decision rule: As | | , we may reject the null hypothesis.

Comment: From the hypothesis test, we may conclude that there is a significant rank correlation between the sales potential and sales performance. Assignment 3 Problem 4 Consider the following set of rankings for a sample of 10 elements. Element 1 2 3 4 5 6 7 8 9 10

10 6 7 3 4 2 8 5 1 9

8 4 10 2 5 7 6 3 1 9

a) Compute the rank correlation coefficient for the data. b) Test whether the rank correlation coefficient is significant or not at 5% level of significance. Make an appropriate comment.