This action might not be possible to undo. Are you sure you want to continue?

BooksAudiobooksComicsSheet Music### Categories

### Categories

### Categories

### Publishers

Scribd Selects Books

Hand-picked favorites from

our editors

our editors

Scribd Selects Audiobooks

Hand-picked favorites from

our editors

our editors

Scribd Selects Comics

Hand-picked favorites from

our editors

our editors

Scribd Selects Sheet Music

Hand-picked favorites from

our editors

our editors

Top Books

What's trending, bestsellers,

award-winners & more

award-winners & more

Top Audiobooks

What's trending, bestsellers,

award-winners & more

award-winners & more

Top Comics

What's trending, bestsellers,

award-winners & more

award-winners & more

Top Sheet Music

What's trending, bestsellers,

award-winners & more

award-winners & more

P. 1

Nonparametric Statistical Inference, Fourth Edition|Views: 1,472|Likes: 26

Published by Mario Balderas

See more

See less

https://www.scribd.com/doc/111733291/Nonparametric-Statistical-Inference-Fourth-Edition

07/01/2013

text

original

In a k*Â*2 contingency table, the B family is simply a dichotomy with

say success and failure as the two possible outcomes. Then it is a

simple algebraic exercise to show that the test statistic for indepen-

dence can be written in an equivalent form as

Q* ¼
*

*X
*

k

i*¼*1

*X
*

2

j*¼*1

*ð*Xij* À*Xi:X:j=N*Þ*2

Xi:X:j=N* ¼
*

*X
*

k

i*¼*1

*ð*Yi* À*ni^p*Þ*2

ni^p*ð*1*À*^p*Þ
*

*ð*3:1*Þ
*

where

Yi* ¼* Xi1 ni* À*Yi* ¼* Xi2

^p* ¼
*

*X
*

k

i*¼*1

Yi=N

If B1 and B2 are regarded as success and failure, and A1;A2;. . .;Ak are

termed sample 1, sample 2,. . ., and sample k, we see that the chi-

square test statistic in (3.1) is the sum of the squares of k standardized

binomial variables with parameter p estimated by its consistent esti-

mator ^p. Thus the test based on (3.1) is frequently called the test for the

equality of k proportions, previously covered in Section 10.8 and illu-

strated here by Example 3.1.

**Example 3.1** A marketing research ﬁrm has conducted a survey of

businesses of different sizes. Questionnaires were sent to 200 ran-

domly selected businesses of each of three sizes. The data on responses

**Table 2.2 Expected frequencies
**

Nicotine

Alcohol

0

1–15

16 or more

Total

0

105 (82.7)

7 (17.7)

11 (22.6)

123

0.01–0.10

58 (51.1)

5 (10.9)

13 (14.0)

76

0.11–0.99

84 (109.6)

37 (23.4)

42 (30.0)

163

1.00 or more

57 (60.5)

16 (12.9)

17 (16.5)

90

Total

304

65

83

452

Business size

Small

Medium

Large

Response

125

81

40

ANALYSIS OF COUNT DATA

**529
**

are summarized below. Is there a signiﬁcant difference in the pro-

portion of nonresponses by small, medium, and large businesses?

Solution The frequencies of nonresponses are 75, 119, and 160. The

best estimate of the common probability of nonresponse is (75*þ*119*þ
*160)=600

for each size business. The value of Q from (3.1) is 74.70 with 2 degrees

of freedom. From Table B we ﬁnd P < 0:001, and we conclude that the

proportions of nonresponse are not the same for the three sizes of

businesses.

If k* ¼* 2, the expression in (3.1) can be written as

Q* ¼ ð*Y1=n1* À*Y2=n2*Þ*2

^p*ð*1*À*^p*Þð*1=n1* þ*1=n2*Þ
*

*ð*3:2*Þ
*

Now the chi-square test statistic in (3.2) is the square of the difference

between two sample proportions divided by the estimated variance of

their difference. In other words, Q is the square of the classical stan-

dard normal theory test statistic used for the hypothesis that two

population proportions are equal.

Substituting the original Xij notation in (3.2), a little algebraic

manipulation gives another equivalent form for Q as

Q* ¼* N*ð*X11X22* À*X12X21*Þ*2

X:1X:2X1:X2:

*ð*3:3*Þ
*

This expression is related to the sample Kendall tau coefﬁcient of

Chapter 11. Suppose that the two families A and B are factors or

qualities, both dichotomized into categories which can be called pre-

sence and absence of the factor or possessing and not possessing the

quality. Suppose further that we have a single sample of size N, and

that we make two observations on each element in the sample, one for

each of the two factors. We record the observations using the code 1 for

presence and 2 for absence. The observations then consist of N sets of

pairs, for which the Kendall tau coefﬁcient T of Chapter 11 can be

determined as a measure of association between the factors. The

numerator of T is the number of sets of pairs of observations, say*ð*aibi*Þ
*and

are not zero. The differences here are both positive or both negative

only for a set (1,1) and (2,2), and are of opposite signs for a set (1,2) and

(2,1). If Xij denotes the number of observations where factor A was

recorded as i and factor B was recorded as j for i,j

differences with the same sign is the product X11X22, the number of

pairs which agreed in the sense that both factors were present or both

were absent. The number of differences with opposite signs is X12X21,

**530
**

CHAPTER 14

the number of pairs which disagreed. Since there are so many ties, it

seems most appropriate to use the deﬁnition of T modiﬁed for ties,

given in (11.2.37) and called tau b. Then the denominator of T is the

square root of the product of the numbers of pairs with no ties for each

factor, or X1:X2:X:1X:2. Therefore the tau coefﬁcient is

T* ¼* X11X22* À*X12X21

*ð*X:1X:2X1:X2:*Þ*1=2* ¼* Q

N

* *1=2

*ð*3:4*Þ
*

and Q=N estimates* t*2

, the parameter of association between factors A

and B. For this type of data, the Kendall measure of association is

sometimes called the phi coefﬁcient, as deﬁned in (2.6).

**Example 3.2** The researchers in the study reported in Example 2.1

really might have been more interested in a one-sided alternative of

positive dependence between the variables alcohol and nicotine.

Since the data are measurements of level of consumption, we could

regard them as 452 pairs of measurements with many ties. For

example, the 37 mothers in cell (3,2) of Table 2.1 represent the

pair of measurements (AIII, BII), where AIII indicates alcohol

consumption in the 0.11–0.99 range and BII represents nicotine

consumption at level 1–15. For these kinds of data we can then

calculate Kendall’s tau for the 452 pairs. The number of concordant

pairs C and the number of discordant pairs Q are calculated as

shown in Table 2.3. Because the ties are quite extensive, we need to

incorporate the correction for ties in the calculation of T from

(11.2.38). Then we use the normal approximation to the distribution

of T in (11.2.30) to calculate the right-tailed P value for this one-

sided alternative.

**Table 2.3 Calculations for C and Q
**

C

Q

105(5*þ*13*þ*37*þ*42*þ*16*þ*17)*¼*13,650

7(58*þ*84*þ*57)*¼*1,393

7(13*þ*42*þ*17)*¼*504

11(58*þ*84*þ*57*þ*5*þ*37*þ*16)*¼*2,827

58(37*þ*42*þ*16*þ*17)*¼*6,496

58(84*þ*57)*¼*705

5(42*þ*17)*¼*295

13(84*þ*57*þ*37*þ*16)*¼*2,522

84(16*þ*17)*¼*2,2772

37(57)*¼*2,109

37(17)*¼*629

42(57*þ*16)*¼*3,066

24,346

12,622

ANALYSIS OF COUNT DATA

**531
**

T*¼
*

24;346*À*12;622

*ﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ
*

* *452

2

* À* 304

2

*À ÁÀ* 65

2

*À ÁÀ* 83

2

*À Á
*

*! * 452

2

*À ÁÀ* 123

2

*À ÁÀ* 76

2

*À ÁÀ* 163

2

*À ÁÀ* 90

2

*À Á
*

*!
*

*s
*

*¼*0:1915

Z* ¼* 3*ð*0:1915*Þ
*

*ﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ
*

452*ð*451*Þ
*

*p
ﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ
*

2*ð*904*þ*5*Þ
*

*p
*

*¼* 6:08

We ﬁnd P=0.000 from Table A of the Appendix.

There is also a relationship between the value of the chi-square

statistic in a 2*Â*2 contingency table and Kendall’s partial tau coefﬁ-

cient. If we compare the expression for TXY:Z in (12.6.1) with the ex-

pression for Q in (3.3), we see that

TXY:Z* ¼
*

*ﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ
*

Q=N

*p
*

for N* ¼* m

2

*
*

A test for the signiﬁcance of TXY:Z cannot be carried out by using Q,

however. The contingency table entries in Table 6.1 of Chapter 12 are

not independent even if X and Yare independent for ﬁxed Z, since all

categories involve pairings with the Z sample.

Teoria de La Credibilidad - Desarrollo y Aplicaciones

Paradojas,Acertijos y Demostraciones Invalidas

La Magia y Las Matematicas

Wooton

Nueva Historia Mini Made Mexico

El Prometeo Quetzalcóatl

ConozcaMas_Abril_2012L

Contraste de Hipótesis

Tablas as

Functional Analysis - Walter Rudin

- Read and print without ads
- Download to keep your version
- Edit, email or read offline

Nonparametric Statistical Methods Using R

Understanding Advanced Statistical Methods(PDF)

2009 Freedman Statistical Models RevEd

Algebraic and Geometric Methods in Statistics

Statistical Data Analysis Explained

Applied Survey Data Analysis

David R. Brillinger Time Series Data Analysis and Theory 2001

Regression and Modeling

Linear Regression

Research Design and Statistical Analysis

Livingstone, Data Analysis

Statistical Power Analysis with Missing Data, Adam Davey andJyoti Savla

Regression Analysis of Count Data

Applied Regression Analysis

Making Sense of Data a Practical Guide to Exploratory Data Analysis and Data Mining

Principles of Statistical Inference

Multivariate Statistical Inference and Applications

Applied Regression Analysis

Stochastic Finance

Statistics and Data With R

Time Series Analysis

Statistical Power Analysis

Clustering for Data Mining

Using R for data management, statistical analysis, and graphics.pdf

Basic Statistics

Cohen Y. - Statistics and Data with R - 2008.pdf

Robust Nonparametric Statistical Methods

time-series-introduction.pdf

Learning Statistics with R

Reproducible Research With R and RStudio

Are you sure?

This action might not be possible to undo. Are you sure you want to continue?

CANCEL

OK

You've been reading!

NO, THANKS

OK

scribd

/*********** DO NOT ALTER ANYTHING BELOW THIS LINE ! ************/ var s_code=s.t();if(s_code)document.write(s_code)//-->