7 views

Uploaded by Wiween Mihad

- lindsay new spss
- 51-Saptarishis Astrology Astro News
- UT Dallas Syllabus for stat3360.5u1.10u taught by Yuly Koshevnik (yxk055000)
- Project_4
- Workshop for Commissioners - 21 January 2016 - Using Data to Support System Improvement
- 53019926-Little-R-E-Mechanical-Reliability-Improvement-Probability-and-Statistics-for-Experimental-Testing-Marcel-Dekker-2001.pdf
- Reporting_Statistics_in_APA_Format
- Lampiran Perhitungan SPSS
- Abstract Abstrac Lit 2014
- 810007
- Statistics Assignment (1)
- 11 Final Solutions
- Effect of Credit Information Influence Loan Volume Granted By Selected Deposit Taking Saccos in Nyeri County, Kenya
- out of field in korea.pdf
- ch.14 - ppt
- B9780080970868030506.pdf
- Ts88 05 Kapovic Etal 0595 Slican Primer
- APBioLab
- 314455559 Mini Project Yani2
- Math - 10 Steps - Final Format

You are on page 1of 9

Definitions

Parametric Tests

Statistical tests that involve assumptions about or estimations of population parameters. (what weve been learning)

Nonparametric Tests

Also known as distribution-free tests Statistical tests that do not rely on assumptions of distributions or parameter estimates (what were going to be learning)

More Definitions

The Chi-Square (X2) test is a nonparametric test that is used to test hypotheses about distributions of frequencies across categories of data. Different from what weve been learning

Then Averages Scales Now Frequencies Categories

The X2 goodness-of-fit test.

Used when we have distributions of frequencies across two or more categories on one variable. Test determines how well a hypothesized distribution fits an obtained distribution.

Used when we compare the distribution of frequencies across categories in two or more independent samples. Used in a single sample when we want to know whether two categorical variables are related.

In my backyard, I have a new hybrid rose bush. I hypothesize that (according to Mendelian genetic theory) that I should have 50% pink flowers, 25% white flowers, and 25% red flowers.

Pp

Pp

PP

Pp

Pp

pp

Flowers

I grow 120 of these plants from seed. The resulting colors of flowers are as follows:

Recall, my expectations were 50% Pink, 25% White, 25% Red. Pink White 20 Red 25

Pink 75

White 20

Red 25

Observed

75

So, if I planted 120 seeds, Id expect this set of colored flowers. Expected 60 30 30

Pink Observed Expected 75 60 White 20 30 Red 25 30

If my hypothesis is true (50%, 25%, 25%), how likely is it that I could get this difference between my actual distribution and my expected distribution of colored flowers?

If my hypothesis is true (50%, 25%, 25%), how likely is it that I could get this difference between my actual distribution and my expected distribution of colored flowers?

Used to determine if the probability < , in which case the hypothesis is rejected or if the probability > , in which case the hypothesis is not rejected.

Hypotheses

H0: P(pink, white, red) = .5, .25, .25

The population proportions of pink, white, and red flowers are .5, .25, and .25, respectively.

Notice that the hypotheses for the ChiSquare Goodness-of-Fit Test are stated in terms of proportions. The Chi-Square TEST is conducted on actual frequencies not proportions. Specifically, the X2 test operates on differences between observed and expected frequencies. First - make sure everything is a frequency.

The population proportions of pink, white, and red flowers are something other than .5, .25, and .25, respectively. mutually-exclusive, exhaustive categories (P=1).

Pink Observed Expected 75

120(.5) = 60

White 20

we calculate (O-E)2/E in each cell, sum all of the (O-E)2/E values over all cells, and compare this summed value to a critical value.

120(.25) = 30 120(.25) = 30

E = N Expected Proportion E = N P(cell)

(O E )2 2 o = E

Statisticians have found that if H0 is true and we calculate the X2 statistic for all possible samples of size N, the values for a probability distribution called the X2 distribution.

A family of distributions varying in df (like the t distribution). Positively skewed; the amount of skew decreases as df increases. Minimum value = 0 (X2 cant be negative) Average (typical) value increases (the entire distribution shifts to the right) as df increases.

A family of distributions varying in df (like the t distribution). df=1 df=3 df=5 df=10

o

2

(O E )2 = E

As differences between Os and Es get bigger, X2 gets bigger. Since we are only interested in rejecting H0 if the differences between the obtained frequencies and the expected frequencies is greater than expected by chance, the rejection region is in the upper tail.

Finding X2c

Table E.1 has the tabled values. df?

df = k - 1 Why?

If you have 3 categories, only the counts in 2 of them are free to vary.

Finding X2c

df 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 0.050 3.841 5.991 7.815 9.488 11.070 12.592 14.067 15.507 16.919 18.307 19.675 21.026 22.362 23.685 24.996 26.296 27.587 28.869 30.144 31.410 0.025 5.024 7.378 9.348 11.143 12.832 14.449 16.013 17.535 19.023 20.483 21.920 23.337 24.736 26.119 27.488 28.845 30.191 31.526 32.852 34.170 0.010 6.635 9.210 11.345 13.277 15.086 16.812 18.475 20.090 21.666 23.209 24.725 26.217 27.688 29.141 30.578 32.000 33.409 34.805 36.191 37.566 0.005 7.879 10.597 12.838 14.860 16.750 18.548 20.278 21.955 23.589 25.188 26.757 28.300 29.819 31.319 32.801 34.267 35.718 37.156 38.582 39.997

State H0 and H1. Choose Relevant probability distribution is X2 with k - 1 df. Find X2c & state decision rule: I will reject H0 if X2o > X2c Calculate X2o Apply decision rule.

Calculating X2o

Pink Observed 75 60 White 20 30

Finding X2o

Red 25 30

White 20 30

Red

Expected

(O-E) = 0, always

Finding X2o

Pink Observed - Expected 15 White -10 Red -5

Pink (Observed - Expected)2 Expected 225 60

Finding X2o

White 100 30 Red 25 30

Components of X2

White 100

Red 25

(Observed - Expected)2 / Expected

Pink 3.75

White 3.33

Red .83

Interpretation

Since we reject H0, the geneticists hypothesis does not fit the data. The population distribution across the three categories is probably different than .50 pink, .25 white, .25 red.

The manufacturer of Posts Raisin Bran cereal, which lags behind Kelloggs in sales, believe that, given the chance to try both, most consumers will prefer Posts. They devise a blind taste test. A sample of 100 people eat a small bowl of each cereal, without knowing which is which, and they are asked which cereal they like better. Fifty-seven people say they like Posts better, while 43 choose Kelloggs. Can the manufacturer advertise, More people prefer Posts? H0: P(Posts) = P(Kelloggs) or P(Posts, Kelloggs) = .5, .5 H1: P(Posts) P(Kelloggs) or P(Posts, Kelloggs) .5, .5

Breakfast Answers

2) Use = .05 3) df = 1, X2 distribution with 1 df 4) X2c for = .05, df = 1, is 3.84; Decision rule: reject H0 if X2o > 3.84 5) Calculations E(Posts) = E(Kelloggs) = 100 (.5) = 50

Cereal Post's Kellogg's O E O-E 57 50 7 43 50 -7 100 100 0 (O-E)^2 (O-2)^2/E 49 0.98 49 0.98 1.96 = X2o

In the Breakfast Example, we found that a 57 to 43 majority isnt enough to reject H0. What is the smallest number of Posts preferences that will lead to a significant finding (rejection of H0) at = .05? Correct and well-reasoned answers are worth 5pts on top of your final (total) grade.

Since X2o < X2c (1.96 < 3.84), we retain H0. The manufacturers cannot claim that more people prefer Posts.

Observations in different categories are independent. Categories are mutually exclusive. Categories are exhaustive. No expected frequency < 2. Few expected frequencies < 5.

The X2 distribution does not accurately describe the probabilities of various sampling outcomes if expected frequencies are small.

Used when:

We want to compare the distribution of frequencies across categories in two or more independent samples. We want to determine whether the paired observations obtained in two or more categorical variables are independent or associated.

A developmental psychologist hypothesizes that mothers who have physical contact with their infants immediately after birth are more likely to hold them on the left side, where the sound of the mothers heartbeat is more pronounced, than mothers who do not have such early contact with their infants. She observes 125 early-contact mothers and 105 latecontact mothers with the following results: Left Is there a significant difference? Early Late 80 55 Right 45 50 125 105 230

Same test.

Contingency Tables

Left Early Late 80 55 Right 45 50

Stating H0 & H1

When two or more groups are being compared, H0 states that the population distributions across all categories are the same. H1 states that the population distributions differ.

We are trying to determine if the frequencies in one variable are contingent upon the frequencies of the other variable.

Stating H0 & H1

H0 : Early and late contact mothers do not differ in how they hold their neonates. H1 : Early and late contact mothers hold their neonates differently. OR H0 : Group membership and distribution across categories are unrelated. H1 : Group membership and distribution across categories are related.

Stating H0 & H1

H0 : Group membership and distribution across categories are unrelated. H1 : Group membership and distribution across categories are related. OR H0 : Time of first contact and how neonates are held are not related H1 : Time of first contact and how neonates are held are related.

Stating H0 & H1

H0 : Time of first contact and how neonates are held are not related H1 : Time of first contact and how neonates are held are related. OR H0 : Time of first contact and how neonates are held are independent. H1 : Time of first contact and how neonates are held are dependent / correlated / related.

Test statistic for the test of independence is the same as in the goodness-of-fit test:

(O E )2 2 o = E

Two differences:

Calculation of expected frequencies Calculation of df

For each cell,

Early Late Column Sums Left 80 55 135 Right 45 50 95 Row Sums 125 105 N = 230

E=

N

Where N = total number of observations.

E (early , left ) = E (late, left ) =

230

E (early , right ) =

230

230

E (late, right ) =

230

Observed Frequencies Left Right Early 80 45 Late 55 50 Column Sums 135 95 Row Sums 125 105 N = 230

P ( row ) = row sum N P (column ) = column sum N

Expected Frequencies Left Right Early 73.4 51.6 Late 61.6 43.4 Column Sums 135 95

Expected frequencies = N P

Row Sums 125 105 N = 230

E (row AND column ) =

X2

Analyses.

df in X2 Independence Tests

df = (# rows - 1) (# columns - 1) Why?

Remember marginals are fixed in X2 independence tests.

100 100 50 60 90

H0 : Time of first contact and how neonates are held are independent. H1 : Time of first contact and how neonates are held are dependent / correlated / related. = .05 X2 distribution with 1 df X2c for = .05, 1 df = 3.84;

reject H0 if X2o > 3.84

Observed Frequencies Left Right Early 80 45 Late 55 50

Observed - Expected Left Right Early 6.6 -6.6 Late -6.6 6.6

Expected Frequencies Left Right Early 73.4 51.6 Late 61.6 43.4

(Observed - Expected)^2 Left Right Early 43.56 43.56 Late 43.56 43.56

(O E ) 2 o = E

2

Overview of X2

X2 - a nonparametric test applied to categorical, frequency data. Relevant probability distribution is the X2 distribution.

A family of distributions varying in df Positively skewed with minimum = 0 Skew decreases as df increases. Center of distribution and critical values increase as df increases.

X2o = 3.14

Overview of X2

Rejection region in the upper tail. Decision rule: reject H0 if X2o > X2c Two forms:

Goodness-of-fit

used to determine whether an obtained distribution fits a hypothetical one.

Independence

used to test whether two categorical variables are related used to test whether two different samples are related

- lindsay new spssUploaded byapi-242910883
- 51-Saptarishis Astrology Astro NewsUploaded bySaptarishisAstrology
- UT Dallas Syllabus for stat3360.5u1.10u taught by Yuly Koshevnik (yxk055000)Uploaded byUT Dallas Provost's Technology Group
- Project_4Uploaded byA K
- Workshop for Commissioners - 21 January 2016 - Using Data to Support System ImprovementUploaded byTim
- 53019926-Little-R-E-Mechanical-Reliability-Improvement-Probability-and-Statistics-for-Experimental-Testing-Marcel-Dekker-2001.pdfUploaded byMohamed AbdelAziz
- Reporting_Statistics_in_APA_FormatUploaded byAbbas Smiley
- Lampiran Perhitungan SPSSUploaded byYucca Camelia
- Abstract Abstrac Lit 2014Uploaded bySarka Ade
- 810007Uploaded bymansi
- Statistics Assignment (1)Uploaded byumairshahid1
- 11 Final SolutionsUploaded byterrygoh6972
- Effect of Credit Information Influence Loan Volume Granted By Selected Deposit Taking Saccos in Nyeri County, KenyaUploaded byInternational Organization of Scientific Research (IOSR)
- out of field in korea.pdfUploaded byBhannu Ramanan
- ch.14 - pptUploaded byManish Malik
- B9780080970868030506.pdfUploaded byRanjan Kumar Singh
- Ts88 05 Kapovic Etal 0595 Slican PrimerUploaded byGoran Buncic Buki
- APBioLabUploaded bykimberlycwijaya
- 314455559 Mini Project Yani2Uploaded byNadia Nurfauziah
- Math - 10 Steps - Final FormatUploaded byTrisha Cabral
- International Journal of Engineering and Science Invention (IJESI)Uploaded byinventionjournals
- Footwear IndustryUploaded byDipak Thakur
- conducting a user study.pptUploaded byFemi Alabi
- Beneke 2015Uploaded byKiran Saleem
- Course Syllabus in Inferential Statistics 1st Sem 2018-2019Uploaded byZoey Mitchiko Sotto
- Sample Midterms AtUploaded byMadhu Mangal Kumar
- MATH - 10 STEPS - FINAL FORMAT.docxUploaded byTrisha Cabral
- BRM 2.2 SyllabusUploaded byshaziafirdoos
- MONITERINGUploaded byJobin Thomas
- Fall 2011 Syllabus STAT3309 7 TTH-1Uploaded byAman ZeGreat

- ANOVA_Dua HalaUploaded byWiween Mihad
- ANOVA 2-wayUploaded byWiween Mihad
- Adding FractionsUploaded byWiween Mihad
- projek add mathUploaded bynna_92
- Kesediaan Pljr Gna KompUploaded byWiween Mihad
- kajian kesUploaded bykatyquzandrea
- jurnal mt 1Uploaded byWiween Mihad
- jurnal mt 1Uploaded byWiween Mihad
- Nota Psl Kaedah PenyelidikanUploaded byWiween Mihad
- Cth ResearchUploaded byWiween Mihad
- HURAIAN SUKATAN PELAJARAN MATEMATIK TAHUN 4Uploaded byMat Jang
- Penggunaan Geometer Skatchpad Bg Tajuk PenjelmaanUploaded byAina Ab Karim

- Wc 500176768Uploaded byErwinBoos
- Microsoft Dynamics CRM 2011 User's GuideUploaded byRafael Souza
- 2000 a Century of Philosophy Hans Georg Gadamer in Conversation With Riccardo Dottori[1]Uploaded byGermán Olmedo Díaz
- Excerpt of "Allegiant" by Veronica RothUploaded byHere & Now
- ECSS E HB 32 22A Insert Design HandbookUploaded byGuillermo Martínez
- Bush GeotypicalTerrains-Beginners Guide 1stEditionUploaded bytgm0
- e Learning PerceptionsUploaded byClaudia Vilcea
- article 2-young adults use of semioticsUploaded byapi-277836343
- Application of Gis Software for Erosion Control in the Watershed ScaleUploaded byIJSTR Research Publication
- ViewUploaded bylitzyouk8381
- 2SD864Uploaded byNguyen Phuoc Ho
- Computer Notes - Copy ConstructorUploaded byecomputernotes
- Tech Drawing Pre-lim Exam ( draft only)Uploaded byLen Anastacio
- UK Home Office: DorsetMappa2004Uploaded byHome Office
- MITRES_6_007S11_lec08Uploaded byAnkit Anand
- Kenneth R. Gaunter Jr.Uploaded byThe News-Herald
- Multiple Intelligence Pseudo Science and Actual Scientific Intelligence Theory l v3 Baf s1Uploaded byAlberto Martínez
- Daniel Golani, Brenda Appelbaum-Golani-Fish Invasions of the Mediterranean Sea(2010)Uploaded byCorina Taraș-Lungu
- Canam Joist Girder Catalogue 2014Uploaded byLuis Avila
- 5.System PlanningUploaded bySandyShreegadi
- Fahrenheit 451 Reading ScheduleUploaded byRod Fobert
- LCSW EXAM 1Uploaded byamdetriana3729
- mgt 203 Ethical Environment of BusinessUploaded byJomar Rabajante
- CS2307_MNLUploaded byrupesh kumar
- Visual Basic Lab ManualUploaded bysbalajisathya
- cc_sylUploaded byShivam Sharma
- Challenges of Power Electronic Packaging and ModelingUploaded bylimityy
- Theoretical and Experimental Investigations on SolarUploaded byRahul Tripathi
- Variational FrameworkUploaded byscorpion2583
- IRJET-Fuzzy Logic based Expert System for Detecting Colorectal CancerUploaded byIRJET Journal