## Are you sure?

This action might not be possible to undo. Are you sure you want to continue?

by

Norm Shklov, Ph. D.

Department of Mathematics and Statistics

University of Windsor

Windsor, Ontario, Canada

(Retired)

and

Jay C. Powell, Ph.D.

Faculty of Education

University of Windsor

(Retired)

Running Head: Frequency table data analysis

September 2001.

ABSTRACT

Having established clear evidence for a form of learning that is transformational and

discontinuous, the authors herewith present a way to consider frequency data in the

educational and social-sciences setting. This paper describes the procedure in detail and gives

several instances of its application.

2

The Problem

These two researchers have, for some time, been investigating non-traditional

approaches to observational data. The initial problem arose because they were trying to find

some way to examine, in multiple-choice tests, the patterns in which students' answers

(whether right or wrong) change with learning. Their presumption has bel'that there may be a

discontinuous component to learning in addition to the linear cumulative one commonly

assumed.

As background to this problem there are two possibilities. The first possibility is that

all wrong answers are blind guesses and that students will change from any wrong answer to

the right one with appropriate instruction. In this case the frequent current practice of

considering only right answers when observing student behavior for quality control would be

entirely appropriate. There would be no meaningful information to be found among the

wrong answers.

The second possibility, which comes from Powell's teaching experience and from

Piaget's many years of clinical observation (see: Flavell, 1963), is that answers change in

some sort of sequential order that is dependent upon how students interpret the questions

being asked. Their interpretations, in tum, could be dependent upon their current style of

thinking and their current depth of intellectual maturity.

In this case, learning would display discontinuous transformational properties

(systematic answer changes) in addition to cumulative properties (increasing numbers of

correct answers). There might be considerable meaningful information in all answers, not

merely the right ones.

To look at change within the testing context, it is necessary to give the same test more

than once. When this is done, it becomes possible to compare the answers selected upon the

3

first administration to those selected on the second administration and build frequency tables

that summarize these selections. There are two possibilities. First, the students may select the

same answer on both administrations. Second, they may change from one answer to another

within each item.

If the first possibility (that students guess blindly until they know the answer) is

correct, the typical observation should be changes to the right answer from any wrong one

followed by systematic repetition of the right answer over time. No other combination of

selection should be statistically significant beyond a chance level.

If the second possibility is correct, then a much more complex pattern of answer

stability and answer change should become evident as students transform their thinking with

increased maturity.

-<:.

The Approach

In a four-option multiple-choice test there will be a four by four frequency matrix for

each item containing 16 cells. One direction may be used to represent the responses from the

first administration of the test and the other direction to represent the responses from the

second administration. In this case each cell will be the frequency of the event of joint

selection for each pre-post response pair. Those students who, for some reason, omitted

making a choice at either of two administrations may be dropped from consideration because

these data will not add to the information about the dynamics of answering from within the

test.

With these frequency tables, there are four pieces of information that can be

considered firm for each cell frequency. These are:

4

1) The number of times the members of the sample chose both members of the pair

of responses being considered (the observed frequency),

2) The number of times this group chose the first member of the pair on the first

administration (row or column sums),

3) The number of times this group of students chose the second member of this pair

on the second administration (column or row sums), and

4) The total nwnber of students in the entire frequency table (N).

In mathematical terms, the values being considered are the cell frequency, its

associated row and column sums, and the table sum.

Unlike typical contingency tables, the concern here is with each cell instead of the

overall departure of the entire matrix from a l or some other generalized mathematical

distribution. The problem being considered here is trying to decide how large (or how small)

an observed frequency in any cell must be in order to indicate a meaningfl1l (statistically

significant) joint (or mutually exclusive) event.

Insert Table 1 about here

Suppose, as illustrated in Table l.a, that we have a joint choice of 11, a first variable

frequency of 15, and a second variable frequency of 19. If the group size is 23, the expected

joint choice frequency is (15 x 19)/23 = 12.39. In this case, 11 is a lesser value than this

expectation but may not be enough below this expected value to conclude that choosing one

option implies the rejection of the other. If the group size is 90, as illustrated in Table l.b, the

expected frequency is (15 x 19)/90 = 3.17. Are 11 observations enough larger than this

expectation to indicate a systematic joint choice?

.5

~ -tb' £::. investigation of the literature The nearest procedure that seemed to have

credibility was the proposal of Fuchs and Kennett (1980). Using a simulation of this problem

with 10,000 cases on two by two tables, the standard deviation from their proposed procedure

was consistently greater that 1.00. The observation indicates that their proposal has a

conservative bias in which frequencies that should be accepted would be rejected.

In this same simulation for finding a way to put a probability value on one cell with

know marginal sums and any total frequency were employed. Shklov and Powell (1988)

summarized the most important of these attempts, in which they compared, in simulated

conditions; the binomial, multinomial, hypergeometric and uniform distributions. Only the

multinomial distribution gave a statistically significant fit with constrained marginal totals.

Historically, this problem ar-ose from the observation that, on highly discriminating

test questions, par-ticular "wrong" answers tend to cluster at specific narrow ranges of the

total scores. These answer tend to cluster around their modes of selection across an entire

test. This property is well known from item response theory (IRT) but the interpretation of

these answers has been problematic.

To address this interpretation issue, two approaches are possible. First, interpretations

can be assigned to distracters based upon errors in information or logic. Powell (1970) and

Powell and Isbister (1974) attempted this approach with mixed results. These classifications

were less stable, using linear statistical procedmes, than would be hoped fo If a major

improvement in testing technology were to be achieved.

The second possibility would be to cluster these answers statistically and then attempt

to provide interpretations from written reports ofreasoning or from interviews. Powell (1968)

showed that, with adults, answers in the "wrong answer" set could be predicted from

reasoning reports about two thirds of the time. Using interviews with students from the third

6

through the eighth grades Powell (1977) found a Piaget-like sequence of interpretations of

these answer clusters with a consistency of more than 50 % but less than 65 %.. This study

also showed, as would be expected because these subsets clustered around their modes of

selection, a strong age-dependent sequence in their order.

There were unanswered question remaining from this latter study. First, how much

influence does the thought processes leading to the "wrong" answers have upon the "right"

answers being observed. By predicting the total scores on another test from all of the

subscores on the test showing the Piaget-like sequence, Powell (1976) showed that the

"wrong" answer subscores were first in frequently and explained more of the

variability than the "right" answer subscores (both concrete and abstract).

Second, ifthere is truly a Piaget-like developmental progression among these

answers, then this lea.';ing sequence must be discontinuous and the changes of answers from

k

phase to the next should be appropriate, both in direction and by age level.

,.

Giving the test twice across the age range from less than eight to more than nineteen and

using the multinomial procedure (described in detail in this present paper) to establish which

cells were statistically significant, Powell and Shklov (1992) showed that both these

conditions were met. Of additional interest is the observation that although nearly every age

level was highly statistically significant for the repeated selection of the "right" answer, This

test-retest value accounted for only 23 % of all the answers pairs on this test. When the sum

of the frequencies in all the significant cells was tabulated, 75 % of all the answer pairs were

explained as meaningful (P ""7'0.945).

Discussions with other psychometricians suggested that some of the more recent

econometric techniques might serve this same purpose. Keswick (2001) undertook this

attempt. This is what he reported:

7

I applied a logistic regression, an ANOVA (using the categorizations - two

by two and four by four transition matrices), a Chi Square (again using the

categorization), and since the categories are ordinals I applied a tobit (which is just an

ordered probit analysis). No procedure explained more variance. Of greater

concern, none really showed a direct detectable relationship.

It would seem, therefore, that to detect discontinuities in learning that may influence

the scores we obtain for assessing student performance, some other approach than the typical

linear statistical analysis would seem to be required. One possible approach is the application

paper to provide the details of this procedure, with examples from different types of

of the adaptation we have developed of the multinomial procedures. It is the purpose of this

qualitative research.

L..---

/

I

The Procedure

Step 1.

This procedure begins with an m x n matrix of frequencies 0u as follows:

011 0

12

0

1

"

R

1

0

21

0

22

O

2

,, R

2

Oml Om2

Om"

R

m

C

1

C

2

C"

N

n

where: R='L.o

I IJ '

)=1

m

~ . ='L. 0

IJ ,

i=]

(1)

(2)

8

and:

Step 2

In n

N= 2. 2. 0u.

i=lj=1

(3)

The next step is to collapse this matrix around any cell 0u . This step produces a 2 x 2

matrix, with the observed frequency of 0u being 0, of the sort:

0 j R', where: j=R'j-o,

g h R'2 R'2 = N - R'I ,

C'I C'2 N g= C', - 0,

C'2=N-C'I,

and h = N - (0 +j+ g).

In words,jis the frequency of the remainder of the row, g is the frequency of the remainder

of the column and h is the residual frequency of the total table. Tables 2 and 3 give a

numerical example of these first two steps in this procedure, begilming with the 4 x 4

frequency matrix, such as:

Insert Table 2 about here

By choosing to look at 0

22

(with a frequency of 0 = 33; the repeated choice of the

correct answer as shown in the box in Table 2) the collapsed matrix becomes:

Insert Table 3 about here

In order to determine the probability that 0 would be 33 by chance alone, it is

necessary to find two cumulative probabilities. First, it is necessary to find the cumulative

9

probability for the range of all possible values in that cell with the tlu'ee marginal values kept

constant. Second, there is the need to find the cumulative probability for all single tables that

have a value of 33 or less.

To achieve this end, it is necessary to find the smallest possible value and the largest

possible value that 0 can achieve within these marginal constraints.

Since these values are all frequencies, and the least possible frequency is zero (0), the

least possible value that 0 can possess is when either the frequency value of0 or ofh is zero

(0). Similarly, the greatest possible value that cell 0 can contain occurs when either the

fi'equency of f or of g is zero (0).

Mathematically:

O(min) = 0, if 0 :::; h; otherwise O(min) = 0 - h

o(max) = the lesser of R' I and C' I

(4)

(5)

In the present case, the minimum value is 0 (zero) and the maximum value is 46.

Step 3

The multinomial equation is applied to the range of all possible values from the

minimum (0) to the maximum (46), adding these partial probabilities to find the total

possible probability (PI )of events within these constrained conditions. The equation

becomes:

(6)

O(min)

[0

- Listening Skills - Following Directions
- That Precious Moment
- Teaching is What You Get Learners to Do
- Powell () Attitude Change
- vcla score report
- abigail may vcla score report
- isrva1107201431558080 vab9jgb33j
- praxis scores
- hw09.pdf
- isrva0103201431499027 va3p5w5283
- Toward the Application of the Constructionist Philosophy to Educational
- ued 495-496 kelley shelby vcla scores
- Bernauer, Powell (2010) Data Mining of both Right and Wrong Answers from a Mathematics and a Science M/C Test given Collectively to 11,228 Students from India [1] in years 4, 6 and 8
- Powell () What Are Tests Measuring?
- Powell (1977) The Developmental Sequence of Cognition as Revealed by Wrong Answers
- vcla
- Communicative Language Teaching
- science department lab rubric 2016
- 2014lang0303 0307
- GATE 2014 Question Paper & Answers - BT
- 4 srt lesson idea 2016
- praxisscorereport pt 2
- vcla
- TQM & HR 7
- ued496 robison kirstie vclascores
- vcla
- 2014grade90310 0314
- vcla
- vcla test scores edited
- kapsch praxis scores

Skip carousel

- Very Cool Project Proposal
- UT Dallas Syllabus for socs3305.501 06f taught by Kruti Dholakia (kruti)
- MCAT Day 1 2016 exam
- 38 Fair empl.prac.cas. 1220, 34 Empl. Prac. Dec. P 34,506, 15 Fed. R. Evid. Serv. 2008 Patricia B. Allen, Sylvester J. Vaughns, Individually and on Behalf of All Persons Similarly Situated v. Prince George's County, Maryland, a Corporation Donald H. Weinberg, Prince George's County Personnel Officer Prince George's County Personnel Board Thomas J. Wessel, Chairman, Evelyn J. Bata, M. James Evans, Jerome F. Byrnes, and Alton Thomas, Jr., Members, 737 F.2d 1299, 4th Cir. (1984)
- ASA VAM Statement[1]
- Random House & Scribd Case Study
- 68144_1960-1964
- 34376_1930-1934
- MCAT Day 2 2016 exam
- UT Dallas Syllabus for stat3355.501.10f taught by Qiongxia Song (qxs102020)
- UT Dallas Syllabus for socs3105.501.07f taught by Ka-yiu Ho (kxh022100)
- bls_0157_1915.pdf
- UT Dallas Syllabus for socs3405.002.10s taught by Ka-yiu Ho (kxh022100)
- 2138_1975-1979
- UT Dallas Syllabus for eco3304.501 06s taught by Isaac Mcfarlin (ixm024000)
- UT Dallas Syllabus for psy2317.501.08s taught by Nancy Juhn (njuhn)
- Essays on Mankind and Political Arithmetic by Petty, William, Sir, 1623-1687
- 69470_2005-2009
- 58430_1945-1949
- UT Dallas Syllabus for socs3305.002.07s taught by Ka-yiu Ho (kxh022100)
- 11434_1965-1969
- 61035_1995-1999
- 69847_2010-2014
- UT Dallas Syllabus for socs3405.001.08f taught by Ka-yiu Ho (kxh022100)
- As ISO 10017-2006 Guidance on Statistical Techniques for ISO 9001-2000 (ISO TR 10017 Ed. 2.0 (2003) MOD)
- UT Dallas Syllabus for epps3405.001.11f taught by Michael Tiefelsdorf (mrt052000)
- Projections of Education Statistics to 2023
- Police Body Cams
- United Nations Manual for the Production of Statistics on the Information Economy 2009
- UT Dallas Sample Syllabus for Chansu Jung

- GX114, Analysis of First Year Spending
- GX 001 ATP Proposal Kit 2000
- Gov.uscourts.mdd.242733.2.1
- Gov.uscourts.mdd.242731.5.3
- UNITED STATES OF AMERICA’S MOTION TO DISMISS APPEAL AND STAY BRIEFING SCHEDULEdd.242731.5.0
- USA v KARRON 07 CR 541 Rent and Riley Testimony
- USA v Karron Government Exhibit 110 Numbered Pages
- USA-V-KARRON-07-CR-541 BJ LIDE Direct Cross Redirect ReCross Trial-Transcript
- Casi by Belinda Gx110 Gx114 Dbk05
- Gov.uscourts.mdd.242733.2.3
- Gov.uscourts.mdd.242733.2.2
- Bernauer, Powell (2010) Data Mining of both Right and Wrong Answers from a Mathematics and a Science M/C Test given Collectively to 11,228 Students from India [1] in years 4, 6 and 8
- Gov.uscourts.mdd.242733.5.5
- Powell (1977) The Developmental Sequence of Cognition as Revealed by Wrong Answers
- Government Exhibit 114
- 2013.01.07-IG-to-Wolf
- gov.uscourts.ca2.12-2297.109.0
- 12-2297 Submission Package 01 Reduced
- USA v KARRON 07 CR 541 RPP Pre Full Trial and Sentence Transcript 01
- GOLDBERG DECLARATION - Signed Final Version 9-12-11 - 750 Pm
- Inspector General’s Witness Intimidation of Examiners
- 12-2297 Corrected Brief Submission Package
- USA v KARRON 07 CR 541 RPP Pre Trial Conf and Full Trial and Sentencing Transcript
- 2013.05.01-IG-to-Wolf
- Ondrik and Yamatani Brady Notification with Statements of Fact and Plea Agreements
- gov.uscourts.ca2.12-2297.110.0
- USA v Karron 07 CR 541 (RPP) Frank Spring Direct Examination and Cross Examination 1
- Examiner Not Guilty of Criminal Time Reporting Charges
- gov.uscourts.ca2.12-2297.105.0
- Lide Reads Ondrik-Garrison E-Mail

Sign up to vote on this title

UsefulNot usefulClose Dialog## Are you sure?

This action might not be possible to undo. Are you sure you want to continue?

Close Dialog## This title now requires a credit

Use one of your book credits to continue reading from where you left off, or restart the preview.

Loading