Critique of Test and Suggestions For Improvement

Critique of Test and Suggestions for improvement.
This test was administered at the End of Term 1, Form 4 examinations 2020. It was a
combination of a teacher made test as well as CSEC POA past paper questions. Upon reflection
the face validity was a valid test as to me as the teacher, it covered all topics that was taught
using the scheme of work for the term. Content validity is testing all relevant areas of the topic
however, the questions asked of the students did not reflect all areas. For instance, question 1 a
students were asked to identify one internal user of accounting information and not address all
internal or external users of accounting information. However, in question 1b all areas of
business types were addressed making it more valid than question 1 A. Content validity was
insufficient for some stated questions. Therefore I would need to ensure that all teacher made
questions conforms to all areas of that topic taught as it cheats students of a chance to show their
cognitive knowledge in this area.
For further scrutiny, item discrimination was calculated as follows:-
Utilizing calculations for item discrimination which measures the ability to discriminate between
high and low scores for test.
Total number of students who took the test -30
Number who got high scores on the test -10
Number who got low scores on the test - 20
Total number of students who got the item correct -15
Number who got the item correct and got high scores -2
Number who got item correct and got low scores -13
Discriminationindex=difficulty index for high scoring group−difficulty index for low scoring group
2 13
Discriminationindex= −
10 20
Discriminationindex=0.20−0.65=−0.45
From the data calculated negative 0.45 the test was not valid. Students low scores in this
test certain reflect factors from the covid pandemic which were not taken into consideration such
as internet connectivity issues which distraught students, no electricity for some students which
incapacitated them from completing all questions. But also most questions were geared towards
lower order skills and not higher order skills. The test was biased towards the majority being
recollection of theory and only question 3 was actual booking. In addition, the table of
specifications was done after the test so this affected the scores of students as not all areas would
have been tested which would have cheated those students who knew other parts of the answer
but it was not tested. A suggestion is that the table of specification be done before the exam to
ensure all areas, objectives and competencies are tested. The teacher also needs to ensure that
when setting self-made questions it reflects the varying levels of skills within the cognitive
domain for both lower and higher order.
For reliability of the test the same concepts were tested during the term but at the end of
term after more than two week the correlation of marks were different than expected. So the test-
retest for reliability shows a discrepancy in marks. A factor that may have contributed to this is
the time span for testing. As such the test retest for topics should be done on a continuous basis
and not only when the topic is taught and at end of term test. This can eliminate the discrepancies
in scores. I suggest perhaps instead of one 3 Hour long exam the under normal face to face
conditions, I change to two online examinations 1hr 30 minutes long or even make the exam
shorter .
Additionally, inter – rater reliability in terms of only one teacher, me marking the exams
was biased as I would have had a relationship with my students and given or deducted marks
based on students’ performance during the term. For instance given an extra mark to a student
who was always participating in class but deducting for a student not participating in class or a
student showing working as compared to one not showing workings. The marks awarded would
have been generalized as there was no table of specification before the examinations so the
rubric would have been unclear. A suggestion as stated before a table of specification before
exam but also a detailed rubric indicating how many marks to be awarded for workings ,
showing calculations.
Parallel reliability two different tests were conducted monthly tests and end of term tests
with similar constructs were tested but again the results varied. Again the students marks were
low however, time constraints in an online examination being administered would affect
reliability as monthly tests time was different (2 days to submit) for examination time within 3
hours.
Another area for reliability was the fact that for teacher made questions the teacher would
have used the everyday language used in class and it could have been misinterpreted or unclear
to some students. As I am the only accounts teacher perhaps giving the exam to a business
colleague or English Language teacher to check for ambiguity and wording of document.

Critique of Test and Suggestions For Improvement

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Critique of Test and Suggestions For Improvement

Uploaded by

Copyright:

Available Formats

Critique of Test and Suggestions for improvement.

internal or external users of accounting information. However, in question 1b all areas of

cognitive knowledge in this area.

For further scrutiny, item discrimination was calculated as follows:-

high and low scores for test.

Total number of students who took the test -30

Number who got high scores on the test -10

Number who got low scores on the test - 20

Total number of students who got the item correct -15

domain for both lower and higher order.

You might also like