You are on page 1of 2
Testing: criteria for evaluating tests Content validity Face validity Refers toa test appeafing fo test what it is trying to test. This Refers to.a test testing what it is supposed to test In constructing a test you should draw up a list of the skills, ‘structures.etc. that you want to test. Then devise the test using this list. The test may not contain all of these things but should contain a representative selection of them. This helps avoid-testing what is easy to-test rather than what is important to test, is not a'scientific concept; it refers to how the test appears to the users. For example, ifyyouiaim fo'test a student's ability to read:and understand whole texts, Itimight appear strange to do-this by giving them a multiple:choice grammar test. Test reliability "| would get more or fess the:same results. (It would never be ‘This means that if the Same Students, with the same amount of , took the same test at a different time they. exactly the same because hurnans aren't like that.) The closer the resutts, the more reliable the testIt is unlikely that teachers designing tests will be able to test this kind of reliability. If a student does surprisingly badly or well in a test, what do you do? (A disadvantage of tests as the sole means of assessment!) Scorer reliability This means that different markers or Scorers would give the _| same marks to the same tests. This is. easy with discrete item tests-such as multiple choice if there really is only one correct answer and the markers mark accurately. But with, for example, a piece of ‘free writing’, the marking may be more: Subjective, particularly if the marker knows the students who did:the test. To improve scorer reliability you can use things like clear guidelines for marking (criteria and points awarded), Standardisation meetings (to. compare sample tests and agree on what constitutes.an A, B or a C.for example), or double marking (two teachers. mark each piece of work and. the score is averaged).

You might also like