This action might not be possible to undo. Are you sure you want to continue?
Sylvester Saimon Simin Keningau Teachers Training College
You should be able to….
Knowledge 6. Evaluation 6.1 Test Blue Print and Question Construction (10 hours) Skills Prepare a test blue print based on KBSM Science Syllabus Construct objective, structure, and essay questions based on the test blue print Prepare marking schemes for the above questions Study and analyze the format of the examination papers on the following aspects: distribution of multiple choice, structure and essay questions. Analyze each MCQ and classify it based on Bloom’s Taxonomy Values / Remarks T & L Resources: •Kementerian Pendidikan Malaysia (1995d) •Past year PMR and SPM Examination papers
6.2 Centralised examinations (PMR and SPM) Format of PMR and SPM examination papers
Values: To be aware of the importance of planning and preparation before administering a test. To be aware of the accountability of the test.
I Have a Dream About Assessment Roger Farr
• I have a dream that assessment...
– ...will be accepted as a means to help teachers plan instruction rather than as a contrivance to force teachers to jump through hoops; – ...will be based on trust in a teachers judgment as much as numbers on a page are trusted; – ...will become a helpful means to guide children to identify their own literacy strengths rather than a means to conveniently label them; – ...will support each child in becoming the best he or she can be rather than a means to sort children into groups of the best and the worst; – ...will be put to use to honor what children can do rather than destroying them for what they can’t do.
• I have a dream that assessment...
• And I have a dream that assessment...
• If we all work together we can make such dreams become a reality as we work to help each child grow.
Purpose of Evaluation
• • • • • • • • to determine the students’ achievement of certain knowledge and skills as specified by the syllabus of the subject to measure students’ progress over time, to rank students’ in terms of their achievement, to diagnose the main difficulties faced by the students in the areas of study, to determine how effective are the teacher’s instructional strategies, to determine the effectiveness of the curriculum, its strengths and weaknesses, to encourage good study habits, to motivate students
• Performance Assessments
– Assessment requiring students to demonstrate their acheivement of understandings and skills by actually performing a task or set of tasks (eg. Writing a story, giving a speech, conducting an experiment, operating a machine)
• Alternative Assessment
– A title for performance assessments that emphasizes that these assessment methods provide an alternative to traditional paper-andpencil testing.
• Authentic Assessment
– A title for performance assessments that stresses the importance of focusing on the application of understandings and skills to real problems in “real-world” contextual settings
WORKING DEFINITION The official endorsement of the procedures and/or standards of an institution by an authority. For example, an examination board may accredit a center for the assessment of course work. A long-term goal which may or may not be achievable within the teaching program. A challenge by a candidate or a school to the results awarded by an examining authority. General term used for the 'measurement' of a behavior or characteristic One part of an assessment package - e.g., a written paper, a practical test, an oral exam, a piece of coursework. A statement of an expected learning outcome which will be assessed. The total assessment scheme which may be composed of one or more components Listening test (not to be confused with an 'oral test' i.e., a test of speaking.) The effect (positive or negative) of the scheme of assessment on the teaching/ learning program which precedes it. Tendency of a test, or an item, to place one group at an advantage over another on the basis of a factor (e.g., gender, ethnicity, language) other than that which the test purports to assess. Final proof of an examination paper as it will appear, after printing, on the candidate's desk. Administrative arrangement where all answer scripts are brought to a central location for marking. Where markers remain at the center throughout the marking period, this may be referred to as 'residential marking'. Use of examination results to provide individuals with documentary evidence of achievement (i.e., a certificate). Statistics describing the behavior of a test item (typically its level of difficulty and its discriminatory power) by analysis the responses of a particular group of test-takers. Note that such statistics are dependent on the group taking the test. (See also IRT). Special preparation of candidates for an examination typically by practicing the techniques of test taking, rote learning of past questions and answers, 'question spotting' etc. Set of guidelines and/or regulations controlling the procedures of assessment authorities in the conduct of public examinations. Where examination bodies have constitutional autonomy, this may have to be a voluntary code of practice.
aim (educational aim)
assessment assessment component
assessment objective assessment package
aural examination backwash effect (occasionally 'washback effect') bias
camera-ready copy (CRC)
classical item statistics
code of practice
All educational aspects of an institution and its teaching programs including non-examined subjects Test score at which students are deemed successful (and below which they are deemed unsuccessful). See also grade threshold. Procedure in which answer scripts are independently scored by two raters. Where there is a discrepancy between scores, set procedures apply for reaching the final score. Typically these include averaging small differences and using an 'expert marker' as an arbiter where differences are large. Individuals or institutions who use examination results for their own purposes e.g., universities, schools, employers. An equitable examination ensures that all students who possess the same degree of ability receive the same result. Where there are inequities, an individual or group gains an unfair advantage over others. It follows that inequity places some individuals and/or groups at a disadvantage due to factors other than the ability that the examination purports to assess. Assessment for the purpose of making a value judgment, e.g., to judge the effectiveness of a teaching program Place officially recognized for the conduct of examinations. Typically centers are state schools, private schools, university halls or private buildings hired for examination purposes. The systematic flow of information gained from an assessment to educationists, policy makers, and others e.g., examiner reports for teachers. Assessment which takes place as an integral part of the teaching-learning program (see also summative assessment). Test score between two reporting grades. For example, if the A-grade threshold is 81%, students scoring 80% will be awarded grade B and those scoring 81%, grade A. Examination system which requires candidates to take a prescribed number and combination of subjects. The award of the certificate is dependent on the candidate meeting pre-determined criteria for success.
An examination where students, parents and teachers invest a great deal of effort, and perhaps money, in preparing because success can potentially bring great rewards whilst failure may damage the candidate's life-chances. Form of malpractice where someone takes an examination in place of the registered candidate. Person who supervises and is responsible for the conduct of an examination in a particular examination room/hall. Item Response Theory (sometimes IRM - Item Response Modeling). Psychometric tool which, in its simplest form, uses a mathematical model to link a student's chance of being successful on an item with the student's ability and the item's difficulty. This allows items to be calibrated on an absolute measurement scale. A collection of items categorized according to their characteristics e.g., type of item, topic, skill being assessed, level of difficulty, etc. Items are then drawn from the bank to build a test according to predetermined test specifications. A table which ranks schools on the basis of examination results and other indicators (see also 'value added'). Unauthorized release of examination materials and/or information prior to the official release date. Where an independent country takes responsibility for the maintenance and further development of an examination system introduced by a former colonial authority. Any deliberate act of wrongdoing, contrary to the rules of the examination, designed to give a candidate an unfair advantage or, albeit less frequently, to place a candidate at a disadvantage. One who marks/scores candidate responses (also rater). Instructions as to how marks are to be allocated to student responses (answers). These may be detailed for objective and semi-objective tasks. For open-ended and subjective tasks, they may take the form of general descriptions ('band descriptors'). An assessment made using the concept of a well-defined ability scale to quantify a behavior or characteristic e.g. mathematical ability. General term used by examining authorities for the process of checking quality. Question paper moderation typically involves the review of draft question papers by an expert panel. Moderation of school-based assessment may involve a Board representative visiting the school to look at work and interview teachers and students. Alternatively, samples of student work may be sent for review by a Board moderator. Assessment designed to determine national standards usually conducted using a representative sample of students.
marker marking scheme
Item that can be scored without the marker making a personal judgment as to the quality of the response e.g., multiple-choice. Optical Mark Reader - scanning device for reading marks from special forms thereby allowing the automatic input of student responses to, for example, multiple-choice question papers. Term applied, especially in Africa, to an organization established by a government but which, through its constitution and budgetary arrangements, enjoys a great degree of operational freedom and insulation from direct political interference. The science of teaching including both theory and practice. Candidate who enters, and pays for, his/her own entry to a public examination as compared with a candidate who is entered by the institution (school) in which he/she is studying and which is recognized by the examining authority as an authorized center. Field concerned with the measurement, and hence quantification, of human behaviors and characteristics. Psychometric strategies are built on statistical models of measurement and human behavior. An examination offered by a national or provincial (state) authority, or on behalf of such an authority, to students at a particular level of an education system. The primary purpose is to certify the level of achievement of individual students and/or to select students for the next level of the education system. Form of selection system where the share of available opportunities to be awarded to a particular group is pre-determined. For example, in order to ensure gender balance in a selective secondary school system, 50% of places may be awarded to boys and 50% to girls. As a consequence, some boys may be selected with lower examination scores than those achieved by girls who are rejected (or vice versa). One who marks/scores candidate responses - a marker. Key process whereby the details of individuals (students) are entered into the administrative database as candidates for forthcoming examinations. Term used, particularly in the Asian sub-continent, for candidates registering through recognized centers for a series of examinations for the first time. Private candidates and those re-sitting examinations are considered irregular. A measure of the stability of the results produced by an examination. This includes the stability of scores on re-testing, the stability of scores with remarking, and the correlation of scores for sub-sections within the test (homogeneity). Any assessment of student performance which takes place in a school and is incorporated into the public examination result. Note that the degree of freedom allowed to the school will depend on the regulations and moderation procedures of the examining authority.
pedagogy private candidate
General term for an answer booklet or sheets produced by a candidate in response to an assessment task. Use of examination results to select individuals for educational or employment opportunities where the number of such opportunities is limited. In many developing countries, examination results are used to select students for the next phase of education e.g. primary-secondary, lower secondary-higher secondary, secondary-tertiary. A plan or 'blueprint' giving the format of a question paper or other assessment component. The importance of an examination as judged by what may be gained through success - and what may be lost through failure. Therefore, a 'highstakes' examination will typically be highly competitive because the successful will enjoy greatly enhanced opportunities. Task composed of a number of sub-questions (items) linked by a common context or piece of stimulus material. The sub-questions may be independent of each other or may be sequenced to lead candidates through a more complex task (progressive). Item that requires the marker (rater) to make a personal judgment as to the quality of the response e.g. the literary merits of an essay or the artistic merits of a painting. Note that in order to minimize variation, rater judgments may be guided and constrained by marking schemes and descriptors of performance. Assessment which takes place at the end of the teaching-learning program to record 'final achievement' (see also formative assessment). A follow-up examination allowing students to retake subjects in which they have not reached the required level. This issue is of particular importance in systems awarding group certificates. A document formally specifying what will be assessed by the examination and how the assessment will be carried out. Plastic envelopes for examination materials which cannot be resealed without showing obvious signs of being opened. A specific short-term goal of the teaching program.
stakes (of an examination)
syllabus (examination syllabus) tamper-evident packaging
teaching objective (curriculum objective) teaching program teaching/learning program
The program of instruction. The instruction delivered by a teacher coupled with the learning that takes place during the program. Extent to which the processes involved in the examination system are visible to the public - especially schools, teachers and students. A measure of the extent to which an examination measures what it purports to measure.
Achievement Alternative assessment Assessment Criterion Criterion-referenced Diagnostic assessment Evaluation Formative assessment Grading Ipsative assessment Learning outcome Norm-referenced Objective Peer assessment
A demonstration of learning at a particular moment in time Any and all assessments that differ from multiple choice, one word answer, timed items that characterize standard tests The gathering of data about students or program, often used as a formative process to guide instruction The standard against which performance is measured Judgement of performance against a previously agreed standard Determines the level of achievement/performance prior to entering a The application of judgement to the data in the form of a grade or comment, placing a value on that work Ongoing feedback on a student=s performance throughout the learning process Assigning a letter, percentage or score The measure of student growth A general statement which describes an observable result by which a student demonstrates knowledge, skill or attitude Judgement of performance against the norm for the group A specific statement of intent Reflective practice in which students make observations about the performance of their peers
Performance assessment Portfolio assessment Process assessment Product assessment Reporting Rubric Self-assessment
Usually an alternate or authentic assessment, where a student completes a relevant task which demonstrates learning by using or applying knowledge The assessment of a representative collection of a students work over time Focuses on the variety of strategies, thinking skills and processes that a student uses to complete a task Focuses on the end product of a learning process Communicating process or achievement to the student or his/her parents or guardian A set of quality criteria Reflective practice in which students make observations about their own performance Reflective practice in which students make observations about their own performance A point of reference against which judgements can be made A report on the final achievement -- given at the end of a unit or work or semester or year
Self-referenced Standard Summative Evaluation
What is performance assessment?
• A performance assessment is an assessment activity that requires students to construct a response, create a product or demonstrate a skill they have acquired. Rubrics, based on the selected criteria, are given to students to ensure that they know what they need to do to meet or exceed the learner outcomes. • Well-constructed performance assessments:
– are the most authentic types of assessment since they replicate out of school experiences, encourage self-evaluation and demonstrate what students know and can do; – put students in a role (e.g. scientist, newspaper editor) and provide an audience for their task – provide degrees of proficiency based on criteria and make public the criteria.
A few things to know ……
• Bloom’s taxonomy • Difference between;
– Testing, measurement, evaluation – objective & subjective items – formative & summative evaluation – critrion reference test & norm reference test
• Validity & Reliability
The Assessment Process
• – – –
Preparation (including Test / Task Blueprint)
Determine the kind of information needed and decide how and when to obtain it. Obtain a variety of information as accurately as possible Judgements are made by comparing the information to selected criteria. Record significant findings and determine appropriate courses of action.
2. Information gathering 3. Forming judgements 4. Decision making and reporting
• Information gathering techniques
– Procedure for obtaining information – Inquiry (asking), observations (senses), analysis (performance, product), testing (common situation to which all students respond,common set of instructions governing response, set of rules for scoring responses & description of performance ie score)
• Information gathering instrument
– Tools used to gather information – 3 basic types : tests, rubrics and questionnaires
• Teacher made test / classroom tests vs standardized tests • Rubric : set of rules for scoring student products or performance. Typically take the form of a checklist or a rating scale • Questionnaires : useful for getting opinions, feelings and interests
Information Gathering Techniques
inquiry •Opinions •Self-perceptions •Subjective judgements •Affective (especially attitudes) •Social perceptions observation •Performance or end products of some performance •Affective (especially emotional reactions) •Social interaction •Psychomotor skills •Typical behavior •Subjective, but can be objective if care is taken in the construction and use of the instruments •Inexpensive but time-consuming analysis •Learning outcomes during the learning process (intermediate goals) •Cognitive and psychomotor skills •Some affective outcomes •Objective but not stable over time testing •Attitude and acheivement •Terminal goals •Cognitive outcomes •Maximum performance
Kind of information obtainable
•Least objective •Highly subject to bias and error
•Most objective and reliable
•Inexpensive but can be time consuming cost
•Fairly inexpensive •Preparation time is somewhat lenghty but crucial
•Most expensive, but most information gained per unit of time
Information Gathering Instrument
Type Standardized test Used when accurate information is needed Advantage Usually well developed and reliable. Include norms for comparing the performance of a class or an individual Usually measure exactly what has been taught. Inexpensive. Can be constructed as need arises. Helpful in keeping observations focused on key points or critical behaviors. Disadvantage Often not measuring exactly what had been taught. Expensive. Limited in what is measured. No norms beyond the class are available. Often unreliable. Require quite a bit of time to construct. Measure only presence or absence of a trait or behavior.
Teacher made test
Routinely as a way to obtain achievement information Checklists To determine the presence or absence of specific charateristics of performance Rating Scales To judge quality of performance To inquire about feelings, opinions, and interests
Rubrics To assess the quality of student performance
Allow observational data to be used in making qualitative as well as quantitative judgements Keep inquiry focused and help teacher obtain the same information from each student.
Take time and effort to construct. Can be clumsy to use if too complex. Take time and effort to construct. Difficult to score. No right or wrong answers. Data difficult to summarize. 25