You are on page 1of 17

Basic Concepts in Assessment How can we use assessment as a tool to improve our teaching?

Assessments as Tools Assessment is a process of observing a sample of students behavior and drawing inferences about their knowledge and abilities. We use a sample of student behavior to draw inferences about student achievement. Forms of Educational Assessment Informal vs. formal assessment Paper-pencil assessment vs. performance assessment Traditional assessment vs. authentic assessment Standardized test vs. teacher-developed assessment Informal vs. formal assessment Informal assessments are spontaneous, day-to-day observations of students performance in class. Formal assessment is planned in advance & used for a specific purpose to determine what is learned in a specific domain. Paper-pencil vs. Performance assessment Paper-pencil: asks students to respond in writing to questions. Performance: asks students to demonstrate knowledge or skills in some other fashion. Students perform in some way. Traditional vs. authentic assessment Traditional: assesses basic knowledge & skills separate from realworld tasks. Authentic: assesses students ability to use what theyve learned in tasks similar to those in the outside world. Standardized test vs. teacher-developed test Standardized test: developed by test experts, published for use in many schools. Teacher-developed tests: developed by a teacher for use in individual classroom.

Purposes for assessment Formative evaluation: assessing what students know before & during instruction. We can redesign lesson plans as needed. Summative evaluation: assessment after instruction to determine what students have learned, to compute grades. Promoting learning Assessments as motivators Assessments as mechanisms for review Assessments as influences on cognitive processing- studying more effectively for types of test items. Assessments as learning experiences Assessments as feedback Qualities of good assessments- RSVP Reliability Standardization Validity Practicality Reliability The extent to which the instrument gives consistent information about the abilities being measured. Reliability coefficient- correlation coefficient +1 to -1 Standard error of measurement SEM- shows how close a students score is to what it should be. A true score is the ideal score for a student on a subject based on past performance. The test manual will compute common errors in the scoring. Scores must be given within this range- the confidence interval. Enhancing the reliability of classroom assessments Use several tasks in each instrument Define each task clearly enough so students know what is being asked. Use specific, concrete criteria

Keep expectations out of judgment. Avoid assessing a child when s/he is ill, tired, out of sorts in some way. Use the same techniques and environment for assessing all kids. Standardization The concept that assessment instruments must have similar, consistent content, format, & be administered & scored in the same way for everyone. Standardized tests reduce error in assessment results & are considered to be more reliable. Validity The extent an instrument measures what it is designed to measure. Content validity- items are representative of skills described Predictive validity- how well an instrument predicts future performance. SAT, ACT Construct validity- how well an instrument measures an abstract, internal characteristic- motivation, intelligence, visual-spatial ability. Essentials of testing An assessment tool may be more valid for some purposes than for others. Reliability is necessary to produce validity. But reliability doesnt guarantee validity. Practicality The extent to which instruments are easy to use. How much time will it take? How easily is it administered to a group of children? Are expensive materials needed? How much time will it take? How easily can performance be evaluated?

Standardized tests Criterion-referenced scores show what a student can do in accord with certain standards. Norm-referenced scores compare a students performance with other students on the same task. Norms are derived from testing large numbers of students. Types of standardized tests Achievement tests- to assess how much students have learned of what has been taught Scholastic aptitude tests- to assess students capability to learn, to predict general academic success. Specific aptitude tests- to predict how students are likely to perform in a content area. Technology and Assessment Allows adaptive testing Can include animation, simulation, videos, audios Enables easy assessment of specific problems Assesses students abilities with varying levels of support Provides immediate scoring Guidelines for choosing standardized tests Choose a test with high validity for your purpose & high reliability. Be sure the tests norm group is relevant to your population. Follow directions closely. Types of test scores Raw scores- based on number of correct responses. Criterion-referenced scores- compare performance to criteria or standards for success. Norm-referenced scores- compare students performance to the average of students the same age.

Norm-referenced scores Grade-equivalents and age-equivalents compare a students performance to the average performance of students at the same age/ grade. Percentile ranks- show the percentage of students at the same age/ grade who made lower scores than the individual. Standard scores- show how far the individual performance is from the mean by standard deviation units. Standard scores Normal distribution- bell curve Mean Standard deviation- variability of a set of scores. IQ scores ETS scores Stanines Z-scores Standard deviation IQ scores- mean of 100, SD of 15 ETS scores- (Educational Testing Service tests- SAT, GRE) mean of 500, SD of 100 Stanines- for standardized achievement tests- mean- 5, SD- 2 z-scores- mean of 0, SD of 1- used statistically Norm- vs. criterion-referenced scores Norm-referenced scores- grading on the curve, based on the class average. Sets up a competitive environment, not a sense of community. May be used in performance tests- who gets to be first chair in band. Criterion-referenced scores show if students have mastered objectives. Interpreting test scores Compare 2 norm-referenced test scores only when those scored come from equivalent norm groups. Have a clear rationale for cutoff scores for acceptable performance. Never use a single test score to make important decisions.

High-stakes testing and accountability High-stakes testing- Making major decisions on the basis of a single assessment. Accountability- holding teachers, administrators responsible for students performance on those tests. Some tests have determined passing a grade or graduation. Problems with high-stakes testing Tests dont always show instructional objectives. Teachers spend time teaching to the tests. Low achievers or special ed students are often not included. Criteria often bias against students from lower SES. Not enough emphasis on helping schools/ students improve. Potential solutions to the problems Identify what is most important for students to know. Educate the public about what tests scores can do. Look at alternatives to tests. Use multiple measures in making high-stakes decisions. Identify what is most important for students to know. Educate the public about what tests scores can do. Look at alternatives to tests. Use multiple measures in making high-stakes decisions. Confidentiality & communication of test results Family Educational Rights & Privacy Act- limits testing to achievement/ scholastic aptitude. Restricts test results to students, parents, & teachers. Restricts students grading others papers, posting scores publicly, or going through student papers to find ones own paper. Parents/ students can review test scores & school records. Communicating classroom assessment results Assessment is primarily to help students learn & achieve more effectively.

Class results must be communicated to parents to enable student success. Explaining standardized test results Be sure you understand the test results yourself. It may be sufficient to explain test results in general terms. Use percentile ranks rather than IQ or grade equivalents. Describe the SEM & confidence intervals if you know them. Taking student diversity into account Developmental differences Test anxiety Cultural bias Language differences Testwiseness Accommodating students with special needs Modify format of test Modify response format Modify timing Modify setting Administering part, not all test Use instruments that are more compatible with students level

testing
Show me everything on Performance Management definition In general, testing is finding out how well something works. In terms of human beings, testing tells what level of knowledge or skill has been acquired. In computer hardware and software development, testing is used at key checkpoints in the overall process to determine whether objectives are being met. For example, in software development, product objectives are sometimes tested by product user representatives. When the design is complete, coding follows and the finished code is then tested at the unit or module level by each programmer; at the component level by the group of programmers involved; and at the system level when all components are combined together. At early or late stages, a product or service may also be tested for usability. At the system level, the manufacturer or independent reviewer may subject a product or service to one or more performance tests, possibly using one or more benchmarks. Whether viewed as a product or a service or both, a Web site can also be tested in various ways - by observing user experiences, by asking questions of users, by timing the flow through specific usage scenarios, and by comparing it with other sites.

Meaning of Evaluation

Evaluation has its origin in the Latin word Valupure which means the value of a particular thing, idea or action. Evaluation, Thus, helps us to understand the worth, quality, significance amount, degree or condition of any intervention desired to tackle a social problem. Meaning of evaluation: Evaluation means finding out the value of something. Evaluation simply refers to the procedures of fact finding Evaluation consists of assessments whether or not certain activities, treatment and interventions are in conformity with generally accepted professional standards. Any information obtained by any means on either the conduct or the outcome of interventions, treatment or of social change projects is considered to be evaluation. Evaluation is designated to provide systematic, reliable and valid information on the conduct, impact and effectiveness of the projects. Evaluation is essentially the study and review of past operating experience.

Purpose of Evaluation

From an accountability perspective: The purpose of evaluation is to make the best possible use of funds by the program managers who are accountable for the worth of their programs. Measuring accomplishment in order to avoid weaknesses and future mistakes. -Observing the efficiency of the techniques and skills employed -Scope for modification and improvement.

-Verifying whether the benefits reached the people for whom the program was meant . Form a knowledge perspective: The purpose of evaluation is to establish new knowledge about social problems and the effectiveness of policies and programs designed to alleviate them. Understanding peoples participation & reasons for the same. Evaluation helps to make plans for future work.

Principles of Evaluation

The following are some of the principles, which should be kept in view in evaluation. 1. Evaluation is a continuous process (continuity). 2. Evaluation should involve minimum possible costs (inexpensive). 3. Evaluation should be done without prejudice to day to day work (minimum hindrance to day to day work). 4. Evaluation must be done on a co-operative basis in which the entire staff and the board members should participate (total participation). 5. As far as possible, the agency should itself evaluate its program but occasionally outside evaluation machinery should also be made use of (external evaluation). 6. Total overall examination of the agency will reveal strength and weaknesses. (agency / program totality). 7. The result of evaluation should be shared with workers of the agency (sharing).

Stages in Evaluation.

1. Program Planning Stage. Pre investment evaluation or Formative evaluation or

Ex ante evaluation or Early / Formulation Pre project evaluation or Exploratory evaluation or Need assessment. 2.Program Monitoring Stage. Monitoring Evaluation or Ongoing / interim. Concurrent evaluation 3.Program completion Stage. Impact evaluation or Ex- post evaluation or (Summative / Terminal / Final) Final evaluation.

Criteria for Evaluating Development Assistance Steps in Evaluation : Types of Evaluation


Evaluation can be categorized under different headings A) By timing (when to evaluate) Formative Evaluation Done during the program -Development stages (Process Evaluation, ex-ante evaluation, project appraisals) Summative Evaluation Taken up when the program achieves a stable of operation or when it is terminated (Outcome evaluation, ex post evaluation etc.) B) By Agency. Who is evaluating? Internal Evaluation External Evaluation

It is a progress / impact Unbiased, objective detailed Monitoring by the management it self assessment by an outsider (Ongoing / concurrent evaluation) C) By Stages On going Terminal Ex post During the implementation At the end of After a time lag of a project or immediately from completion after the completion of a project

Types of Evaluation Desired Situation Sustained benefits and impact Present Situation Mid-Term review End-of project or final evaluation Ex-post or impact evaluation Time PROJECT Internal / External Evaluation:

Internal Evaluation: (Enterprise Self Audit) Internal evaluation (or otherwise monitoring, concurrent evaluation) is a continuous process which is done at various points and in respect of various aspects of the working of an agency by the agency staff itself i.e. staff board members and beneficiaries. External / Outside Evaluation: (This is done by outsiders /Certified Management Audit) Grant giving bodies in order to find out how the money given is utilized by the agency or how the program is implemented sent experienced and qualified evaluators (inspectors) to assess the work E.g. Central social welfare Board Some donors may send consultants in order to see how far the standards laid down are put into practice. Inter agency evaluation. In this type two agencies mutually agree to evaluate their program by the other agency. Inter agency tours.

Methods of Evaluation: (Tools / techniques)

Over the years, a variety of the methodologies have been evolved by academicians, practitioners and professionals for evaluating any program / project. Some of the commonly used practices are given below. First hand Information : One of the simplest and easiest methods of evaluation by getting first hand information about the progress, performance, problem areas etc,. of a project from a host of staff, line officers, field personnel, other specialists and public who directly associated with the project. Direct observation & hearing about the performance and pitfalls further facilitate the chances of an effective evaluation. Formal / Informal Periodic Reports. Evaluation is also carried out through formal and informal reports. Formal reports consists of -Project Status Report -Project Schedule chart -Project financial status Report. Project Status Report: From this one can understand the current status, performance, schedule, cost and hold ups, deviations from the original schedule. Project Schedule Chart: This indicates the time schedule for implementation of the project. From this one can understand any delay, the cost of delay and the ultimate loss. Project Financial Status Report: It is through financial report, one can have a look at a glance whether the project is being implemented within the realistic budget and time. Informal Reports: Informal reports such as anonymous letters, press reports, complaints by beneficiaries & petitions sometimes reveal the true nature of the project even though these reports are biased and contains maligned information. Graphic presentations:

Graphic presentations through display of Charts, Graphs, Pictures, Illustrations etc. in the project office is yet another instrument for a close evaluation. Standing Evaluation Review Committees: Some of the organizations have setup standing committees, consisting of a host of experts and specialists who meet regularly at frequent intervals to discuss about problems and to suggest remedial measures. Project Profiles: Preparation of the project profiles by the investigating teams on the basis of standardized guidelines and models developed for the purpose, is also another method of evaluation.

Views about evaluation


Evaluation primarily perceived from three perspectives. Evaluation as an analysis determining the merits or deficiencies of a program, methods and process. Evaluation as an audit systematic and continuous enquiry to measure the efficiency of means to reach their particular preconceived ends. In the agency context Evaluation of administration means appraisal or judgement of the worth and effectiveness of all the processes (e.g. Planning, organizing, staffing etc.) designed to ensure the agency to accomplish its objectives. Areas of evaluation: Evaluation may be split into various aspects, so that each area of the work of the agency, or of its particular project is evaluated. These may be, 1.Purpose 2.Programs 3.Staff 4.Financial Administration 5.General. Purpose: The review the objectives of the agency / project and how far these are being fulfilled. Programs:

Aspects like number of beneficiaries, nature of services rendered to them, their reaction to the services, effectiveness and adequacy of services etc. may be evaluated. Staff: The success of any welfare program / agency depends upon the type of the staff an agency employs. Their attitude, qualifications, recruitment policy, pay and other benefits and organizational environment. These are the areas which help to understand the effectiveness of the project / agency. Financial Administration: The flow of resources and its consumption is a crucial factor in any project / agency. Whether the project money is rightly consumed any over spending in some headings, appropriation and misappropriation. These are some of the indicators that reveal the reasons for the success or failures of any project. General: Factors like public relations strategies employed by the project / agency, the constitution of the agency board or project advisory committee and their contribution future plans of the agency are important to understand the success or failures of any project.

Evaluation

Analysis on how successful the project has been in Transforming the means (i.e. the resources and inputs allocated to the project) through project activities into concrete project results Provides the stakeholders with information on inputs/costs per unit produced

Overall Objectives Efficiency Means + Preconditions Activities+ Assumptions Results + Assumptions Project Purpose + Assumptions Change utilisation action allocation Analysis on how well the production of project results Contributes to the achievement of the project purpose, i.e.: Are there clear Indications of changes and improvements that benefit the beneficiaries of the project? Uses base-line information on the pre project situation as a starting point Effectiveness Impact Analysis of the overall effects of the project Analysis of the contribution of the project purpose to the overall objectives Focus

on long-term changes in the environment of the project Collection and analysis of information at the levels of communities and society at large focusing on the final beneficiaries of the project Also analysis of unintended impacts (negative and positive) Criteria for Evaluating Development Assistance

Relevance = The extent to which the aid intervention is suited to the priorities and policies of the target group, partner country and donor Possible questions: To what extent are the objectives of the program still valid? Are the activities and outputs of the program consistent with the overall goal and the attainment of its objectives? Are the activities and outputs of the program consistent with the intended impacts and effects? Efficiency = Efficiency measures the outputs qualitative and quantitative in relation to the inputs. It is a term which signifies that the aid uses the Least costly resources in order to achieve the Desired results. This generally requires Comparing alternative approaches to achieving the same outputs, to see whether

the most efficient process has been adopted Possible questions: Were the activities cost-efficient? Were objectives achieved on time? What were the major factors influencing the achievement of the results?

Effectiveness = A measure of the extent to which an aid intervention attains its objectives Possible questions: To what extent were the objectives achieved/are likely to be achieved? What were the major factors influencing the achievement or non-achievement of the objectives? Impact = The positive and negative changes produced by an intervention, directly or indirectly, intended or unintended. Possible questions: What has happened as a result of the programme or project? What real difference has the activity made to the beneficiaries? How many people have been affected? Sustainability = Sustainability is concerned with measuring whether the benefits of an activity are likely to continue after donor funding has been withdrawn . Possible questions: To what extent did the benefits of a programme or project continue after donor funding ceased? What were the major factors which influenced the achievement or non-achievement of sustainability of the program or project?

You might also like