INTRODUCTION:Rating scales are the crudest form of measure using scaling technique. Scaling describes the procedures of assigning numbers to various degrees of opinion, attitudes and concepts. SCALE:Meaning and definition:The scales are form of self report, is a more precise means of measuring phenomena than the questionnaire . Most scales measures psychological variables. However, scaling technique can also used to obtain self report on physiological variables like pain , nausea or functional capacity. The scale is defined as a “procedure for the assignment of numbers (or other symbols) to a property of objects in order to impact some of the characteristics of numbers to the properties in question.” Methods of scaling:Scaling can be done in two ways
o Making a judgement about some characteristics of an individual & then placing them

constructing questionnaire in such a way that the score of individual response assigns him a place on the scale.

RATING SCALE:Meaning with example:An observer may be asked to judge the behaviour he observes and classify it in to categories. This is essentially the task he performs when completing a schedule, but he can also be asked to give a numerical value or rating to his judgments. Rating is a term applied to expression of opinion or judgement regarding some situation , object or character . Opinion are usually expressed on a scale of values. “Rating scale refers to a scale with a set of points which describe varying degrees of the diamension of an attribute being observed. Example: here we judge an object without reference to other similar objects :eg i. like –dislike’
Above average - average - below average'

'Other classification with more categories such as ' like very much –like somewhat- neutral – dislike some what - dislike very much'

There is no rule to use a two point / three point or with still more points. In practice 3-7 point scales are generally used for the simple reason that more points on a scale provide an opportunity for greater sensitivity of measurement. Uses of rating scaling scale:The ratingscales are used  in the evaluation of individuals , & their reaction  in the psychological evaluation of a stimuli  to record quantified observations of a social situation
 to describe the behaviour of individuals , the activities of an entire group , the changes in the

According to Guilford there are fve broad categories of rating scales: 1. Numerical scales 2. Graphic scales 3. Standard scales 4. Rating by cumulative points 5. Forced choiced rating

2. Graphic scales 3. Standard scales 4. Rating by cumulative points 5. Forced choiced rating Numerical scales:In a typical rating scale, a sequence of defined numbers are supplied to the rater or to the observer.The rater or the observer assigns to each stimulus to be rated , an appropriate number in line with these definition or descriptions.One example of such scale is ratings of the effective value of colours and odours is as follows: 10 most pleasant imaginable 9 most pleasant 8 extremly pleasant 7moderately pleasant 6 mildly pleasant 5indifferent 4 mildly unpleasant 3 moderately unpleasant 2 extremely unpleasant 1 most unpleasant 0 most unpleasant imaginable Another example of numeric rating scale for pain assessment:-

In such scales , sometimes zero is placed at the ‘indifferent’ category and negative numbers below it . It has been seen that observers or raters usually avoid terminal categories. If such categories (0 – 10) are not included , observers or raters would tend to avoid categories 1 & 19 and thus the range of rating gets shortened . To avoid this short coming , it is suggested to expand the scale beyond the categories which a researcher wants to include in his scale . For example , if a researcher wants an effective scale of 7 points , he may make use of additional two categories so that desired dispersion of seven point rating is achieved. In some numerical scales , the observer or rater is not provided with numbers which he has to use in making judgements. He has to report in terms of descriptive ‘cues’ and then the researcher assigns numbers to them. For example ,while rating performance in a drama , the ‘cues’ may be the following.: very good , good, average, poor , very poor. To these cues , the numbers 1 through 85 may be assigned by the researcher. i.
ii.

They are simplest in terms of handling the results It may suffer from so many biases and errors

Graphic scales:The graphic scale is the most popular and most popular and the most widely used type of rating scale. In this scale, a straight line shown , vertically or horizontally,with various cues to help the rater. The line is either segmented in units or it is continuous. If the line is segmented, the number of parts can be varied.

talkative

Talks when necessary

Refrine d from talking

These are simple and easy to administer

Interesting to the rater and needs little motivation

Provide opportunity for as fine discrimination There is some what greater labour of scoring in connection with some formats of graphic scale

Standard scaes:In this a set of standard is presented to the rater. The standards are usually object of some kind to be rated with pre established scale values In its best form , this type is like that of the scales for judging the quality of handwriting.The scales of handwriting provide several standard specimens that have previously spread over a common scale by methods of equal appearing intervals or pair comparison. With the help of standard specimens, a new sample of handwriting can be equated tone of the standards or judge as being between two standards. The man to man scale and the portrait – matching scale are other two forms of conform more or less to the principle of standard scale. Rating by cumulated points:The unique and and common feature of rating scale by cumulated points is in the method of scoring. The rating score for an object or individual is the some or average of the weighted or unweighted points. The ‘check list method’ & the ‘guess who technique’ belong to this category of rating. ‘check list method’are applicable in the evaluation of the performance of the personnel in a job. Hartshorne and Mary used this method for evaluating children with respect to character. A list of 80 trait names describing some favorable and unfavorable character like cooperative , cruel, thoughtful, humane, greedyetc, was prepared. Each rater checked every term in the list that he thought applied to a child. The weights of +1 and _1 were assigned to every favorable and unfavorable traits respectively and the child’s score was the algebraic sum of the weights.

It is suggested that the check list items may in multiple choice form rather than in true false form. For example , while rating the performance of personnel in their work assignment the items like the items like the following may be used: Cooperate with others
• • • • •

His relations with public are outstanding creditable acceptable poor detrimental

Enthusiastically Willingly Indifferently Grudgingly Defiantly

The ‘Guess who technique’ of rating was also developed by Hartshorne and Mary for use particularly with child rates. For this purpose some statements in terms of some ‘descriptions’ like ‘here is one who is always doing little things to make others happy’, were constructed and each child was told to list all his classmates who fitted each description , mentioning the same child as many times as necessary. Each child scored a point for each favorable description applied to him , and the total score was the sum total of all such points. Forced choice rating:Here the rater is asked, not to say whether the ratee has a certain trait or to say how much of a trait the rate has but to say essentially whether he has more of one trait than another of a pair. In the construction of forced choice rating instrument , descriptions are obtained concerning persons who are recognized as being at the highest and lowest extremes of the performance continuum for the particular group to be rated. Descriptions are analysed into simple behaviour qualities , stated in very short sentences or by trait names, which were called as ‘element’. These elements are used to construct items and then discrimination value and performance value are determined for each element. In forming an item , elements are paired . Two statement or terms with about the same high performance value are paired , one of which is valid and the other not . Two statements or terms with about the same high performance value are paired , one of which is valid and the other not. Two statement or terms with about equally low performance value are also paired , one being valid and the other not.

Two pairs of statement, one pair with high performance value performance value, are combined in a tetrad to form an item. An example: careless serious minded energetic snobbish

and one with low

In the construction , the rater is asked to react each tetrad as an item , saying which one of the four best fits the rate and which one of the four least appropriate. The tool is tried out in a sample for which there is an outside criterion for the purpose of validating the response. Then the discriminating responses are determined and differential weights are assigned to each item. Limitations in constructing and using rating scale:Constant errors :- rating based on human judgements are subject to many source of personal bias or errors. • The error of leniency:-there is constant tendency among the raters to rate those whom they know well,or in whom they are ego involved, higher than they should, such raters are called easy raters. -some ratersbecome aware of the failing of easy rating and consequently rate individuals lower than they should. Such raters are called hard raters . when rating is too low the constant error is one of the negative leniency when too high positive leniency occurs. For example:

poor

fair

good

Very good

excellent

Physical health: in this example only one unfavourablecue is given and most of the range is given to degrees of favourable report. The researcher evidently anticipates a mean reading somewhere near the cue good.

• The error of central tendency:In this error of central tendency, most of the ratershesitate to rate the individuals on the extremes of the scale and tend to rate the individuals on the middle of the scale. It is more common amongs the raters who are unknown to the individual. • The halo effect:This error which obscures the cluster of traits within an individual. The rater forms a general opinion about the persons opinion about the person’s merit and his ratings on specific traits are greatly influenced by this general impression . It results in spurious amount of positive correlation between the traits that are rated. • The logical error:It is due to the fact that judges are likely to give similar rating for traits which theyfeel logicallyrelated to each other. • The contrast error:The error is due to a tendency for a rater to rate others in the opposite direction from himself in a trait. For example , in a study the raters were asked to rate individuals in the trait of “need for orderliness” . It was seen that the raters who themselves were high in orderliness tended to see others as being less orderly than they were. • The proximity error:This error also gives rise to undue covariances among some traits like the logical error and the contrast error. It has been seen that adjacent traits on a rating form tend to intercorrelate higher than remote ones , their degree of actual similiarity being approximately equal. Construction of a rating scale:A trait to be rated should be given a trait name and a definition i. ii. Arating scale should make use of good cues There is hard and fast rule concerning the number of steps or scale divisions to be used in a rating scale. In general , 5 to 7 point scale seem to serve adequately.

General Advantages of rating methods:There are some advantages of rating methods 1. it consumes only less time 2. they are interesting to the raters , especially if graphic method is used 3. it can be used by raters who have minimum of training 4. it can be used with large numbers of stimuli 5. have much wider range of application 6. best ratings can be obtained by presenting one stimulus to a rater at a time.

RESEARCH REVIEW:For rating the Strength of Scientific Research Findings AHRQ(Agency For Heathcare research And Quality) Conducted a study on 1999:The researchers reviewed the titles and abstracts of 1,602 publications. From this set, they retained for this report 121 systems comprised of rating scales, checklists, other instruments, and guidance documents. Specifically, they assessed 20 systems relating to systematic reviews, 49 systems for randomized controlled trials (RCTs), 19 for observational studies, 18 for diagnostic test studies, and 40 systems for grading the strength of a body of evidence. For purpose of final evaluation, they focused on scales and checklists The researchers summarized more than 100 sources of information on systems for assessing study quality and strength of evidence for systematic reviews and technology assessments. Using criteria based on key categories to these systems, they identified 19 studyquality and 7 strength-of-evidence grading systems that people conducting systematic reviews and technology assessment can use as starting points.

AHRQ not only sees this report as meeting the congressional mandate outlined earlier, but the Agency hopes that groups or organizations producing systematic reviews and technology assessments will apply these rating scales and grading schemes in a manner that will benefit groups developing clinical practice guidelines and other health-related policy advice

CONCLUSION:Qualitative description of a limited number of aspects of a thing or a trait of a person can easily been done by rating scale. The rating scale procedures exeed all psychological – measurement methods.

