You are on page 1of 4

Scale development for depressive symptoms

There are 5 stages that a developer must follow in order to produce a scale that will

assess depressive symptoms. Firstly, the test developer should ask himself some basic

preliminary questions such as what will the scale measure, what is the goal of the scale, if

there is need for the scale, who will be the sample, and questions regarding the content area

covered, the way of administration, the format and possible risks and benefits. In our case,

the goal of the developer is to create a scale that will measure depressive symptoms

possibly to the largest appropriate sample. It is important, in this initial stage, for the

developer to do the relevant pilot work/study. During this, the developer will determine the

best way to measure depression after creating, revisioning, deleting test items and after

reviewing the literature on research. According to Tay and Jebb (2017), the test developer

should state a clear conceptualization of depressive symptoms. Also, it is important to

specify if depressive symptoms are best understood as being one variable(unidimensional)

or a combination of variables(multidimensional). Once the part of pilot work is completed,

the next step is to construct the scale.

There are two approaches for scale development, according to Tay and Jebb (2017).

The first is the deductive approach where the definition of the construct is known and it is

used to generate the items and the inductive approach where there is uncertainty or lack of

definition of the consult (Tay & Jebb,2017). In this case, depression and its symptoms are

well known and examined so it is better to follow the deductive approach (Tay & Jebb,2017).

So, now the developer begins the next step of constructing the scale. In this step, basically,

the developer needs to decide how the measure device will be designed and how the

different numbers will be assigned to different levels of depressive symptoms. After that, the

developer decides which is the best scale method among: the Likert scale, paired

comparisons, the Guttman scale or equal appearing intervals. The decision depends on the
nature of the variable(depression), the test takers, the preferences of the developer and

what kind of data(ordinal-interval) the test developer wants. The next step in scale

construction, is to write the possible items that will be included in the scale. The developer

will have to determine the range that each item will cover, the different types of the item

and the number of the items. It is essential that the developer writes a large number of

items (preferably twice as much as the final scale will include) from personal experience or

academic acquaintance. It would be beneficial here if the test developer interviews

physicians, clinicians, patients, patient’s relatives, and others to gain insight about

depressive symptoms. Moreover, according to Boateng et al., (2018), experts judges should

evaluate each of the items to determine whether they represent depressive symptoms

adequately. Those experts should be different than those who are developing the scale

(Boateng et al.,2018). Next, the developer should choose how the items will be scored:

cumulative model(the higher the score the higher the presence of the symptoms),

class/category scoring(adequate exhibition of symptoms for diagnosis) or ipsative

scoring(comparison of one test taker’s scores with another from another test in the same

scale). In my view, the best options for depressive symptoms are the first two.

After the developer creates the initial pool of items, he needs to try out the scale.

The scale should be tried out under the same conditions as the final version will be

administered and on individuals as similar as the sample will be. After the first

administration of the scale to the representative sample, the developer analyzes the

responses and scores of the items. In other words, he does an item analysis that may include

indices about the item’s difficulty, reliability, validity, and discrimination. In this case, two

preferred ways is the item’s reliability and validity measurement. The inter reliability index

provides evidence of internal consistency of the scale. This index is equal to the product of

the item-score standard deviation (s) and the correlation (r) between the item score and the

total test score. Also, Tay and Jebb (2017) propose measuring internal consistency, that
should be minimum above .70 but a score of .90 and higher is recommended. The item

validity index is a measure that provides evidence regarding the degree to which a test is

measuring what it is supposed to measure (the higher the better). When the item’s standard

deviation(s1) and the correlation between the item score and criterion score(p1) are known,

the developer can use the s1 = p1 (1-p1) formula to measure the validity. Furthermore, the

developer should continue providing validity evidence, through group differences on scores,

convergent or divergent evidence by comparing the scale to other related scales (Tay &

Jebb,2017). Generally, according to Boateng et al.,(2018), the item reduction analysis

happens in order to ensure that functional and internal consistent items will be included in

the final version of the scale.

The last step is test revision. In this step, the developer decides which items will be

included, eliminated, or rewritten in the depression scale. Then, the developer administers

the scale under standardized conditions to a second sample. After that, the developer

analyzes again the data from the 2nd administration and may decide that the scale is ready.

Sometimes, it is possible that the developer will need to revise the scale many times.

Problems such as divergence between theoretical structure, poor reliability, inadequate

construction, poor inter-item total correlations and malfunctioning items may arise that

need to be solved(Tay & Jebb,2017). Once the developer decides that the scale for

depressive symptoms is finished, the scale’s norms will be developed, and the scale will be

“standardized”. Standardization, in other words, means that uniformity and objectivity are

introduced into the test administration, interpretation and scoring. In addition, focus needs

to be given to the sample as it needs to be representative on those variables that might

affect depressive symptoms. This concludes the process of developing the scale.
References

Boateng, G. O., Neilands, T. B., Frongillo, E. A., Melgar-Quiñonez, H. R., & Young, S. L. (2018).

Best Practices for Developing and Validating Scales for Health, Social, and Behavioral

Research: A Primer. Frontiers in Public Health, 6.

https://doi.org/10.3389/fpubh.2018.00149

Tay, L., & Jebb, A. (2017). Scale Development. In S. Rogelberg (Ed), The SAGE Encyclopedia of

Industrial and Organizational Psychology, 2nd edition. Thousand Oaks, CA: Sage.

You might also like