Testandassesfinal

Scale development for depressive symptoms
There are 5 stages that a developer must follow in order to produce a scale that will
assess depressive symptoms. Firstly, the test developer should ask himself some basic
preliminary questions such as what will the scale measure, what is the goal of the scale, if
there is need for the scale, who will be the sample, and questions regarding the content area
covered, the way of administration, the format and possible risks and benefits. In our case,
the goal of the developer is to create a scale that will measure depressive symptoms
possibly to the largest appropriate sample. It is important, in this initial stage, for the
developer to do the relevant pilot work/study. During this, the developer will determine the
best way to measure depression after creating, revisioning, deleting test items and after
reviewing the literature on research. According to Tay and Jebb (2017), the test developer
should state a clear conceptualization of depressive symptoms. Also, it is important to
specify if depressive symptoms are best understood as being one variable(unidimensional)
or a combination of variables(multidimensional). Once the part of pilot work is completed,
the next step is to construct the scale.
There are two approaches for scale development, according to Tay and Jebb (2017).
The first is the deductive approach where the definition of the construct is known and it is
used to generate the items and the inductive approach where there is uncertainty or lack of
definition of the consult (Tay & Jebb,2017). In this case, depression and its symptoms are
well known and examined so it is better to follow the deductive approach (Tay & Jebb,2017).
So, now the developer begins the next step of constructing the scale. In this step, basically,
the developer needs to decide how the measure device will be designed and how the
different numbers will be assigned to different levels of depressive symptoms. After that, the
developer decides which is the best scale method among: the Likert scale, paired
comparisons, the Guttman scale or equal appearing intervals. The decision depends on the
nature of the variable(depression), the test takers, the preferences of the developer and
what kind of data(ordinal-interval) the test developer wants. The next step in scale
construction, is to write the possible items that will be included in the scale. The developer
will have to determine the range that each item will cover, the different types of the item
and the number of the items. It is essential that the developer writes a large number of
items (preferably twice as much as the final scale will include) from personal experience or
academic acquaintance. It would be beneficial here if the test developer interviews
physicians, clinicians, patients, patient’s relatives, and others to gain insight about
depressive symptoms. Moreover, according to Boateng et al., (2018), experts judges should
evaluate each of the items to determine whether they represent depressive symptoms
adequately. Those experts should be different than those who are developing the scale
(Boateng et al.,2018). Next, the developer should choose how the items will be scored:
cumulative model(the higher the score the higher the presence of the symptoms),
class/category scoring(adequate exhibition of symptoms for diagnosis) or ipsative
scoring(comparison of one test taker’s scores with another from another test in the same
scale). In my view, the best options for depressive symptoms are the first two.
After the developer creates the initial pool of items, he needs to try out the scale.
The scale should be tried out under the same conditions as the final version will be
administered and on individuals as similar as the sample will be. After the first
administration of the scale to the representative sample, the developer analyzes the
responses and scores of the items. In other words, he does an item analysis that may include
indices about the item’s difficulty, reliability, validity, and discrimination. In this case, two
preferred ways is the item’s reliability and validity measurement. The inter reliability index
provides evidence of internal consistency of the scale. This index is equal to the product of
the item-score standard deviation (s) and the correlation (r) between the item score and the
total test score. Also, Tay and Jebb (2017) propose measuring internal consistency, that
should be minimum above .70 but a score of .90 and higher is recommended. The item
validity index is a measure that provides evidence regarding the degree to which a test is
measuring what it is supposed to measure (the higher the better). When the item’s standard
deviation(s1) and the correlation between the item score and criterion score(p1) are known,
the developer can use the s1 = p1 (1-p1) formula to measure the validity. Furthermore, the
developer should continue providing validity evidence, through group differences on scores,
convergent or divergent evidence by comparing the scale to other related scales (Tay &
Jebb,2017). Generally, according to Boateng et al.,(2018), the item reduction analysis
happens in order to ensure that functional and internal consistent items will be included in
the final version of the scale.
The last step is test revision. In this step, the developer decides which items will be
included, eliminated, or rewritten in the depression scale. Then, the developer administers
the scale under standardized conditions to a second sample. After that, the developer
analyzes again the data from the 2nd administration and may decide that the scale is ready.
Sometimes, it is possible that the developer will need to revise the scale many times.
Problems such as divergence between theoretical structure, poor reliability, inadequate
construction, poor inter-item total correlations and malfunctioning items may arise that
need to be solved(Tay & Jebb,2017). Once the developer decides that the scale for
depressive symptoms is finished, the scale’s norms will be developed, and the scale will be
“standardized”. Standardization, in other words, means that uniformity and objectivity are
introduced into the test administration, interpretation and scoring. In addition, focus needs
to be given to the sample as it needs to be representative on those variables that might
affect depressive symptoms. This concludes the process of developing the scale.
References
Boateng, G. O., Neilands, T. B., Frongillo, E. A., Melgar-Quiñonez, H. R., & Young, S. L. (2018).
Best Practices for Developing and Validating Scales for Health, Social, and Behavioral
Research: A Primer. Frontiers in Public Health, 6.
https://doi.org/10.3389/fpubh.2018.00149
Tay, L., & Jebb, A. (2017). Scale Development. In S. Rogelberg (Ed), The SAGE Encyclopedia of
Industrial and Organizational Psychology, 2nd edition. Thousand Oaks, CA: Sage.

Testandassesfinal

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Testandassesfinal

Uploaded by

Copyright:

Available Formats

Scale development for depressive symptoms

should state a clear conceptualization of depressive symptoms. Also, it is important to

specify if depressive symptoms are best understood as being one variable(unidimensional)

or a combination of variables(multidimensional). Once the part of pilot work is completed,

the next step is to construct the scale.

academic acquaintance. It would be beneficial here if the test developer interviews

class/category scoring(adequate exhibition of symptoms for diagnosis) or ipsative

Jebb,2017). Generally, according to Boateng et al.,(2018), the item reduction analysis

the final version of the scale.

Problems such as divergence between theoretical structure, poor reliability, inadequate

to be given to the sample as it needs to be representative on those variables that might

Research: A Primer. Frontiers in Public Health, 6.

You might also like