You are on page 1of 7

Psychological Medicine, 1997, 27, 191–197.

Copyright # 1997 Cambridge University Press

The validity of two versions of the GHQ in the


WHO study of mental illness in general health care
D. P. G O L D B E RG," R. G A T E R, N. S A R T O R I US, T. B. U S T U N, M. P I C C I N E L LI,
O. G U R E J E    C. R U T T ER
From the Institute of Psychiatry, London ; and the Division of Mental Health of the World Health
Organization, Geneva, Switzerland

ABSTRACT
Background. In recent years the 12-item General Health Questionnaire (GHQ-12) has been exten-
sively used as a short screening instrument, producing results that are comparable to longer
versions of the GHQ.
Methods. The validity of the GHQ-12 was compared with the GHQ-28 in a World Health
Organization study of psychological disorders in general health care. Results are presented for
5438 patients interviewed in 15 centres using the primary care version of the Composite Inter-
national Diagnostic Instrument, or CIDI-PC.
Results. Results were uniformly good, with the average area under the ROC curve 88, range
from 83 to 95. Minor variations in the criteria used for defining a case made little difference to
the validity of the GHQ, and complex scoring methods offered no advantages over simpler
ones. The GHQ was translated into 10 other languages for the purposes of this study, and
validity coefficients were almost as high as in the original language. There was no tendency for
the GHQ to work less efficiently in developing countries. Finally gender, age and educational
level are shown to have no significant effect on the validity of the GHQ.
Conclusions. If investigators wish to use a screening instrument as a case detector, the shorter
GHQ is remarkably robust and works as well as the longer instrument. The latter should only
be preferred if there is an interest in the scaled scores provided in addition to the total score.

brevity of the GHQ-12 makes it attractive for


INTRODUCTION
use in busy clinical settings, as well as in settings
The World Health Organization’s study of where some patients may be illiterate and require
psychological disorders in general medical the questionnaire read out to them. Early studies
settings provided an opportunity to evaluate the disembedded this short GHQ from longer
performance of two versions of the GHQ against versions – it is contained within in 30, 36 and 60
a research interview that generates diagnoses item questionnaire ; but it shares only six items
using either the ICD-10 or the DSM-IV systems with the scaled GHQ, or GHQ-28. More recent
in 15 different centres. The fact that there was studies have used the questionnaire on its own.
common training of research workers and that Since relatively little has been published on
identical methodology was used at all centres the validity of the GHQ-12, all previous studies
increases the value of making comparisons known to the authors have been summarized in
between centres, since an important cause of Table 1, even including two studies that did not
variation has been greatly reduced. use standardized assessments. However, results
This study compares the shortest GHQ, or of these studies were within the range of variation
GHQ-12, with the longer scaled GHQ-28. The of the others reported. It can be seen that
validity studies have been reported in nine
" Address for correspondence : Professor Sir David Goldberg, countries, and that the median values for
Institute of Psychiatry, De Crespigny Park, London SE5 8AF. sensitivity and specificity are high, although the
191
192 D. P. Goldberg and others

Table 1. Previous validity studies using the GHQ-12


2nd stage interview Threshold Sensitivity Specificity
Authors, date Country ; Setting numbers Score GHQ-12 (%) (%)

Goldberg, 1972* England ; Primary care CIS ; N ¯ 200 1}2 93±5 78±5
Tennant, 1977* Australia ; Community PSE ; N ¯ 120 — 87±0 91±0
Banks 1983* England ; Community PSE ; N ¯ 200 2}3 71±4 79±8
adolescents
Radanovic & Eric, Yugoslavia ; Students CIS ; N ¯ 151 1}2 91±3 75±8
1983*
Mari & Williams, 1985 Brazil ; Primary care CIS ; N ¯ 260 3}4 85±0 79±0
Shamasundar et al. India ; Primary care IPSS ; N ¯ 144 1}2 87±0 93±0
1986
Bellantuono et al. Italy ; Primary care CIS ; N ¯ 90 2}3 75±0 71±0
1987
Bandyopadhyay et al. India ; Referrals to Unstandardized ; 2}3 77±4 89±7
1988 psychiatry N ¯ 111
Wilkinson & Markus, England ; Primary care CIS-PROQSY; 1}2 75±0 91±0
1989* N ¯ 256
Gureje & Obikoya, Nigeria ; Primary care CIDI ; N ¯ 214 0}1 67±0 74±0
1990
Pan & Goldberg, 1991 England ; Chinese CIS ; N ¯ 223 2}3 67±5 76±6
primary care
Araya et al. 1992 Chile ; Primary care CIS-R ; N ¯ 77 4}5 76±0 73±0
Picinelli et al. 1993 Italy ; Primary care CIS ; N ¯ 94 3}4 75±0 74±0
(13}14 Likert) (71±0) (76±0)
Abiodun, 1993 Nigeria ; Primary care PSE ; N ¯ 272 3}4 83±7 79±8
Abiodun, 1994 Nigeria ; Medical and PSE ; N ¯ 263 2}3 88±7 83±3
surgical
Gynaecology, PSE ; N ¯ 233 2}3 91±4 77±5
Antenatal, PSE ; N ¯ 240 2}3 83±3 81±4
Community PSE ; N ¯ 330 2}3 84±1 81±2
Politi et al. 1994 Italy ; Conscripts, Unstandardized ; 8}9 68±0 59±0
18 year-old males N ¯ 363 (Likert)
van Hemert et al. 1995 Holland ; MOPD, PSE
disembedded GHQ-12 N ¯ 112 2}3 77±0 79±0
free standing GHQ-12 N ¯ 78 5}6 72±0 75±0
Summary 9 countries Various Mode : 2}3 Median : 83±7 % Median : 79±0 %
4031 interviews

* Disembedded version of GHQ used. CIS ¯ Clinical Interview Schedule ; PSE ¯ Present State Examination ; CIDI ¯ Composite
International Psychiatric Interview ; IPSS ¯ Indian Psychiatric Survey Schedule ; PROQSY ¯ Computer administered CIS.

range is very wide. The various studies used a but there is doubt about the value of Likert
range of different second stage research inter- scoring (0–1–2–3) against the more usual GHQ
views, and it was not clear to what extent the scoring (0–0–1–1) (Piccinelli et al. 1993 ; Politi et
wide range of validity coefficients reflected al. 1994). Cronbach’s alpha is reported in several
variation in the clinical standard against which studies, and is between ­0±82 and ­0±86 (Sriram
the GHQ-12 was being compared. This may et al. 1989 ; Winefield et al. 1989 ; Politi et al.
have accounted for the range of best thresholds 1994). van Hemert and his colleagues (1995)
reported in the literature previously, shown in have suggested that the threshold score for the
Table 1. However, it can be seen from Table 1 disembedded questionnaire is much lower than
that the best threshold for the Clinical Interview that for the free-standing questionnaire, and this
Schedule (CIS : eight studies, mean threshold receives partial confirmation from the table
3±13) is almost identical to that for the Present (disembedded, seven studies, mean threshold
State Examination (PSE : six studies, mean 2±57 ; free standing, 12 studies, mean threshold
threshold 3±17). There has also been doubt 3±42). However, this cannot be the whole story,
about the best method for scoring available : since there is considerable overlap between the
various studies (Gureje & Obikoye, 1990 ; Pan & two distributions.
Goldberg, 1990 ; Piccinelli et al. 1993) have not The validity of the GHQ-28 has also been
shown that C-GHQ scoring has much to offer, questioned by van Hemert and his colleagues on
Validity of the GHQ across cultures 193

the grounds that the inclusion of items from the divided into three strata, so that the first one
somatic subscale, where a positive response can contained 60 % of the patients, the second one
be produced either by physical disease or by 20 %, and the third one the top 20 %. To achieve
psychological disorder, can only reduce the these proportions, the following scores defined
discriminatory ability of the questionnaire. In medium and high scorers in each centre : Ankara
support of his argument, van Hemert et al. show (2,4), Athens (3,5), Bangalore (3,7), Berlin (2,5),
that the validity coefficients of the somatic Groningen (2,5), Ibadan (2,5), Mainz (2,5),
subscale are less than those of the other three Manchester (2,4), Nagasaki (2,4), Paris (4,7),
subscales. However, merely because one scale Rio de Janeiro (3,5), Santiago (5,7), Seattle
has coefficients that are less good does not mean (3,5), Shanghai (0,1 in one centre, 2,4 in the
that the overall instrument lacks validity ; indeed, other) and Verona (4,6). The adoption of these
satisfactory coefficients have been reported by varying sampling fractions meant that it was
others (Bolognini et al. 1989 ; Romans Clarkson possible to complete the study in each centre by
et al. 1989 ; Benjamin et al. 1991 ; Aderibighe & screening approximately similar numbers of
Gureje, 1992 ; Cheung & Spears, 1994) since the patients. The GHQs were printed with a com-
last review by Goldberg & Williams (1988). puter generated code indicating the score which
In addition to providing data on the com- a patient should exceed if they were to be
parative validity of the GHQ in the 15 centres, selected for interview : in this way a complex
the present study investigates the performance stratified sampling method was used smoothly
of the GHQ-12 as a case detector against the across the 15 centres.
longer GHQ-28. Results are reported for various The CIDI-PC can generate diagnoses using
definitions of ‘ caseness ’, whether ICD-10 or either the International Classification of Disease,
DSM-IV, with or without current anxiety, and 10th Edition (ICD-10) system (WHO, 1993), or
with and without alcohol dependence. The the Diagnostic and Statistical Manual of the
questionnaire was administered in 11 different American Psychiatric Association, 4th edition
languages, and we consider whether translation (DSM-IV) system (American Psychiatric As-
from its original language is associated with sociation, 1984). For the present purposes,
lowered validity. Since the questionnaire was ‘ lifetime ’ diagnoses were ignored, and only
designed in the developed world, we compare current mental status was considered. The
the performance of the questionnaire in Europe following diagnoses were included : current
and North America with its performance in the depression, dysthymia, agoraphobia, panic dis-
developing world. Finally, we examine the effects order, generalized anxiety disorder, somatiza-
of gender, age or educational level on the tion disorder, neurasthenia (chronic fatigue)
screening performance of the instrument. and hypochondriasis. We examined the effects
of including current anxiety symptoms (not
requiring a 6-month duration) and alcohol
METHOD
dependence, but harmful use of alcohol was not
The study involved 15 centres round the world, included in the present analysis.
in which a total of 11 languages were spoken Sample size was determined in order to
(Ustun & Sartorius 1995). Both GHQ and the provide adequate statistical power for com-
primary care version of the Composite In- parisons both between and within centres. It was
ternational Diagnostic Interview (CIDI-PC) projected that 1500 patients needed to be
were translated and back-translated in each of screened in each centre in order to detect 60
these languages. Consecutive patients attending current cases of depression, and to have adequate
clinics at participating centres were approached numbers to allow a centre to compare the course
providing that they were older than 17, were not of depression relative to other disorders that
too ill to participate ; were able to communicate were common at that centre. If selected for
and had a fixed address. The latter requirement interview, patients were usually seen within a
was because the study used a longitudinal design. day or so of the GHQ being completed, although
A pilot study in each centre indicated that the at Verona the time was within 2 weeks. At the
GHQ score distributions were very different time of the interview each subject completed a
across centres. In each centre the scores were 34-item version of the GHQ, containing both
194 D. P. Goldberg and others

the GHQ-12 and the GHQ-28. The present end with 0±83, and Manchester and Bangalore at
study has used results from this GHQ to compute the top with 0±95 and 0±93 respectively (see Table
validity measures. 2). The overall sensitivity was 83±4 %, and the
The research worker administering the CIDI- specificity 76±3 %. At each centre, optimal figures
PC was blind to the results of this questionnaire. were achieved with rather different thresholds :
Selection of centres was dependent upon the many centres with 1}2, but Manchester high
existence of experienced investigators, and with 3}4, and Bangalore with a surprising 6}7.
ability to raise funds for the study in the Most of the patients at the Bangalore centre
developed countries. The detailed methodology were illiterate, and had the questionnaire read
is described elsewhere (Ustun & Sartorius 1995). out to them. The mean prevalence of ICD-10
diagnoses across the 15 centres was 24 %, and
RESULTS data from each centre were standardized to this
prevalence in order to calculate the positive
In the first stage of the study 25 916 patients predictive values (PPVs) and allow comparison
(96±5 % of those approached) completed the between centres.
GHQ-12. Co-operation with the second stage
interview varied between centres, but was found Validity coefficients for the GHQ-28
to be 64±9 % across centres. A total of 5438
patients completed the second stage interview. There were also found to be high, but no higher
Before validity coefficients were computed, than those for the GHQ-12 (see Table 3). The
results were weighted back to the original positive predictive values were generally slightly
populations of patients who had been screened. better, and there was less variation in the best
Receiver Operating Characteristic (ROC) curves thresholds between centres.
were calculated using the ROCFIT programme
(Metz et al. 1984). Best thresholds were calcu- Altering the gold standard
lated using the optimal trade-off between sen- Values for DSM-IV diagnoses are comparable,
sitivity and specificity : where two thresholds but slightly lower, than those for ICD-10
gave similar trade-offs, that with higher sen- diagnoses. We also tried the effects of adding
sitivity was preferred. ‘ current anxiety ’ to the ICD-10 diagnoses, but
found that it made little differences to the
Validity coefficients for the GHQ-12 validity coefficients – the reduced sensitivity was
These were generally high, with the mean area compensated for by the improved specificity,
under the ROC curves being 0±88, with a fairly and the area under the curve stayed the same.
narrow range : Berlin and Mainz at the lower Adding alcohol dependence to the list of ICD-10

Table 2. Validity coefficients for the GHQ-12 in 15 centres


Sensitivity Specificity PPV GHQ-12
#%
Centre Threshold (%) (%) (%) ROC

Ankara 1}2 70±6 82±3 55±7 0±85


Athens 2}3 80±6 84±7 62±4 0±93
Bangalore 6}7 86±7 88±9 71±2 0±94
Berlin 2}3 72±6 75±0 47±8 0±83
Groningen 2}3 80±3 86±4 65±1 0±90
Ibadan 1}2 77±8 79±4 54±4 0±89
Mainz 2}3 73±5 81±2 55±2 0±83
Manchester 3}4 84±6 89±3 71±4 0±95
Nagasaki 1}2 76±2 85±9 63±1 0±86
Paris 1}2 78±2 79±4 54±3 0±85
Rio de Janeiro 1}2 70±2 77±3 49±4 0±84
Santiago 2}3 84±8 82±2 60±0 0±89
Seattle 1}2 82±1 76±5 52±4 0±86
Shanghai 1}2 86±1 79±9 57±5 0±90
Verona 1}2 75±8 65±3 40±6 0±85
All centres 2}3 76±3 83±4 52±9 0±88
Validity of the GHQ across cultures 195

Table 3. Validity coefficients for the GHQ-28 in 15 centres


Sensitivity Specificity PPV GHQ-28
#%
Centre Threshold (%) (%) (%) ROC

Ankara 3}4 74±6 77±1 50±7 0±86


Athens 5}6 89±5 82±8 62±2 0±92
Bangalore 8}9 93±4 85±0 66±4 0±94
Berlin 5}6 81±9 72±9 48±8 0±84
Groningen 5}6 84±9 81±9 59±8 0±93
Ibadan 4}5 80±8 75±6 51±2 0±87
Mainz 5}6 80±7 72±9 48±5 0±84
Manchester 6}7 84±4 86±2 65±8 0±93
Nagasaki 3}4 76±7 77±6 51±9 0±85
Paris 3}4 79±3 74±9 49±9 0±82
Rio de Janeiro 3}4 82±0 71±8 47±9 0±86
Santiago 6}7 89±0 85±8 66±4 0±94
Seattle 3}4 80±5 74±8 50±2 0±84
Shanghai 7}8 84±6 85±5 64±8 0±92
Verona 5}6 70±8 72±9 45±2 0±84
All centres 5}6 79±7 79±2 54±7 0±88

Table 4. Effects of varying the scoring system on the validity coefficients of the GHQ in 15 centres :
GHQ scoring assigns weights of 0, 0, 1, 1 ; Likert weights of 0, 1, 2, 3 ; and C-GHQ assigns weights
of 0, 1, 1, 1 to items indicating illness, and 0, 0, 1, 1 for items indicating health
Area under Best Sensitivity Specificity
GHQ Scoring method curve threshold (%) (%)

GHQ-12 GHQ-scoring 0±88 1}2 83±5 75±1


C-GHQ 0±86 4}5 78±9 74±5
Likert 0±85 11}12 78±9 77±4
GHQ-28 GHQ-scoring 0±87 5}6 79±2 79±6
C-GHQ 0±85 12}13 77±2 77±2
Likert 0±87 23}24 79±8 78±5

Table 5. Effects of varying the diagnostic system on the validity coefficients of the GHQ
in 15 centres
GHQ Area under Best Sensitivity Specificity
Gold Standard version curve threshold (%) (%)

ICD-10 system GHQ-12 0±88 1}2 83±5 75±1


GHQ-28 0±88 5}6 79±2 79±6
DSM-4 system GHQ-12 0±87 1}2 81±6 76±4
GHQ-28 0±87 5}6 76±7 81±0

diagnoses reduced the coefficients (see Tables 4 GHQ method, but it was about the same for the
and 5). GHQ-28. The C-GHQ was slightly less good
than the GHQ method for both questionnaires
Different scoring methods (see Tables 4 and 5).
The questionnaires were scored by the usual
method (0–0–1–1), as well as Likert (0–1–2–3) Effect of translation from English
and the C-GHQ method, in which items that For this analysis, the pooled results for
measure health are scored as usual, but items Manchester and Seattle were compared with
reflecting illness are scored 0–1–1–1. For the results for the 13 other centres. The area under
GHQ-12, the Likert method was worse than the the ROC curve is slightly less for the GHQ-12
196 D. P. Goldberg and others

given in English, but about the same for the However, the difference between the best and
GHQ-28. The differences are not significant. the worst centre is not that great. It would
appear that if a researcher wishes to use the
Developed versus developing country GHQ as a case detector, that the shorter
For this analysis, the nine centres in developed questionnaire, scored in the simplest manner, is
countries were compared with Ibadan and as good as longer versions and more involved
Bangalore as representatives of developing methods. The Likert scoring method will pro-
countries. Four other countries formed an duce a wider and smoother score distribution if
intermediate group. There is no tendency for a researcher wishes to assess severity and the C-
countries in the developing world to have lower GHQ method is more normally distributed than
coefficients. the GHQ scoring method. Finally, the GHQ-28
should only be used if a researcher wishes to
Effects of gender, age and educational level have scaled scores in addition to a total score.
If the GHQ is to be used as a screening The similarity between the validity coefficients
instrument in surveys, it is essential that its for the two diagnostic systems probably reflects
validity characteristics are not influenced by the fact that they are fast converging upon each
gender, age of subject, or educational level, as other. The GHQ works as well in the developing
this would bias the results. ROC curves were world as the developed world and loses only a
computed by gender, by age group, and by small amount by translation into other
educational level, and were not found to be languages.
significantly different from one another. Where
there were more than two groups (i.e. age and
educational level) pairwise comparisons were REFERENCES
made between each pair : none were significant. Abiodun, O. (1993). A study of mental morbidity among primary
care patients in Nigeria. Comprehensive Psychiatry 34, 10–13.
Abiodun, O. (1994). A validity study of the Hospital Anxiety and
DISCUSSION Depression Scale in general hospital units and a community
sample in Nigeria. British Journal of Psychiatry 165, 669–672.
There are a number of possible explanations for Aderibighe, Y. & Gureje, O. (1992). The validity of the 28-item
general health questionnaire in a Nigerian ante-natal clinic. Social
the differences in validity coefficients between Psychiatry and Psychiatric Epidemiology 27, 280–283.
centres besides local cultural differences in the American Psychiatric Association (1984). Diagnostic and Statistical
expression of distress. These include differing Manual of Mental Disorders (DSM-IV). American Psychiatric
Association : Washington, DC.
degrees of defensiveness, effects of translation Araya, R., Wynn, R. & Lewis, G. (1992). Comparison of two self-
into a particular language, differing degrees of administered psychiatric questionnaires (GHQ-12 and SRQ-20) in
expertise in administering the second stage Chile. Social Psychiatry and Psychiatric Epidemiology 27, 168–173.
Bandyopadhyay, G., Sinha, S., Sen, B. & Sen, G. (1988). Validity of
interview, and varying degrees of collaboration General Health Questionnaire (GHQ-36}GHQ-12) in the psychi-
with the survey. It is not possible to determine atric OPD of a general hospital – a pilot study. International
which of these was responsible for the differences Journal of Social Psychiatry 34, 130–134.
Banks, M. (1983). Validity of the General Health Questionnaire in a
reported, although the differences are probably young community sample. Psychology Medicine 13, 349–353.
best accounted for in terms of differences in the Bellantuono, C., Fiorio, R., Zanotonelli, R. & Tansella, M. (1987).
Psychiatric screening in general practice in Italy : a validity study of
preparedness of patients to report symptoms on the GHQ. Social Psychiatry 22, 113–117.
a pencil and paper test. Benjamin, S., Lennon, S. & Gardner, G. (1991). The validity of the
Mari & Williams (1986) found that those with GHQ for 1st stage screening for mental illness in pain clinic
patients. Pain 47, 197–202.
more than four years education had better ROC Bolognini, M., Bettschart, W., Zehnder-Gubler, M. & Rossier, L.
curves than those with less education than this : (1989). The validity of the French version of the GHQ-28 and
the poorly educated were more likely to give PSYDIS in a community sample of 20 year olds in Switzerland.
European Archives of Psychiatry and Neurological Sciences 238,
positive replies to the questionnaire that were 161–168.
not confirmed by interview, while Hobbs and his Cheung, P. & Spears, G. (1994). The reliability and validity of the
Cambodian version of the 28-item GHQ. Social Psychiatry and
colleagues (1984) showed that the GHQ worked Psychiatric Epidemiology 25, 276–280.
better with women (sensitivity 88 %, specificity Goldberg, D. P. (1972). The Detection of Psychiatric Illness by
83 %) than with men (sensitivity 72 %, specificity Questionnaire. Maudsley Monograph No. 21. Oxford University
Press : Oxford.
87 %). Neither of these results were confirmed Goldberg, D. & Williams, P. (1988). A User’s Guide to the GHQ.
on this relatively large data-set. NFER-Nelson : Windsor.
Validity of the GHQ across cultures 197

Gureje, O. & Obikoya, B. (1990). The GHQ as a screening tool in a Romans-Clarkson, S. E., Walton, V., Herbison, G. & Mullen, P. E.
primary care setting. Social Psychiatry and Psychiatric Epidemi- (1989). The validity of the GHQ-28 in New Zealand Women.
ology 25, 276–280. Australia and New Zealand Journal of Psychiatry 23, 187–196.
Hobbs, P. R., Ballinger, C. B., Greenwood, C., Martin, B. & Shamasundar, C., Murthy, S., Praksh, O., Prabhakar, N. &
McClure, A. (1984). Factor analysis and validation of the GHQ in Krishma, D. K. S. (1986). Psychiatric morbidity in a general
men : a general practice survey. British Journal of Psychiatry 144, practice in an Indian City. British Medical Journal 292, 1713–1715.
270–275. Sriram, T., Chandrashekar, C., Isaac, M. & Shanmugham, V. (1989).
Mari, J. & Williams, P. (1985). A comparison of the validity of two The General Health Questionnaire : comparison of an English
psychiatric screening questionnaires (GHQ-12 and SRQ-20) in version and a translated Indian version. Social Psychiatry and
Brazil, using Relative Operating Characteristic (ROC) analysis. Psychiatric Epidemiology 24, 317–320.
Psychological Medicine 15, 651–659. Tennant, C. (1977). The General Health Questionnaire : a valid index
Mari, J. & Williams, P. (1986). Misclassification by screening of psychological impairment in Australian populations. Medical
questionnaires. Journal of Chronic Diseases 39, 371–378. Journal of Australia 12, 392–394.
Metz, C. E., Wang, P. L., & Kronman, H. B. (1978). ‘ ROCFIT ’.
van Hemert, A., Heijer, M., Vorstenbosch, M. & Bolk, J. (1995).
Dept of Radiology and the Franklin Maclean Memorial Research
Detecting psychiatric disorders in medical practice using the
Institute, University of Chicago : Chicago.
General Health Questionnaire. Why do cut-off scores vary ?
Pan, P. & Goldberg, D. (1990). A comparison of the validity of
Psychological Medicine 25, 165–170.
GHQ-12 and CHQ-12 in Chinese primary care patients in
Manchester. Psychological Medicine 20, 931–940. Ustun, T. B. & Sartorius, N. (1995). Mental Illness in General Health
Piccinelli, M., Bisoffi, G., Bon, M., Cunico, L. & Tansella, M. (1993). Care. John Wiley : Chichester.
Validity and test–retest reliability of the Italian version of the 12- Wilkinson, G. & Markus, A. C. (1989). Validation of a computerized
item General Health Questionnaire in general practice : a com- assessment (PROQSY) of minor psychological morbidity by ROC
parison between three scoring methods. Comprehensive Psychiatry analysis using a single GP’s assessments as criterion measures.
34, 198–205. Psychological Medicine 19, 225–232.
Politi, P., Piccinelli, M. & Wilkinson, G. (1994). Reliability, validity Winefield, H., Goldney, R., Winefield, A. & Tiggemann, M. (1989).
and factor structure of the 12-item General Health Questionnaire The General Health Questionnaire : reliability and validity for
among young males in Italy. Acta Psychiatrica Scandinavica 90, Australian youth. Australia and New Zealand Journal of Psychiatry
432–437. 23, 53–58.
Radanovic, Z. & Eric, L. (1983). Validity of the General Health World Health Organization (1993). The ICD-10 Classification of
Questionnaire in a Yugoslav student population. Psychological Mental and Behavioural Disorders. Diagnostic Criteria for Research.
Medicine 13, 205–207. WHO : Geneva.

You might also like