You are on page 1of 7

Expanding the Framework for

Mental Health Program Evaluation1


Peter D. Fox, PhD, and John M. Kuldau, MD, Stanford, Calif

WHILEHILE there is an extensive literature gram evaluation. The basic concepts of cost-
on the evaluation of mental health pro- effectiveness analysis and its relevance to
grams, the state of the art in mental health program evaluation are then described. Fi¬
program evaluation is unsettled, given the nally, some issues and challenges that fre¬
various competing criteria and methodolo- quently arise in evaluating psychiatric
gies that have been proposed. In spite of the programs are discussed. The paper refers pri¬
long history of psychiatry, psychiatric pro- marily to psychiatric programs that receive
grams remain among the most difficult to support from the community or the govern¬
evaluate in the field of medicine. ment, although much of the paper is relevant
A comprehensive view of a psychiatric to psychiatric programs in general, inde¬
program must encompass three related ac- pendent of the sources of funding.
tivities that are usually performed by groups
of workers who are separated by training, Need for a More Systematic Approach
tradition, and organizational barriers. These
activities are (1) administrative and finan- There are at least three reasons to strive
cial, involving the assignment of resources, for a more systematic and rigorous approach
eg, mental health workers, to a program; (2) to the evaluation of psychiatric programs.
clinical treatment; and (3) assessment of The first is the necessity of choice. Profes¬
treatment outcome. In spite of the strong in- sionals in the mental health field, be they
terrelationships, the three factors are usual- psychiatrists, social workers, psychologists,
ly treated as separate by the social system in or other personnel, are faced with the prob¬
which they are imbedded. lem of selecting among various programs or
This paper offers new approaches to eval- modes of treatment designed to assist the
uating mental health programs by broaden- patient. This choice is forced because it is
ing the framework of evaluation through the not possible to undertake simultaneously all
consideration of all three activities. The first forms of treatment and rehabilitation. Yet
section discusses the need for a more syste¬ there is considerable disagreement regard¬
matic approach to program evaluation. Next, ing the most desirable approaches to treat¬
some current assessment practices are pre¬ ment. Meaningful program evaluation facili¬
sented, followed by discussion of the rele¬
a tates good decision making by providing the
vance of economic criteria to psychiatric pro- information required to compare alternative
programs.
Submitted for publication Feb 20,1968.
From the Department of Psychiatry, Stanford Second, evaluation is an integral part of
University School of Medicine, and the Stanford Uni- good clinical practice. Good clinical practice
versity Graduate School of Business, Stanford, Calif demands that the practitioner define the
(Dr. Fox), and Stanford University School of Medi-
cine and Palo Alto Veterans Administration Hos- problems of each individual patient as accu¬
pital (Dr. Kuldau). Dr. Fox is currently at the
Stanford Research Institute, Menlo Park, Calif.
rately as possible, describe the course of ac¬
tion to be pursued, predict the effects of an
Reprint requests to the Department of Psychi-
atry, Stanford University School of Medicine, Stan- intervention, and finally, receive informa¬
ford, Calif 94305 (Dr. Kuldau). tion of the effects of this intervention. For

Downloaded From: by a Tulane University User on 01/11/2019


example, a surgeon who performs an appen¬ other federal agencies including those dealing
dectomy formally states his diagnosis of ap¬ with health. These accounting systems, re¬
pendicitis in the hospital records, operates, ferred to as planning-programing-budgeting
and gives required postoperative care. (PPB) systems, are designed to yield better
Shortly after the operation he, as well as information on the costs of individual pro¬
others in the hospital, receives feedback on grams, where these programs may cross agen¬
the actual condition of the removed appen¬ cy lines.
dix from a pathology laboratory. Parallel Major efforts are also under way to imple¬
feedbacks do not occur with most psychi¬ ment similar accounting systems in state
atric programs. and local governments. In addition, the Fed¬
Evaluating the effect of psychiatric inter¬ eral Government has increasingly required
vention is clearly more difficult than evalu¬ that cost-effectiveness, or cost-benefit, stud¬
ating the outcome of routine surgical proce¬ ies be conducted to justify expenditures.
dures, for several reasons. The state of the Central to this approach is the philosophy
mental patient, both before and after treat¬ that the costs and benefits of a program are
ment, is not easily describable in measurable related and hence cannot be meaningfully
terms. The outcome of psychiatric interven¬ analyzed in isolation. While these innova¬
tion may be unclear until many months af¬ tions have met resistance, their success is
ter treatment is terminated. Even so, the widely recognized and can be expected to
ability to predict the effects of treatment is influence patterns of mental health bud¬
as important for the psychiatrist as it is for geting.
the surgeon. Ideally, the psychiatrist should
Note on Current Assessment Practices
give his patient estimates prior to undertak¬
ing treatment, perhaps in probabilistic Assessment of psychiatric programs
terms, of what the patient can expect of his should include (1) characterization of a
therapy as well as the associated time and population patients; (2) careful descrip¬
of
cost. tion of a program, including associated
Third, mental health programs obtain costs, designed to change them; and (3) out¬
much of their financial support from com¬ come or effectiveness measures consisting of
munity sources, typically government agen¬ a catalogue of changes that have occurred
cies, that are increasingly demanding more after a given time has elapsed. The changes
justification of funding requests. As one that have occurred may then be evaluated as
measure of the extent of community fund¬ good, bad, or indifferent according to the ob¬
ing, recent government statistics show that jectives of the program. The process of de¬
approximately 23,000 psychiatric patients termining outcome measures should force
are in private mental hospitals and general the evaluation team to reflect upon and com¬
hospitals, as compared with 603,000 in state municate program objectives in operational
and county hospitals, and another 60,000 in terms.
Veterans Administration (VA) hospitals.2·3 Outcome assessment is usually a serious
Hence the vast majority of psychiatric beds weak point in psychiatric evaluation. In their
are government-funded. Simply stated, the book on psychiatric rehabilitation Kandel
government agencies involved want to know and Williams4 summarized outcome criteria
—and have a right to know—what they are used in research as falling into three classes:
receiving for the money they spend. (1) changes in psychiatric traits and status;
The increased pressures for program jus¬ (2) hospital adjustment; and (3) community
tification have been stimulated in recent adjustment, as indicated by such criteria
years by the program evaluation methodolo¬ as level of social adjustment, living arrange¬
gy that Secretary Robert S. McNamara and ments, community activities, vocational ad¬
his staff have introduced into the US Depart¬ justment, supervision needed, readmission
ment of Defense since 1961. In particular, we rates, and time out of the hospital. Other
can cite the success of new accounting sys¬ categorizations of outcome assessment have
tems designed to yield better information on appeared in the literature. The operational
program costs, and the presidential directive definition of these diverse criteria remains
that such systems be implemented in most fluid and an active area of research.

Downloaded From: by a Tulane University User on 01/11/2019


We argue that economic factors should be that results from the program. The in¬
added to the list of relevant criteria. Al¬ creased earnings are taken to represent the
though economic factors are important in patient's increased economic contribution to
program selection, rarely have they been society, and will in practice generally be
explicitly evaluated along with clinical con¬ shared between the patient and society
siderations. Indeed clinical and economic through decreased welfare payments, in¬
factors frequently have been treated as inde¬ creased tax payments, etc.
pendent of each other even though the The third economic effect is the financial
availability and proper use of economic re¬ loss resulting from antisocial behavior of the
sources (eg, personnel, physical plant, etc) mentally ill. Emotionally disturbed persons
are determinants of clinical success. are more prone than stable persons to en¬
gage in illegal or irresponsible activities
Relevance of Economic Criteria for such as violence, drinking, promiscuity, and
Evaluating Mental Health Programs other socially disruptive activities. They are
also less likely to maintain solid family rela¬
The economic effects of alternative men¬ tions. Such undesirable behavior has both
tal health programs need to be analyzed in economic and noneconomic adverse effects.
The economic cost of mental illness re¬
conjunction with traditional clinical out¬ sulting from antisocial behavior is difficult
come measures if these programs either use
to estimate, and hence the economic benefits
or affect economic resources. Three forms of
economic effect are identified here. that should be attributed to a particular pro¬
gram because of the reduction of such
The first effect is the obvious one of the
behavior through psychiatric care must usual¬
cost of conducting various programs. Conley
et al5 estimate that $4 billion is spent annu¬ ly be treated as unquantifiable. The dif¬
ficulties in quantifying these benefits do not,
ally on treating or preventing mental illness, however, make them unreal.
including both inpatient and outpatient If the concept of the patient as earner is
care. Government expenditures account for
two thirds of this $4 billion.6 Hence the pro¬ legitimate, then an ethical problem must be
raised. It is generally accepted that medical
gram cost, whether or not it is borne by the resources are sufficiently scarce that it is not
patient, should be one criterion for evaluat¬
ing the program. Indeed if we consider the possible to provide all sick people with the
best medical care. Assume that two individ¬
situation where a fixed amount of money
uals are identical in all respects except their
has been allocated to mental health pro¬
grams, a high per-patient cost implies that
expected life span, eg, one patient is 25, the
other is 55 years old. The potential for in¬
fewer persons can undergo treatment.
The second economic effect is that of the
creasing the earnings of the 25-year-old pa¬
tient exceeds that for the 55-year-old pa¬
change in earning power of the patient. This tient. Do we therefore conclude that the
change is a measure of the increase in the 25-year-old patient should be treated before
patient's economic productivity. Many psy¬ treating the 55-year-old? The decision of
chiatric programs, along with other health whom to treat first involves a value judg¬
programs, should benefit the patient as an ment that must be made either explicitly or
earner and this benefit should be estimated.
by default.
A sick person is rarely as productive as he
would be if he were healthy. Consequently, Cost-Effectiveness Model as an Approach
he depletes his own financial resources or to Program Evaluation
depends on support from society at large. In
this regard, the annual loss in marketable In practice, a variety of criteria, both eco¬
output is estimated at $14.3 billion.5 If one nomic and noneconomic, is likely to be rele¬
of the objectives of a mental health program vant for program evaluation. The problem
is to permit the patient to be an economical¬ remains to develop a framework for compar¬
ly independent and self-sustaining member ing psychiatric programs, given that more
of society, then one should view as a pro¬ than one criterion is considered relevant. A
gram benefit the increase in patient earnings cost-effectiveness model, originally discussed

Downloaded From: by a Tulane University User on 01/11/2019


eliminated from consideration form the sched¬
ule of efficient alternatives.
For example, suppose that a choice must
be made between three alternatives to han¬
dle neurotic patients who do not require
hospitalization and these alternatives are, in
order of increasing cost: (1) to provide no
treatment, (2) to provide outpatient care,
and (3) to provide hospitalization. Let us
assume that the program evaluator judges
outpatient care to be more effective than
hospitalization in addition to being cheaper.
Hence the alternative of outpatient care
dominates hospitalization, which can be
Fox7 in another context, is presented for eliminated from consideration. However, the
by
this purpose. program evaluator must present the first two
alternatives (no treatment vs outpatient
Two individuals are referred to—the pro¬
gram evaluator and the decision maker. treatment), which form the schedule of
This clear-cut division does not exist in efficient alternatives, to the decision maker.
The decision maker must then make a value
practice; the roles of the program evaluator judgment of whether the benefits of outpa¬
and the decision maker invariably overlap.
The distinction is useful, however, for pur¬ tient treatment outweigh the additional cost
that would be incurred.
poses of exposition. In government agencies,
the decision maker is assumed to be a politi¬ The Figure illustrates for formation of a
cal figure who seeks to allocate funds wisely. schedule under the assumption that seven
An essential concept of cost-effectiveness alternative programs are being considered.
analysis is that of the schedule, which we de¬ The program evaluator has estimated the
scribe in terms of a graph showing the effec¬ cost of each of seven programs, labeled At
tiveness and associated cost of alternative through A7; he has also estimated the effec¬
programs. The program evaluator estimates tiveness of each program. He now seeks to
the cost that would be incurred and the derive the schedule, assuming that the only
effectiveness—or outcome—that would re¬ information he has concerning the prefer¬
sult if each of the alternative programs were ence of the decision maker is that, for pro¬
undertaken. Again, a simplifying assump¬ grams with equal costs, the most effective
tion is made for purposes of exposition, ie, program is desired and, for that equally
that there is one operational outcome meas¬ effective programs, the lowest cost one is de¬
ure. In practice, several outcome measures sired.
are likely to be relevant and must be exam¬ Programs are compared pair-wise to de¬
ined individually. rive the schedule. Alternative As need not
The program evaluator normally may dis¬ be considered further since alternative A2
card a program that both costs more and is both costs less and is more effective. Thus
less clinically effective than another pro¬ A¡¡ is said to dominate A,, ie, it is superior
gram. On the other hand, if one program to A, based on the two criteria that are as¬
costs more but is more effective than anoth¬ sumed to be relevant. It can be noted that
er, the program evaluator should present As is also dominated by At and As. Similar¬
both programs to the decision maker to de¬ ly, Ai need not be considered further since
termine whether the additional effectiveness system A5 is clearly superior. If, after elimi¬
is worth the additional cost. One task of the nating As and Aif the remaining combina¬
program evaluator then is to determine tion of pairs are examined, the program
which programs need not be considered by evaluator will be unable to find two alterna¬
the decision maker because there are alter¬ tives such that one is both more effective
native programs that either cost no more and costs less than the other. The remaining
and are more effective or cost less and are at five then are members of the schedule of
least as effective. The programs that are not effective alternatives and should be presented

Downloaded From: by a Tulane University User on 01/11/2019


to the decision maker. These are represented It may be misleading to conclude that one
by dots surrounded by circles in the Figure. program is better than the other. Program A
Consider the typical situation where more may be more effective for one patient popu¬
than one measure of effectiveness or out¬ lation, and program for another patient
come is relevant. Suppose, for example, that population. A major issue is deciding how
length of stay out of the hospital and a the patient population should be stratified,
measure of social adjustment are selected as eg, by demographic variables (sex, socio-
appropriate measures. Then the graph in the economic level, age, etc) or by diagnostic
Figure would have three axes, representing variables (schizophrenic, alcoholic, etc).
the three criteria: cost, length of stay out of One practical problem in empirical research
hospital, and social adjustment. If cost is is that there may be a trade-off between a
considered irrelevant or is identical for all precise and narrowly defined subsample and
programs being compared, then the cost axis the statistical significance of the results be¬
need not be considered, and we are back to cause of the small size of the subsample. For
the two-dimensional case for our two out¬ example, suppose that a program serves
come measures. both alcoholics and nonalcoholics. Possibly
In summary, the function of the program only nonalcoholics are materially aided by
evaluator is to analyze programs in terms of the program. However, the subsample of
the criteria of interest and to make known to nonalcoholics may be sufficiently small that
the decision maker his analysis of each of the results lose statistical significance if only
the dominant alternatives. The determina¬ nonalcoholics are considered.
tion of the best program involves value judg¬ Related to this is the issue of establishing
ments unless one program performs as well priorities for who should be treated, and at
as all other programs, based on all relevant what level of intensity. If we admit the
criteria, and is superior, based on one or existence of a shortage of adequately trained
more criteria. mental health workers, a mechanism must
be established to determine how economic
Some Key Issues in Program Evaluation resources (eg, personnel, hospital beds)
should be distributed. The concept is not
Program evaluation was described in the pleasant. The matter of priorities has been
introduction as having three major compo¬ handled in our society more by partial solu¬
nents: (1) administrative and financial, (2) tions than by overall design. In effect, the
clinical, and (3) treatment outcome. This present method of allocating psychiatric
section discusses some key issues that care favors certain categories of persons, in
should be specifically addressed by persons particular, upper socioeconomic classes and
concerned with all three components. These special groups, eg, veterans or persons em¬
persons need to develop a greater degree of ployed by organizations having particularly
sophistication about the interrelationships broad health insurance plans. One can ques¬
involved. The issues addressed are those of: tion both the wisdom of the current deci¬
(1) identification of the population that can sion-making process and the equity of the
benefit from a program; (2) setting of prior¬ resulting system of priorities.
ities on the availability of psychiatric care; If we accept that medical resources are
(3) consideration of the effect of funding limited, the question is not whether a priori¬
procedures on performance; (4) the effect of ty system is to be established but how. Es¬
bias, and (5) the influence of the adminis¬ tablishing the priority system involves value
trative structure. Since, in practice, program judgments that are likely to be beyond the
evaluation involves a strong element of art competence of someone evaluating a partic¬
and judgment, the importance of these is¬ ular program. These judgments often re¬
sues and the best way of handling them will quire national policy decisions, either im¬
vary from program to program. plicitly or explicitly. It should, however, be
An important question is that of deter¬ within the competence of the evaluator (or
mining the patient population for which a evaluation team) to describe the population
particular program is best suited. Consider that will benefit from a particular program.
two competing programs, labeled A and B. The effects of the evaluation criteria and

Downloaded From: by a Tulane University User on 01/11/2019


funding procedures on both individual per¬ vated bias. This type of bias occurs when the
formance (eg, that of a ward chief in a hos¬ researcher, motivated by personal advance¬
pital) and on organizational units need to ment, reports results such that the evidence
be considered. Ideally, measures should be in favor of a particular point of view is
formulated that can serve to evaluate men¬ stressed, and evidence to the contrary is
tal health workers as well as programs. played down or ignored. For an article on
These measures must be carefully designed, how such bias has influenced both research
since the effectiveness of a psychiatrist is and policy decisions in another field, that of
more difficult to assess than that of a sur¬ water resources development, see Marshall.9
geon who performs repetitive and well- The second type arises from the research¬
defined operations. The effect of these meas¬ er's bringing to bear a particular point of
ures on the performance of the psychiatrist view arising from his background or train¬
must be considered, and the measures must ing. Thus a narrow view of a program may
be designed to improve and not degrade pa¬ cause outcome evaluation itself to be atom¬
tient treatment. If the criteria are clearly ized. Even within reasonably well-defined
stated in operational terms, the danger ex¬ disciplines, individuals have their own slant
ists that the person who is being evaluated on what is important. The bias can become
or who identifies with a program being more apparent when we consider the whole
evaluated will avoid difficult patients, ignore gamut of the social sciences in which one
his intuition to the detriment of the patient, person cannot be fully versed. For example,
or otherwise manipulate the statistics. Nor Freeman and Simmons10 discuss how their
is it likely that a system can be designed sociological approach results in their under¬
which eliminates a strong subjective element playing treatment influences in their exten¬
in evaluating personnel. sive outcome study of chronic mental pa¬
The procedures for allocating funds to in¬ tients. On the other hand, others11 more
dividual VA hospitals demonstrate how the within the medical tradition might reason¬
criteria for funding to organizational units ably report the effects of drugs on rehospitali-
can affect treatment. The VA had for years zation rates without closely scrutinizing so¬
funded its mental hospitals solely on the ba¬ ciological influences. Studies attempting to
sis of the size of the hospital census, referred assess the relative importance of variables
to as average daily patient load (ADPL). drawn from diverse traditions are less fre¬
This practice tended to discourage these quent. One exception is the work of Pasam-
hospitals from discharging patients who no anick et al,12 that studies both drug and so¬
longer needed hospitalization. Not until cial context variables for their influence on
fiscal year 1968 were changes instituted so outcome.
that a variety of bases serve to justify fund¬ The final issue relates to the administra¬
ing, including turnover rate, variety and so¬ tive structure of most hospitals. Typically
phistication of programs, medical school there is little communication between the
affiliation, and amount of education and administrators, who formulate the budget,
training conducted. The new allocation and ward chiefs, who use the hospital's re¬
practices were designed to encourage indi¬ sources. This gap is due to several factors,
vidual hospitals to take actions that are including organizational structure, differ¬
more congruent with good patient treat¬ ences in language between psychiatrists and
ment. However, the criteria for funding still administrators, and lack of outcome informa¬
relate only to input rather than outcome. tion. If outcome information were available,
Thus additional analysis is required to de¬ the hospital management could use it for
termine whether the new funding or alloca¬ purposes of budgeting, and a ward team
tion procedures do achieve the desired goals could use it to evaluate its effectiveness and
of patient treatment outcome. to attempt improvements. For example, the
Bias is an issue in evaluation studies and work at Fort Logan, Colo,13·14 provides a
may take many forms. Two types of bias are preliminary investigation of the problems of
described here. The reader is referred to Ro¬ collecting and distributing outcome informa¬
senthal8 for a more complete discussion. tion to different areas in the organization
The finit type is selfish or politically moti- structure.

Downloaded From: by a Tulane University User on 01/11/2019


Challenge of Program Evaluation communicate and interact with these other
professionals involved.
One fruitful area for study is that of de¬ Throughout this paper the need for more
signing patient records so that outcome in¬ comprehensive program evaluation has been
formation would be analyzed as a matter of stressed. It should be emphasized that im¬
course. This information would be used to proved measurement is intended to assist
evaluate psychiatric programs in hospitals and improve the judgment of those who
and other settings for community psychi¬ make decisions and not to supplant it. In the
atry. Data on patients would be collected at final analysis, program selection and organi¬
set intervals, both during treatment and for zation involves value judgments.
several years afterwards. Such a scheme is Summary
technically facilitated by the development of A framework is presented for evaluating
computers and related equipment. The ex¬
tent to which evaluation can become a rou¬ psychiatric programs under the assumptions
tine part of mental health programs is a vi¬ that several criteria are relevant. Psychiatric
tal issue. program evaluation is viewed in the context
Good evaluation of psychiatric programs of choice, requiring consideration of three
requires the marshaling of skills that have major interrelated activities that influence
not been marshaled in the past. Depending treatment and its availability. These are:
on the nature of the program being studied, (1) administrative and financial activities,
professionals with a knowledge of such fields (2) clinical treatment, and (3) assessment
as economics, urban affairs, computer tech¬
of treatment outcome. Certain key issues in
nology, and systems analysis are able to program evaluation are discussed.
contribute to the evaluation. Mental health Both authors are participants in research involv¬
workers will find value in gaining acquaint¬ ing the application of new concepts to community
psychiatry funded bv National Institute of Mental
ance with these disciplines so that they can Health grant No. MH 02332-02.

References
1. Daniels, D., and Kuldau, J.: Marginal Man, 8.Rosenthal, R.:Experimenter Effects in Behav-
the Tether of Tradition, and Intentional Social Sys- ioral Research, New York: Appleton-Century-Crofts,
tem Therapy, Community Ment Health J 3 (1):13- 1966.
20 (spring) 1967.
2. US Public Health Service, Department of 9.Marshall, H.:"Politics and Efficiency in Water
Health, Education, and Welfare: Veterans With Development," in Kneese, A., and Smith, S.(eds.):
Mental Disorders Resident in Veterans Administra- Water Research, Baltimore: Johns Hopkins Press,
tion Hospitals, October 31, 1962, Washington, DC: 1966, pp 291-310.
Public Health Service Publication No.1223,1966. 10. Freeman, H.E., and Simmons, O.: The Mental
3. US Public Health Service, Department of Health Patient Comes Home, New York: John Wi-
Health, Education, and Welfare: Patients in Mental ley & Sons, Inc.,1963.
Institutions, Washington DC: Public Health Service 11. Engelhardt, D.M., et al: Phenothiazines in
Publication No.1452, 1966, parts II and III. Prevention of Psychiatric Hospitalization, Arch Gen
4. Kandel, D.B., and Williams, R.: Psychiatric Psychiat 16:98-101 (Jan) 1967.
Rehabilitation, New York: Atherton Press, 1964. 12. Pasamanick, B.; Scarpitti, F.; and Dinitz, S.:
5. Conley, R.W.; Conwell, M.; and Arrill, M.B.: Schizophrenics in the Community, New York: Ap-
An Approach Measuring the Cost of Mental Ill-
to
pleton-Century-Crofts, 1967.
ness, Amer J Psychiat 124:755-762 (Dec) 1967. 13. Polak, P.: Unclean Research and Clinical
6. Conley, R.W.; Conwell, M.; and Arrill, M.B.: Change, Milbank Memorial Fund Quarterly XLIV
Estimate of Amount and Distribution of Current
Cost of Mental Illness, 1966, mimeographed. (1) pt 2, pp 337-345 (Jan) 1966.
7. Fox, P.: A Theory of Cost-Effectiveness for 14. Binner, P.: Development of the Research De-
Military Systems Analysis, Operations Res 13 partment, Milbank Memorial Fund Quarterly XLIV
(2):191-201 (March-April) 1965. (1), pt 2, pp 313-319 (Jan) 1966.

Downloaded From: by a Tulane University User on 01/11/2019

You might also like