You are on page 1of 12

JOURNAL OF APPLIED BEHAVIOR ANALYSIS 1978, ll.

203-214 NUMBER 2 (SUMMER 1978)


SOCIAL VALIDITY: THE CASE FOR SUBJECTIVE MEASUREMENT
or
HOW APPLIED BEHAVIOR ANALYSIS IS FINDING ITS HEARTI
MONTROSE M. WOLF
UNIVERSITY OF KANSAS

I apologize, but I must begin making my What was the purpose of our journal? It was
case for subjective measurement by recounting a question that was clearly more important than
to you my own experiences with it over the past the others I had been asked. So I decided to con-
few years. Almost a decade ago, when the field sult the Gods but, as usual, Don Baer, Don
of applied behavior analysis was beginning to Bushell, Barbara Etzel, Vance Hall, Bill Hop-
expand so rapidly, we were faced with the task kins, Judy LeBlanc, Keith Miller, Todd Risley,
of putting together the Journal of Applied Be- and Jim Sherman were not in their offices. How-
havior Analysis. For a period of several months ever, I did find Don Baer in the hall. So I asked
Garth Hopkins, who was our managing editor, Don, "What is the purpose of JABA?" and Don
presented us with a series of unexpected deci- said in his usual offhand but eloquent way, "It
sions to make; like: What color should the is for the publication of applications of the anal-
paper be? And did we need a paper that would ysis of behavior to problems of social impor-
hold together for two thousand years or were tance." Well, that sounded so reasonable that
we willing to live with a shelf-life of only a it had to be true. So that is what I put in the
thousand years? And so on. Journal and it went to press.
Just a couple of days before we were sched- There was only one small problem; I wasn't
uled to go to press with our very first issue, sure what "social importance" meant or, worse
Garth called with one more question. "What is still, how to measure it. And, as I am sure you
the purpose of the Journal of Applied Behavior can appreciate, the more I thought about this
Analysis?", he asked. He said we needed to put the more concerned I became.
a description of the purpose on the inside front The dictionary only added to my distress.
cover, as one finds in other journals. He needed According to my New Webster's Vest Pocket
an answer almost immediately. Dictionary (1962) importance simply meant
"having value" and of course, social meant "per-
1This manuscript was presented as an invited ad- taining to society". Thus, something of social
dress to the Division of the Experimental Analysis of
Behavior, American Psychological Association, Wash- importance would have to be judged by some-
ington, D.C., September, 1976. Many valuable sug- one as having value to society.
gestions regarding this manuscript were made by Don Unfortunately, that sounded slightly subjec-
Baer, Curt Braukmann, Steve Fawcett, Dean Fixsen, tive to me. And subjective criteria have not been
Bill Hopkins, Frances Horowitz, Kathi Kirigin, Jack
Michael, Keith Miller, Todd Risley, Jim Sherman, very respectable in our field. We have consid-
and Sandra Wolf. Preparation of the manuscript was ered ourselves a natural science, concerned about
partially supported by Grants MH20030, MH13644, the objective measurement of natural events
and MH13881 from the National Institute of Men-
tal Health (Center for Studies of Crime and Delin- such as arithmetic problems worked correctly,
quency) to the Department of Human Development litter picked up, sexual responses occurring, and
and the Bureau of Child Research, University of social skills learned. We have considered our-
Kansas. Reprints may be obtained from Montrose M.
Wolf, Department of Human Development, Univer- selves to be like the other natural sciences: like
sity of Kansas, Lawrence, Kansas 66045. physics, chemistry, and biology, which concern
203
204 MONTROSE M. WOLF

themselves with the objective aspects of nature inated American university psychological
and profitably abandoned the subjective dimen- life." (Watson, 1930).
sions of natural events sometime in their pri-
mordial past. B. F. Skinner, in Science and Human Behav-
We have considered ourselves to be distinctly ior (1953), also argued forcefully against sub-
purer and more objective than most of our sister jective measures of private events. He began by
social sciences. We have looked especially ask- pointing out the implications of the discrimi-
ance at our colleagues in sociology, anthopology, nated operant model of language. He described
psychiatry, and humanistic psychology because how a community can reinforce and thus de-
they often mix into their sciences difficult-to- velop reliable verbal reporting of public events
digest portions of subjective measurement. because both the community and the individual
But psychologists have not always been so have access to these events. On the other hand,
suspicious of subjective data. For some time, he pointed out that since the community cannot
and until the first decades of this century, intro- have access to private events, the use of psy-
spection was the basic method of psychology. chology of introspective or subjective data leads
As you no doubt remember from your history to serious questions about reliability. Skinner
of psychology course, introspection is defined as continued,
the observation or examination of one's own "The layman also finds the lack of a re-
mental, emotional, or feeling states. The sub- liable subjective vocabulary inconvenient.
jects' verbal descriptions about sensations, pri- Everyone mistrusts verbal responses which
vate events, and feelings such as pleasantness describe private events. Variables are often
and unpleasantness had been taken to be the operating which tend to weaken the stim-
primary subject matter of psychology (Boring, ulus control of such descriptions, and the
1950). As a reaction against introspection in reinforcing community is usually power-
psychology and in science generally, there arose less to prevent the resulting distortion. The
positivism from Bridgeman in physics and from individual who excuses himself from an
Comte, Mach, and Feigl in philosophy. To quote unpleasant task by pleading a headache
Edwin Boring (1950) about its impact: cannot be successfully challenged, even
"The movement was positivistic. It was an though the existence of the private event
attempt to get back to basic data and thus is doubtful."
to increase agreement and diminish the While defining a functional analysis for us,
misunderstandings that came about from Skinner (1953) urged us to concentrate on the
unsuspected differences in meaning. Expe- objective behavioral data in our science as in
rience [introspection) had proved unsuc- the following quotation:
cessful as the scientific ultimate." (Boring,
1950) "The objection to inner states is not that
they do not exist, but that they are not rele-
John Watson began page one of his book vant in a functional analysis.... In dealing
Behaviorism in the following manner: with the directly observable data we need
not refer to. . . the inner state. ..."
"Two opposed points of view are still dom-
inant in American psychological thinking Having been well trained in these traditions,
-introspective or subjective psychology, we all agreed that in our journal, everything
and behaviorism or objective psychology. would be measured in objective ways. We would
Until the advent of behaviorism in 1912, avoid subjective measurement-that would be
introspective psychology completely dom- a first priority. Some of the members of the
SOCIAL VALIDITY 205

JABA Board of Editors even wanted to restrict in the field strongly suggests that these be main-
us to using only mechanically recordable be- tained rigorously. Except, of course, in the spe-
havior in our applied research. They wanted a cial case of everyone's own manuscripts which,
microswitch under every schoolroom chair and because of their unusual significance, merit spe-
under every bed. They were even suspicious of cial consideration. In any event, among the
observer measurement systems that contained standards that I was entrusted to uphold was
reliability checks. Yet I, in a moment of haste, that of requiring objective, reliable data. Thus,
had committed our journal to a goal, to an ul- you can appreciate the concern I began to feel
timate criterion, to a reason for being, that was when some of our most esteemed colleagues be-
clearly and simply subjective and that we had gan submitting articles to JABA that included
no good way of measuring. undisguised, blatantly subjective data.
You can imagine what I expected. I prepared One of the first came from, of all people,
for an onslaught of abuse, invective, and ridicule Bob Jones and Nate Azrin (1969). They had
from our editors and our reading audience. "So- been conducting an exquisite series of experi-
cial importance? Bah! Humbug!", I thought ments on the effects of rhythm and stimulus
they would say. To my surprise and relief, what duration on stuttering behavior. They had
happened was that people seemed pretty much shown, very nicely, that they could almost com-
to accept it. Many even seemed to know what pletely eliminate stuttering by having the stut-
it was. For example, JABA editors often re- terers synchronize their speech with a simple,
ferred to it in their reviews and used it as a regular beat. They had also developed a portable
basis for recommending or not recommending practical piece of apparatus that would present
manuscripts for publication. The editors most the beat tactually, and privately, thus avoiding
frequently reported that the particular manu- embarrassment to the wearer. Their results in-
scripts that they had been asked to review didn't dicated that they were on the verge of an im-
have very much of it. On the other hand, they portant solution to stuttering. There had been
reported that a few manuscripts had a moderate one problem, however. The speech, although
amount of it. And an occasional one or two had almost stutter-free, was complained about by
a lot of it. This made me feel somewhat better. listeners as sounding artificial. [The next sen-
Although I wasn't sure what it was or how to tence is to be read with a monotone with a dis-
measure it objectively, it was clear that many of tinct beat.) Apparently, they did not stutter, but
my colleagues had no trouble at all in recog- they did not talk very naturally, either.
nizing it. To deal with this problem, Jones and Azrin
I was also fearful of criticism from our read- systematically explored various beat durations.
ing audience. And we did receive occasional Then,-and this was the difficult part-they
complaints about social importance. But pri- asked judges to rate the "naturalness" of the
marily they wanted to know why the research speech at various beat durations. The judges
that appeared in JABA was not more socially reported that the speech sounded most natural
important. That criticism was easy for me to live to them at between two and three seconds of
with. I just blamed our authors. If the readers beat duration.
had taken me to task for using a fuzzy subjec- I wanted to phone Jones and Azrin and say,
tive criterion like "social importance", then I "Hey you guys, do you realize what you are do-
would have had no excuse. ing to me and the journal? Do you realize what
But the issue of subjective measurement con- kind of precedent you will be setting with your
tinued to make my life complicated. One of the 'naturalness'? Why, the people in our field who
functions of a chief editor is to uphold the are not as sophisticated as you and me and who
standards of the journal. And almost everyone are easily influenced will begin to think that it
206 MONTROSE M. WOLF
is possible to measure how people feel about all one says they like it or not. Besides, look at the
kinds of subjective things. I know that 'natural- precedent that it will set. Before long, those
ness' sounds innocent enough, but think about who don't appreciate the extreme risks of sub-
it a moment. If you publish a measure of 'natu- jective data will start asking for feedback from
ralness' today, why tomorrow we will begin the participants in their treatment programs.
seeing manuscripts about happiness, creativity, Who knows where that will end?"
affection, trust, beauty, concern, satisfaction, fair- But I felt sure that McMichael and Corey
ness, joy, love, freedom, and dignity. Who knows would just say that feedback from participants
where it will end? Think for just a moment. is not a trivial issue: that if the participants don't
What is that going to do to us and to the field like the treatment then they may avoid it, or
of applied behavior analysis?" run away, or complain loudly. And thus, society
But I was sure that they would have just said will be less likely to use our technology, no
that they would agree that it was going to com- matter how potentially effective and efficient it
plicate our science a bit. But if those things might be.
described by subjective labels were the things At the same time that I was having to wrestle
that were most important to people, then those with the problems of subjective measurement
were the things, even though they might be in JABA, my colleagues and I in the Achieve-
complex, that we should become more con- ment Place Research Project were having some
cerned with. After all, as an applied science of problems with unsolicited subjective feedback
human behavior, we supposedly were dedicated on similar issues. Colleagues, editors, and com-
to helping people become better able to achieve munity members were asking us about the behav-
their reinforcers. ioral goals that we had chosen for training the
Well, it didn't stop with Jones and Azrin. teaching-parents and the youths participating in
At about the same time I received a lovely the community-based, family-style, behavioral
manuscript from Jim McMichael and Jeff Corey treatment program at Achievement Place. They
(1969) in which they reported the exciting would ask us: "How do you know what skills
finding that college students in a Keller-type to teach? You talk about appropriate skills this
PSI (Personalized System of Instruction) course and appropriate skills that. How do you know
did better on the exam than the students in a that these are really appropriate?" We, of
traditional lecture course. This was, of course, course, tried to explain that we were psycholo-
a very important finding, as it replicated and gists and thus the most qualified judges of what
substantiated Keller's research. The only prob- was best for people. Somehow, they didn't seem
lem was that they also asked the students in each convinced by that logic.
course how much they liked their course. The In addition, the first time we tried to replicate
students in the PSI course rated their course the Achievement Place program in another com-
a great deal higher than the students in the munity, that community gave us feedback in a
traditional lecture sections. most drastic manner. Before we really knew
"Well," I thought to myself, "What in the that they had complaints about our program they
world am I going to do with this one? They had "fired" us. Finally, there were those who
are asking the participants in a behavioral treat- were challenging the importance of some of the
ment program how much they like it. Why, of results of the training that we were reporting.
course they should like it. After all, we are do- "Yes," they would say, "there are changes in
ing it to them for their own good aren't we? the behavior, but how do we know that they
And even if they say they don't like it, we know are really important changes?"
what is best for them. Clearly, if the procedure The message we seemed to be getting was
is effective, its just not important whether any- that "social importance" was a subjective value
SOCIAL VALIDITY 207

judgement that only society was qualified to Thus, in order to be responsive to our com-
make. If our objective was, as described in munities and to our data, one of our challenges
JABA, to do something of social importance, became to try to determine the behaviors that
then we needed to develop better systems and teaching-parents need in order to "relate to their
measures for asking society whether we were youths". "What do some people have that makes
accomplishing this objective. The suggestion kids like them? And how were we going to
seemed to be that society would need to validate find out?", we asked ourselves over and over.
our work on at least three levels: "Relating" appeared to be such a complex be-
havioral puzzle of subtle social behaviors that
1. The social significance of the goals. Are we were not sure how to begin our behavioral
the specific behavioral goals really what analysis. We did have the Jones and Azrin ex-
society wants? ample for measuring "naturalness", and we
2. The social appropriateness of the proce- came upon another method from, of all places,
dures. Do the ends justify the means? the Rogerian counselling psychologists.
That is, do the participants, caregivers and Haase and Tepper published an article in the
other consumers consider the treatment Journal of Counseling Psychology in 1972 that
procedures acceptable? was a great deal of help to us. Like so many
3. The social importance of the effects. Are Rogerians, Haase and Tepper were interested in
consumers satisfied with the results? All "empathy". They wanted to see if they could
the results, including any unpredicted find out what nonverbal behaviors of the coun-
ones? sellor were involved in empathy in order to be
better able to teach and evaluate counsellors in
We have come to refer to these as judgements training. They set up simulated counselling situ-
of social validity. It seems to us that by giving ations that contained various nonverbal com-
the same status to social validity that we now ponents, such as level of eye contact, trunk
give to objective measurement and its reliability lean (forward or backward), body orientation
we will bring the consumer, that is society, into (toward the client or rotated away from the cli-
our science, soften our image, and make more ent), distance from the client and various levels
sure our pursuit of social relevance. of "empathic" verbal messages". Videotaped ex-
An example from our own experience in the cerpts were then presented to experienced coun-
Achievement Place Research Project is that we sellors, who rated the amount of overall em-
were told by many communities that one of the pathy presented in each excerpt. It was found
most important characteristics of teaching-par- that eye contact, trunk lean, distance, and verbal
ents that they wanted was "warmth". When content were all related to the judgements of
quizzed about "warmth", the community mem- empathy. One result that really seemed to sur-
bers indicated that they wanted teaching-par- prise the authors was that the nonverbal be-
ents who "know how to relate to youths". For haviors accounted for more than twice as much
some time, our response to this request was to of the judgements of empathy than did the ver-
disagree with them. We argued, "What you bal behaviors. A counsellor who was saying
really need is someone who knows how to give something only moderately empathic was judged
and take away points at the right time." But the to be highly empathic if he or she were also
results of our research (Braukmann, Kirigin, engaging in eye contact, forward trunk lean, and
and Wolf, 1976) are tending to support the were positioned close to the client.
community's commonsense wisdom about the Well, it occurred to us that this model could
importance of teaching-parents being able to be used to analyze the meaning of all kinds of
"relate to youths". complex and subjective verbal labels. It also
208 MONTROSE M. WOLF
looked like a way to find out what some of the study was that he was not able to predict the
behaviors were that made some teaching-par- behaviors of the teaching-parents that were go-
ents better than others in being able to "relate ing to be most liked by the youths. As a matter
to youths". Alan Willner, with Curt Brauk- of fact, some of the behaviors that he thought
mann, Kathi Kirigin, Dean Fixsen, Lonnie Phil- would be most important to the youths were
lips, and I (Willner et al., 1977) began to never mentioned by them. He still wasn't con-
attempt to identify the interaction behaviors of vinced. After all, maybe the youths just couldn't
teaching-parents in Achievement Place style verbalize these subtle behaviors-which of
group homes the youths liked and didn't like. course was a real possibility. In this case, how-
Alan Willner had several youths look at video- ever, he cross-validated the original behaviors
taped examples of a variety of teaching-parent/ by giving the youths more structured interviews,
youth interactions and to list the things that in which he included more detailed descriptions
they liked and the things that they disliked. of the behaviors that he thought should also be
These comments were put into categories and important to them. The youths still rated those
then rated by the youths on an A, B, C, D, and behaviors as much less important than the ones
F basis. The youths gave A's to the following that they had earlier pointed out as important.
teaching-parent behaviors: a calm, pleasant This same outcome was found with youths who
voice tone, offers to help, joking, fairness, ex- were not involved in the first set of interviews.
planations, concern, enthusiasm, politeness, and It has become clear to us that we cannot pre-
getting to the point. F's were given to the fol- dict very well what many subjective labels of
lowing teaching-parent behaviors: throwing ob- complex behavioral phenomena are going to
jects, accusing, blaming statements, shouting, no mean to our judges. Nevertheless, while the task
opportunity provided to speak, insulting remarks, of unravelling those social behaviors that are in-
unfair point exchanges, and profanity. Willner volved in knowing how to "relate to youths" is
then took some of the highest rated social be- incomplete, Alan Willner has taken us closer to
haviors, taught them to teaching-parent trainees, that goal.
and found that youths rated these trainees much Another example of the use of the social
higher after the trainees received instruction in validation method to examine the social validity
the youth-preferred behaviors.2 of behavioral goals is a study by Neil Minkin,
One important sidelight of Alan Willner's with Curt Braukmann, Bonnie Minkin, Gary
Timbers, Barbara Timbers, Dean Fixsen, Lonnie
Phillips-and me (Minkin et al., 1976). Neil
2Jack Michael (personal communication, 1976) has Minkin wanted to determine what conversa-
pointed out that some behaviors, identified as pre- tional skills of adolescent girls were relevant.
ferred by this method, may have acquired their rein-
forcing value by their usually being members of He took videotapes of adolescent girls in con-
chains of behaviors. An example might be offers to versations with adults and of university girls in
help. It is possible that if offers to help were not conversations with adults. Judges from the com-
often followed by providing help, the offers them- munity were then asked to rate the effective-
selves would lose their reinforcing value. Similarly,
behaviors described as showing concern may have ness of each of these girls as conversationalists.
the same relationship to a more complex chain of As might be expected, the community people
behaviors. Thus, there appears to be an important judged the university girls to be more effective
and not, as yet, well understood "sincerity" dimen-
sion that should be brought to the attention of any- and ranked them higher. Minkin and others
one who may want to apply these findings. On the reviewed the videotapes of all the university and
other hand, some of the behaviors identified as pre- junior high-school girls several times, and de-
ferred may not be dependent on later events for their termined that a composite score of three kinds
reinforcing value. Examples might be joking and
explanations. of behavior correlated at the 0.84 level with
SOCIAL VALIDITY 209
the ratings given by the community representa- of Jones and Azrin (1969) and the work of
tives. (The three behaviors were: time spent Haase and Tepper (1972), we find that we can
talking, conversational questions, such as "What establish the social importance or validity of
are you taking in school?", and positive feed- complex classes of behavior that have subjective
back behaviors such as "Uh huh", "Yeah", and labels. By supplementing our traditional objec-
"Great!") In this manner it was possible to iso- tive measures, we can determine the relationship
late many of the behaviors that the community between the objectively measured behaviors and
representatives clearly were responding to when the subjective labels. This procedure opens op-
they rated overall quality of a conversation. portunities to explore all of the important goals
Another example of the social validation of that are described by subjective labels.
behavioral goals, conducted by the Achievement To summarize the method for determining
Place group, was carried out by Jack Werner, goal behaviors, I quote from Minkin and his
with Neil Minkin, Bonnie Minkin, Dean Fixsen, colleagues (1976):
Lonnie Phillips-and me (Werner et al., 1975).
Police exercise a great deal of discretion in "For example, 'affection' might be con-
handling juvenile offenders. Less than one- sidered a complex social behavior. If the
fourth of those youths who come into contact goal of a behavior analyst were to teach a
with police officers and who could be taken parent to be more affectionate towards his
into custody actually are taken into custody. Ac- or her child, it would be necessary to specify
cording to Piliavin and Briar (1964), the vio- the important component behaviors of af-
lation per se is usually less influential in deter- fection. Some of the components might in-
mining the choice of disposition than is the clude touching, smiling, and hugging. To
demeanor of the youth. It is often estimated that validate the social importance of these
the social behaviors of the youths account for behaviors, four steps might be used. First,
approximately 50% of all decisions regarding gathering sample parent-child interactions.
prejudicial handling of youths. Jack Werner Second, developing reliable definitions and
wanted to identify some of the important be- recording specific behaviors. Third, em-
havioral components of youth-police interac- ploying relevant judges, that is, other par-
tions so that he could teach these to youths. ents or children, to rate the sample inter-
Through informal interviews and then formal actions and evaluate each parent as to the
questionnaires, Werner and his colleagues iden- amount of affection shown to the child
tified several apparently important behaviors, within the interaction. The evaluation in-
including expression of cooperation, body orien- strument might be a bi-polar rating scale
tation so that the youth was facing the officer, with the poles labelled as to the amount of
and politeness. Werner found that these behav- affection shown. Step four would involve
iors could be reliably measured, thus partially correlating the ratings of the judges with
solving the behavioral puzzle of what objec- a composite score of the objectively mea-
tively measurable youth behaviors may influence sured behaviors of the parents. The sub-
police officers' decisions about custody. sequent correlation coefficient would indi-
So, rather than deciding by oneself the valid- cate the level of relationship of the speci-
ity of the behavioral objectives of a treatment fied objectivity measured components of
program, we can approach the specific consumer affection to the common English 'meaning'
or representatives of the relevant community, of affection as rated by the judges. Some of
and through interviews or ratings determine the important behavioral components of
much more precisely what the socially signifi- creativity, conversation, and affection, as
cant problems are. And, based on the example well as other complex classes of social
210 MONTROSE M. WOLF
behaviors, could probably be identified likelihood that the program will be adopted and
through the use of these social validation supported by others.
procedures." The third dimension of social validity is the
social importance of the effects of behavioral
It is clear that a number of the most impor- treatment. Are consumers satisfied with the re-
tant concepts of our culture are subjective, per- sults, all of the results, including those that
haps even the most important. Martin Luther, were unplanned? Behavioral treatment pro-
as the story goes, was severely criticized for grams are designed to help someone with a
setting Potestant hymns to the popular melodies problem. Whether or not the program is help-
of songs and dances of the time. He replied, ful can be evaluated only by the consumer. Be-
"Why should we let the devil have all the best havior analysts may give their opinions, and
tunes?" Well, why should we let the others these opinions may even be supported with em-
have all of the best human goals and social pirical objective behavioral data, but it is the
problems? participants and other consumers who want to
A second kind of social validity that has im- make the final decision about whether a pro-
pressed its importance on us is the social ap- gram helped solve their problems. Many be-
propriateness (in terms of ethics, cost, and prac- havior analysts are beginning to validate their
ticality) of the treatment procedures that we objective data with systematic subjective mea-
use. Again, behavior analysts are beginning to sures of consumer satisfaction.
ask clients and care-givers systematically about For example, Ron Kent and Dan O'Leary
the acceptability of their procedures. Foxx and (1976) found the ratings by teachers and parents
Azrin (1972) found restitution procedures more of child behavior also improved when their ob-
acceptable to care-givers than timeout or shock jective data showed increases in appropriate
punishment. These authors have also reported school behavior. Karen Maloney and Bill Hop-
over-correction to be a re-education procedure kins (1973) determined that when they modified
that is acceptable to care-givers of the retarded. the sentence structure of stories written by ele-
Janet Porterfield, Emily Herbert-Jackson, and mentary school children, judges' ratings of crea-
Todd Risley (1976) recently determined that tivity also increased. This is to be contrasted
"contingent observation" (that is, having to stop with the findings of Tom Brigham, Paul Grau-
playing and just watch your playmates for sev- bard, and Aileen Stans (1972), who were also
eral seconds) was not only an effective proce- attempting to improve quality of composition
dure for reducing the disruptive behavior of of school children, and found that some contin-
young children in a day-care setting, it was also gencies that increased objective dimensions had
found to be acceptable to the care-givers and to little effect on subjective ratings of quality, while
the parents of the children. other contingencies produced increases in both
Our own data show that ratings by the youths objective measures and subjective ratings of
in Achievement Place style homes of the fairness story quality. Steve Fawcett and Keith Miller
of the program and the concern of the teaching- (1975) demonstrated that an instructional pack-
parents correlate very highly with the number of age designed to enhance public-speaking behav-
offenses that the youths commit while they are ior was effective in producing increases in both
in treatment (Braukmann, Kirigin, and Wolf, the objectively measured public-speaking behav-
1976). It may be that not only is it important to iors and in the audience's ratings of the quality
determine the acceptability of treatment pro- of the performance of the trainees.
cedures to participants for ethical reasons, it We have described the Achievement Place
may also be that the acceptability of the pro- research of Willner, Minkin, and Werner and
gram is related to effectiveness, as well as to the their colleagues, where judges were used to de-
SOCIAL VALIDITY 211

termine socially valid dimensions of teaching- How well do they represent the quality of
parent/youth interaction behavior, quality of national life? How valid are they as mea-
conversation components, and significant ele- sures of the goodness of life in this coun-
ments in youth-police interaction. In each of try? The history of the last 25 years is not
those studies, the outcomes were also socially reassuring. During this period this country
validated. That is, relevant judges were also used has experienced an unprecedented rise in
to assess the social importance of the changes national affluence, with a spectacular in-
in the objectively measured behaviors. And it crease in average family income and an as-
was found that youths rated the quality of the sociated decline in the number of families
teaching-parents higher, members of the com- below the poverty line. During the same
munity rated the quality of the youths' conver- period we have seen a phenomenal rise
sations higher, and police officers rated the in the incidence of crime, an epidemic of
quality of the demeanor of the youths higher as various forms of public violence, a greatly
the objectively measured behaviors increased in increased use of drugs with associated drug
each case. abuse, a continuing increase in the number
At the treatment program level, Curt Brauk- of fragmented families; a sharp drop in
mann with Dean Fixsen, Kathi Kirigin, Elaine public confidence in elected officials, and
Phillips, Lonnie Phillips-and I (1975) de- what appears to be a substantial rise in so-
scribed how feedback from consumers can be cial and political alienation. [I} . . . find
used to provide ongoing quality control of the it hard to believe that the quality of Amer-
dissemination of the Achievement Place treat- ican life has been greatly enhanced dur-
ment model. The consumers of the program, ing this period."
that is the youths in the program, their parents,
and community members and agencies, evaluate E. F. Schumaker, in his book Small is Beauti-
the teaching-parents by rating their effective- ful: Economics as if People Mattered (1973),
ness, concern, etc. throughout the year of train- raised this same issue. He urged economists to
ing and certification, and each year thereafter. consider what he terms the "primacy of quali-
It has not been possible to demonstrate experi- tative distinctions", rather than being so con-
mentally the effectiveness of this feedback sys- cerned with objective data like the gross na-
tem by using it with some programs and not with tional product.
others because of ethical considerations. But Recently, the Swedish medical sociologists
there is one important bit of data. Since this feed- Levi and Anderson (1975) suggested that ob-
back was put into effect, the Achievement Place jective measures that habitually have been used
program has not been summarily "fired" from a by the United Nations to assess the quality of
community, as in that first attempt at replica- life be supplemented by subjective measures.
tion. Also, these consumer satisfaction ratings They proposed that the traditional objective
are often highly correlated with objective mea- measures of quality of life, such as education,
sures of effectiveness (Braukmann et al., 1976). employment, economy, housing, nutrition, etc.
Concern for the social validity of objective be given equal emphasis with subjective criteria
measures seems to be an issue in other social such as "happiness, satisfaction, and gratifica-
sciences as well. At the American Psychological tion". Thus, applied behavior analysts are not
Association meeting, Angus Campbell (1976) the only applied social scientists who are being
raised this issue about economics: asked to validate their measures by checking
with society.
"None of us doubts that economic data Well, if social validity is such a good thing,
have admirable qualities: the question is, why haven't we been doing more of it all along.
212 MONTROSE M. WOLF
Of course, the answer is that subjective data are our treatment program, we must be very cau-
risky data. Subjective data may not have any tious because we have no adequate way of
relationship to actual events. A program that is checking the reliability of the verbal report in
described by its consumers as well-liked or effec- an independent way. And as Skinner pointed
tive may not necessarily be either pleasant or out, verbal descriptions of private events are
effective. Thus, there is the danger that subjec- open to "fictional distortion" (1959).
tive data will seriously mislead us. For example, in order to influence consumer
For example, Berleman, Seaberg, and Stein- evaluations, it is conceivable that some of those
burn (1972) conducted a delinquency preven- being evaluated might politic their consumers
tion experiment with carefully matched experi- for better ratings. Similarly, it is conceivable
mental and control groups, using intensive that some of those consumers giving ratings
one- to two-year treatment by social workers as might fear that they will not remain anonymous
the intervention procedure. The evaluation of and be afraid that those they are rating might
the effectiveness during the treatment period and retaliate in some manner. One can conceive of
during the eight months following treatment in- many such possibilities, but let us remember
dicated "no positive impact" on disruptive be- that the reliability of objective measurement
havior in school, police contacts, or rate of systems can also be manipulated, as the excel-
institutionalization. The untreated control group lent series of studies by O'Leary and Kent and
performed as well or better than the experi- their colleagues (O'Leary, Kent, and Kanowitz,
mental group. Yet, when asked about their ex- 1975; Kent, O'Leary, Diament, and Dietz,
perience in treatment, the youths ". believed
. . 1972) have demonstrated. From these studies,
that their school acting-out had decreased. When it seems clear that the scoring behavior of ob-
asked if they would participate in a similar servers can be affected by a variety of variables,
service again, 89 percent of the parents re- such as experimenter feedback. We must take
sponded positively, as did 94 percent of the these into consideration whenever we design a
boys". measurement system that involves observers.
Behavioral researchers have reported many Thus, we know that the reliability of objective
examples of a lack of correspondence between measurement procedures can be influenced by
client-reported data and observer-obtained data. a number of known and probably unknown
Patterson (personal communication, 1974) for variables, but we continue to use these systems
example, described discrepancies between paren- because they are the only way to obtain some
tal reports of improvements in the child's behav- very important data, they often work, and we
ior, while objective data obtained by observers feel some confidence that we are gaining a better
did not support these claims. Conrad and Wincze understanding of the conditions that may dis-
(1976) reported that clients undergoing orgas- tort them.
mic reconditioning verbally reported favorable Similarly, we know that social validity mea-
results that were not substantiated by the ob- sures can be manipulated and abused, but we
jective data. cannot allow this to lead us to neglect them.
Why do these discrepancies exist? One pos- Rather, we must establish that set of conditions
sibility is that the contingencies of the situation under which people can be assumed to be the
create distortion. Verbal behavior, clearly, is a best evaluators of their own treatment needs,
manipulable behavior. And we must be sus- procedural preferences, and posttreatment sat-
picious of it because we know that we will not isfaction. True, we know little about the proper
always understand the contingencies operating set of conditions, but we must attempt them
on it. When we are asking for a verbal descrip- anyway. We can expect that they will involve
tion of a private event, such as satisfaction with education about options, lack of coercion, an-
SOCIAL VALIDITY 213

onymity, and so on. We can study the effects of better ways of teaching people to observe their
these conditions on subjective data, as O'Leary behavior and their conditions and to make more
and Kent and their colleagues have studied their accurate decisions about their improvement. The
effects on objective observer-dependent measure- opinion poll people often seem to be able to
ment systems. And then we will be better able make excellent predictions about voting behav-
to control for them. ior based on verbal report. Surely we can do
A second possible explanation for subjective- as well.
objective discrepancies is that the consumer is Undoubtedly, there will be further important
responding to changes in some behavior or con- studies that point out to us the shortcomings of
dition that we are not recording with our par- certain social validity measures, just as has been
ticular objective measures. For example, the done for observer-dependent objective mea-
parent may say that a child has "improved", sures. But we can't despair. After all, measure-
while our behavioral measure of rate of tan- ment has been our thing. In our field, we have
trums does not show a decrease. The discrepancy developed so many ingenious measurement sys-
may be because the child has stopped cursing, tems. There is no doubt that we could measure
which was important to the parent, but not the disruptive classroom behavior of a school
measured by us, perhaps because it does not of fish, if need be. Surely, we will be able to
bother us. If this lack of appropriate measure- develop measurement systems that will tell us
ment is one of the factors in subjective-objective better whether or not our clients are happy
discrepancies, then we must become better at with our efforts and our effects.
setting up our measurement systems. Earlier in our history, Watson and Skinner
A third possibility, and the most serious, is argued forcefully against subjective measure-
that subjective measurement is impossible be- ment because they were concerned about the in-
cause humans cannot judge and report their own appropriate causal roles that hypothetical in-
situation accurately enough. It may be that they ternal variables, subjectively reported, were
don't know when they are better or worse off. playing in social science. As a result, many of
It may be that to expect a human ever to be us concluded that all subjective measurement
able to report accurately when something feels was inappropriate. A new consensus seems to
good or feels bad is just more than we can hope be developing. It seems that if we aspire to so-
for from our confused species. But this conclu- cial importance, then we must develop systems
sion is unacceptable if our goal is to design a that allow our consumers to provide us feedback
responsive consumer-oriented applied social about how our applications relate to their values,
science. As Levi and Anderson (1975) argued to their reinforcers. This is not a rejection of
in making their case for adding subjective mea- our heritage. Our use of subjective measures
sures to objective quality-of-life indicators: does not relate to internal causal variables. In-
"We believe that each individual can be
stead, it is an attempt to assess the dimensions
of complex reinforcers in socially acceptable and
assumed to be the best judge of his own practical ways. It is an evolutionary event that
situation and state of well-being. The al- is occurring as a function of the contingencies
ternative is some type of 'big brother' who of the applied research environment; contin-
makes the evaluation for groups and na- gencies that our founders would probably say
tions. World history provides many ex-
they appreciate, if we had the nerve to ask them
amples of such 'expert' or 'elitist' opinions for such subjective feedback on our behavior.
being at variance with what was expected
by the man in the street." REFERENCES
Berleman, W. C., Seaberg, J. R., and Steinburn, T. W.
Therefore, we may have to try to develop The delinquency prevention experiment of the
214 MONTROSE M. WOLF
Seattle Atlantic Street Center: A final evaluation. Maloney, K. B. and Hopkins, B. L. The modifica-
Social Science Review, 1972, Sept., 323-346. tion of sentence structure and its relationship to
Boring, E. G. A history of experimental psychol- subjective judgements of creativity in writing.
ogy. New York: Appleton-Century-Crofts, 1950. Journal of Applied Behavior Analysis, 1973, 6,
Braukmann, C. J., Fixsen, D. L., Kirigin, K. A., Phil- 425-434.
lips, E. A. Phillips, E. L., and Wolf, M. M. McMichael, J. S. and Corey, J. R. Contingency
Achievement Place: The training and certifica- management in an introductory psychology course
tion of teaching-parents. In W. S. Wood (Ed), produces better learning. Journal of Applied Be-
Issues in evaluating behavior modification. Cham- havior Analysis, 1969, 2, 79-84.
paign, Illinois: Research Press, 1975. Pp. 131- Minkin, N., Braukmann, C. J., Minkin, B. L., Tim-
152. bers, G. D., Timbers, B. J., Fixsen, D. L., Phillips,
Braukmann, C. J., Kirigin, K. A., and Wolf, M. M. E. L., and Wolf, M. M. The social validation
Achievement Place: The researchers' perspective. and training of conversation skills. Journal of
Paper presented at the meeting of the American Applied Behavior Analysis, 1976, 9, 127-140.
Psychological Association, Washington, D.C., New Webster's Vest Pocket Dictionary. Otten-
September, 1976. heimer Publishers, Inc., 1962.
Brigham, T. A., Graubard, P. S., and Stans, A. An- O'Leary, K. D., Kent, R. N., and Kanowitz, J. Shap-
alysis of the effects of sequential reinforcement ing data collection congruent with experimental
contingencies on aspects of composition. Journal hypotheses. Journal of Applied Behavior Analysis,
of Applied Behavior Analysis, 1972, 5, 421-430. 1975, 8, 43-51.
Campbell, Angus. Subjective measures of well- Piliavin, I. and Briar, S. Police encounters with
being. American Psychologist, 1976, 31, 117-124. juveniles. American Journal of Sociology, 1964,
Conrad, S. R. and Wincze, J. P. Orgasmic recondi- 70, 206-214.
tioning. A controlled study of its effects upon the Porterfield, J. K., Herbert-Jackson, E., and Risley,
sexual arousal and behavior of adult male homo- T. R. Contingent observation: an effective and
sexuals. Behavior Therapy, 1976, 7, 155-166. acceptable procedure for reducing disruptive be-
Fawcett, S. B. and Miller, L. K. Training public- havior of young children in a group setting.
speaking behavior: an experimental analysis and Journal of Applied Behavior Analysis, 1976, 9,
social validation. Journal of Applied Behavior 55-64.
Analysis, 1975, 8, 125-136. Schumaker, E. F. Small is beautiful: economics as
Foxx, R. M. and Azrin, N. A. Restitution: A if people mattered. New York: Harper & Row,
method of eliminating aggressive-disruptive be- 1973.
havior of retarded and brain damaged patients. Skinner, B. F. Science and human behavior. New
Behaviour Research and Therapy, 1972, 10, York: Macmillan Co., 1953.
15-27. Skinner, B. F. Cumulative record. New York: Ap-
Hasse, R. F. and Tepper, D. T. Nonverbal com- pleton-Century-Crofts, Inc., 1959.
ponents of empathetic communication. Journal Watson, John B. Behaviorism. Chicago: The Uni-
of Counseling Psychology, 1972, 19, 417-424. versity of Chicago Press, 1930.
Jones, R. J. and Azrin, N. A. Behavioral engineer- Werner, J. S., Minkin, N., Minkin, B. L., Fixsen, D.
ing: stuttering as a function of stimulus duration L., Phillips, E. L., and Wolf, M. M. Interven-
during speech synchronization. Journal of Ap- tion package: An analysis to prepare juvenile
plied Behavior Analysis, 1969, 2, 223-230. delinquents for encounters with police officers.
Kent, R. N. and O'Leary, D. K. A controlled evalu- Criminal Justice and Behavior, 1975, 2, 5 5-83.
ation of behavior modification with conduct Willner, A. G., Braukmann, C. J., Kirigin, K. A.,
problem children. Journal of Consulting and Fixsen, D. L., Phillips, E. L., and Wolf, M. M.
Clinical Psychology, 1976, 44, 586-596. The training and validation of youth-preferred
Kent, R. N., O'Leary, K. D., Diament, C., and Dietz, social behaviors with child-care personnel. Jour-
A. Expectation biases in observational evalua- nal of Applied Behavior Analysis, 1977, 10, 219-
tion of therapeutic change. Journal of Consulting 230.
and Clinical Psychology, 1972, 42, 774-780.
Levi, L. and Anderson, L. Psychosocial stress: Pop-
ulation, environment, and the quality of life. Received 15 October 1976.
Holliswood, N.Y.: Spectrum Press, 1975. (Final Acceptance 12 August 1977.)

You might also like