You are on page 1of 16

Work & Stress, OctoberDecember 2007; 21(4): 348362

Evaluating organizational-level work stress interventions:


Beyond traditional methods

TOM COX, MARIA KARANIKA, AMANDA GRIFFITHS, &


JONATHAN HOUDMONT

Institute of Work, Health & Organisations, University of Nottingham, UK

Abstract
Since the early 1990s, there has been a growing literature on organizational-level interventions for
work-related stress, and associated calls for such interventions to be evaluated. At the same time,
doubts have been expressed about the adequacy of traditional scientific research methods in applied
psychology (the natural science paradigm) in providing an effective framework for such evaluations.
This paper considers some of the philosophical and methodological issues raised by evaluation
research in relation to organizational-level interventions for work-related stress. Four key issues are
discussed: the concept of a study being ‘‘fit for purpose’’ in relation to research designs and the nature
of acceptable evidence; the issue of control of research conditions in real-world studies; the need to
evaluate process as well as outcome, including the interrelated nature of process and outcome; and the
interpretation of imperfect evidence sets. The starting point of this paper is the reality of
organizational life, which is complex and continually changing. Its main objective is not to offer an
alternative to a scientific approach but to argue for a more broadly conceived and eclectic framework
for evaluation that acknowledges the limitations of the traditional approach. It espouses an approach
that is reflective of the reality of organizational life and in which the methods used for evaluating an
intervention are fit for purpose. The paper concludes by offering an outline framework for this broader
approach to the evaluation of interventions.

Keywords: Work-related stress, risk management, organizational level interventions, evaluation,


methodology, process, fit for purpose, control

Introduction
In the early 1990s, an apparent increase in the size of the problem of work-related stress,
combined with new requirements on employers under European health and safety
legislation, forced serious consideration of how this problem might be best managed in
the UK. However, this problem was not the UK’s alone and similar discussions took place
in most other Member States of the European Union and elsewhere in the world. In 1993, a
report to the UK Health & Safety Executive (HSE) recommended that work-related stress
be treated as an occupational health issue and that, in line with common practice in health
and safety, a risk management approach be adopted (Cox, 1993). This recommendation
proved consistent with the approach to the management of occupational health and safety
problems later published by the European Commission in 1996. The focus of the HSE’s

Correspondence: Tom Cox, Institute of Work, Health & Organisations, 8 William Lee Buildings, Science and
Technology Park, University Boulevard, Nottingham NG7 2RQ, UK. email: tom.cox@nottingham.ac.uk

ISSN 0267-8373 print/ISSN 1464-5335 online # 2007 Taylor & Francis


DOI: 10.1080/02678370701760757
Organizational-level interventions for work-related stress 349

approach in the UK is on the design and management of work, on prevention and on


organizational-level interventions (Cox, Griffiths, Barlow, Randall, Thomson, & Rial-
González, 2000; Mackay, Cousins, Kelly, Lee, & McCaig, 2004): the organization is seen as
the ‘‘generator’’ of the stress-related risk to health. Over the following decade, government
bodies and research institutes across Europe and elsewhere worked to develop their own
approaches to the management of work-related stress. Many different initiatives and
procedures have emerged and under a variety of names. However, most involve evidence-
based problem solving and can be broadly described as forms of risk management (Leka
et al., 2007; Oeij et al., 2006).
Since the early 1990s, there has been a slowly growing scientific literature on
organizational-level interventions for work-related stress (Levi, Sauter, & Shimomitsu,
1999) and associated calls for such interventions to be evaluated (Sauter et al., 2002;
Schaufeli & Kompier, 2001). Doubts have been expressed about the adequacy of the
natural science paradigm (the traditional scientific experimental approach to research),
used on its own, in providing an effective framework for such evaluations (Griffiths, 1999;
Griffiths & Schabracq, 2003; Ovretveit, 1998). This paradigm, usually involving a pre- and
a post-test, an experimental and a control group, and random assignment of research units
to the groups involved, is not the only scientific method, and it is argued that a broader
more eclectic and complementary approach to the evaluation of interventions is required.
The real world of organizations and organizational life is complex and ever evolving. It
may be best described in terms of complex adaptive systems (Dooley, 1997; Schneider &
Somers, 2006). Typically, the elements of such systems are correlated, interact in both
synergistic and inhibitory ways, express non-linearity in their relationships, and show
dependency on both time and context (Karanika, 2006). The natural science paradigm,
with its emphasis on reductionism, simple mechanistic causal relationships, and structural
determinism, is poorly suited to the study of organizations and organizational life and here,
too, the challenge is that of evaluating organizational-level interventions for work-related
stress. The philosophical and methodological issues that characterize the traditional
research approach in applied psychology need to be critically examined. Such a critique
is at the heart of this paper, allowing for the emergence of new approaches in applied
psychology to complement that research paradigm.
The starting point, however, lies in the applied nature of the parent discipline. Applied
psychology is defined not by the methodology that it uses but by the problems that it seeks
to address, and its philosophy and strategy for addressing those problems. It is concerned
with problems that are real and of societal importance. In this vein, Colman (1994) talks of
applied psychology being ‘‘the application of psychological theories and research findings to
practical problems of everyday life’’ (pp. xixii). It is both empirical and pragmatic. It is
evidence-based and seeks to establish the validity of its theories and models in terms of their
application and the consequences of such application. Applied psychology is practised
through evidence-based and systematic problem solving. Research provides the foundation
to that problem solving, and therefore it should combine theoretical and methodological
rigour with practical relevance (Hodgkinson, Herriot, & Anderson, 2001). From this
perspective, a number of issues are addressed here: that of the acceptability of evidence and
the related issue of study designs being fit-for-purpose, the question of the focus of
evaluation research (outcomes or processes), and, finally, the challenge of aggregating data
(evidence) from studies that are held by some to be less than perfect in terms of the
traditional research approach.
350 T. Cox et al.

Acceptable evidence: Fit for purpose


The risk management approach argues for the prevention of work-related stress through
organizational-level interventions, focused on prevention by means of the better design and
management of work and of its social and organizational contexts (Cox et al., 2000;
Griffiths, 1998). Unfortunately, organizational level prevention appears to be the least
common type of work stress intervention and, although the literature is growing, there still
appear to be a relatively small number of studies that have examined its effectiveness
(Jordan, Gurr, Tinline, Giga, Faragher, & Cooper, 2003; Loisel, Durand, Baril, Gervais, &
Falardeau, 2005; Murphy, 1996; van der Hek & Plomp, 1997). Furthermore, criticism has
been levelled at the lack of methodological rigour in many of the studies that do exist
(Briner & Reynolds, 1999; Parkes & Sparkes, 1998; Reynolds, 1997; Reynolds & Shapiro,
1991; Rick, Thomson, Briner, O’Regan, & Daniels, 2002). The criticisms include, among
others, lack of a control group, lack of random allocation of participants to conditions,
scarcity of longitudinal studies, unclear testing of theory, inappropriate data analysis
strategies, and omitting to control for contextual factors. For some, therefore, there is a
paucity of acceptable evidence in support of the claimed effectiveness of organizational-level
interventions for work stress. This has led to the conclusion that, in terms of health benefits,
organizational-level interventions may be limited in what they have achieved (or can
achieve).

Defining acceptable evidence


The first key question is ‘‘how do we determine what is acceptable evidence?’’ The question
of acceptability, when applied to evidence, appears to be determined by some solely by the
research design and, possibly, the way in which the data are analysed. The traditional
research approach in psychology, which was borrowed from the natural sciences and which
espouses experimental designs and randomized controlled trials (RCTs), is treated by many
as the gold standard (Kaptchuk, 2001; Macdonald, Veen, & Tones, 1996; Robson, 2002).
This is despite the fact that they are not easy to implement in applied psychological research
(Sauter et al., 2002). The gold standard is defined by experimental designs, the random
allocation of participants to conditions, the collection of quantitative data, direct tests of
causality, a focus on outcomes, and the assumption of linear relationships.
The obvious question is how much can such research designs tell us about real-world
situations, in which controlled comparisons are not always possible (either practically or
ethically), where relationships are complex and may defy explanation in terms of simple
causal relationships, where they are as likely to be non-linear as they are to be linear, and
where there are often no easily measured outcomes that are meaningful (Black, 1996;
Guastello, 1993; Karanika, 2006; Nielsen, Randall, & Albertsen, 2007). This is the basis of
the argument that the traditional experimental approach in applied psychology may be
inadequate for exploring the complex and changing world of organizations (Griffiths &
Schabracq, 2003; Ovretveit, 1998). It is argued here that the key issue in determining
whether or not any data that is collected represents acceptable evidence is that of it being fit
for purpose.

Fit for purpose


Fitness for purpose can be defined as the correct approach to obtaining data of appropriate
quality (Thompson & Fearn, 1996) judged against the purpose of obtaining those data.
Organizational-level interventions for work-related stress 351

This concept is not new and has been used in scientific research for many years; for
example, in relation to questions of evaluation in higher education (UK) (Harvey & Green,
1993), the quality of analytical methods in chemical biology (Thompson & Fearn, 1996),
the surveillance of sexually transmitted infections (Catchpole, Harris, Renton, & Hickman,
1999), the quality of social science research (Boaz & Ashby, 2003), and, most recently, the
development of methods for clinical trials (Wagner, Williams, & Webster, 2007).
‘‘One question that is rarely made clear in applied psychology is what is the strategic
purpose of a study?’’ This is largely because there is a strong prevailing assumption that all
studies are designed to extend our basic knowledge of human and systems behaviour by
uncovering and testing generalizable (and publishable) laws of such behaviour. Herriot
(1984), in his book Down from the ivory tower, made a distinction between motivation for
academic publication and career advancement and motivation for real-world problem-
solving and ‘‘making a difference.’’ Such a distinction, if it exists, may be helpful in
identifying and contrasting different purposes for conducting research and the way in which
those different purposes map onto research questions and research methodology.
Following from Herriot (1984), the different purposes for research can be contrasted at a
number of levels (such as nature and aetiology of research question, choice of methodology,
and desired outcome) and, not least, in terms of the interaction between purpose and
methodology. In the case of academic purpose, a researcher’s expertise or specialization,
adherence to their discipline, and the framing of research aims by theoretical and
methodological interests are often strong factors shaping the research question and the
way in which it is answered. For real-world problem solving, however, research questions
are driven by their societal importance, and methods are determined by the nature of the
problem and its possible solution. Both need to be open to innovative influences from other
disciplines. This is the level of ‘‘making a difference’’ in research and, often, this leads to
truly interdisciplinary approach. The important point here is that different research
purposes can require different research methods, and that choice of method should be
determined by its being fit for purpose.
In reality, there are a number of different general purposes for research in relation to the
evaluation of organizational level interventions. For example, studies may be carried out to
inform management decision-making for future change, for benchmarking, to make clear to
management why an intervention has worked or failed, or to identify risks to health that
exist at work and suggest ways in which they might be reduced. Examples are provided
through the research that supported the development of the HSE’s Management Standards
initiative (see Griffiths, Cox, Karanika, Khan, & Tomás, 2006; Mackay et al., 2004).
With the possibility of there being a number of different purposes for any planned study
comes the new question: ‘‘is the research method chosen fit for purpose?’’ The question
here is which of the two key concepts, the research method or the problem and research
question, should be the fixed point? Do we tailor our research methods to fit traditional
research method and reject those that do not, however important they are for society, or do
we develop research methods to allow us to answer the important questions? Is the research
method the master or the servant of evaluation science? It is clear to us that the research
agenda should be determined by the importance of the problems to be addressed and that
the research method chosen is the tool by which such problems can be solved.
Koch (1959) also believed it should be the former but often was not, such that ‘‘man’s
stipulation that psychology should be adequate to science (has) outweighed his stipulation
that it be adequate to man’’ (p. 783). Griffiths and Schabracq (2003) wrote in a similar
vein:
352 T. Cox et al.

Partly as a result of the overzealous imitation of the favourite methods of natural science,
in order to please the current psychological establishment, those engaged in organiza-
tional intervention research may have sometimes put the cart before the horse. That is,
methods have been put before problems (p. 184).

What Guion (1998) termed the ‘‘Procrustean approach’’ to research, or only working on
problems that fit established theories and methods, should be avoided. The Procrustian
approach could lead to what Basch and Gold (1986) termed a Type IV error; that is the
‘‘evaluation of a program that no one cares about and is irrelevant to decision-makers’’ (p.
301). Schein (1991) has observed:

. . . we have largely adopted a traditional research paradigm that has not worked very well,
a paradigm that has produced very reliable results about very unimportant things, and
sometimes possibly invalid results altogether. In that process I believe we have lost touch
with some of the important phenomena that go on in organizations, or have ignored them
simply because they were too difficult to study by the traditional methods available (p. 2).

Two things have to be involved in advancing this situation: first, consideration of the
societal importance of the problem, here possibly its organizational relevance, and second,
following Hodgkinson et al. (2001), its ethical and theoretical status. Where questions are
acknowledged to be important to society or a particular working population or organization,
and are determined to be legitimate in ethical and theoretical terms, then it is argued here
that we must consider our methods and attempt, if necessary, to innovate ones which are fit
for purpose in answering the research question. The applied researcher may not have the
luxury of ‘‘walking away’’ from questions that do not neatly fit the traditional research
approach. Being fit for purpose may be determined through a series of five tests: Is the
problem to be addressed important to society? What is the purpose of researching it? What
is the particular research question? Can the research method answer the research question
and meet the purpose of the study? Is it theoretically and ethically legitimate?
Of course, this assumes that one can access the situation and people under study, apply
the research method, be sure of compliance with the study requirements, and actually
capture the data. This cannot always be guaranteed (Black, 1996; Semmer, 2006). Randall
(2002) noted that:

Practical considerations such as the level of disruption, organization, or planning


associated with a study design, or limits on sample size and access to participants, often
result in compromises (in study design), even before the interventions are implemented
[. . .] The priorities of the business, its established systems and structures inevitably have
implications for study design. Once interventions are implemented their delivery often
moves further outside the control of the researcher (p. 22).

In the real world, organizations are often reluctant to have their internal initiatives examined
and reported on in the public domain. Although, in theory, many of the problems of
conducting experimental studies in real-world situations might be overcome, the practical
implications for organizations, researchers, and funders mean that this is usually impossible
(Black, 1996).
Organizational-level interventions for work-related stress 353

Control in the real world: Reality and design


A second and closely related issue is that of the management and control of the research
conditions in real world research. Intervention implies change. In organizations, at least,
such change is set against a background of continual evolution and the necessary adaptation
to turbulent socio-economic and political environments. There is little that is static about
organizations and their lives and behaviour (Dooley, 1997; Schneider & Somers, 2006).
Furthermore, organizations do not exist to be ‘‘case studies,’’ nor are their staff employed to
be ‘‘participants’’ in such studies. As a result, it is not often possible to exert sufficient
control over organizations, their behaviour, or that of their employees and social partners,
to allow a pure experimental design to be the framework for any evaluation. Two exceptions
may exist. First, the necessary level of control required to impose an experimental design
might be possible in particular organizations where there is a strong power structure, for
example, in the army, police or fire services, hospitals, schools, or prisons. Second, there are
also situations that can provide natural experiments. Indeed, many disciplines have used
natural experiments in research to advance understanding; in economics (see, for example,
Meyer, 1995; Rosenzweig & Wolpin, 2000), public health (Pettigrew et al., 2005;
Wilkinson, 1990), nutrition (Susser, 1981), and various areas of applied psychology (social
psychology: Oettigen, Little, Lindenberger, & Baltes, 1994; occupational health psychol-
ogy: Kompier, Aust, van den Berg, & Siegrist, 2000; Parkes, 1982; environmental
psychology: Guagnano, Stern, & Dietz, 1995). Outside these two exceptions, the reality
is that in intervention research true experiments and RCTs are difficult if not impossible to
conduct (Black, 1996; Semmer, 2006). Indeed, it might be argued that, where it has been
possible to exert the necessary control on conditions, the situation or organization might be
judged to be atypical and unrepresentative of the wider population of organizations.
Recognition of this reality has led several researchers to develop the notion of the quasi-
experiment, most notably Campbell and Cook (see, for example, Campbell & Stanley,
1963; Cook & Shadish, 1994). Essentially, quasi-experimentation recognizes the inability to
be able to insist on the conditions that define the ideal experiment, such as the random
allocation of participants to conditions, the design of those conditions, and the availability
of a control group, and it offers strategies to limit the confounding effects of these failures
through innovative but complex designs, adapted statistical analyses, and conservative
interpretation. As for the ideal experiment, much of what is covered by the concept of
quasi-experimentation remains focused on evaluating outcomes. There is a case for
exploring the breadth of our interest beyond outcomes.

Outcomes and processes


The third important consideration focuses on the question of process in relation to that of
outcome. First, we must define what is meant by ‘‘process’’ and ‘‘outcome.’’ Definitions of
process in relation to intervention research tend to focus on it being a series of actions,
changes or functions bringing about a result. Process therefore refers to the flow of
activities; essentially who did what, when, why, and to what effect. In systems thinking, it
refers to the things that happen to translate input into output. Process-based evaluation is
concerned with ‘‘what and how things happened and with issues of compliance’’ (p. 52, Cox
et al., 2000). By comparison, outcomes are usually defined as ‘‘an end result or
consequence’’ and outcome-based evaluation refers to ‘‘what the result was; the difference
that was made’’ (p. 52, Cox et al., 2000).
354 T. Cox et al.

The point for consideration here is that while outcome evaluation can establish the fact of
an effect (or otherwise), alone it does not offer an explanation of how that effect happened
or did not happen or happened only in part (Griffiths, 1999; Landsbergis & Vivona-
Vaughan, 1995; Nielsen, Fredslund, Christensen, & Albertsen, 2006; Nielsen et al., 2007;
Nytrø, Saksvik, Mikkelsen, Bohle, & Quinlan, 2000; Saksvik et al., 2007). Science is
concerned with knowing not only that something happened but also why and how it
happened. More is usually needed in terms of understanding the processes involved:
process is important in its own right and not simply as a proxy for outcomes that are
difficult to measure or access.
Outcomes are often presented as though they were important measurable events
representing a fixed or steady state at some defined and important point in time. This
appearance of stability may be a reflection of the measurement and not of the reality of any
situation. Take, for example, the measurement of satisfaction or blood pressure. Both show
state-like behaviour and naturally change over time. Fixing their measurement in time, as a
decision or measurement procedure, does not confer more permanent qualities on them.
Indeed, outcomes can themselves be processes. In research on the effects of job
displacement (for example, Eliason & Storrie, 2006), many researchers have treated
organizational closure, for example, as a discrete event (at a point in time). In reality, the
management of any closure may spread over months if not years. Outcomes are often
embedded in process and simply represent a decided measurement at a point in time. The
argument here is that they do not have importance beyond their context in process.
Failure to consider process may drive a number of errors in the evaluation of
interventions. A common one is Type III error, which is defined as solving the wrong
problem, such as, ‘‘the conduct of an evaluation on a programme that has not been
sufficiently implemented’’ (p. 300, Basch & Gold, 1986; also see Schwartz & Carpenter,
1999). This might lead to the conclusion that an intervention was ineffective when it was
the delivery of that intervention that was faulty (Dobson & Cook, 1980; Lipsey & Corday,
2000). Furthermore, an understanding of the processes involved may help explain the
results of outcome evaluation or even allow designs to be adapted to better reveal outcomes
(Randall, Griffiths, & Cox, 2005). In the 1960s, Suchman (1967) stated:

In the course of evaluating the success or failure of a programme, a great deal can be
learned about how and why a programme works or does not work [ . . .] An evaluation
study may limit its data collection and analysis simply to determining whether or not a
programme is successful [. . .] However, an analysis of process can have both adminis-
trative and scientific significance, particularly where the evaluation indicates that a
programme is not working as expected. Locating the cause of the failure may result in
modifying the programme so that it will work, instead of its being discarded as a complete
failure (p. 66).

Essentially, questions related to process address the ‘‘black box’’ of organizational-level


interventions and possibly providing information that can be generalized to shape
subsequent efforts and help determine subsequent success (Karachi, Abbott, Catalano,
Haggerty, & Fleming, 1999). The argument is not for evaluation of process rather than
outcome but for a combined approach. This would sit comfortably with evaluation models
in the health science, such as that of Donabedian (1972, 1986) framed in terms of
structure, process, and outcome.
Public health is one area in which there has been a noticeable increase in the use of
process evaluation (see, for example, Donabedian, 1972, 1986; Fitzgerald, 1999; Harvey &
Organizational-level interventions for work-related stress 355

Wensing, 2003). Steckler and Linnan (2002) have explained this increase in a way
consistent with the current argument. They have argued that public health interventions
have become increasingly more complex in terms of their mix of components, and the
nature of their focal sites and audiences. This makes it more challenging to examine the
effectiveness of each of their components taken in isolation, and that this requires an
understanding of their interplay and context. It has become important to be able to explain
why certain outcomes are achieved and which of the mix of components was the more
effective in that respect. (This is especially important where large budgets are allocated to
multilevel community trials.) Process evaluation, they argue, can increase our under-
standing about the relationships and interactions among the components of a programme,
and support the development of the underlying theory. It may therefore improve the
likelihood of future theory-informed interventions being successful.

What types of process should be considered in the evaluation of organizational-level interventions?


Current research on organizational-level interventions for work-related stress suggests a
number of possible process variables, including the nature of managerial support for those
interventions and those affected by them, employees’ readiness for and acceptance of the
need for change, their motivation and their willingness and ability to participate, their role
in the decision-making process, the resources available to support change, and the quality of
social relations and trust within the organization (Cox, Karanika, Mellor, Lomas,
Houdmont, & Griffiths, 2007; Nielsen et al., 2007; Nytrø, Saksvik, & Torvatn, 1998;
Nytrø et al., 2000; Taris et al., 2003). These variables reflect two different things: first, the
management of the intervention process (or implementation), and second, the organiza-
tional context for that intervention in terms of the organizational and social processes in
which it is embedded. This is an important distinction, with both types of processes being
different from those of the intervention itself.
Such distinctions find resonance in the literature. There is, for example, a growing body
of evidence to suggest that the management of the implementation of any intervention is an
important key to its success (Cox et al., 2007; Kompier, Cooper, & Geurts, 2000;
Mikkelsen, Saksvik, & Landsbergis, 2000; Nytrø et al., 2000; Parker & Wall, 1998; Randall,
Cox, & Griffiths, 2007; Saksvik et al., 2007). Karsh, Moro, and Smith (2001) have argued
that ‘‘the study of the implementation process is crucial both for our understanding of
future research results and for understanding the variance in outcomes’’ (p. 89). At the
same time, it has also been suggested that variability in macro processes in the wider
organizational, social and socio-economic, and political contexts may explain why some
interventions are successful while others are not (Goldenhar & Schulte, 1994). It has been
suggested that the probability of failure of any organizational intervention is about 50%
(Fullan, Miles, & Taylor, 1981), possibly as a result of the influence of such macro
processes.

Interpreting the imperfect: Aggregation of evidence


The fourth and final issue here is that of interpreting and learning from ‘‘non ideal’’ studies.
Several reviews of organizational-level interventions have concluded that few evaluations
conform to the ideal requirements of the experimental approach and that the others cannot
offer reliable and valid insight into their effectiveness. Undoubtedly taken individually,
study by study, this is probably true, and each study on its own may mean very little.
However, can this imperfect evidence be aggregated to any effect?
356 T. Cox et al.

Kunz and Oxman (1998) compared randomized and non-randomized clinical trials in
terms of patient outcomes. Randomized trials were treated as the ‘‘gold standard’’ against
which the non-randomized trials, lacking largely in relation to concealed random allocation,
were compared. In studies using concealed allocation, patients are unaware of which
condition (treatment or control) they have been allocated to. The authors reported that
their evidence suggested that the non-randomized trials, compared to the randomized trials,
caused a distortion of apparent treatment effects leading to both larger and smaller
estimates (by trial). However, Kunz and Oxman also reported that outcomes in randomized
and non-randomized treatment groups, for the same intervention, were ‘‘frequently
similar’’ and that it was differences between randomized controls and historical controls
(non-randomized trials) that led to the distortions of effect. Their data when comparing
randomized and non-randomized trials for different interventions were not clear. They also
looked specifically at concealed allocation of participants to conditions compared with non-
concealed allocation and reported large distortions of effect. These appear to drive their
overall conclusions. It is possible that the nature of the illness being treated, of the clinical
interventions used, and of their outcomes might have determined the impact of
concealment. Non-concealment might prove a ‘‘fatal flaw’’ in clinical trials but not
necessarily be fatal in, say, organizational interventions where a different set of dynamics
might operate.
The Kunz and Oxman (1998) study does not detract from the value of RCTs, where
these are possible, but it does begin to highlight the complexity of the situation and leaves
open the possibility that examination of imperfect studies might provide some food for
thought. Methodological inadequacy, judged against the requirements of traditional
scientific method, and uncertainty in measurement, are common in applied research. In
any study, these increase the likelihood of noise in the data and reduce the probability that
the study will reveal a reliable effect. If the study does reveal an effect, then there may be a
risk of an inflated (or deflated) estimation of effect size. If a large number of studies, none of
them ideal but differing in their inadequacies, present the same results then there might be
an argument that those results might be accepted if they are theoretically and practically
credible. Collectively, the findings of these studies could be challenged on the grounds of
absolute accuracy. However, where such accuracy is not the key issue, proving the
robustness of their findings and their practical and theoretical validities might be sufficient
to allow them to be considered, depending on the purpose of the research. In contrast, the
probability of false positives will increase across such studies where they share systematic
error, as it is more difficult to separate context-specific errors from valid findings.
Furthermore, it might be possible to learn from individual studies that are imperfect
when judged against the ideal of the traditional research approach. For example, analyses in
which there are multiple sources of information with different types of data being captured
using different methods, but showing the same trends, might be taken to provide a true
picture. A not dissimilar point has been made on several occasions by Kristensen and his
colleagues (Kompier & Kristensen, 2001; Stole & Kristensen, 1996) and by Semmer
(2003) in relation to small-scale studies in the real world. Thus, there is valid argument
that, rather than being dismissed, studies judged methodologically as not ideal should be
scrutinized collectively as they might together yield useful insights. Semmer (2003) has
emphasized the importance of providing detailed descriptions of projects rather than simply
criticizing (allegedly) poor designs.
Organizational-level interventions for work-related stress 357

Framework for the evaluation of organizational-level interventions


What is already established is often more obvious and more detailed than a new idea.
Furthermore, what is already established has probably developed over many years through
much effort while new ideas are, by definition, just starting out on that journey. In this
paper, we offer more than an extension to existing critiques of the traditional research
approach in relation to the evaluation of organizational-level interventions: we offer a
framework for a broader scientific approach. It is argued here that there are three levels at
which we might think about evaluations of organizational-level interventions, a fixed point
in our thinking as applied psychologists, and some guiding principles for evaluating
organizational-level interventions.

Evaluation model
The different levels for evaluating organizational-level interventions can be arranged
hierarchically to describe an evaluation model. The central core is the intervention process
itself and its outcomes. The next domain is represented in the implementation process. Both
of these processes are embedded in the third domain represented in macro processes: the
social, organizational, socio-economic, and political. Taken together, these three levels fit
comfortably within approaches to evaluation in public health and health services research,
for example, Donabedian’s (1972, 1986) conceptualization of ‘‘structure, process and
outcome’’ and Goldenhar, LaMontagne, Katz, Heaney, and Landsbergis’s (2001) frame-
work of intervention effectiveness, implementation, and development research.

Fixed point
The fixed point is largely context specific. In the context of organizational-level
interventions, it is a commitment, as applied psychologists, to evidence-based problem-
solving (see, for example, Cox et al., 2000; Kompier, 2004; Mackay et al., 2004) developed
through research designed to address important problems but which is both theoretically
and practically valid.

Principles
Probably the most important principle for evaluating organizational-level interventions
relates to use of the concept of designs being fit for their purpose and of data being
acceptable (in the context of the defined purpose) in terms of being good enough, as
discussed earlier. A second principle is the need to be innovative in the development of good
enough designs. Apparent innovation in this area might simply be achieved by under-
standing the science of other disciplines and suitably importing their design ideas and
principles. A good example of this is the use of adaptive designs in the evaluation of
interventions borrowed from health promotion studies. Using measures of the intervention
process, such as that of organizational penetration, and its emergent variability may allow
the statistical analyses to reveal the effects of the intervention (see Randall et al., 2005). The
third principle is the need to pay attention to processes as well as outcomes, and being
suitably critical of the concept of an ‘‘outcome’’ as a meaningful steady state (defined
through measurement) and as something separate from process. The fourth principle, not
developed here, is a willingness to use qualitative as well as quantitative data and to explore
the possibility of developing new explanatory models as well as testing out existing ones.
The fifth principle is a willingness to use a multiplicity of methods and measurements, and
358 T. Cox et al.

from a variety of disciplines; in a sense, evaluation science needs to be truly interdisci-


plinary.
This methodological framework may differ from that of the traditional scientific
approach, but is an approach that is more likely to give us some understanding of what
works and what does not in organizational-level interventions. Judged against the gold
standard of the ideal experiment it may be found wanting by some (as discussed by, for
example, Boaz & Ashby, 2003), but it may provide information in cases where otherwise
none would be possible. Developments within this framework should offer us gains over
chance and, when organizations’ and peoples’ lives and futures are at stake such gains are
worthwhile.

Conclusion
It has been argued in this paper that there is an increasing need for applied psychology to
consider further how organizational-level interventions for dealing with challenges such as
work-related stress should best be designed and conducted. In particular, there is a need to
debate the philosophy that frames such evaluations and the methodology that supports
them. A comprehensive and informed literature needs to be grown, based on a coherent
body of knowledge.
Many believe that a research approach that is based on traditional research methods is
best for the evaluation of organizational-level interventions. This ignores the limitations of
such methods in the real-world context. The main objective of this paper is not to offer an
alternative to a scientific approach but to argue for a more broadly conceived and eclectic
scientific framework for intervention evaluation. It espouses an approach that is reflective of
the reality of organizational life, and offers an outline framework for this broader view of
evaluation based on five principles. The most important principle relates to research designs
being fit for their purpose and data being acceptable in the context of that purpose. The
second principle is the need to be innovative in the development of research designs that are
fit for their purpose. The third principle is the need to pay attention to intervention
processes as well as their outcomes. The fourth principle is a willingness to use qualitative as
well as quantitative data and to explore the possibility of developing new explanatory
models as well as testing out existing ones. The fifth and last principle is a willingness to use
a multiplicity of methods and measurements, and from a variety of disciplines; evaluation
science needs to be inter-disciplinary. The authors hope that this framework can be
expanded through future research and debate and that this paper will promote the effective
evaluation of organizational-level interventions for work-related stress.

Acknowledgements
The authors wish to thank Graeme MacLennan, University of Aberdeen, and four
anonymous reviewers for their comments on this paper. They also wish to acknowledge
the general funding support provided by the Health & Safety Executive to the Institute of
Work, Health & Organisations at the University of Nottingham, for developing the risk
management methodology. The views expressed here are those of the authors and do not
necessarily represent those of any other person or organization.
Organizational-level interventions for work-related stress 359

References
Basch, C. E., & Gold, R. S. (1986). The dubious effects of type V errors in hypothesis testing on health education
practice and theory. Health Education Research, 1, 299305.
Black, N. A. (1996). Why we need observational studies to evaluate the effectiveness of health care. British Medical
Journal, 312, 12151218.
Boaz, A., & Ashby, D. (2003). Fit for purpose? Assessing research quality for evidence based policy and practice. Working
Paper 11, ESRC UK Centre for Evidence Based Policy and Practice, University of London.
Briner, R. B., & Reynolds, S. (1999). The costs, benefits, and limitations of organizational level stress
interventions. Journal of Organizational Behavior, 20, 647664.
Campbell, D. T., & Stanley, J. C. (1963). Experimental and quasi-experimental designs for research. Chicago: Rand
McNally.
Catchpole, M. A., Harris, J. R. W., Renton, A., & Hickman, M. (1999). Surveillance of sexually transmitted
infections: fit for purpose? International Journal of STD and AIDS, 10, 493494.
Colman, A. M. (1994). Applications of psychology. London: Longman.
Cook, T. D., & Shadish, W. R. (1994). Social experiments: some developments over the past fifteen years. Annual
Review of Psychology, 45, 545580.
Cox, T. (1993). Stress research and stress management: putting theory to work. Sudbury: HSE Books.
Cox, T., Griffiths, A., Barlow, C., Randall, R., Thomson, T., & Rial-González, E. (2000). Organisational
interventions for work stress: a risk management approach. Sudbury: HSE Books.
Cox, T., Karanika, M., Mellor, N., Lomas, L., Houdmont, J., & Griffiths, A. (2007). Implementation of the
management standards for work-related stress: process evaluation. Sudbury: HSE Books.
Dobson, L. D., & Cook, T. J. (1980). Avoiding type III error in program evaluation: results from a field experiment.
Evaluation and Program Planning, 3, 269276.
Donabedian, A. (1972). Models for organizing the delivery of health services and criteria for evaluating them.
Milbank Quarterly, 50, 103154.
Donabedian, A. (1986). Criteria and standards for quality assessment and monitoring. Quality Review Bulletin, 12,
99108.
Dooley, K. J. (1997). A complex adaptive systems model of organizational change. Nonlinear Dynamics. Psychology
and Life Sciences, 1, 6997.
Eliason, M., & Storrie, D. (2006). Lasting or latent scars? Swedish evidence on the long term effects of job
displacement. Journal of Labor Economics, 24, 831856.
European Commission (1996). Guidance on risk assessment at work. Luxembourg: Office for Official Publications
of the European Communities.
Fitzgerald, L. (1999). Case studies as a research tool. Quality and Health Care, 8, 75.
Fullan, M., Miles, M., & Taylor, G. (1981). Organization development in schools: the state of the art. Washington, DC:
National Institute of Education.
Goldenhar, L., & Schulte, P. (1994). Intervention research in occupational health and safety. Journal of
Occupational Medicine, 36, 1022.
Goldenhar, L. M., LaMontagne, A. D., Katz, T., Heaney, C., & Landsbergis, P. (2001). The intervention research
process in occupational safety and health: an overview from the Natural Occupational Research Agenda
Intervention Effectiveness Research Team. Journal of Occupational and Environmental Medicine, 43, 616622.
Griffiths, A. (1998). The psychosocial work environment. In R. C. McCaig, & M. J. Harrington (Eds), The
changing nature of occupational health. Sudbury: HSE Books.
Griffiths, A. (1999). Organizational interventions: facing the limitations of the natural science paradigm.
Scandinavian Journal on Work, Environment and Health, 25, 589596.
Griffiths, A., Cox, T., Karanika, M., Khan, S., & Tomás, J. (2006). Work design and management in the
manufacturing sector: development and validation of the Work Organisation Assessment Questionnaire.
Occupational and Environmental Medicine, 63, 669675.
Griffiths, A. J., & Schabracq, M. (2003). Work and health psychology as a scientific discipline: facing the limits of
the natural science paradigm. In M. Schabracq, J. A. M. Winnubst, & C. L. Cooper (Eds), Handbook of work and
health psychology (2nd edn). Chichester: Wiley & Sons.
Guagnano, G. A., Stern, P. C., & Dietz, T. (1995). Influences on attitude-behaviour relationships: a natural
experiment with curbside recycling. Environment and Behaviour, 27, 699718.
Guastello, S. (1993). Do we really know how well our occupational accident prevention programmes work? Safety
Science, 16, 445463.
Guion, R. M. (1998). Some virtues of dissatisfaction in the science and practice of personnel selection. Human
Resource Management Review, 8, 351365.
360 T. Cox et al.
Harvey, G., & Wensing, M. (2003). Methods for evaluation of small scale quality improvement projects. Quality.
Safety and Health Care, 12, 210214.
Harvey, L., & Green, D. (1993). Defining quality. Assessment and Evaluation in Higher Education, 18, 934.
Herriot, P. (1984). Down from the ivory tower: graduates and their jobs. Chichester: Wiley.
Hodgkinson, G. P., Herriot, P., & Anderson, N. (2001). Re-aligning the stakeholders in management research:
lessons from industrial, work and organisational psychology. British Journal of Management, 12 (special issue),
S41S48.
Jordan, J., Gurr, E., Tinline, G., Giga, S., Faragher, B., & Cooper, C. (2003). Beacons of excellence in stress
prevention. Sudbury: HSE Books.
Kaptchuk, T. J. (2001). The double-blind, randomized, placebo-controlled trial: gold standard or golden calf?
Journal of Clinical Epidemiology, 54, 541549.
Karachi, T. W., Abbott, R. D., Catalano, R. F., Haggerty, K. P., & Fleming, C. B. (1999). Opening the Black Box:
using process evaluation measures to assess implementation and theory-building. American Journal of Community
Psychology, 27, 711713.
Karanika, M. (2006). An appeal to reality: modelling non-linear work-health relationships in the context of risk
management. Unpublished doctoral thesis, University of Nottingham.
Karsh, B. T., Moro, F. B. P., & Smith, M. J. (2001). The efficacy of workplace ergonomic intervention to control
musculoskeletal disorders: a critical analysis of the peer reviewed literature. Theoretical Issues in Ergonomic
Science, 2, 269276.
Koch, S. (1959). Psychology: a study of science. New York: McGraw-Hill.
Kompier, M. (2004). Does the ‘‘Management Standards’’ approach meet the standard? Work & Stress, 18, 137
139.
Kompier, M., & Kristensen, T. S. (2001). Organizational work stress interventions in a theoretical, methodological
and practical context. In J. Dunham (Ed.), Stress in the workplace: past, present and future. London: Whurr
Publishers.
Kompier, M. A., Aust, B., van den Berg, A. M., & Siegrist, J. (2000). Stress prevention in bus drivers: evaluation of
13 natural experiments. Journal of Occupational Health Psychology, 5, 1131.
Kompier, M. A. J., Cooper, C. L., & Geurts, S. A. E. (2000). A multiple case study approach to work stress
prevention in Europe. European Journal of Work and Organisational Psychology, 9, 371400.
Kunz, R., & Oxman, A. D. (1998). The unpredictability paradox: review of empirical comparisons of randomised
and non randomised clinical trials. British Medical Journal, 317, 11851190.
Landsbergis, P. A., & Vivona-Vaughan, E. (1995). Evaluation of an occupational stress intervention in a public
agency. Journal of Organisational Behavior, 16, 2948.
Leka, S., Cox, T., Makrinov, N., Ertel, M., Hallsten, L., Iavicoli, S., et al. (2007). Towards the development of a
psychosocial risk management toolkit. Stockholm: SALTSA.
Levi, L., Sauter, S., & Shimomitsu, T. (1999). Work-related stress*it’s time to act. Journal of Occupational Health
Psychology, 4, 394395.
Lipsey, M. W., & Corday, D. S. (2000). Evaluation methods for social intervention. Annual Review of Psychology,
51, 345375.
Loisel, P., Durand, M. J., Baril, R., Gervais, J., & Falardeau, M. (2005). Interorganizational collaboration in
occupational rehabilitation: perceptions of an interdisciplinary rehabilitation ream. Journal of Occupational
Rehabilitation, 15, 581590.
Macdonald, G., Veen, C., & Tones, K. (1996). Evidence for success in health promotion: suggestions for
improvement. Heath Education Research, 11, 367376.
Mackay, C. J., Cousins, R., Kelly, P. J., Lee, S., & McCaig, R. H. (2004). ‘‘Management Standards’’ and work-
related stress in the UK: policy background and science. Work & Stress, 18, 91112.
Meyer, B. (1995). Natural and quasi-experiments in economics. Journal of Business & Economic Statistics, 13, 151
161.
Mikkelsen, A., Saksvik, P. Ø., & Landsbergis, P. A. (2000). The impact of a participatory organisational
intervention on job stress in community health care institutions. Work & Stress, 14, 156170.
Murphy, L. R. (1996). Stress management in working settings: a critical review of the health effects. American
Journal of Health Promotion, 11, 112135.
Nielsen, K., Fredslund, H., Christensen, K. B., & Albertsen, K. (2006). Success or failure? Interpreting and
understanding the impact of interventions in four similar worksites. Work & Stress, 20, 272287.
Nielsen, K., Randall, R., & Albertsen, K. (2007). Participants’ appraisals of process issues and the effects of stress
management interventions. Journal of Organizational Behavior, 28, 118.
Nytrø, K., Saksvik, P. Ø., Mikkelsen, A., Bohle, P., & Quinlan, M. (2000). An appraisal of key factors in the
implementation of occupational stress interventions. Work & Stress, 14, 213225.
Organizational-level interventions for work-related stress 361
Nytrø, K., Saksvik, P. Ø., & Torvatn, H. (1998). Organizational prerequisites for the implementation of systematic
health, environment and safety work in enterprises. Safety Science, 30, 297307.
Oeij, P., Wiezer, N., Elo, A., Nielsen, K., Vega, S., Wetzstein, A., et al. (2006). Combating psychosocial risks in
work organisations: some European practices. In S. McIntyre, & J. Houdmont (Eds), Occupational health
psychology: European perspectives on research, education and practice (Vol. 1). Maia, Portugal: ISMAI Publishers.
Oettigen, G., Little, T. D., Lindenberger, U., & Baltes, P. B. (1994). Causality, agency and control beliefs in East
versus West Berlin children: a natural experiment on the role of context. Journal of Personality and Social
Psychology, 66, 579595.
Ovretveit, J. (1998). Evaluating health interventions. Buckingham: Open University Press.
Parker, S., & Wall, T. (1998). Job and work design: organizing work to promote well-being and effectiveness. Thousand
Oaks, CA: Sage.
Parkes, K. (1982). Occupational stress among student nurses: a natural experiment. Journal of Applied Psychology,
67, 784796.
Parkes, K. R., & Sparkes, T. J. (1998). Organizational interventions to reduce work stress: are they effective?. Sudbury:
HSE Books.
Pettigrew, M., Cummins, S., Ferrell, C., Findlay, A., Higgins, C., Hoy, C., et al. (2005). Natural experiments: an
underused tool for public health? Public Health, 119, 751757.
Randall, R. (2002). Organisational interventions to manage work-related stress: using organisational reality to permit and
enhance evaluation. Unpublished doctoral thesis, University of Nottingham.
Randall, R., Cox, T., & Griffiths, A. (2007). Participants’ accounts of a stress management intervention. Human
Relations, 60, 11811209.
Randall, R., Griffiths, A., & Cox, T. (2005). Evaluating organizational stress management interventions using
adapted study designs. European Journal of Work and Organizational Psychology, 14, 2341.
Reynolds, S. (1997). Psychological well-being at work: is prevention better than cure? Journal of Psychosomatic
Research, 43, 93102.
Reynolds, S., & Shapiro, D. (1991). Stress reduction in transition: conceptual problems in the design,
implementation, and evaluation of worksite stress management interventions. Human Relations, 44, 717733.
Rick, J., Thomson, L., Briner, R., O’Regan, S., & Daniels, K. (2002). Review of existing supporting scientific
knowledge to underpin standards of good practice for key work-related stressors*Phase 1. Sudbury: HSE Books.
Robson, C. (2002). Real world research: a resource for social scientists and practitioner-researchers. Oxford: Blackwell
Publishing.
Rosenzweig, M. R., & Wolpin, K. I. (2000). Natural ‘‘natural experiments’’ in economics. Journal of Economic
Literature, 38, 827874.
Saksvik, P. Ø., Tvedt, S. D., Nytrø, K., Andersen, R. B., Andersen, T. K., Buvik, M. P., et al. (2007). Developing
criteria for healthy organizational change. Work & Stress, 21, 243263.
Sauter, S. L., Brightwell, W. S., Colligan, M. J., Hurrell, J. J, Jr.,, Katz, T. M., LeGrande, D. E., et al. (2002). The
changing organisation of work and the safety and health of working people. Cincinnati: NIOSH.
Schaufeli, W., & Kompier, M. (2001). Managing job stress in the Netherlands. International Journal of Stress
Management, 8, 1534.
Schein, E. (1991). Legitimating clinical research in the study of organizational culture. MIT Working Paper No: 3288-
91-BPS.
Schneider, M., & Somers, M. (2006). Organizations as complex adaptive systems: implications of complexity
theory for leadership research. The Leadership Quarterly, 17, 351365.
Schwartz, S., & Carpenter, K. M. (1999). The right answer for the wrong question: consequences of type III error
for public health research. American Journal of Public Health, 89, 11511153.
Semmer, N. K. (2003). Job stress interventions and organisation at work. In J. C. Quick, & L. E. Tetrick (Eds),
Handbook of occupational health psychology. Washington: APA.
Semmer, N. K. (2006). Job stress interventions and the organisation of work. Scandinavian Journal of Work,
Environment and Health, 32, 515527.
Steckler, A., & Linnan, L. (2002). Process evaluation for public health interventions and research. San Francisco, CA:
Jossey-Bass.
Suchman, E. A. (1967). Evaluative research: principles and practice in public services and social action programs. New
York: Russell Sage Foundation.
Susser, M. (1981). Prenatal nutrition, birth weight and psychological development: an overview of experiments,
quasi-experiments and natural experiments in the past decade. American Journal of Clinical Nutrition, 34, 784
803.
362 T. Cox et al.
Taris, T. W., Kompier, M. A. J., Geurts, S. A. E., Schreurs, P. J. G., Schaufeli, W. B., de Boer, E., et al. (2003).
Stress management interventions in the Dutch Domiciliary Care Sector: findings from 81 organizations.
International Journal of Stress Management, 10, 297325.
Thompson, M., & Fearn, T. (1996). What exactly is fitness for purpose in analytical measurement? Analyst, 121,
275278.
van der Hek, H., & Plomp, H. (1997). Occupational stress management programmes: a practical overview of
published effect studies. Occupational Medicine, 47, 133141.
Wagner, J. A., Williams, S. A., & Webster, C. J. (2007). Biomarkers and surrogate end points for fit-for-purpose
development and regulatory evaluation of new drugs. Clinical Pharmacology and Therapeutics, 81, 104107.
Wilkinson, R. G. (1990). Income distribution and mortality: a ‘‘natural’’ experiment. Sociology of Health and Illness,
12, 391412.

You might also like