You are on page 1of 6

Epidemiologia e Psichiatria Sociale

http://journals.cambridge.org/EPS

Additional services for Epidemiologia e Psichiatria Sociale:

Email alerts: Click here


Subscriptions: Click here
Commercial reprints: Click here
Terms of use : Click here

Event-based categorical sequential analyses of the medical interview: a


review

Maria Angela Mazzi, Lidia Del Piccolo and Christa Zimmermann

Epidemiologia e Psichiatria Sociale / Volume 12 / Special Issue 02 / June 2003, pp 81 - 85


DOI: 10.1017/S1121189X00006126, Published online: 11 October 2011

Link to this article: http://journals.cambridge.org/abstract_S1121189X00006126

How to cite this article:


Maria Angela Mazzi, Lidia Del Piccolo and Christa Zimmermann (2003). Event-based categorical sequential analyses of
the medical interview: a review. Epidemiologia e Psichiatria Sociale, 12, pp 81-85 doi:10.1017/S1121189X00006126

Request Permissions : Click here

Downloaded from http://journals.cambridge.org/EPS, IP address: 128.122.253.212 on 12 May 2015


Event-based categorical sequential analyses
of the medical interview: a review
MARIA ANGELA MAZZI, LIDIA DEL PICCOLO, CHRISTA ZIMMERMANN
Department of Medicine and Public Health, Service of Medical Psychology, University of Verona, Verona, Italy

SUMMARY. When the doctor-patient interaction is viewed as a series of utterances, the temporal position of utterances
becomes a central information in understanding the nature of interaction. Important concepts are interdependence and serial depen-
dence which account for the fact that two partners influence each other in their talk and that each partner influences him/herself.
Lag sequential analysis studies the associations between doctor and patient utterances in a two-way contingency table (lag one
sequences) and is used for exploratory purposes. Log-linear modelling, based on multi-way contingency tables, is used as an exten-
sion of lag-sequential analysis to study longer sequences.
Markov chains test sequences in terms of processes with the aim to find predictive models and require a theory driven approach.
Pattern recognition aims to discover regularities in the temporal evolution of the utterance sequences. Theory driven applications
analyse manifest patterns in terms of their conditional probability distribution while empirically driven applications are used to
detect "hidden" patterns. These different approaches to sequential data can be regarded as complementary tools to describe the doc-
tor patient consultations at various levels of complexity.

Declaration of Interest: none.

KEY WORDS: doctor patient conversation, sequential dyadic interaction, lag-sequential analysis, categorical data series.

INTRODUCTION ral position becomes a central information in understan-


ding the relationship between occurrences of events.
Sequential methods allow to study the doctor-patient The theory of the social exchange (Dumas, 1986)
consultation in terms of interaction between two indivi- claims that in a dyadic interaction process each partner's
duals. behaviour follows from his/her previous behaviour and it
After the translation of the doctor-patient consultation is only partially influenced by the behaviour of the other.
into a series of codes, interaction may be studied from According to Stiles et al. (1998), instead, participants of
many different points of view. The focus of research may a conversation continually adjust their actions to achieve
be on detecting specific recurring sequential patterns, on desirable outcomes; the authors called this the "responsi-
studying the time lag in response to some given stimulus veness of human interaction". Interaction in terms of
(to evidence the relationship between what doctor and "responsiveness" seems the more convincing description
what patient says), or on testing differences between of what happens during a medical consultation.
groups of interviews in terms of antecedent-consequent Expressed in statistical terms this assumes that the serial
events (to distinguish, for example, interviewing styles dependence between physician and patient speech is
before and after a training or to detect gender differences high, but must be quantified and distinguished from the
in conducting the consultation). dependence "within speakers" stressed by Dumas (1986),
Bakeman & Gottman (1986) described the relation- when we want to understand the complex structure of the
ship between the behaviours of two persons as two sepa- verbal interaction.
rate series of events in interaction. When the dyadic inte- We will therefore focus our attention on the statistical
raction is between a series of codes/events, their tempo- methods which take into account these serial dependen-
cies between and within subjects (here patients and phy-
sicians).

Address for correspondence: Dr. M.A. Mazzi, Department of


Medicine and Public Health, Section of Psychiatry, Service of Medical DEFINITIONS AND CONCEPTS
Psychology, University of Verona, Policlinico G.B. Rossi, Piazzale
L.A. Scuro 10, 37134 Verona (Italy).
Fax: +39-045-585871 Generally sequential analyses pursue two aims
E-mail: mariangela.mazzi@univr.it (Bakeman & Gottman, 1986). The first one is to discover

Epidemiologia e Psichiatria Sociale, 12, 2, 2003


81
M.A. Mazzi et al.

probabilistic patterns in the stream of code events; in LAG-SEQUENTIAL ANALYSIS


other words, the interest is centred on the order and the
prevailing sequences that characterise the data. The This approach was introduced by Sackett (1979) in
second aim is to assess the effect of contextual or expla- order to analyse complex interactions between mother
natory variables on the sequential structure of interac- and infant. This approach was adopted by Gottman &
tions (here the doctor-patient dialogue). Bakeman (1979), Bakeman & Gottman (1986; 1997) and
There is no unique definition of sequential analysis. In Bakeman & Quera (1992; 1995a, b), who applied the lag-
statistics the term refers to the experimental theory, in sequential analysis to study the association among cate-
which probability tests are posed in sequence, as is done gories in a two-way contingency table. The aim was to
when applying hierarchical, or nested, methods. These use relatively simple summary statistics on rather com-
techniques are of no pertinence to conversational analy- plex data. This approach did not require strong a-priori
ses; at least as discussed here. In the biological and medi- hypotheses and the applications were essentially explora-
cal context "sequential analysis" is a proxy term of the tory.
linkage analysis. It is used in the field of human genetics For an easier comprehension of the temporal position
which study the patterns hidden in the DNA "alphabet". of events, they adopted the notation of transitional matri-
The application of some of these sophisticated statistical ces from the Markov chain theory (Lehoczky, 1998). The
techniques might be considered also for the exploration Markov chain theory actually implies an inferential con-
of patterns in conversational analyses. firmatory approach to the sequential processes and requi-
In psychology sequence analysis is mainly intended in res strong assumptions in terms of distribution, temporal
terms of lag sequential analysis by which sequentially homogeneity and dependency structure (Faraone &
recorded behavioural data are analysed in observational Dorfman, 1987). In contrast, the exploratory characteri-
studies of social systems (Sackett, 1979; Gottman & Roy, stic of lag sequential analysis permits to ignore these
1990; Bakeman & Quera, 1995b). This approach investi- restrictions, but this advantage reduces the power of the
gates the position of events of interest in the temporary statistical tests.
distribution of their occurrences and seems appropriate The information related to the sequence of codes is
when the intent is exploratory. structured by using a transitional contingency table which
is the basis for any sequence analysis. This table usually
Serial dependence and interdependence of event codes represents coded events in terms of their displacement in
The literature on behavioural processes distinguishes time (lag). Rows represent the first or given event (for
two kinds of dependencies in the dyadic conversation example, physician utterance) and columns the following
(Dumas, 1986; Van Beek et al, 1992): the serial depen- or target event (what the patient says subsequently, that is
dence and the interdependence. at lag one or greater than one). The lag is the distance, in
Serial dependence is also called autodependence or terms of subsequent code units or events, between the
autocontingency (Altham, 1979; Tavare & Altham, given and the target event considered. Each cell repre-
1983), or "dependence within subject" and accounts for sents the frequency of chains that begin with the given
the fact that a subject may influence him/herself. When event and end with the target event. The same chain may
the patient (or doctor) says something, it may be influen- be expressed in terms of transition probabilities, dividing
ced by what s/he has said before. The dependence within the frequency of the cell by the frequency of the row.
subject is an important element in the basic statistical The advantage of transition probabilities is that "they
assumptions underlying sequential analysis, because "correct" for differences in base rates for the "given"
autodependence is a non-random aspect of a behavioural behavioural states ... [and] easily show the most likely
sequence. ways of "moving" from one state to another" (Bakeman
Interdependence refers to the "dependence between & Gottman, 1997, p. 104).
subjects" which occurs when two subjects influence each However, the disadvantages, according to the authors,
other. It is also called cross-contingency or joint depen- are the great variability among subjects in transition pro-
dence, which derives from the interaction of two distinct babilities and their questionable use for comparing grou-
sources (either patient or doctor). ps. An example of this application can be seen in Eide et
The choice of the statistical technique has to take into al. (2003).
account this complexity of the units of analysis (code The sequential techniques presented below are distin-
categories of speech). guished into those which restrict the analyses to imme-
diately following (lag 1) behaviours and to those which
Epidemiologia e Psichiatria Sociale, 12, 2, 2003
82
Event-based categorical sequential analyses of the medical interview: a review

refer to a multivariate contest, considering more than lag Wampold's K, z-score) to improve the estimation criteria
1 sequences. The assumption underlying lag 1 sequence when the parametric assumptions are not confirmed or
analysis, rarely declared, is the so-called "fist-order when the sample size is small. They then compared the
Markov chain". It implicitly assumes that there is only a transition indices between groups, to test different
short-term memory effect in the sequences, namely that sequence characteristics.
each behaviour is influenced by the immediately prece-
ding behaviour. It would be therefore more correct to call
lag 1 sequential analysis "contiguity analysis". LOG-LINEAR MODELLING

Lag 1 sequential analyses An extension beyond lag 1 needs multivariate statisti-


The basic conditions for contingency tables require cal techniques, which take into account more than one
that observations are probabilistically independent, show subsequent dependency. This approach allows to analyse
the same distribution and that their number is large for how long (at lag 2, 3, 4, etc.) the reciprocal influence
(Wickens, 1989). To satisfy the last condition the total of physician and patient speech is operating (sequence
sample should be at least four or five times the number of memory). Log-linear models are often applied to lag-
cells (or more if the distribution of the marginal category analysis, because they test the association among catego-
frequencies is skewed), with the exception of a few cells ries in a multi-way contingency table (Bakeman & Quera,
with low frequencies (in a large table no more than 20% 1995; Van-Beek et al, 1992).
of total cells may have frequencies less than 5). The log-linear model is a regression function for fre-
The classical methods of contingency analysis (for quency data, in which the logarithm of the expected value
example, chi-square test, phi index, kappa index) should of a count variable is modelled as a linear function of
not be applied to transitional tables that describe sequen- parameters (explanatory variables). The parameters can
ces of behaviour, because these last requirements are evidence the main effects (the common part shared by all
unlikely to be met by these type of data. interviews), the marginal effects (the information provi-
The scientific debate about the appropriateness of the ded from marginal rows and columns - they represent the
statistical procedures regarding sequence analysis was specific style of each partner) and the interaction effects
already lively twenty years ago (Allison & Liker, 1982; (the association among variables indicates the autodepen-
Tavare & Altham, 1983; Dumas, 1986; Faraone & dence within a subject, when even lag distances are con-
Dorfman, 1987). In particular it was pointed out that the sidered, and the interdependence between subjects, when
simple chi-square index is inadequate to test the presence odd distances are explored). The best model (the one that
of interdependence between two "interactants", and that fits the observed data in the simplest way) is usually
the autodependence needs to be controlled and quanti- selected from hierarchical models by a stepwise criterion
fied. Tavare & Altham (1983) proposed the gamma cor- which extracts the relevant interaction parameters. By
rection factor to multiply with the chi-square in order to this way it is possible to quantify the length of the
infer cross contingency (interdependency) between event memory effect.
categories of two partners. Gamma index is a function of Once obtained the best model, the analysis of residuals
the two autocontingencies, estimated by the Phi coeffi- (Agresti, 1996) is used to compare observed and fitted
cients of each separate within-partner sequence (summa- counts and to indicate those that contain sequences occur-
rised in a distinct transition contingency table for either ring with a significantly higher or lower frequency than
patient or doctor). Because the gamma value is comprised expected. The residual frequency is thus a measure of the
between 0 and 1, it decreases the value of the Pearson lack of fit (distance of observed from expected counts)
chi-square by removing the component due to autocon- where the expected frequencies reflect the hypothesis of
tingencies within partners. independence.
Several solutions were proposed also to resolve the Another interesting technique to study multi-way con-
question of the distribution requirements of statistical tingency tables is logit-linear modelling (Budescu, 1984),
tests. Faraone & Dorfmann (1987) used two robust tech- by which a more traditional distinction between the
niques to estimate the correct variance of a cross-depen- response variables and the explicative variables factors,
dence index and its confidence intervals, based on jack- as in ANOVA, is maintained. It permits to investigate
knife and data-split approaches. Bakeman et al. (1996a) non only the presence of interdependence, but also
introduced the permutation test approach on some asso- whether a partner dependence prevails over the other one
ciation or transition indices (Yule's Q, odds ratio, (for example, patient behaviour is more dependent on

Epidemiologia e Psichiatria Sociale, 12, 2, 2003


83
M.A. Mazzi et al.

physician behaviour than vice versa). This is called "lag- patterns (particular relations between groups or pairs of
ged dominance". events in a time series). In the last case the aim is to find
some nested relations among a complex sequence of
Markov Chain models occurrences.
The Markov chains are defined as a collection of ran- In the first approach the statistical techniques are easy
dom variables with a specific dependency structure, cal- and are based on the conditional probability theory, but
led Markov property. This theory assumes that to make they require to test theory driven hypotheses. Prior know-
predictions about "future" behaviours it is sufficient to ledge, theory based or based on previous studies, plays an
consider only its present state and not its past history. important role in determining the choice of the sequence
Faraone & Dorfman (1987) discuss the applicability of pattern under study. The aim is to test whether the expec-
this technique to the study of behavioural sequences. ted pattern occurs among the observed patterns more
They discuss the underlying assumptions and their vali- often than by chance in terms of its conditional probabi-
dity. The assumptions are stationarity, ergodicity and lity distribution. The conditional probability of a sequen-
order of dependency. The first postulates that the proba- ce is estimated by dividing the number of times it occur-
bilities of the occurrences of events and their transition red by the numbers of times it could have occurred in the
probabilities do not change over time. Stationarity is also observational records (the numbers of occurrences plus
called time homogeneity. In terms of doctor-patient con- the number of non-occurrences).
sultations this means, for example, that the probability of An example of this application can be seen in
a specific sequence during the dialogue remains inva- Zimmermann et al. (2003).
riant. A chain is ergodic when all components (sequence This approach to describe behavioural sequence,
of two code events) have a likelihood of occurrence (tran- however, is not exhaustive, because patterns easily beco-
sition probability from one code to the other) greater than me invisible to the naked eye when other behaviours
zero. The order of dependency is the duration of the occur in between. The "hidden pattern" approach helps to
memory effect under study (at what lag a code event distinguish which pattern is relevant (signal) and which is
stops to affect subsequent code events) and has to be defi- irrelevant (noise), by reducing the extraneous sources of
ned a priori. Van-Beek et al. (1992) applied Markov variability (e.g. by increasing signal-to-noise ratio). The
chains to mother-infant interactions, but they limited tools for the study of hidden patterns refer to a specific
their analysis at the second order dependence (lag 2). branch of statistics (called also classification theory)
The analysis of the complex structure of the doctor which explore structure and patterns in large data sets,
patient consultation would require a higher order Markov without recourse to the classical assumptions of the con-
technique or the more sophisticated Hidden Markov firmatory approach, often too rigid for practical use
Chain (MacDonald & Zucchini, 1997; Pentland & Liu, (Lange, 1998). Magnusson (2000) defined hidden pat-
1999) or interconnected Markov Chain (Avery & terns as "patterns of patterns of patterns" (T-patterns). He
Henderson, 1999; Avery, 2002), which so far have found proposed for the study of human interaction the "T-pat-
application only in the field of genetics. tern detection algorithm" to discover hierarchical pattern
(patterns of simpler sequence patterns). Magnusson
Recognition of the occurrence of defined sequences (2002) has suggested this approach to study DNA
(pattern recognition): Conditional probability theory sequences. This technique seems promising also for the
applied to specific sequences of events study of doctor-patient interactions at a high level of
When the aim is to discover regularities in the tempo- complexity.
ral evolution of the conversation, the appropriate metho-
dology to investigate the behavioural interaction is the
pattern recognition. The pattern research can be both CONCLUSION
hypothesis driven (an earlier part of the pattern may be
seen as a likely cause of the later part of the same pattern) The different approaches illustrated above can be
and empirically driven (the search for associations regarded as complementary tools to treat the complex
between parts of sequences without specific hypotheses). structure of doctor patient consultations. None of these
Since the structure of the human behaviour is very techniques seems better than the other, but each of them
complex, it is convenient to distinguish between the can show specific characteristics of the involved sequen-
manifest beliavioural patterns (visible and recurrent pat- ces. As Moran et al. (1992) suggest, to better understand
terns in behavioural streams) and the hidden beliavioural the relationships among behavioural events "efforts must
Epidemiologia e Psichiatria Sociale, 12, 2, 2003
84
Event-based categorical sequential analyses of the medical interview: a review

be focused to paint different pictures of the same scene, Eide H., Quera V. & Finset A. (2003). Exploring rare patient behaviour
with sequential analysis: an illustration. Epidemiologia e Psichiatria
each accurate in its own way" (p. 89). However, we Sociale 12, 109-114.
should not forget Wickens' (1993) warnings which help Faraone S.V. & Dorfman D.D. (1987). Lag sequential analysis: Robust
to avoid the pitfalls connected with the wrong use of sta- statistical methods. Psychological Bulletin 101 , 312-323.
Gottman J.M. & Backeman R. (1979). The sequential analysis of obser-
tistical techniques. He admonishes not to ignore the basic vational data. In Social interaction analysis: Metliodological issues
assumptions of each technique which have to be satisfied (ed. ME. Lamb, S.J. Soumi and G.R. Stephenson). University of
in order to have unbiased results. Wiscounsin Press: Madison.
Gottman J.M. & Roy A.K. (1990). Sequential Analysis: a Guide for
Behavioral Researchers Cambridge University Press: Cambridge.
Lange N. (1998). Pattern Recognition. In Encyclopedia of Biostatistics
REFERENCES (ed. P. Armitage and T. Colton), pp. 3298-3304. John Wiley &
Sons: West Sussex.
Lehoczky J. (1998). Markov Chains. In Encyclopedia of Biostatistics
Agresti A. (1996). An Introduction to Categorical Data Analysis. John
(ed. P. Armitage and T. Colton), pp. 2423-2428. John Wiley &
Wiley & Sons: New York.
Sons: West Sussex.
Allison P.D. & Liker J.K. (1982). Analyzing sequential categorical data
MacDonald I.L. & Zucchini W. (1997). Hidden Markov and Other
on dyadic interaction: A comment on Gottman. Psychological
Models for Discrete-valued Time Series. Chapman & Hall: London.
Bulletin 91, 393-403.
Magnusson M.S. (2000). Discovering hidden time patterns in behavior:
Altham P.M. (1979). Detecting relationships between categorical varia-
T-patterns and their detection. Behavioral Research. Methods,
bles observed over time: a problem of deflating a Chi-squared stati-
Instruments and Computers 32, 93-110.
stic. Applied Statistics 28, 115-125.
Magnusson M.S. (2002). T-patterns in behavior and DNA: detection
Avery P.J. (2002). Fitting interconnected Markov chain models: DNA
and analysis with Theme and GeneTheme. In Measuring Beliavior
sequences and test cricket matches. Statistician 51, 267-278.
2002, 4* International Conference on Methods and Techniques in
Avery P.J. & Henderson D.A. (1999). Fitting Markov chain models to
Behavioral Research, 27-30 August 2002, Amsterdam, The
discrete state series such as DNA sequences. Applied Statistics 48,
Netherlands.
153-161.
Moran G., Dumas J.E. & Symons D.K. (1992). Approaches to sequen-
Bakeman R. & Gottman J.M. (1986). Observing Interaction: an
tial analysis and the description of contingency in behavioral inte-
Introduction to Sequential Analysis. Cambridge University Press:
raction. Behavioral Assessment 14, 65-92.
Cambridge.
Pentland A. & Liu A. (1999). Modelling and prediction of human beha-
Bakeman R. & Gottman J.M. (1997). Observing Interaction: an
vior. Neural Computation 11, 229-242.
Introduction to Sequential Analysis, 2nd ed. Cambridge University
Sackett, G.P. (1979). The lag sequential analysis of contingency and
Press: Cambridge.
cyclicity in behavioral interaction research. In Handbook of Infant
Bakeman R. & Quera V. (1992). SDIS: A sequential data interchange
Development (ed. J.D. Osofsky). Wiley: New York.
standard. Behavioral Research. Methods, Instruments and
Computers 24, 554-559. Stiles W.B., Honos-Webb L. & Surko M. (1998). Responsiveness in
psychotherapy. Clinical Psychology: Science and Practice 5, 439-
Bakeman R. & Quera V. (1995a). Log-linear approaches to lag-sequen-
458.
tial analysis when consecutive codes may and cannot repeat.
Tavare S. & Altham P.M. (1983). Serial dependence of observations
Psychological Bulletin 118, 272-284.
leading to contingency tables, and corrections to chi-squared stati-
Bakeman R. & Quera V. (1995b). Analysing Interaction. Sequential
stics. Biometrika 70, 139-144.
Analysis with SDIS and GSEQ. Cambridge University Press:
Van-Beek Y., de-Roos B., Hoeksma J.B. & Hopkins B. (1992).
Cambridge.
Sequential analysis of nominal data in mother-infant communica-
Bakeman R., McArthur D. & Quera V. (1996a). Detecting group diffe-
tion: Quantifying dominance and bidirectionality. Beliaviour 122,
rences in sequential association using sampled permutations: Log
306-328.
odds, kappa, and phi compared. Behavioral Research Metliods,
Instruments and Computers 28, 446-457. Wickens T.D. (1989). Multiway Contingency Tables Analysis for tlie
Social Science. Lawrence Erlbaum Associates Inc.: Hillsdale.
Bakeman R., Robinson B.F. & Quera V. (1996b). Testing sequential
Wickens T.D. (1993). Analysis of contingency tables with between-
association: Estimating exact p values using sampled permutations.
subjects variability. Psychological Bulletin 113, 191-204.
Psychological Metliods 1, 14-15.
Zimmermann C , Del Piccolo L. & Mazzi M.A. (2003). Patient cues and
Budescu D.V. (1984). Tests of lagged dominance in sequential dyadic
medical interviewing in general practice: examples of the applica-
interaction. Psychological Bulletin 96, 402-414.
tion of sequential analysis. Epidemiologia e Psichiatria Sociale 12,
Dumas J.E. (1986). Controlling for autocorrelation in social interaction
115-123.
analysis. Psychological Bulletin 100, 125-127.

Epidemiologia e Psichiatria Sociale, 12, 2, 2003


85

You might also like