Professional Documents
Culture Documents
Sight and Sound Out of Synch
Sight and Sound Out of Synch
ScienceDirect
Journal homepage: www.elsevier.com/locate/cortex
Research report
Article history: The sight and sound of a person speaking or a ball bouncing may seem simultaneous, but
Received 19 November 2012 their corresponding neural signals are spread out over time as they arrive at different
Reviewed 21 December 2012 multisensory brain sites. How subjective timing relates to such neural timing remains a
Revised 10 February 2013 fundamental neuroscientific and philosophical puzzle. A dominant assumption is that
Accepted 19 March 2013 temporal coherence is achieved by sensory resynchronisation or recalibration across
Action editor Robert McIntosh asynchronous brain events. This assumption is easily confirmed by estimating subjective
Published online 1 April 2013 audiovisual timing for groups of subjects, which is on average similar across different
measures and stimuli, and approximately veridical. But few studies have examined normal
and pathological individual differences in such measures.
Keywords:
Case PH, with lesions in pons and basal ganglia, hears people speak before seeing their
Audiovisual integration
lips move. Temporal order judgements (TOJs) confirmed this: voices had to lag lip-
Psychophysics
movements (by w200 msec) to seem synchronous to PH. Curiously, voices had to lead lips
Individual differences
(also by w200 msec) to maximise the McGurk illusion (a measure of audiovisual speech
Illusions
integration). On average across these measures, PHs timing was therefore still veridical.
Sensory timing Age-matched control participants showed similar discrepancies. Indeed, normal individual
differences in TOJ and McGurk timing correlated negatively: subjects needing an auditory lag
for subjective simultaneity needed an auditory lead for maximal McGurk, and vice versa.
This generalised to the StreameBounce illusion. Such surprising antagonism seems opposed
to good sensory resynchronisation, yet average timing across tasks was still near-veridical.
Our findings reveal remarkable disunity of audiovisual timing within and between
subjects. To explain this we propose that the timing of audiovisual signals within different
brain mechanisms is perceived relative to the average timing across mechanisms. Such
renormalisation fully explains the curious antagonistic relationship between disparate
timing estimates in PH and healthy participants, and how they can still perceive the timing
of external events correctly, on average.
2014 The Authors. Published by Elsevier Ltd. Open access under CC BY license.
* Corresponding author. City University, Northampton Square, London EC1V 0HB, United Kingdom.
E-mail address: elliot.freeman@city.ac.uk (E.D. Freeman).
0010-9452 2014 The Authors. Published by Elsevier Ltd. Open access under CC BY license.
http://dx.doi.org/10.1016/j.cortex.2013.03.006
2876 c o r t e x 4 9 ( 2 0 1 3 ) 2 8 7 5 e2 8 8 7
synchronous. These processes are not logically the same, and measures might just reflect different criteria for deciding
evidence from functional brain imaging suggests they are whether two asynchronous events from different modalities
supported by distinct brain mechanisms (Bertini et al., 2010; should be integrated or segmented, compared to when deciding
Miller and DEsposito, 2005; Stevenson et al., 2010). Given such whether the two events are synchronous or asynchronous. The
separation, and the multiple-clocks problem which that en- mismatch between measures was also small, though note that
tails, any evidence of dependence of integration on synchro- these measures were averaged across observers, which might
nisation could be indicative of synchronising mechanisms conceal the true extent to which optimal timing may differ
operating between distinct cognitive processes, not just be- between mechanisms within individuals.
tween modalities as a whole. Neuropsychological studies might contribute to this debate
It seems intuitive that such unity of timing across processes if cases could be found where brain lesions result in selective
should be achieved. Such an intuition might be based on the impairment of either synchronisation or integration, or joint
assumption that single physical events should be associated impairment of both together. A case of the latter kind seems to
with a unitary percept (Welch and Warren, 1980). It might indeed be reported by Hamilton et al. (2006), where the temporal
be surprising if we consciously perceived different aspects of the mismatch experienced by patient AWF coincided with an
same event as occurring at different times (though in some eliminated McGurk effect for veridically synchronous stimuli.
cases it seems we do; Arnold et al., 2001; Moutoussis and Zeki, However Hamilton et al. did not test McGurk under different
1997). Evidence suggests that the brain does actively strive to conditions of audiovisual asynchrony. Thus the critical evi-
maintain synchrony across processes. For example in the unity dence for true interdependence of synchronisation and inte-
effect, stimuli which are readily integrated (such as meaningful gration functions was lacking, which would have been
speech sounds and lip-movements) tend to be judged as syn- provided if the McGurk effect had been reinstated in AWF, for
chronous even if they are actually not (Vatakis and Spence, subjectively simultaneous stimuli.
2007). Conversely, integration may depend on a prior decision
about the temporal correspondence of auditory and visual 1.3. The present study
streams. For example, in the classic McGurk illusion (McGurk
and MacDonald, 1976), the combination of a voice saying /ba/ From the above review it may be concluded that the question of
and a face mouthing [ga] often results in hearing the syllable /da/ how, or indeed whether, the brain can minimise discrepancies
, while auditory /da/ with visual [ba] can sound like /ba/, but such in timing between modalities and between cognitive processes,
visual interference declines (on average) with increasing asyn- has not yet been satisfactorily resolved. Critical insights may be
chrony between voice and lips (Munhall et al., 1996; Soto-Faraco gained by studying individual differences between measures
and Alsius, 2007, 2009; van Wassenhove et al., 2007). Similarly probing synchronisation and integration, and comparing nat-
for non-speech stimuli, we are more likely (on average) to ural variations in these measures with those acquired following
perceive two balls as bouncing off each other when their colli- brain injury.
sion is accompanied simultaneously by a sound, compared to In particular, we can examine (1) whether PH is an example
when these auditory and visual events are asynchronous of a categorical breakdown of putative unifying mechanisms,
(Sekuler et al., 1997). Though such findings demonstrate or whether his lesions have merely shifted him along a con-
dependence of integration on synchrony, on average across tinuum of disunity, where we may also find ourselves. We
participants, its critical dependence on individuals own sub- therefore ask, how unusual is PH (Experiment 1)? If highly
jective synchrony has not been examined to date. abnormal, he could be the exception that proves the rule,
The above positive evidence suggests that the brain actively that unity and synchrony are normally achieved in individuals
benefits from, and actively strives for subjective unity across its (albeit with inaccuracies). But exceptions can also prove rules
different process. But however desirable, a unitary percept may wrong. Our evidence, of large discrepancies between our two
not always be achieved. Some observations appear to challenge measures in PH and surprisingly also in normal subjects,
the intuitive dependence of multisensory integration on au- suggests that asynchrony and disunity may rule instead.
diovisual synchronisation (Spence and Ngo, 2012). For example We can also ask (2) whether PHs acquired subjective
in the McGurk effect, Soto-Faraco and Alsius (2007, 2009) used a asynchrony is specific to perception of audiovisual temporal
dual-task paradigm to measure McGurk interference and sub- order or whether this affects the temporal tuning of audiovi-
jective synchrony as a function of audiovisual asynchrony. sual integration, and also how closely measures of integration
They found that illusory McGurk percepts were often reported correspond with measures of synchrony, within normal in-
even for audiovisual stimuli that could be reliably identified as dividuals. We assess integration in the McGurk effect (Exper-
asynchronous (on average across participants). Such dual iment 2) and the StreameBounce illusion (Experiment 3). If
perception of good lip-voice integration despite a detectable multimodal integration and synchronisation of speech stimuli
audiovisual asynchrony hints that unity can be violated, and are based on dependent mechanisms, it seems straightfor-
thus in principle measures of subjective integration and syn- ward to predict that individual differences for our two mea-
chronisation might be based on their own stimulus inputs and sures will correlate positively. Alternatively, a null correlation
correspondence mechanisms rather than shared or synchron- seems intuitively likely if these mechanisms were fully inde-
ised mechanisms. Some doubt remains about this however, pendent. We find neither. Our counterintuitive results call for
because integration and synchronisation judgements still cen- a revised understanding of how the brain solves the multiple-
tred on similar near-veridical asynchronies (on average), and clocks problem, which we propose.
thus could still be subject to common synchronising mecha- Our study simply replicated the dual-task paradigm of
nisms. Furthermore, any apparent differences between the Soto-Faraco and Alsius (2007) with PH and normal controls. On
2878 c o r t e x 4 9 ( 2 0 1 3 ) 2 8 7 5 e2 8 8 7
mean 22), and included the results from the older age- effect were /ba/ [ga] da and /da/ [ba] ba or bda. The
matched controls. Data from four further participants were other two congruent pairings /ba/ [ba] and /da/ [da] tend to
excluded, due to poor performance, resulting in implausible be heard correctly. Background was set to the average red green
estimates of subjective timing >300 msec asynchrony, outside blue (RGB) value across all pixels and frames. For the Streame
the typical range for multisensory integration (Vatakis et al., Bounce experiment, visual stimuli were two yellow circular at
2008; Vatakis and Spence, 2007) and indicative of poor qual- maximum contrast on a black background. Each moved from
ity data and unreliable function fits. Experiment 3 (testing the positions left and right above fixation, via the central fixation
StreameBounce illusion) had 24 participants aged 18e24, point, to opposite positions below fixation (see Fig. 2 for di-
excluding two others who reported no bounce illusion. All mensions, and Supplementary Video). Animations were
participants were nave to the specific aims of this study. accompanied by a 400 Hz tone of 100 msec duration, with the
Participants received a monetary reward. Procedures were same manipulation of asynchrony as for the McGurk stimuli.
approved by the local Psychology ethics committee. Movies were followed by 9 pt white text prompting responses,
displayed centrally.
3.2. Apparatus and stimuli Supplementary data related to this article can be found
online at http://dx.doi.org/10.1016/j.cortex.2013.03.006.
Laboratory apparatus comprised an Apple Mac Mini, with We also tested PH with various biological and/or non-
Labtec speakers positioned either side of a 17" Sony HMD- speech stimuli. Finger-click movies, of 3000 msec duration,
A420 cathode ray tube (CRT) display, viewed in darkness showed a hand with the middle finger clicking against the
from 70 cm. Mobile apparatus for older participants and PH thumb. Sequences began with either the hand open (to
comprised a Sony Vaio SZ1XP PC with built-in speakers and provide predictive information) or closed. For scrambled-
13.3" liquid crystal display (LCD) display, viewed from speech stimuli, the soundtrack from the original McGurk
approximately 57 cm. In both cases video mode was stimuli was passed through a three-channel noise vocoder
1200 800 with a 60 Hz refresh rate. Subjects responded using using Praat software (version 5.1.21, http://www.praat.org),
the cursor keys on the standard keyboard. rendering the speech unintelligible but without affecting the
McGurk stimuli were based on Soto-Faraco and Alsius (2007), spectral composition of the sound or the temporal sequence
which were kindly provided by the authors (see Fig. 2 for di- of amplitude modulations. The video sequence remained the
mensions, and Supplementary Video). Auditory /ba/ and /da/ same. Non-biological stimuli comprised a white square
phonemes (with white noise at 15% of maximum amplitude) (1.67 on each side) on a black background, presented for
were combined with visual lip-movements for [ba], [da] and 200 msec and paired with a white-noise burst of 200 msec
[ga]. The two incongruent pairings for eliciting the McGurk duration.
2880 c o r t e x 4 9 ( 2 0 1 3 ) 2 8 7 5 e2 8 8 7
Fig. 2 e Trial sequence and stimuli for McGurk (top row) and StreameBounce illusions (bottom).
from responding at chance to responding voice second 75% curiously, the optimal asynchrony for maximum McGurk
of the time. The resulting value quantifies the lag that can (tMcG) showed almost exactly the opposite asynchrony
produce a Just Noticeable Difference (JND) between subjec- (240 msec auditory lead was required for optimum McGurk).
tively synchronous and asynchronous stimuli. Thus voices effectively lagged lip-movements for the pur-
For the phoneme discrimination task we obtained the poses of audiovisual speech integration (Fig. 3b).
proportion of trials in which the reported phoneme was To investigate the generality of PHs auditory lead we
consistent with the lip-movements, averaged across incon- tested him on a variety of biological and artificial non-speech
gruous conditions only. For example, a ba response to /da/ stimuli, using single-task TOJs. In separate single-task tests,
[ba] and a da response to /ba/ [ga] were scored as we found PSS was closer to veridical for artificial stimuli
consistent. This was plotted as a psychometric function of (flash/noise-burst pairs: 2 99 msec auditory lag, 95% confi-
auditory lag. The data from each of the two incongruent dence interval), finger-clicks (with no significant difference
conditions, plus the average across them, were fit using an between movies beginning from open or closed-hand posi-
asymmetric double sigmoid function (ADS, following van tions: 64 85 msec), and for unintelligible noise-vocoded
Wassenhove et al., 2007), which results in a bell-shaped curve speech (52 98 msec). In contrast, a similar single-task
with adjustable height, width and asymmetry, using the paradigm with the original speech stimuli showed a similar
following equation: PSS shift as in the dual-task situation (210 90 msec). It may
therefore be concluded that PHs PSS shift was specific to
1 x c1 x c2
y tanh tanh speech, and not dependent on the number of concurrent
2 w1 w2
tasks.
with constraints w1 > 0 and w2 > 0
The optimal auditory lag for maximum McGurk interfer- 4.1.2. Comparison with controls
ence (tMcG) from vision was read off at the peak of each of How unusual is PH? Using a modified t test for comparing an
these interpolated functions and averaged, with 95% confi- individuals test score with a small normative sample (Crawford
dence intervals derived from fits of 1000 bootstrapped sam- and Howell, 1998), we found PHs tMcG was significantly greater
ples. For streamebounce judgements, ADS functions were than for 10 healthy age-matched participants [Crawford
fitted to the proportion of bounce responses. t(9) 2.23, p .05]. The discrepancy between PHs PSS and tMcG
Across subjects, mean (and SD) of R2 values for goodness of measures was also significantly greater than for the control
fit of functions to the psychometric data were .89 (.13) for the sample [Crawford t(9) 2.46, p .04]. On these measures PH
TOJ task, and .75 (.18) for the phoneme discrimination task. therefore does seem abnormal. However his PSS was not
significantly deviant from controls [t(9) 1.50, p .17 ] (Table 2).
Fig. 3 illustrates these results graphically as psychometric
4. Results functions for PH compared with the group average function.
We repeated the analysis after collecting data from a further
All inferential statistics reported in the following are based on sample of 27 young participants (see Expt. 2) with similar re-
parametric statistics, as data did not deviate significantly from sults (Table 2). Relative to the tMcG measure, PH was again
normality (KolmogoroveSmirnov p > .05). significantly deviant from young participants [t(25) 2.64,
p .01], and from the whole combined-age sample [t(35) 2.55,
p .02]. The discrepancy between PSS and tMcG measures was
4.1. Experiment 1
also significant for the young [t(25) 2.14, p .04] and
combined-age sample [t(35) 2.25, p .03]. However, he was
4.1.1. PH
not deviant relative to the PSS for young [t(25) 1.28, p .21]
PHs TOJs corroborated his subjective report of voice leading
and the combined-age sample [t(35) 1.37, p .18].
lips. His PSS was shifted away from veridical to 210 msec
It is surprising to note that on the measure that reflects
auditory lag. This means that subjective synchrony could only
PHs subjective report of voice leading lips, some healthy
be restored for PH by artificially lagging voices relative to lip-
participants showed PSS values of comparable magnitude to
movements (by 210 msec, see Table 2), at which point tem-
PH (Fig. 4a). Given that some normal participants seemed to
poral order became indistinguishable (Fig. 3a). Also very
show a similar magnitude of PSS shift, is PH is the only one
aware of asynchrony? 10/37 participants consistently reported
Table 2 e Mean results from McGurk experiment.
a visual or auditory lead on more than 75% of synchronous
N PSSa CIb JNDc CI tMcGd CI trials. Thus for these participants, the difference between
PH 1 210 40 93 35 240 56 veridically synchronous stimuli and their personal PSS was
Older 10 19 94 176 64 21 78 actually greater than their JND for perceiving asynchrony. In
Younger 27 31 57 272 81 62 42 other words, these subjects seemed to reliably perceive
a
PSS, in milliseconds; positive values for auditory lag, negative for physically synchronous stimuli as asynchronous, at least
auditory lead. under laboratory conditions.
b
CI: 95% confidence interval (msec); estimated for PH by boot-
strapping and for controls from SEMean. 4.2. Experiment 2: McGurk with normative sample
c
JND from subjective simultaneity (msec).
d
tMcG: asynchrony for maximum McGurk effect (msec). Positive
PHs two lesions in pons and STN seem well placed to disrupt
values for auditory lag.
audition and/or timing (Halverson and Freeman, 2010;
2882 c o r t e x 4 9 ( 2 0 1 3 ) 2 8 7 5 e2 8 8 7
Fig. 3 e Psychometric data for PH (with black data points and interpolation using a broken line), and for healthy young (black
continuous function) and older (grey) groups. a) TOJ: proportion of voice second reports (y-axis) for different auditory lags
(negative x-values for auditory lead), interpolated with a logistic function. Horizontals indicate the PSS with 95% confidence
intervals based on bootstrapped estimates for PH and on SEMean for controls. b) Phoneme discrimination task: proportion
of responses following lip-movement, averaged across incongruous conditions only, interpolated using ADS functions.
Auditory lag for tMcG was read off at the maximum.
Kolomiets et al., 2001; Teki et al., 2011), and might explain the within each mechanism might have been renormalised rela-
auditory lagging observed in tMcG. But how could the same tive to the average asynchrony across mechanisms.
lesions also produce an opposite shift in PSS, and PHs corre- Such renormalisation might explain how veridical
sponding experience of auditory leading? perception is maintained on average following pathological
It may be instructive to note that in PH our two measures of disruption of timing in selected mechanisms, but for neuro-
sensory timing are distributed roughly symmetrically around logically healthy people the prediction is highly counterintu-
zero auditory lag. Thus despite the temporal disparity be- itive: individual differences (Stone et al., 2001) which bias one
tween mechanisms it seems that on average across measures measure of subjective timing in one direction (e.g., auditory
he can still achieve near-veridical timing (see General lead for PSS) might be associated with the opposite bias in
discussion for further elaboration). It is suggested that in other measures (e.g., auditory lag for tMcG, or vice versa). This
order to maintain veridical performance, and thus continue to prediction of a negative correlation contrasts with the positive
live in the present moment, pathological auditory slowing correlation predicted if synchronising mechanisms brought
within impaired mechanisms is balanced by perceiving individual differences in PSS (Stone et al., 2001) and tMcG into
auditory timing in preserved mechanisms as slightly earlier agreement (Fujisaki et al., 2004; Harris et al., 2008; Spence and
than veridical. In other words the asynchronies obtained Squire, 2003; Vroomen and Keetels, 2010).
Fig. 4 e PSS (x-axis) plotted against asynchrony for maximum a) McGurk, i.e., tMcG (open circle: PH; grey: older; black: young)
and b) bounce illusions (tBounce) with line of best fit, and marginal histograms.
c o r t e x 4 9 ( 2 0 1 3 ) 2 8 7 5 e2 8 8 7 2883
To test this we measured the correlation between PSS and report, of patient AWF (Hamilton et al., 2006). However the
tMcG, across the whole sample of young and older partici- similarity with PH is difficult to assess, as the direction of
pants (total N 37). As predicted by the compensation hy- AWFs acquired temporal mismatch was not specified, and he
pothesis above, the correlation was significantly negative was only tested with synchronous stimuli. AWF showed no
(N 38, Pearsons r .47, p .003, Fig. 4a). Yet on average McGurk effect while PH did when tested with asynchronous
performance on both measures remained near-veridical (auditory leading) stimuli. AWFs lesions are also in a quite
(Fig. 3). different location, in right parietal cortex, while PHs lesions are
in mid-brain and brainstem. We can at least claim that the
4.3. Experiment 3: StreameBounce present case is the first to be reported of an acquired subjective
auditory lead, which is speech-specific and accompanied by an
Is this apparent repulsion of timing measures just a speech- auditory lag for optimal McGurk integration.
specific phenomenon? We tested this with the Streame Surprisingly, some healthy participants also showed large
Bounce illusion (Sekuler et al., 1997, Fig. 1), in which two deviations of PSS; indeed for some, synchronous stimuli were
approaching balls may appear to bounce off each other when just-noticeably asynchronous. Thus it seems PH is not so
their collision coincides with a sound, rather than streaming unusual in terms of experiencing a mismatch in audiovisual
past each other. As before, there were two questions after timing. Such ubiquitous sensory asynchrony further un-
each trial. The first probed the temporal order of the sound dermines support for the existence of specialised synchroni-
relative to the visual collision. The second required partici- sation mechanisms. It also raises the obvious question of why
pants to judge whether they saw the balls bouncing off each only PH is aware of his asynchrony in his everyday life. It is
other or streaming through each other, from which we esti- possible that our TOJ results from normal participants are
mated the asynchrony for maximum bounce (tBounce). We specific to our laboratory conditions. In the outside world we
again found a negative correlation between PSS and tBounce learn to expect that when auditory and visual events originate
(Pearsons r .54, p .001, for 24 new young participants, from the same source, they are also very likely to occur
Fig. 4b). Note that in contrast to the McGurk illusion for speech simultaneously, regardless of their sensory timing. Under this
where vision influences hearing, in this non-speech illusion, unity assumption (Vatakis and Spence, 2007; Welch and
hearing influences vision. Thus we may infer that this nega- Warren, 1980) our perception might tend to rely more on
tive correlation pattern, replicated for speech and non-speech, this expectation than any sensory evidence of asynchrony.
and in both directions of audiovisual influence, reflects a Our paradigm, by contrast, presented a randomised range of
general (rather than a stimulus-specific or task-specific) asynchronous stimuli with no feedback about which was
characteristic of perception. actually synchronous. In this situation the unity assumption
cannot be confidently applied, and perceptions may rely more
on asynchronous sensory inputs than prior expectations.
5. General discussion Under such conditions even neurologically healthy subjects
might notice an asynchrony given actually synchronous
Our psychophysical investigation of PH confirmed his sub- stimuli. As for PH, his subjective asynchrony (which changed
jective report of lips lagging voices, when measured using unexpectedly later in life) might just be too great for him to
TOJ, but revealed the opposite bias for McGurk integration, reconcile with the assumption of unity, even outside the lab
with temporal tuning favouring auditory lagging. Thus PH (Vatakis and Spence, 2007; Welch and Warren, 1980).
suffers not only from an acquired disruption of synchronisa- While PHs auditory lead for PSS is not statistically
tion, but also a violation of perceptual unity of timing across abnormal, his auditory lag for optimal McGurk (tMcG) is. This
different aspects of the same pairing of auditory and visual might be explained if the principle impairment caused by his
stimuli. Neurologically normal individuals also showed a lesions is actually a slowing of auditory processing, consistent
comparable opponency between our two measures (in speech with the location of his lesion on a tract connecting with the
and non-speech and in both directions of audiovisual influ- inferior colliculus, part of the early auditory system (see
ence): thus if one subject showed auditory lagging for TOJ, the Supplementary Materials for an analysis of tractography).
McGurk measure tended to show auditory leading (or vice The dissociation between PHs temporal tuning of subjec-
versa). Altogether, these counterintuitive findings suggest tive simultaneity for TOJ, versus for phoneme discrimination,
that perception of synchrony and integration depend on suggests that each different task may probe different mech-
distinct rather than common synchronising mechanisms, and anisms, each subject to their own neural asynchronies
reveal one strategy by which the brain might achieve near- (Aschersleben and Prinz, 1995). For example, one mechanism
veridical perception of the timing of multisensory events, at might be involved in speech integration and the other in
least on average, despite the evident temporal disunity of judging sensory synchrony (Calvert, 2001; Miller and
sensory processing. DEsposito, 2005; Vroomen and Stekelenburg, 2011). The
further dissociation between PSS for speech versus non-
5.1. How unusual is PH? speech would be consistent with the existence of special
mechanisms for these different stimulus types (Vatakis et al.,
If specialised mechanisms existed to synchronise senses in 2008). Alternatively the same mechanisms might have
normal brains, one would expect to find more cases of acquired different temporal tunings depending on the low-level char-
sensory desynchronisation when such mechanisms are acteristics of the specific stimulus presented (Vroomen and
lesioned (Wiener et al., 2011). There has only been one previous Stekelenburg, 2011). From these dissociations it seems, at
2884 c o r t e x 4 9 ( 2 0 1 3 ) 2 8 7 5 e2 8 8 7
least for PH, that there are indeed multiple clocks (see timing (Fig. 4). Our results suggest that timing discrepancies
Introduction), whose discrepant timings cannot be reconciled. between mechanisms serving performance of our synchroni-
sation and integration tasks cannot be fully reconciled. How-
5.2. How closely do measures of integration normally ever, as we explain below (and in Fig. 5), our evidence is still
correspond with measures of synchrony?
consistent with the mainstream assumption that the brain naturally in a similar (or opposite) direction to PH: in different
adjusts for differences in neural timing between distinct mo- people the greater the deviation in the auditory lead (lag) di-
dalities. Our account just makes explicit the assumption that rection for some mechanisms, the more auditory leading
this adjustment is made based on average differences in timing: (lagging) will be reported for other mechanisms, relative to the
either between modalities (Harris et al., 2008), or in principle mean asynchrony, thus resulting in an apparent antagonism
more generally between cognitive processes or any arbitrary between mechanisms. Given that the mean neural asyn-
groupings of temporally discrepant mechanisms. chrony most reliably relates to external synchrony (under
the unity assumption), renormalisation explains how
5.4. Temporal renormalisation near-veridical performance is maintained on average, across
mechanisms and also across subjects.
Given the present evidence that disparities in timing for
different tasks cannot be fully minimised, there appears to be 5.5. Simulation
no escape from the multiple-clocks problem: with one clock you
always know the time; with two you are never sure. But of course, Computations based on statistical distributions are routinely
Segals maxim is misleading. Given a room full of clocks, each proposed in Bayesian theories of perception (Miyazaki et al.,
independently subject to inaccuracies, our best guess at the 2006; Yamamoto et al., 2012), while functions similar to
correct time comes from the average across all clocks. Thus averaging over such distributions have been considered in
statistically, the more clocks we have the better for accurately theories of population coding (Roach et al., 2011). Assuming
estimating this average. Such averaging may be how the brain similar mechanisms in principle, we performed a simple
solves the multiple-clocks problem. This problem is that simulation, in which we plotted values sampled from two
different auditory and visual stimuli are processed at different random variables (clocks), after subtracting each from the
speeds, and arrive at different mechanisms (e.g., contributing average across a population of clocks. We found that this
to synchrony and integration judgements respectively) at simple renormalisation model could accurately simulate the
different times, resulting in a distribution of neural timings negative correlation observed (see Supplementary Methods S2
measured across the different mechanisms. From the point of and Supplementary Figure 2 for further details). This serves to
view of an individual mechanism contributing to this distri- demonstrate how the observed negative correlation phe-
bution, it is uncertain to what extent the timing of its inputs nomenon might emerge simply as a consequence of
reflects the true external timing of events or just internal renormalisation, and not due to any explicit antagonism be-
processing delays (Scharnowski et al., 2013). But the average tween mechanisms.
over the distribution provides a purer estimate of the
neural timing that relates most reliably to the true timing of
5.6. Conclusions
external events (see Fig. 5 for a schematic illustration, and
Supplementary Discussion of how this could apply before
Neuroscientists and philosophers have long pondered the
and/or after unimodal signals). We propose that discrepancies
relationship between subjective and neural timing (Dennett
in timing between mechanisms are not minimised but perceived
and Kinsbourne, 1995; Harris et al., 2008; Spence and Squire,
relative to their average timing.
2003; Zeki and Bartels, 1998). Our observations with PH and
In contrast to the other theoretical alternatives, this tem-
with neurologically healthy participants confirm that
poral renormalisation theory provides a fuller and more
perception is characterised fundamentally by asynchrony and
explicit account of all of our paradoxical findings: why a lesion
disunity: different aspects of the same pair of multisensory
produces opposite lags in different measures; why in normal
stimuli may be perceived with different asynchronies, and
participants different measures of subjective timing appear
these discrepancies cannot be fully minimised. But an
mutually repulsive, and how despite such disunity perception
apparent antagonism between complementary measures of
remains near-veridical on average across measures. To see how
subjective timing reveals a superordinate principle, by which
these phenomena emerge, note that in the multiple-clocks
discrepant timings in the brain may nevertheless be renor-
analogy, if one clock is particularly slow then this will bias the
malised to their average neural timing. By relating subjective
average, relative to which even the correct clocks will seem to
timing to average neural timing, temporal renormalisation
be fast. In the brain, the mean neural delay of each sensory
explains (1) why after a lesion PH experiences auditory leading
modality could also be attracted to particularly slow (or fast)
in one task but the opposite auditory lead in another, (2) why
neural events such that even events with relatively normal
different timing measures are negatively correlated across
timing may be perceived as slightly fast (or slow). In PH, the
normal individuals, and (3) how the brain might tell the time
integrative mechanisms probed by the McGurk task may have
from multiple clocks, with near-veridical accuracy, without
an unusually delayed auditory input, due to a selective brain
needing resynchronising mechanisms.
lesion. The central tendency of the distribution will shift to-
wards auditory lags, and relative to this, auditory signals from
other unaffected mechanisms, such as those performing TOJ,
will now be perceived to be leading. Yet on average across Acknowledgements
these measures, and despite pathological disruptions of
timing, performance remains near-veridical. Renormalisation We thank P.H. for participating, and S. Khan, A. Alsius,
also explains the negative correlation we observed in healthy R. Kanai and T. Schofield for technical assistance; and
individuals, for whom auditory and visual timing may vary M. Cappelletti, D. Bueti, S. Gaigg, C. Haenschel, G. Rees, and
2886 c o r t e x 4 9 ( 2 0 1 3 ) 2 8 7 5 e2 8 8 7
C. Price, for critical discussions. J.D. was funded by a Royal judgement of simultaneity. Quarterly Journal of Experimental
Society Leverhulme Trust Senior Research Fellowship. Imag- Psychology, 16(1): 37e41, 1964.
ing at the Wellcome Trust Centre for Neuroimaging, UCL, and Halverson HE and Freeman JH. Medial auditory thalamic input to
the lateral pontine nuclei is necessary for auditory eyeblink
open access publication, were supported by Wellcome Centre
conditioning. Neurobiology of Learning and Memory, 93(1): 92e98,
grant 091593/Z/10/Z. 2010.
Hamilton RH, Shenton JT, and Coslett HB. An acquired deficit of
audiovisual speech processing. Brain and Language, 98(1):
Supplementary data 66e73, 2006.
Hanson JVM, Heron J, and Whitaker D. Recalibration of perceived
Supplementary data related to this article can be found online time across sensory modalities. Experimental Brain Research,
185(2): 347e352, 2008.
at http://dx.doi.org/10.1016/j.cortex.2013.03.006.
Harris LR, Harrar V, Jaekl P, and Kopinska A. Mechanisms of
simultaneity constancy. In Nijhawan R (Ed), Issues of Space and
Time in Perception and Action. Cambridge University Press, 2008:
references 232e253.
Heron J, McGraw PV, and Horoshenkov KV. Adaptation minimizes
distance-related audiovisual delays. Journal of Vision, 7(13):
Alais D and Carlile S. Synchronizing to real events: Subjective 1e8, 2007.
audiovisual alignment scales with perceived auditory depth Howard IP and Templeton WB. Human Spatial Orientation. Oxford:
and speed of sound. Proceedings of the National Academy of John Wiley & Sons, 1966.
Sciences of the United States of America, 102(6): 2244e2247, 2005. Ivry RB and Spencer RMC. The neural representation of time.
Arnold DH, Clifford CW, and Wenderoth P. Asynchronous Current Opinion in Neurobiology, 14(2): 225e232, 2004.
processing in vision: Color leads motion. Current Biology, 11(8): Keesey JC. Does myasthenia gravis affect the brain? Journal of the
596e600, 2001. Neurological Sciences, 170(2): 77e89, 1999.
Arnold DH, Johnston A, and Nishida S. Timing sight and sound. Keetels M and Vroomen J. Perception of synchrony between the
Vision Research, 45(10): 1275e1284, 2005. senses. In Murray MM and Wallace MT (Eds), The Neural Bases
Aschersleben G and Prinz W. Synchronizing actions with events: of Multisensory Processes. Boca Raton (FL): CRC Press, 2012:
The role of sensory information. Perception & Psychophysics, 1e27.
57(3): 305e317, 1995. King AJ. Multisensory integration: Strategies for synchronization.
Bertelson P and Radeau M. Cross-modal bias and perceptual Current Biology, 15(9): R336eR339, 2005.
fusion with auditoryevisual spatial discordance. Perception & Kolomiets BP, Deniau JM, Mailly P, Menetrey A, Glowinski J, and
Psychophysics, 29(6): 578e584, 1981. Thierry AM. Segregation and convergence of information flow
Bertini C, Leo F, Avenanti A, and La`davas E. Independent through the cortico-subthalamic pathways. The Journal of
mechanisms for ventriloquism and multisensory integration Neuroscience, 21(15): 5764e5772, 2001.
as revealed by theta-burst stimulation. The European Journal of Kopinska A and Harris LR. Simultaneity constancy. Perception,
Neuroscience, 31(10): 1791e1799, 2010. 33(9): 1049e1060, 2004.
Bushara KO, Grafman J, and Hallett M. Neural correlates of Lambert C, Zrinzo L, Nagy Z, Lutti A, Hariz M, Foltynie T, et al.
auditoryevisual stimulus onset asynchrony detection. The Confirmation of functional zones within the human
Journal of Neuroscience, 21(1): 300e304, 2001. subthalamic nucleus: Patterns of connectivity and sub-
Cappe C, Rouiller EM, and Barone P. Multisensory anatomical parcellation using diffusion weighted imaging. NeuroImage,
pathways. Hearing Research, 258(1e2): 28e36, 2009. 60(1): 83e94, 2012.
Calvert GA. Crossmodal processing in the human brain: Insights Macaluso E and Driver J. Multisensory spatial interactions: A
from functional neuroimaging studies. Cerebral Cortex, 11(12): window onto functional integration in the human brain.
1110e1123, 2001. Trends in Neurosciences, 28(5): 264e271, 2005.
Crawford JR and Howell DC. Comparing an individuals test score Martin B, Giersch A, Huron C, and Van Wassenhove V. Temporal
against norms derived from small samples. The Clinical event structure and timing in schizophrenia: Preserved binding
Neuropsychologist, 12(4): 482e486, 1998. in a longer now. Neuropsychologia, 51(2): 358e371, 2013.
Dennett D and Kinsbourne M. Time and the observer: The where McGurk H and MacDonald J. Hearing lips and seeing voices.
and when of consciousness in the brain. Behavioral and Brain Nature, 264(5588): 746e748, 1976.
Sciences, 15(1992): 1e35, 1995. Meredith MA, Nemitz JW, and Stein BE. Determinants of multisensory
Driver J and Noesselt T. Multisensory interplay reveals integration in superior colliculus neurons. I. Temporal factors. The
crossmodal influences on sensory-specific brain regions, Journal of Neuroscience, 7(10): 3215e3229, 1987.
neural responses, and judgments. Neuron, 57(1): 11e23, 2008. Miller LM and DEsposito M. Perceptual fusion and stimulus
Freeman E and Driver J. Direction of visual apparent motion coincidence in the cross-modal integration of speech. The
driven solely by timing of a static sound. Current Biology, 18(16): Journal of Neuroscience, 25(25): 5884e5893, 2005.
1262e1266, 2008. Miyazaki M, Yamamoto S, Uchida S, and Kitazawa S. Bayesian
Fujisaki W, Shimojo S, Kashino M, and Nishida S. Recalibration of calibration of simultaneity in tactile temporal order judgment.
audiovisual simultaneity. Nature Neuroscience, 7(7): 773e778, 2004. Nature Neuroscience, 9(7): 875e877, 2006.
Garca-Perez MA and Alcala-Quintana R. On the discrepant Morein-Zamir S, Soto-Faraco S, and Kingstone A. Auditory
results in synchrony judgment and temporal-order judgment capture of vision: Examining temporal ventriloquism.
tasks: A quantitative model. Psychonomic Bulletin & Review, Cognitive Brain Research, 17(1): 154e163, 2003.
19(5): 820e846, 2012. Moutoussis K and Zeki S. A direct demonstration of perceptual
Ghazanfar AA and Schroeder CE. Is neocortex essentially asynchrony in vision. Proceedings of the Royal Society B: Biological
multisensory? Trends in Cognitive Sciences, 10(6): 278e285, Sciences, 264(1380): 393e399, 1997.
2006. Munhall KG, Gribble P, Sacco L, and Ward M. Temporal
Halliday A and Mingay R. On the resolution of small time constraints on the McGurk effect. Perception & Psychophysics,
intervals and the effect of conduction delays on the 58(3): 351e362, 1996.
c o r t e x 4 9 ( 2 0 1 3 ) 2 8 7 5 e2 8 8 7 2887
Roach NW, Heron J, Whitaker D, and McGraw PV. Asynchrony Teki S, Grube M, and Griffiths TD. A unified model of time
adaptation reveals neural population code for audiovisual perception accounts for duration-based and beat-based
timing. Proceedings of the Royal Society B: Biological Sciences, timing mechanisms. Frontiers in Integrative Neuroscience,
278(1710): 1314e1322, 2011. 5(January): 90, 2011.
Robertson IH. The Test of Everyday Attention. London: Harcourt The Psychological Corporation Wechsler Abbreviated Scale of
Assessment, 2006. Intelligence. San Antonio: Harcourt Brace & Co., 1999.
Scharnowski F, Rees G, and Walsh V. Time and the brain: Treisman M. Temporal discrimination and the indifference
Neurorelativity. Trends in Cognitive Sciences, 17(2): 51e52, 2013. interval. Implications for a model of the internal clock.
Schneider KA and Bavelier D. Components of visual prior entry. Psychological Monographs, 77(13): 1e31, 1963.
Cognitive Psychology, 47(4): 333e366, 2003. van Eijk RLJ, Kohlrausch A, Juola JF, and van De Par S. Audiovisual
Sekuler R, Sekuler A, and Lau R. Sound alters visual motion synchrony and temporal order judgments: Effects of
perception. Nature, 385(23rd January): 308, 1997. experimental method and stimulus type. Perception &
Sherwin I and Efron R. Temporal ordering deficits following Psychophysics, 70(6): 955e968, 2008.
anterior temporal lobectomy. Brain and Language, 11(1): van Wassenhove V, Grant KW, and Poeppel D. Temporal window
195e203, 1980. of integration in auditoryevisual speech perception.
Soto-Faraco S and Alsius A. Conscious access to the unisensory Neuropsychologia, 45(3): 598e607, 2007.
components of a cross-modal illusion. NeuroReport, 18(4): Vatakis A, Ghazanfar AA, and Spence C. Facilitation of
347e350, 2007. multisensory integration by the unity effect reveals that
Soto-Faraco S and Alsius A. Deconstructing the speech is special. Journal of Vision, 8(9): 1e11, 2008.
McGurkeMacDonald illusion. Journal of Experimental Vatakis A and Spence C. Crossmodal binding: Evaluating the
Psychology: Human Perception and Performance, 35(2): 580e587, unity assumption using audiovisual speech stimuli.
2009. Perception & Psychophysics, 69(5): 744e756, 2007.
Spence C and Ngo MK. Does attention or multisensory integration Vroomen J and De Gelder B. Temporal ventriloquism: Sound
explain the crossmodal facilitation of masked visual target modulates the flash-lag effect. Journal of Experimental
identification? In Stein BE (Ed), The New Handbook of Psychology: Human Perception and Performance, 30(3): 513e518,
Multisensory Processing. Cambridge, Mass.: MIT Press, 2012: 2004.
345e358. Vroomen J and Keetels M. Perception of intersensory synchrony:
Spence C and Parise C. Prior-entry: A review. Consciousness and A tutorial review. Attention, Perception, & Psychophysics, 72(4):
Cognition, 19(1): 364e379, 2010. 871, 2010.
Spence C, Shore DI, and Klein RM. Multisensory prior entry. Vroomen J and Stekelenburg JJ. Perception of intersensory
Journal of Experimental Psychology: General, 130(4): 799e832, synchrony in audiovisual speech: Not that special. Cognition,
2001. 118(1): 75e83, 2011.
Spence C and Squire S. Multisensory integration: Maintaining the Warrington EK and James M. The Visual Object and Space Perception
perception of synchrony. Current Biology, 13(13): R519eR521, Battery. Bury St Edmunds: Thames Valley Test Company, 1991.
2003. Welch RB and Warren DH. Immediate perceptual response to
Spreen O and Benton AL. Neurosensory Centre Comprehensive intersensory discrepancy. Psychological Bulletin, 88(3): 638e667,
Examination for Aphasia. Victoria, British Columbia: 1980.
Neuropsychology Laboratory, University of Victoria, 1969. Wichmann FA and Hill NJ. The psychometric function: I. Fitting,
Sternberg S and Knoll RL. The perception of temporal order: sampling, and goodness of fit. Perception & Psychophysics, 63(8):
Fundamental issues and a general model. In Kornblum S (Ed), 1293e1313, 2001.
Attention and Performance IV: 629e685, 1973. Wiener M, Matell MS, and Coslett HB. Multiple mechanisms for
Stevenson RA, Altieri NA, Kim S, Pisoni DB, and James TW. Neural temporal processing. Frontiers in Integrative Neuroscience,
processing of asynchronous audiovisual speech perception. 5(July): 1e3, 2011.
NeuroImage, 49(4): 3308e3318, 2010. Witkin HA, Wapner S, and Leventhal T. Sound localization with
Stone JV, Hunkin NM, Porrill J, Wood R, Keeler V, Beanland M, conflicting visual and auditory cues. Journal of Experimental
et al. When is now? Perception of simultaneity. Proceedings of Psychology, 43(1): 58e67, 1952.
the Royal Society B: Biological Sciences, 268(1462): 31e38, 2001. Yamamoto S, Miyazaki M, Iwano T, and Kitazawa S. Bayesian
Sugita Y and Suzuki Y. Implicit estimation of sound-arrival time. calibration of simultaneity in audiovisual temporal order
Nature, 421(6926): 911, 2003. judgments. PloS ONE, 7(7): e40379, 2012.
Sumby WH and Pollack I. Visual contribution to speech Zeki S and Bartels A. The asynchrony of consciousness.
intelligibility in noise. Journal of the Acoustical Society of America, Proceedings of the Royal Society B: Biological Sciences, 265(1405):
26(2): 212e215, 1954. 1583e1585, 1998.