11 Embodied Intersubjectivity: Jordan Zlatev

11
Embodied
Intersubjectivity
Jordan Zlatev
11.1 Introduction
Is language primarily grounded in the body or in sociality? The easy answer

would be ‘in both,’ but if forced to make a choice, most cognitive linguists
would probably opt for the first. This would be unsurprising given the
paradigmatic status of the notion of embodiment in the tradition (Lakoff and
Johnson 1999, Gibbs 2005a, Rohrer 2007), reflected in terms like embodied
realism (Johnson and Lakoff 2002) and embodied meaning (Tyler and Evans
2003, Feldman and Narayanan 2004). But we all have our individual bodies,
while language is fundamentally social and intersubjective (Wittgenstein
1953). Criticism from interactional linguistics (Linell 2009b), functional linguis-
tics (Harder 2010), and the philosophy of linguistics (Itkonen 2003) has zeroed
in on this lacuna, arguing that until it is resolved, cognitive linguistics will
be committed to an individualist, solipsist model of meaning. There have,
of course, been attempts to combine embodiment and intersubjectivity in
a unified account of the foundations of language (Zlatev 1997, Verhagen
2005, Zlatev et al. 2008), but it is fair to say that the tension remains.
One way to resolve this tension is with the help of phenomenology, the
philosophical tradition inaugurated by Edmund Husserl over a century
ago, which analyzes the nature of conscious experience, in its multitude of
aspects – subjective and intersubjective, perceptual and representational,
etc. – and describes how meaning emerges from it. As part of the project of
rethinking the ontological and epistemological foundations of cognitive
linguistics (Zlatev 2010), I here show how phenomenology seamlessly
fuses embodiment and intersubjectivity as complementary aspects of the
same phenomenon, reflected in the expression intercorporéité or embodied
intersubjectivity (Merleau-Ponty 1962). In section 11.2, I explicate why
Merleau-Ponty and other phenomenologists attributed a central role to
the sentient and active human body for relating to others and for the
creation of joint meaning. However, as pointed out by Merleau-Ponty,
Downloaded from https:/www.cambridge.org/core. University of Florida, on 09 Jun 2017 at 00:32:18, subject to the Cambridge Core terms of use, available at
https:/www.cambridge.org/core/terms. https://doi.org/10.1017/9781316339732.012
Embodied Intersubjectivity 173
following (Husserl 1936 [1970]), while this may ‘ground’ the social and
normative meanings of language and science, there remains a gap
between these layers, and it can only be bridged through several stages
of historicity. In section 11.3, we will see that such a model ties in well with
research from developmental psychology, thus framing the discussion in
a more empirical manner.
Subsequently, we turn to areas that have been of explicit concern for
cognitive linguistics, and show how the adopted perspective helps reframe
some key concepts with respect to image/mimetic schemas (section 11.4),
conceptual metaphors (section 11.5) and construal operations, with focus
on non-actual motion (section 11.6). In all these cases, we will see struc-
tures of bodily intersubjectivity serving to help establish and motivate
linguistic structures and meanings, but, crucially, without determining
them, as “it is experience that proposes, but convention that disposes”
(Blomberg and Zlatev 2014: 412). Finally, we summarize the main points of
the argument, highlighting the benefits of a phenomenological cognitive
linguistics.
11.2 Embodied Intersubjectivity in Phenomenology
Phenomenology is becoming increasingly influential in cognitive studies

(Gallagher and Zahavi 2008, Zlatev 2010, Koch et al. 2012, Blomberg and
Zlatev 2014), along with an appreciation of its sensitive analyses of the lived
body (Leib for Husserl; corps vécu for Merleau-Ponty) and their potential for
resolving problems of mind/body dualism, ‘anything goes’ relativism, the
existence of ‘other minds’ and other trappings of (post)modernity. Most
relevant for our current concerns is, however, that embodied intersubjec-
tivity can help ground the intersubjective nature of language as “phenom-
enologists have often endeavoured to unearth pre- or extra-linguistic
forms of intersubjectivity, be it in simple perception, tool-use, emotions,
drives, or bodily awareness” (Zahavi 2001: 166). The functions of these
‘pre- and extra-linguistic forms’ for motivating linguistic meaning will
become more explicit in subsequent sections, and I ask the reader here
simply to reflect on the phenomena described.
While there are multiple aspects of embodied intersubjectivity, the
most fundamental is the double aspect nature of the body: “My body is
given to me as an interiority, as a volitional structure and a dimension
of sensing, but it is also given as a tactually and appearing exteriority”
(Zahavi 2001: 168). The compound term Leibkörper utilizes the fact that
German has two different terms that profile the ‘internal’ (Leib) and the
‘external’ (Körper) aspects, respectively. As profiling does not imply
different (ontological) entities but different perspectives, my Leib and
my Körper ultimately coincide. This is clearly shown in the experience
of double sensation:
174 Z L AT E V
when I touch my right hand with my left, my right hand, as an object, has
the strange property of being able to feel too . . . the two hands are never
simultaneously in the relationship of touched and touching to each other.
When I press my two hands together, it is not a matter of two sensations
felt together as one perceives two objects placed side by side, but of an
ambiguous set-up in which both hands can alternate the rôles of “touch-
ing” and being “touched.” (Merleau-Ponty 1962: 106)
In fact, I could touch your right hand instead of my own, and the sensation
would not be completely different (though not completely the same
either). As Merleau-Ponty (1962: 410) expresses it: “The other can be evi-
dent to me because I am not transparent for myself, and because my
subjectivity draws its body in its wake.” And Zahavi (2003: 104) states
this even more forcibly: “I am experiencing myself in a manner that
anticipates both the way in which an Other would experience me and
the way in which I would experience an Other . . . The possibility of
sociality presupposes a certain intersubjectivity of the body.”
It should be noted that this analysis does not collapse self and other into
an amorphous anonymity as there remains an asymmetry: “The grief and
anger of another have never quite the same significance for him as they
have for me. For him these situations are lived through, for me they are
displayed” (Merleau-Ponty 1962: 415). Still, as lived-through and displayed
are connected in a reversible relation that resembles double sensation, we
can pass from one to the other, without any need for inference or simula-
tion: “I perceive the grief or the anger of the other in his conduct, in his
face or his hands . . . because grief and anger are variations of belonging to
the world, undivided between the body and consciousness” (415).
Using the concept of bodily resonance, Fuchs (2012) analyzes emotions as
seamless blends of an internal ‘affective’ component and an outward-
directed ‘emotive’ component. In a social context, the felt affect of the
Leib is displayed in its Körper’s emotive expression, which then results in an
affect in the Leib of another embodied subject, giving rise to emotive
expression through its Körper and so on. In such an ‘inter-bodily’ loop,
one literally perceives (rather than ‘infers’ or ‘simulates’) the other’s emo-
tion, through a kind of meaning and action-oriented process currently
known as enactive perception (Gallagher and Zahavi 2008).
Building on Husserl’s notion of operative intentionality (fungierende
Intentionalität), referring to a basic pre-conceptual, but meaningful direct-
edness toward the world, Merleau-Ponty proposed the influential notion of
a body schema (schéma corporel) recently explicated as “a system of sensorimo-
tor capacities that function without awareness or the necessity of percep-
tual monitoring” (Gallagher 2005: 24). It is important to distinguish this
from the notion of body image, which “consists of a system of perceptions,
attitudes and beliefs pertaining to one’s own body” (24). The latter takes the
body into focal consciousness, making it an intentional object of perception
or conception, while the body schema constitutes the pre-personal
embodied subject him- or herself, and is imbued (at most) with marginal
consciousness. As both Merleau-Ponty (1962) and Gallagher (2005) argue,
the body schema is preserved in some clinical cases where the body image is
compromised, and should thus be seen as distinct empirically as well.1
The body schema involves learning and memory, referred to in the
phenomenological tradition as body memory, which “does not represent
the past, but re-enacts it through the body’s present performance” (Fuchs
2012: 11). At least two of the forms of body memory distinguished by Fuchs
(2012) are inherently intersubjective. The first, deeper and least open to
reflection, is called intercorporeal, emerging from infancy and resulting in
“implicit relational know-how – bodily knowing of how to interact with
others, how to have fun together, how to elicit attention, how to avoid
rejection, etc.” (15). A second, and more reliant on conscious attention
form of body memory, is called incorporative; it presupposes full self-other
differentiation, and the more or less intentional adoption of postures,
gestures, and styles from others, based on imitation and identification.
Given the asymmetrical, power-based relationship between learner and
model involved, it is unsurprising that Fuchs relates the emergence of
incorporative memory to the well-known sociological concept of “the
habitus – embodied history, internalized as second nature and so forgotten
as history” (Bourdieu 1990, Fuchs 2012: 56).
It is thanks to such intersubjective body memory that ‘body language’ is
so often transparent: “The communication or comprehension of gestures
comes about through the reciprocity of my intentions and the gestures of
others, of my gestures and the intentions discernible in the conduct of
other people. It is as if the other person’s intentions inhabited my body and
mine his” (Merleau-Ponty 1962: 215). However, as the distinction between
the two forms of body memory outlined above indicates, we should not
assume that such transparency is universal: habitus and communicative
gestures are, to a considerable extent, culture and group-specific.
The forms of embodied intersubjectivity reviewed so far have profiled
dyadic, subject–subject relations. However, actions and gestures of the
kind mentioned above may involve objects as well (Andrén 2010). For
Husserl, the transcendent (i.e. real, ‘objective’) nature of an object like
the coffee mug in front of me was of central concern (for else the world-
directedness of intentionality would be in question), and he eventually
concluded that even simple object-directed intentionality presupposes
intersubjectivity. The argument, in brief, is that while I may only see the
mentioned coffee mug from one perspective (at one moment), I co-
perceive its other sides, including its bottom and container-shape, and
I synthesize these into an identity: the coffee mug itself. The foremost
1
Still, we should not dichotomize this distinction too much, since analogously to that between Leib and Körper, typical
human experience allows flexible re-location of consciousness between intentional object (noema) and process
(noesis), related to the cognitive linguistic notion of objective/subjective construal (Zlatev, 2010).
176 Z L AT E V
reason that this can be done is that I am implicitly aware that the other
perspectives are available for other embodied subjects. “My perceptual
objects are not exhausted in their appearance for me; rather, each object
always possesses a horizon of co-extending profiles which . . . could very
well be perceived by other subjects, and is for that very reason intrinsically
intersubjective” (Zahavi 2001: 155).
Finally, this analysis would hold even for natural objects; for example, if
I had a stone on my working table rather than the proverbial coffee mug.
For cultural artifacts or tools, there is an extra layer of embodied inter-
subjectivity, as their affordances – the possibilities for action that they
invite, inscribed on structures of body memory, “contain references to
other persons” (271) – those who have made them, or could use them in
manners similar to mine.
In sum, we have here briefly sketched a number of phenomenological
structures of embodied intersubjectivity: Leibkörper duality, bodily reso-
nance, body memory, intersubjective object perception and the sociality
of artifacts (see Zlatev and Blomberg [2016] for more discussion). Their
relevance for language, and in particular for cognitive semantic analysis,
will be shown in what follows. What is important to emphasize for the
time being is that (a) embodiment and intersubjectivity are complemen-
tary aspects of experience at the most fundamental levels of consciousness
and (b) the meaningfulness of such structures is not private, but interper-
sonal – not ‘in the head’ but ‘in the world’ and (c) their meaning (e.g. of
incorporated memory schemas) can be expected to underlie (‘ground’)
linguistic meaning, but is not identical with it. This principle clearly
applies to ontogenetic development, as shown in the next section.
11.3 From Embodied Intersubjectivity to Language

in Development
A phenomenology-inspired approach to intersubjectivity, and its founda-

tional role for language, has been influential in developmental psychology
(Trevarthen 1979, Stern 2000, Gallagher 2005, Braº ten 2006). One of its
features is that it provides a coherent alternative to the more common
‘theory of mind’ approach to social cognition. In contrast to this, the shared
mind approach (Zlatev et al. 2008) emphasizes that people are attuned to
each other’s subjectivity from birth (Trevarthen 2011), that key cognitive
capacities are first social and only later understood in private or represen-
tational terms (Vygotsky 1978), that experiences are shared not only on
a cognitive level, but also on the level of affect, perceptual processes and
conative engagements (Hobson 2004), and that intersubjectivity involves
different levels and stages in development and evolution.
One such model is the Mimesis Hierarchy (Zlatev 2008, 2013), which
distinguishes between fives stages/layers, with subsequent ones
Table 11.1 The mimetic hierarchy of semiotic development (adapted from

Zlatev 2013)
Examples of cognitive- Approximate

Stage Novel capacity semiotic skills age
5 Language Language-mediated folk - complex sentences 30 m -
psychology - discourse
- onset of narrative
4 Protolanguage Symbols: communicative, - vocabulary spurt 20–30 m
conventional - reorganization of gestures
representations - gradual increase in
utterance complexity
3 Triadic Communicative intent - declarative pointing 14–20 m
mimesis - reciprocal (joint) attention
- associative schemas
2 Dyadic Volitional control and - generalized/deferred 9–14 m
mimesis imitation imitation
- coordinated (joint)
attention
1 Proto-mimesis Empathetic perception - neonatal imitation 0–9 m
- emotional contagion
- ‘proto-conversations’
- synchronous (joint)
attention
superimposing upon rather than superseding earlier ones. The first three
layers are pre-linguistic, and provide the foundation for the highest two
levels, where language emerges, with its representational, normative and
systematic character (Zlatev 2007b). Metaphorically, each layer can be seen
as ‘grounding’ those following it, as shown in Table 11.1. Let us summarize
each briefly, with reference to some of the phenomenological notions
discussed in the previous section.
Proto-mimesis. The child has a minimal embodied self from birth, largely
based on the sense of proprioception. For example, neonates react differ-
ently when their own hand and someone else’s hand touches their cheeks
(Rochat 2011). At the same time, such newer evidence does not contradict
the position that during the first months of life the embodied subject does
not completely differentiate their own body schema from the perceived
body of the other, paradigmatically, the mother (Piaget 1962, Werner
and Kaplan 1963). This is one way to interpret the well-known (though
still controversial) phenomenon of “neonatal mirroring” (Gallagher 2005).
By this analysis, such mirroring is distinct from full imitation (of novel
actions), which requires volitional control of the body, carefully matched
to that of the other. At this early stage, interpersonal relations are heavily
based on processes of bodily resonance, reflected in ‘emotional contagion’
(spontaneously picking up the feelings of other) and mutual attention
(prolonged bouts of looking into the eyes of their caregivers). Body
178 Z L AT E V
memory is of the intercorporeal kind, realized in turn-taking ‘proto-

conversations’ (Trevarthen 1979) and interactional ‘formats’ like peeka-
boo (Bruner 1983).
Dyadic mimesis. By nine months of age, infants have considerably
expanded body schemas, with increased motility, volitional control, and
a distinct sense of self, as well as the beginning of a body image. It is
characteristic that around this age the first pointing gestures and true
imitation emerge, driven largely by the need to maintain intersubjective
closeness (Werner and Kaplan 1963, McCune 2008). Imitation progresses
from sensory-motor imitation, through deferred imitation, to representa-
tional imitation where “the interior image precedes the exterior gesture,
which is thus a copy of an ‘internal model’ that guarantees the connection
between the real, but absent model, and the imitative reproduction of it”
(Piaget 1962: 279). While the first two correspond to incorporative memory,
the achievement of the third form during the second year of life marks the
onset of the first concepts (Piaget 1962), or focally conscious mental re-
presentations, which Zlatev (2005) calls mimetic schemas: “dynamic, concrete
and preverbal representations, involving the body image, accessible to
consciousness and pre-reflectively shared in a community” (334), including
schemas such as Kiss, Kick, Eat, and Jump. As reflected in these quotations,
the idea is that bodily mimesis develops from re-enactive, non-
representational memory to representational thought and a form of body-
based semantic memory. These are clearly structures of embodied
intersubjectivity: “The posited mimetic schemas . . . as the name suggests
are schematizations over multiple mimetic acts . . . internalized from overt
imitation of the actions of the other” (Möttönen 2016: 164).
Triadic mimesis. The earlier stage is named ‘dyadic’ despite the possible
involvement of objects in communicative interaction (e.g. desired ones in
so-called imperative pointing) or imitated actions with objects (e.g. in
feeding), since, only starting from about fourteen months, child, other, and
object become fully integrated in a referential triangle. This is reflected in
‘declarative pointing’ gestures (look at that!) and gaze oscillations between
object and addressee that mark full joint attention, implying an intersub-
jective form of ‘third level mentality’ (I see that you see that I see) (Zlatev
2008). This can be seen as an empirical manifestation of Husserl’s trans-
cendental intersubjectivity, according to which, external objects gain their
fully transcendent status only when they can be the potential focus of joint
attention. Furthermore, it is only with triadic mimesis that Gricean inten-
tional communication can take place, where one is intending both to
communicate something to an audience, and for the audience to recognize
this intention. This implies an embodied second-order intention, enacted
through bodily actions like gaze, smile, raised eye brows, or holding of the
gesture’s reply (Zlatev et al. 2013)
(Proto)language. While the first words have likely appeared much earlier,
these are typically restricted to specific contexts and function more like
indexical schemas, associated with, rather than as a symbols ‘standing

for,’ emotional states and external events (Bates et al. 1979). However,
based on the ‘infrastructure’ of body memories, mimetic schemas, and
triadic mimetic gestures, around their second birthday most children dis-
play a marked increase in the number and variety of their words. This is
often described in the developmental literature as a ‘vocabulary spurt.’
In line with phenomenology, the model relates this to the emergence of
reflective consciousness (Zelazo 2004), which is instrumental for the symbolic
insight that ‘things have names’ and that these names are common, that is
conventional (Zlatev 2013). From this point, language co-develops and
interacts with the use of other semiotic resources such as gestures
(McNeill 2005) and pictures (DeLoache 2004), gradually increasing in struc-
tural complexity (Tomasello 1999). This makes the distinction between
‘proto-language’ and language-proper a gradual one, and the main reason
that it is worth making it an ontogeny is that a certain degree of linguistic
proficiency in representing both events, and their interrelations, is
required for the production of narratives, which for its part has an impor-
tant effect on the construction of both self- and other representations
(Hutto 2008). Unlike some language-centered philosophers (e.g. Dennett
1991), however, the present model and embodied phenomenology at large
hold that narratives cannot ‘make up’ the self, as there would be (a) no
story to tell, (b) no one to tell it, (c) no one to tell it to, and (d) no medium to
tell it in – unless the self, others, and the world where not first cyclically co-
constituted, and language built up upon this fundament.
In the following three sections, we will see how this general scenario
plays out in the analysis of central cognitive linguistic phenomena.
11.4 Image Schemas and Mimetic Schemas
One of the most influential, and at the same time most ambiguous, con-
cepts in cognitive linguistics has been that of image schemas (cf. Zlatev
2005). Usually these have been characterized as highly abstract, non-
representational structures such as Path, Container, and Verticality,
underlying the meaning of closed-class, spatial terms (e.g. to, from, in, out,
up, down). A commonly cited definition of an image schema is that of “a
recurring dynamic pattern of our perceptual interactions and motor pro-
grams that gives coherence to our experience” (Johnson 1987: xiv). The use
of the first-person plural possessive pronoun in this definition, as well as
characterizations of image schemas as structures of “public shared mean-
ing” (190), however, remain not clearly justified. Lakoff and Johnson (1999)
profess to do so in terms of “shared” human anatomy and physiology,
claiming that this would explain why “we all have pretty much the same
embodied basic-level and spatial-semantic concepts” (107). But how can
biological bodies ‘on their own’ (in both senses: without their lived
180 Z L AT E V
counterparts, and without others) give rise to shared intersubjective

experience (Zlatev 2010)? Furthermore, ‘perceptual interactions’ differ
with environments and cultural practices (habitus) so the implicit univers-
alism of the notion is also questionable.
One of the motivations behind the concept of mimetic schemas was to deal
with this problem. As shown in section 11.3, they are assumed to emerge
from being with others, schematizing multiple acts of overt or covert
bodily mimesis. Their harmonious relation with phenomenological analy-
sis has also been acknowledged: “With respect to the phenomenology of
body memory . . . this approach is particularly promising as it allows one to
highlight the role not only of habitual, but also of intercorporeal and
incorporative body memory in the process of meaning formation”
(Summa 2012: 38).
The semantics of the majority of the ‘first verbs’ acquired by (English-
speaking) children (McCune 2008), including intransitive verbs like eat and
transitives like hit, can be seen as rooted in such schemas (Zlatev 2005).
Furthermore, the analysis of the gestures of three Thai and three Swedish
children between eighteen and twenty-six months has provided evidence
that mimetic schemas also give rise to children’s first iconic gestures,
helping to account for the close semantic and temporal integration
between speech and gesture (Zlatev 2014). In particular, 85 percent of
the iconic gestures were found to fall into ‘types’ on the level of specific
actions like kicking, kissing, and applying lotion. Only two such types/
schemas were found to be common to both cultural groups (Kiss, Feed),
while the rest were culture-specific (or at least culture-typical). Nearly all
corresponded to mimetic enactments (with the body representing itself, so
to speak), and those few that did not involved the use of toys: Doll-walk,
Car-drive, etc.
Image schemas of the Path and Container kind have been used in gesture
analysis, but characteristically the considerably more abstract gestures
that correspond to them are first found in older children and adults,
suggesting a possible ‘division of labour’ with mimetic schemas (Cienki
2013). One possibility is that image schemas come about as generalizations
from the more concrete structures of embodied intersubjectivity analyzed
so far in this chapter, and may thus inherit their interpersonal aspects this
way. But it also possible that image schemas are not ‘pre-verbal’ as origin-
ally claimed, but are at least co-determined by the semantics of the closed-
classed items that they are supposed to ‘ground’ (Zlatev 2005). As such
expressions are typically acquired later in life, this would also explain why
image-schema-like gestures are absent in toddlers.
Another proposal is that of a more co-temporal, synchronic develop-
ment and interaction between mimetic schemas and image schemas. For
example, McCune (2008) follows Piaget (1962) in distinguishing between
‘figurative’ and ‘operative’ structures of sensorimotor cognition. While
mimetic schemas correspond to the figurative (concrete, specific) kind,
children’s physical and social interactions may results in pre-linguistic

schemas of Appearance-Disappearance, In-Out (which resemble image
schemas) and these may ground the meaning of dynamic event words
like allgone, in, and out. Piagetian embodiment has not been widely
known for its intersubjectivity, but given the pivotal role of mimesis in
imitation and symbolic play in this theory, it would not be implausible to
view these as structures of embodied intersubjectivity as well, especially if
McCune’s analysis for pre-verbal image schemas holds.
11.5 Metaphors and Their Experiential Roots
Two interrelated debates characterize the recent literature on metaphors

in cognitive linguistics (cf. Johnson 2010). The first concerns the fundamen-
tal level on which metaphors operate: is it that of so-called conceptual
metaphors, defined as “a cross-domain mapping of structure from a source
domain to a target domain, where the two domains are regarded as differ-
ent in kind” (Johnson 2010: 407), commonly illustrated with the relation
between the domains of Time and Space? Or is it the level of recurrent
discourse metaphors such as our European home, which are intermediary in
their conventionality between novel metaphors relying on analogical rea-
soning and literalized expressions (Zinken 2007)? Or is it perhaps the most
specific, contextual level, so that “rather than conceiving of metaphors as
discrete units they should be regarded as a process of meaning construal in
which new metaphoric expressions dynamically emerge, are elaborated,
and are selectively activated over the course of a conversations” (Kolter
et al. 2012: 221). It should be emphasized that it is what is ‘most basic’ that
is contested, as there is a degree of consensus that the levels of (universal)
bodily experiences and cognitive processes, (conventionalized) cultural
practices, and situated language use are not exclusive but rather comple-
mentary (Zlatev 2011).
The second debate concerns the balance between universality and
language/culture specificity of metaphors. As may be expected, the
stance taken in this debate correlates with that taken in the previous
one: if metaphors, and especially so-called primary metaphors (Grady
1997a) such as A F F E C T I O N I S W A R M T H “are acquired unconsciously through
our bodily engagement with our environment” (Johnson 2010: 410),
a considerable degree of universality may be expected. If, on the other
hand, metaphors are essentially discursive constructs, cultural specifi-
city follows almost by definition. Thus, while proponents of the univers-
alist stance have been happy to accept a degree of cultural variation
(Kövecses 2000), for example, “social constructions are given bodily
basis and bodily motivation is given social-cultural substance” (14), the
controversy will persist as long as the deeper, definitional disagreement
does.
182 Z L AT E V
Without pretentions to fully resolve these issues here, we can see in

these debates another reflection of the familiar ‘body versus culture’
dichotomy, to which embodied intersubjectivity could provide a remedy.
We may begin by observing that, while this is seldom acknowledged,
suggested ‘primary metaphors’ are almost always intercorporeal. This is
hardly surprising as they are hypothesized to be “acquired by children
simply because of the nature of the bodily experience (in perception and
bodily movement) for the kinds of the structured environments they
inhabit” (Johnson 2010: 410) and as in the paradigmatic example
A F F E C T I O N I S W A R M T H , these ‘environments’ are crucially interpersonal.
As with image schemas, we could see Johnson (2010) using first-person
pronouns (‘our’) in his description of primary metaphors. In the present
case this is justified, as such ‘correlations’ (affection-warmth, etc.) corre-
spond to felt, reciprocal qualities in the Leibkörper of infant and mother,
that is, to intercorporeal body memory.
However, is it really appropriate to analyze such structures, emerging
through bodily interaction and involving (in most cases) emotion, as
‘cross-domain mappings’? As shown in section 11.2, unlike the way they
are usually represented in cognitive semantics (Kövecses 2000), emotions
hardly constitute an ‘abstract target domain,’ and even less so one that is
‘different in kind’ (see Johnson’s definition above) from their bodily
expressions. Rather, following Fuchs’s analysis, the internal and external
sides of emotion are intimately connected through bodily resonance, allowing
them to be perceived directly, without the need for inference or simula-
tion. Of course, emotions can be conceptualized in many different ways, in
different cultures and languages (e.g. as ‘forces’ or ‘fluids’) but these are
secondary, language-mediated, and indisputably metaphorical construc-
tions. If it is the pre-conceptual, and (largely) universal bodily roots of
metaphors that concern us, then an analysis in terms of ‘cross-domain
mappings’ would seem like placing the cart before the horse.
Studying the ‘mapping’ between the ‘domains’ of motion and emotion
in more or less related languages seems to confirm this theoretical con-
clusion. On the basis of native-speaker intuitions and corpus-analysis,
Zlatev, Blomberg, and Magnusson (2012) performed an extensive search
for conventionalized expression-types in English, Swedish, Bulgarian, and
Thai where a motion verb is used to denote a state of affect/emotion,
without any perceived motion, as in my heart dropped. One-hundred and
fifteen such ‘motion-emotion metaphors’ (on the intermediary level of
conventional metaphor types) were identified and compared.
As expected, the findings showed considerable cross-language differences,
especially between Thai and the three European languages, and also sig-
nificant similarities. For example, in all four languages upward motion
was correlated with positive affect and downward motion with negative.
At the same time, this was expressed differently, with, for example, Thai
requiring compounds in which caj, ‘heart-mind,’ marks the phrase as an
emotion rather than as a motion expression. In line with what was empha-
sized in the previous sections, this showed the need to distinguish
between the levels of (i) conventional linguistic expressions and (ii) pre-
linguistic motivations.
What may the pre-linguistic motivations be in this case? As pointed out
as early as Lakoff and Johnson (1980a) one likely motivation for the
E M O T I O N I S M O T I O N ‘primary metaphor’ is that positive/negative emotion
corresponds to higher/lower bodily posture. In such cases, this would be
yet another expression of Leibkörper duality, with felt qualities being, so to
speak, worn on the sleeve of the living body, rather than thinking of Leib
and Körper as two distinct domains. In other cases, especially when the
motion verb expresses motion through a liquid (one typically sjunka ner,
‘sinks down,’ into a depression in Swedish), a cross-domain mapping
(analogy) may be a more appropriate analysis.
The four languages (and others studied since then) also featured similar
conventional metaphors corresponding to the mimetic schemas Stir,
Shake, and Shatter. In the case of Shake there is obvious motivation in
Leibkörper duality; for example, an electric shock affects both body and
soul. For Stir and Shatter, the motivation is more likely in the analogy of
the felt (inner) sensation and the observed transformations of external
objects: with brews being stirred, and fragile things shattered (Zlatev,
Blomberg, and Magnusson 2012). In sum, there is nothing to guarantee
that the partial overlap of metaphors across unrelated languages is to be
explained by the same mechanism, rather than by different intercorporeal
motivations. To repeat again, the latter should not be confused with the
metaphors themselves, which require expression, linguistic or otherwise.
A very different kind of study also supports this view of interaction
between – but non-identity of – bodily motivations and expression.
An interdisciplinary team of cognitive linguists, therapists, and phenom-
enologists (Kolter et al. 2012) studied how patients expressed aspects of
their lives first only through body movement, and then by simultaneous
use of body movement and speech. One patient first enacted her feelings
with two body patterns: one swinging movement from side to side, and
a “spiral movement, executed with the left hand from the highest peak
above the head down to the waist” (207). Later she stated that she felt that
her life was as a wave, which sometimes goes up and sometimes down.
At some point, however, she noticed the mismatch between the down-
going spiral of her gesture, and the verbal description, and adapted the
movement to be bidirectional. The authors interpreted the (initial) body
movements of the patient as “expressions of learned behavioural patterns
and attitudes, which are sedimented in her body memory” (210), and the
transition from non-matching to matching relation between movement
and language as the ‘waking’ of a ‘sleeping metaphor,’ that is, the emer-
gence of that particular metaphorical construal of her life into conscious
awareness.
184 Z L AT E V
11.6 Non-actual Motion and Construal
Sentences like (1) and (2) have been provided in argument for a non-
denotational semantics: neither the mountain range nor the path are in
motion, but are construed as such, apparently due to some underlying
cognitive process. Recently, it has been especially popular to see this
process as some kind of neural/mental ‘simulation’ (Matlock 2010).
(1) The mountain range goes all the way from Canada to Mexico (Talmy
2000a).
(2) The path rises toward the summit (Langacker 2006).
Such an analysis leaves a number of problems, analogous to those dis-

cussed in the previous section. First, there are a number of different
possible experiential motivations that could explain the occurrence of
such expressions of non-actual motion (NAM) – a term that is neutral with
respect to hypothetical cognitive processes, unlike notions of ‘fictive,’
‘subjective,’ or ‘apparent’ motion. Linking ideas expressed by Talmy,
Langackar, and Matlock to concepts in phenomenology (Blomberg and
Zlatev 2014) distinguish between the following three motivations:
i. the enactive, action oriented nature of perception;

ii. the correlational (act-object) nature of intentionality;
iii. imagination of counter-factual states, closest to a truly metaphorical,
or ‘fictive’ reading of sentences like (1) and (2).
While some NAM-expressions could be linked predominantly with one

or another of (i–iii) there is nothing to restrict multiple motivations.
Furthermore, as was the case with metaphor, pre-linguistic experiential moti-
vations do not determine or constitute linguistic meanings, which conform to
culture and language-specific conventions. For example, certain languages
like Yucatec Maya severely restrict the possibility to use NAM-sentences
(Bohnemeyer 2010).
Finally, while many cognitive linguistic theories locate meaning in the
(individual) head, neither (i) nor (ii) are, strictly speaking, either represen-
tational or individual. First, they are primarily perceptual, in the way that
affordances (Gibson 1979) are: paths afford walking, as chairs afford sitting
on, etc. In the case of figures that do not afford self-motion like mountain
ranges, the process of ‘scanning’ their length is still mostly perceptual, or
quasi-perceptual (as reflected in the terminology used by Langacker).
Moreover, in line with the structures of embodied intersubjectivity dis-
cussed in section 11.2, we can affirm their intersubjective rather than
private character. With respect to (i) the affordances of paths and highways
are the affordances of cultural artifacts, which are not just ‘mine’ but
inherently point to other embodied subjects:
From a phenomenological point of view, affordances are publically dis-

tributed, that is, taught and learned, patterns of interaction that are
accompanied not only by perspectival experiences but also by the con-
sciousness of the public nature and sharedness of these experiences.
(Möttönen 2016: 160)
With respect to (ii), as perceiving any three-dimensional object, and not
just cultural artifacts, implies the implicit awareness of other perspectives
and possible co-perceivers (cf. section 11.2), we may say that in ‘scanning’
an elongated figure like a fence we are implicitly aware that others could
do likewise, or perhaps differently. Scanning is thus something like
a perceptual affordance.
Such argumentation may appear as too theoretical for some tastes, but
Blomberg (2015) has made it fully concrete in an elicitation-based study.
Native-language speakers of Swedish, French, and Thai were asked to
describe pictures such as those shown in Figure 11.1 in single sentences.
Half of the pictures represented figures that afford self-motion (a, b), and
half that do not (c, d). Crossed with this, the same objects were displayed
either from a first-person perspective, 1pp (b, d) or third-person perspec-
tive, 3pp (a, c). The reasoning was that if only ‘mental simulation’ of actual
motion determined the use of NAM-sentences, then perspective would not
(a) (b)
(c) (d)
Figure 11.1 Pictures eliciting non-actual motion descriptions, according to the two
parameters Affordance and Perspective: (a) [+afford, 3pp], (b) [+afford, 1pp], (c) [-afford,
3pp], (d) [-afford, 1pp]
186 Z L AT E V
matter. On the other hand, if mental scanning were the most determina-
tive factor, then most NAM-sentences would be produced in the 3pp con-
dition. The results showed that all categories of pictures elicited NAM-
sentences on the scale of 40 percent for all three languages, significantly
more than control pictures showing non-extended figures. All categories
of target pictures were thus often described by NAM-sentences, but inter-
estingly, for all three language groups, pictures like that in (b) – which
most closely resemble a situation that affords self-motion – elicited more
NAM-sentences than the others.
Finally, the elicited NAM-sentences consistently relied on language-
specific conventions for expressing actual motion, but making this more
‘bleached’ by avoiding the use of Manner-verbs like run and crawl.
In Swedish, this tendency was reflected in the common use of the two
generic motion verbs gaº , ‘go’ and leda, ‘lead’. The French speakers used
a wider range of motion verbs, including Path-verbs such as sortir, ‘exit’;
Thai speakers used serial-verb constructions, but in most cases omitted the
Manner-verb in the series, using only a Path-verb together with a deictic
verb.
In sum, the study confirmed the distinction between experiential motiva-
tions and conventional meanings, and the intersubjective-perceptual, rather
than individual-representational character of the first. Möttönen (2016)
extends a similar analysis to the key concept of construal as such, arguing
that it should not be seen as an individual mental operation, but as an
intersubjective one, on at least three levels: (i) on the level of perception –
perspective presumes the co-existence of other perspectives and the identity
of the referential object; (ii) on the pragmatic level of alignment in conversa-
tion – individual meaning-intending (referential) acts like that creature there
and the dog differ in specificity, but remain co-referential; and (iii) on the level
of conventionalized construals – sedimented through multiple instances of
(i) and (ii) over individual and historical time. Even from this cursory sum-
mary, it can be seen how higher levels presuppose lower ones, with level (i)
being essentially the level of pre-linguistic embodied intersubjectivity.
11.7 Conclusions
This chapter shows the relevance of phenomenology in general, and of

embodied intersubjectivity in particular, for cognitive linguistics. Some
scholars from interaction studies have viewed the field as lacking in this
respect: “from ‘embodied cognition’ to cognitive linguistics to micro-
ethnology: the paradigmatic importance of intercorporeality . . . has not
even begun to be recognized” (Streeck 2009: 210). The analyses of the
experiential grounding of mimetic and image schemas, metaphors, and
dynamic construals presented here clearly show that this claim is an
overstatement.
On a more general theoretical level, our discussion has a number of

important implications for the foundations of cognitive linguistics. First,
from its onset (Lakoff and Johnson 1980a), the school has emphasized that
language is ‘grounded in experience,’ but has not fully resolved the ques-
tion whether physical or social experience is most essential. Armed with
the concept of embodied intersubjectivity, we have argued that body and
sociality are interlocked from the start, freeing us from the need to choose
one or the other (see Zlatev 2016).
Second, we have seen that some central experiential structures that
underlie linguistic meaning should not be understood as ‘private’ pro-
cesses (like neural simulation) that cannot be shared by definition, but as
bodily intersubjective ones. To summarize: mimetic schemas involve
intercorporeal and incorporative body memory; E M O T I O N I S M O T I O N , and
other ‘primary’ metaphors are to be explained not so much with the help
of invisible cross-domain mappings but through bodily resonance and
Leibkörper duality; non-actual motion expressions are rooted not so much
in individual representational cognition, but in perceptual intersubjectiv-
ity, involving affordances and the enactive perception.
Third, as pointed out repeatedly, a clear distinction must be maintained
between the pre-linguistic, intercorporeal processes and structures here
described and the qualitatively different kind of meaning and intersubjec-
tivity of language. The first are pre-predicative motivations, accounting for
a degree of cross-linguistic overlap by functioning as attractors, without
the need to postulate ‘linguistic universals’ (Evans and Levinson 2009a),
while linguistic meanings are normative, systematic, and predicative
(Blomberg and Zlatev 2014, Zlatev 2013, Zlatev and Blomberg 2016).
Many a cognitive linguistic analysis has not always made this crucial
distinction, conflating linguistic meaning with conceptualization. Critics
of this have argued that the distance between the subjective Vorestellung
and the conventional Sinn (to use the Fregean terms) is so great, that
conscious experience becomes almost irrelevant for semantics (Itkonen
2003).
Perhaps the greatest advantage of the phenomenology-inspired
approach to language endorsed in the chapter is that it opens the door to
a third way: on the one hand, pre-linguistic experience is not a matter of
private Vorstellungen, but to a considerable degree of embodied intersub-
jectivity; on the other hand, linguistic meanings are not neutral or arbi-
trary conventions, since they emerge from and come to embody shared
intersubjective perspectives, or construals of worldly objects and events.

11 Embodied Intersubjectivity: Jordan Zlatev

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

11 Embodied Intersubjectivity: Jordan Zlatev

Uploaded by

Copyright:

Available Formats

11

Is language primarily grounded in the body or in sociality? The easy answer

11.2 Embodied Intersubjectivity in Phenomenology

Phenomenology is becoming increasingly influential in cognitive studies

11.3 From Embodied Intersubjectivity to Language

A phenomenology-inspired approach to intersubjectivity, and its founda-

Table 11.1 The mimetic hierarchy of semiotic development (adapted from

Examples of cognitive- Approximate

memory is of the intercorporeal kind, realized in turn-taking ‘proto-

indexical schemas, associated with, rather than as a symbols ‘standing

11.4 Image Schemas and Mimetic Schemas

counterparts, and without others) give rise to shared intersubjective

children’s physical and social interactions may results in pre-linguistic

11.5 Metaphors and Their Experiential Roots

Two interrelated debates characterize the recent literature on metaphors

Without pretentions to fully resolve these issues here, we can see in

11.6 Non-actual Motion and Construal

Such an analysis leaves a number of problems, analogous to those dis-

i. the enactive, action oriented nature of perception;

While some NAM-expressions could be linked predominantly with one

From a phenomenological point of view, affordances are publically dis-

This chapter shows the relevance of phenomenology in general, and of

On a more general theoretical level, our discussion has a number of

You might also like