You are on page 1of 54

CURRENT ISSUES

IN DISCOURSE ANALYSIS
What is discourse?
◦ first introduced by Z.Harris in 1952, a formalist view on discourse as a higher
level in the hierarchy: morpheme, clause and sentence
◦ In the framework of the functional approach (G.Leech (1983) and
D.Schiffrin (1994)), discourse is defined as language use (“utterances”, i.e.
“units of linguistic production (whether spoken or written) which are
inherently contextualized” (1994)).
◦ Discourse can be understood as stretches of language perceived to be
meaningful, united and purposive (Cook, 1989).
◦ Discourse features:
◦ Stretch of language longer than a sentence
◦ Meaningful and coherent
◦ Communicative and purposive
◦ Written or spoken

◦ discourse is “the only real linguistic object of language”


/Kibrik 2009/
◦ “People talk to each other in discourses, rather than sentences, much less
morphemes or phonemes ... Therefore, the natural evolution of linguistics as a
science should start with discourse studies, and only on this basis should it
explore the smaller units, obtained as the results of analytical procedures”
/Kibrik 2009/
Approaches to discourse
analysis
What is Critical Discourse Analysis
(CDA)?

◦ the late 1980s

◦ a problem-oriented interdisciplinary research programme which


critically analyses the relation between language and society

◦ Critical Discourse Analysis (CDA) is a type of discourse analytical


research that primarily studies the way social power abuse,
dominance and inequality are enacted, reproduced and resisted by
text and talk in the social and political context
/Teun A. van Dijk (1998)
CRITICAL DISCOURSE ANALYSIS
in D. Tannen, D. Schiffrin & H. Hamilton (Eds.),
Handbook of Discourse Analysis/
CDA
◦ Discourse – language use in speech and writing – is a form of
‘social practice’

◦ Discursive practices may have major ideological effects – that is,


they can help produce and reproduce unequal power relations
between (for instance) social classes, women and men, and
ethnic/cultural majorities and minorities through the ways in which
they represent things and position people.

/Fairclough & Wodak, 1997/


The main tenets of CDA
◦ CDA addresses social problems
◦ Power relations are discursive
◦ Discourse constitutes society and culture
◦ Discourse does ideological work
◦ Discourse is historical
◦ The link between text and society is mediated
◦ Discourse analysis is interpretative and explanatory
◦ Discourse is a form of social action.
/Fairclough & Wodak , 1997/

◦ 'power', 'dominance', 'hegemony', 'ideology', 'class', 'gender', 'race',


'discrimination', ‘manipulation’, 'interests', 'reproduction', 'institutions', 'social
structure' or 'social order'
Basic questions for CDA-
research:
◦ How do (more) powerful groups control public discourse?
◦ How does such discourse control mind and action of (less)
powerful groups, and what are the social consequences of such
control, such as social inequality?
◦ How do dominated groups discursively challenge or resist such
power.
Control of the context and the
structures of discourse

◦ all levels and structures of context, text and talk


can in principle be more or less controlled by
powerful speakers, and such power may be
abused at the expense of other participants
◦ overall strategy of Positive Self-Presentation of
the dominant ingroup, and Negative Other-
Presentation of the dominated outgroups (Van
Dijk, 1993a, 1998b)
Controlling people's minds
◦ discursive mind control is a form of power and dominance if such
control is in the interest of the powerful and if the recipients have
'no alternatives', i.e., no other sources (speakers, writers), no other
discourses, no other option but to listen or read, and no relevant
other beliefs to evaluate such discourses
◦ discursive mind control may now be defined as the control of the
mental models and/or social representations of other people.
The discourse strategies of mind
control
◦ Headlines
◦ An invasion of aliens (immigration)
◦ Implications and presuppositions
◦ Generalizations
◦ ‘This always happens like that‘
◦ 'They are all the same'
◦ Metaphors
◦ waves of immigrants
◦ The lexical expression
◦ illegal, undocumented immigrants
◦ Passive sentence structures and nominalizations
Research in Critical Discourse
Analysis
◦ Gender inequality (Cameron, 1990, 1992; Seidel, 1988; Wodak,
1997; Tannen, 1994)
◦ Power differences in everyday conversational interaction
◦ Verbal sexual harassment
◦ Gender inequalities in bureaucratic and professional text and talk
◦ Stereotypical and sexist representations of women
◦ Ethnocentrism, antisemitism, nationalism and racism (Unesco,
1977; Wilson & Gutierrez, 1985; Hartmann & Husband, 1974; Van
Dijk, 1991, Wodak & Kirsch, 1995)
◦ Media discourse
◦ Political discourse (Chilton, 1988; Geis, 1987, Wodak, 1989, 1994,
1996)
Corpus approach to discourse
analysis
◦ What is a corpus?
◦ A collection of texts in electronic form.
◦ From Latin - ‘body’ (plural corpora), a corpus is a body of language representative
of a particular variety of language or genre which is collected and stored in
electronic form for analysis using concordance software.
◦ A collection of pieces of language that are selected and ordered according to
explicit linguistic criteria in order to be used as a sample of the language (Sinclair,
1996: 4)
◦ A finite collection of machine-readable texts, sampled to be maximally
representative of a language or variety. (McEnery & Wilson, 2001: 197)

◦ The advantages of a corpus approach for the study of discourse, lexis, and
grammatical variation include the emphasis on the representativeness of
the text sample, and the computational tools for investigating distributional
patterns across discourse contexts.
Corpus approach to discourse
analysis
◦ Large-scale general corpora are effective and reliable in
providing insightful information about the preferred use of specific
lexico-grammatical patterns in everyday language use.
◦ The most important aspect of this approach is that it makes it
possible for linguists and discourse analysts to go beyond the
analysis of sentences and short texts to the analysis of huge
amounts of text.
Why use corpora?

◦ A very small number of texts vs. A very large number of texts


◦ Intuition / Introspection vs. Empirical (corpus) evidence
◦ Possibilities of use vs. Patterns of actual use

◦ Corpus-based studies look at patterns


◦ words and word groups
◦ grammatical units
◦ Meanings
◦ Attitudes
◦ frequencies co-occurrence
◦ in context
USING CORPORA IN DISCOURSE ANALYSIS:
TECHNIQUES AND TOOLS
MARIANNA DILAI
PHD, ASSOC. PROF.
APPLIED LINGUISTICS DEPARTMENT
LVIV POLYTECHNIC NATIONAL UNIVERSITY
LVIV, UKRAINE
OUTLINE

 Corpus Linguistics and Discourse Analysis

 Advantages and limitations of the corpus-based approach to discourse analysis

 Corpus-based textual, critical and contextual approaches to discourse analysis

 Keyness analysis of political and media texts / #corpusdata.org, #LancsBox, Voyant.

 Computer-assisted methods in the study of literary texts / CLiC

 Sentiment analysis and corpus-assisted discourse study of affect / LIWC, UAM

15.02.2023
DISCOURSE ANALYSIS AND CORPUS LINGUISTICS

Discourse analysis (DA) Corpus linguistics (CL)

 DA is ‘the study of real language use, by real  CL is 'the study of language based on examples of real life
language use' (McEnery & Wilson, 1996)
speakers in real situations" (T.A. van Dijk, 1997)
 Corpus is a collection of sampled texts, written or spoken, in
 Discourse machine-readable form which may be annotated with various
forms of linguistic information (McEnery, 2006)
 ‘language-in-use’ (Brown and Yule 1983),  Corpus-driven linguistics
 ‘language-in-action’, i.e. ‘meaningful symbolic behaviour’  a theory with corpus enquiries revealing hitherto unknown aspects
(Blommaert, 2005) of language, thus challenging the ‘underlying assumptions behind
many well established theoretical positions’ (Tognini Bonelli, 2001).
 ‘the totality of linguistic practices that pertain to a  Corpus-based linguistics
particular domain or that create a particular object’
 a methodology for validating existing theories and descriptions
(Jucker et al, 2009) (McEnery, 2006; Biber et al., 1998; Conrad, 2002)

15.02.2023
DISCOURSE ANALYSIS AND CORPUS LINGUISTICS

 DA and CL both make use of naturally occurring attested data


 But, there is a ‘cultural divide’ between the two (Leech, 2000):
 DA emphasizes the integrity of the text :: CL tends to use representative samples
 DA is primarily qualitative :: CL is essentially quantitative
 DA focuses on the contents expressed by language :: CL is interested in language per se’

 Corpus analyses treat the text as product rather than as an unfolding discourse as process and social action:
 ‘the computer can only cope with the material products of what people do when they use language. It can only analyse the textual traces of the
processes whereby meaning is achieved’ (Widdowson, 2000).
 ‘Corpus-based methods cannot account for the complex interplay of linguistic and contextual factors whereby discourse is enacted’ (Widdowson,
2000).

 Advantages of the corpus-based approach to discourse analysis (Baker, 2006)


 Reducing researcher bias
 The incremental effect of discourse
 Resistant and changing discourses
 Triangulation (Newby 1977), or using multiple methods of analysis (or forms of data)
15.02.2023
APPROACHES TO CORPUS-BASED DISCOURSE ANALYSIS

 Textual: approaches that focus on language choices, meanings and patterns in texts, mostly commence from a
lexico-grammatical/bottom-up perspective, can also take a rhetorical top-down perspective (P-S pattern (Hoey,
2001)). Various phraseological elements operating at the level of discourse are considered (theory of lexical
priming).
 Critical: an approach that ‘brings an attitude of criticality’, such as critical discourse analysis (CDA), but also draws
on other methods, e.g. systemic functional linguistics (SFL).
 Contextual: analyses that adopt a more sociolinguistic approach to the corpus data, where situational factors are
also taken into account.
 (Flowerdew, 2012; Hyland, 2009)

15.02.2023
COUNTING – COMPARING – VISUALISING
INAUGURAL ADDRESS BY PRESIDENT JOSEPH R. BIDEN, JR., 2021 Keywords

Word list

15.02.2023
KEYWORDS

Cultural keywords Corpus-comparative statistical keywords


 For R. Williams (1983), ‘key’ in ‘keyword’ indicates that a particular  words which are statistically more salient in the text or set of
concept is salient across a culture. For example, democracy’and texts than in a large reference corpus.
revolution are keywords.
 A. Wierzbicka (1997) claims that every language has "key  A keyword is ‘found to be outstanding in its frequency in the text’
concepts," expressed in "key words," which reflect the core values (Scott 1999) by comparison to another corpus.
of a given culture. Anglo English keywords include privacy, personal
autonomy, fairness, mind, reasonable, sense, evidence,  Keyness is generally understood as a measure of the frequency
experience. In Polish - przykro, rodzina, wolność etc. with which a word occurs disproportionately in a particular text
type and is normally assessed by comparing the relative
 M. Stubbs’ (1996, 2001) investigation of cultural keywords is done
synchronically and is informed by corpus-based methods. A corpus frequency of a word in a focus corpus to a reference corpus
provides objective quantitative support for the extent to which using a statistical test.
cultural keywords are being used, and the lexical company they
keep. It thus provides a measure of what meanings are being  Explanatory vs focused keyness analysis (Gabrielatos 2018):
culturally reproduced.
 different perspectives of keywords/ key words and keyness
(Bondi and M. Scott)
 concordance programs AntConc, WordSmith, SketchEngine,
UAM, Voyant, LancsBox.
15.02.2023
CL IN THE STUDY OF MEDIA DISCOURSE (CDA)
Concordance lines for ‘east*’ from The Sun 26,000-word corpus

 O’Halloran’s corpus investigation on how Eastern European


migrants are realised linguistically in The Sun (2010)
 WordSmith Tools
 https://www.lexically.net/wordsmith/
 Collocationally a dominant pattern is numbers of Eastern
European migrants: ‘coachloads’, ‘many’, ‘millions’
 Collocation pattern for migrants relates to poverty: ‘high-
unemployment’, ‘impoverished’, ‘poor’
 Negative meanings around migrants: ‘underqualified doctors
and nurses from Eastern Europe’; ‘vice-girls’; ‘new rules are
dodged or challenged by Eastern European migrants’; ‘bogus
applications from eastern European’; ‘The men and women –
all from eastern Europe – were arrested’; ‘false passports’;
‘suspected visa scam’; ‘criminal scam’

15.02.2023
FULL-TEXT CORPUS
DATA
https://www.corpusdata.org/

15.02.2023
CASE STUDY.
UKRAINE IN THE
ENGLISH ONLINE
NEWS OF 2010-2021
The NOW corpus (News on the
Web)
https://www.english-
corpora.org/now/

15.02.2023
COLLOCATIONS

15.02.2023
COLLOCATIONS IN CONTEXT

15.02.2023
VIRTUAL CORPORA
Ukrain*_USnews_2014
Ukrain*_USnews_2020

15.02.2023
Ukrain*_USnews_2020

KEYNESS ANALYSIS
Ukrain*_USnews_2014

15.02.2023
#LANCSBOX AND VOYANT TOOLS
http://corpora.lancs.ac.uk/lancsbox/index.php https://voyant-tools.org/

15.02.2023
CLIC
HTTPS://CLIC.BHAM.AC.UK/

15.02.2023
CLIC

15.02.2023
SENTIMENT ANALYSIS

 Sentiment analysis, also called opinion mining is a NLP and text mining problem which deals with computational
study of opinions, sentiments and emotions expressed in text. SA is a study of subjectivity (neutral vs emotionally
loaded) and polarity (positive vs negative) of a text (Bo Pang and Lillian Lee)
 lexicon-based approaches rely on sentiment lexicon (e.g., General Inquirer, WordNet Affect, QWordNet or SentiWordNet); text corpora
have been commonly used in domain adaptation, which involves converting a domain-independent sentiment lexicon into a domain-
specific lexicon;
 supervised machine learning methods (e.g., Naive Bayes, MaxEnt, Support Vector Machine).

 SA can be applied at
 the discourse level, which presupposes that each document/ text expresses opinions on a single entity.
 the sentence-level sentiment analysis determines whether the sentence implies positive or negative opinions.
 the object-oriented sentiment analysis reveals sentiment towards a specific entity mentioned in the text.
 the aspect-based sentiment analysis focuses on opinions relative to specific properties (or aspects) of an entity.

15.02.2023
LIWC: LINGUISTIC INQUIRY AND WORD COUNT
 http://www.liwc.net/index.php
 designed by James W. Pennebaker, Roger J. Booth, and Martha E.
Francis;
 the LIWC2015 master dictionary is composed of almost 6,400 words,
word stems, and selected emoticons;
 analyze over 70 dimensions of language
 4 general descriptor categories (total word count, words per sentence,
percentage of words captured by the dictionary, and percent of words longer
than six letters)
 22 standard linguistic dimensions (e.g., percentage of words in the text that
are pronouns, articles, auxiliary verbs, etc.)
 32 word categories tapping psychological constructs (e.g., affect, cognition,
biological processes)
 7 personal concern categories (e.g., work, home, leisure activities)
 3 paralinguistic dimensions (assents, fillers, nonfluencies)
 12 punctuation categories (periods, commas, etc.)

 the text analysis module was created in the Java programming


language;
 analyzes .txt and .doc(x) files;
 output is given in .txt form but can be easily transferred to an excel file.
15.02.2023
LIWC2015 OUTPUT
AFFECT IN ANGLICAN SERMONS (581 WORDS, 7%)

15.02.2023
LIWC2015 OUTPUT
AFFECT IN ANGLICAN SERMONS

Polarity Negative lexicon

15.02.2023
UAM CORPUS TOOL
http://corpustool.com/index.html

Appraisal framework, designed


to explore, describe and explain
evaluative uses of language,
including the ways the language
is used to adopt stances, to
construct textual personas and
to manage interpersonal
positioning and relationships
(Martin & White, 2005).

15.02.2023
MANUAL ANNOTATION USING BUILT-IN ATTITUDE SYSTEM

15.02.2023
RECENT DEVELOPMENTS AND NEW CHALLENGES

 https://corpus-analysis.com/

 multimodal corpora (see Alwood 2008, Adolphs and Carter 2013)


 ‘Corpus investigations focusing exclusively on the verbal component are at risk of overlooking the importance of the other semiotic
codes to the meaning-making process in audiovisual products. By combining multimodal theory and the insights provided by corpora
interrogation, recent scholarship has opened up fresh inquiry into the multi-semiotic nature of audiovisual texts’ (Baños et al 2013).

 a ‘new modal order’ emerging in the era of digital literacies and computer-mediated communication

15.02.2023

You might also like