Da Cunha Montane 2019 A Corpus Based Analysis of Textual Genres in The Administration Domain

887538
research-article2019
DIS0010.1177/1461445619887538Discourse Studiesda Cunha and Montané
Article
Discourse Studies
2020, Vol. 22(1) 3–31
A corpus-based analysis © The Author(s) 2019
Article reuse guidelines:
of textual genres in the sagepub.com/journals-permissions
DOI: 10.1177/1461445619887538
https://doi.org/10.1177/1461445619887538
administration domain journals.sagepub.com/home/dis
Iria da Cunha
Universidad Nacional de Educación a Distancia (UNED), Spain
M Amor Montané
Universitat Pompeu Fabra (UPF), Spain
Abstract
Laymen occasionally have to write texts in specialized domains, such as when interacting with the
public administration. However, in Spanish, most studies related to administrative texts have focused
on texts written by the administration for laymen rather than texts that average citizens write to the
administration. Against this backdrop, this article aims to carry out the following: (a) a linguistic analysis
on a corpus of Spanish-language texts from five textual genres written by laymen and addressed to
the public administration (allegation, cover letter, letter of complaint, claim, and application) on the
textual, lexical and discourse levels; (b) a contrastive analysis that applies statistical techniques to
quantitative data to identify significant differences among these textual genres. The results show a
detailed and comprehensive understanding of the characteristics of each genre. Also, the statistical
results indicate that there are significant differences among some of the analyzed genres.
Keywords
Contrastive analysis, corpus linguistics, discourse analysis, genre analysis, public administration,
statistics
Introduction
Writing specialized texts can be challenging, since they must respect concrete char-
acteristics of a specific textual genre (Cabré, 1999; Gotti, 2008) that depend on
Corresponding author:
Iria da Cunha, Departamento de Filologías Extranjeras y sus Lingüísticas, Facultad de Filología, Universidad
Nacional de Educación a Distancia (UNED), Senda del Rey 7, Despacho 02, 28040 Madrid, Spain.
Email: iriad@flog.uned.es
4 Discourse Studies 22(1)
target domain and textual genre (Bhatia, 1993; Swales, 1990; van Dijk, 1989).
Authors of specialized texts are subject matter experts in their domains (Cabré,
1999), such as doctors in the medical domain, lawyers in the legal domain or econo-
mists in the financial domain. Recipients of these texts could be other specialists,
students and laymen. Textual genres addressed to them will be different. For exam-
ple, specialists tend to read scientific articles, students read textbooks, and laymen
read informative articles.
Nonetheless, average citizens rather than specialists have to write texts in certain
specialized domains, such as when interacting with the public administration. In this
domain, public servants regularly write texts for public consumption in specific textual
genres (such as certificates, notices, and subpoenas), yet average citizens must also cor-
respond with the administration in writing, for example, to present a claim, application
or letter of complaint. However, laymen tend not to be familiar with administrative
textual genres.
Some authors have tried to catalog and draw up a linguistic characterization of tex-
tual genres typically used in the public administration in Spain (Ayala et al., 2000;
Castellón, 2001; Sánchez Alonso, 2014). Based on this work, da Cunha and Montané
(2019) considered the textual genres most often written by Spanish laymen, and empiri-
cally analyzed the most frequently occurring textual genres posing the greatest writing
difficulties. Their conclusions highlighted five genres – allegation, cover letter, letter of
complaint, claim and application – and found that the greatest writing difficulties were
related to textual structure, selection of contents, lexical choices, text cohesion, and the
degree of formality. Nonetheless, to our knowledge, a linguistic characterization of
these textual genres is lacking.
Against this backdrop, this article aims to carry out the following:
(a) a linguistic analysis of the five aforementioned textual genres from the public
administration domain on the textual, lexical and discourse levels to gain a
comprehensive understanding of the characteristics of these types of texts; and
(b) a contrastive analysis that applies statistical techniques to quantitative data to
identify significant differences among these textual genres.
To achieve these objectives, this article draws on a corpus comprised of Spanish-

language texts from the five aforementioned textual genres. It analyzes different linguis-
tic categories from both a quantitative and qualitative point of view in order to gain a
systematic understanding of the representative characteristics of each genre. Finally, a
contrastive analysis using statistical techniques sheds light on significant differences
between the analyzed genres.
The ‘Literature review’ section of this article reviews the literature on administration
texts. The ‘Theoretical framework’ section presents the theoretical frameworks used to
carry out a linguistic analysis on the corpus. The ‘Methodology’ section describes the
different phases of the study. The ‘Linguistic characterization results’ section offers a
linguistic characterization of the analyzed genres, while the ‘Contrastive analysis results’
da Cunha and Montané 5
section presents the results of the contrastive analysis. Finally, the ‘Conclusions and
future work’ section lays out conclusions and directions for future work.
Literature review
Textual genres written in the administration domain allow citizens and public servants
to communicate with one another. These genres can be classified based on the partici-
pants in an administrative act, the pragmatic aim of documents (e.g. communicating
information, sharing a ruling or sentence, etc.) or the phases of the administration (ini-
tial, processing or concluding phase) (García de Toro, 2009). Our study adopts the for-
mer criterion. It considers five textual genres that laymen use when writing to public
servants which are as follows: an allegation, in which citizens involved in an adminis-
trative procedure defend their rights and legitimate interests; a cover letter, which serves
to introduce an individual, explain why they are a good fit for a job, and accompany a
curriculum vitae (CV); a letter of complaint, which expresses an individual’s dissatis-
faction with the administration or a public service; a claim, which calls for damages to
redress irregularities or unfair treatment, and which tends to be administrative or finan-
cial in nature in this domain, challenging taxes or fines levied by the administration; and
an application, which allows an individual to officially submit a petition to the admin-
istration about a given matter.1
The Spanish-language literature on administrative genres tends to classify genres –
generally based on one or more of the aforementioned criteria, – describe their character-
istics, and provide drafting templates containing real, or more frequently, ad hoc examples
(c.f. Ayala et al., 2000; Castellón, 2001; Ministerio de las Administraciones Públicas,
2003; Montolío and Tascón, 2017; Sánchez Alonso, 2014). Textual genres in the admin-
istration domain can be described and characterized in varying degrees of detail. Often,
these studies describe lexical features (e.g. archaisms and paraphrasing), morphological
features (e.g. the use of specific verb forms), or syntactic features (e.g. the length and
structure of phrases). They also tend to present an outline of the typical text structure as
well as some frequently occurring phrases. Moreover, recent guidelines have called for
administrative documents to use clear, simple language, especially when documents are
for laymen (Montolío and Tascón, 2017).
Some studies have approached administrative texts from the perspective of transla-
tion studies (c.f. Abaitua et al.’s (1997) corpus-based study about government gazettes,
or Eurrutia’s (2016) study about administrative immigration texts). Research fre-
quently uses real texts as case studies, making it qualitative rather than quantitative in
nature, focusing on drafting several linguistic features, and the style of certain admin-
istrative texts. For example, Castellón (2009) analyses clarity, legibility and sexist
language in textual genres addressed to laymen (edicts, administrative letters and reso-
lutions, among others); and Torres (2016) analyses some specific lexical, morphosyn-
tactic and stylistic features in a small diachronic corpus including texts extracted from
government gazettes.
Also, there are authors who study the textual genres in the context of the Spanish
electronic administration. For example, Ferrando Martínez (2013) makes proposals for
some genres (e.g. applications, administrative resolutions and notifications) related to
structure, metadata and confidentiality, among other aspects, from the point of view of
the documentation domain.
With regard to the corpus of our research, like some of the aforementioned studies,
real texts from the administration domain are gathered. Specifically, it is a balanced
corpus including five textual genres which are as follows: allegation, cover letter, let-
ter of complaint, claim and application. To our knowledge, no corpus-based analysis
of these genres in Spanish has been carried out. Unlike other studies, our corpus is
exclusively comprised of texts written by laymen. Concerning the methodology used
in our research, a systematic analysis of textual, lexical and discourse features is per-
formed, using both qualitative and quantitative approaches. The systematic approach
adopted herein also allows for comparison among the analyzed genres, establishing a
methodology for carrying out a contrastive analysis of textual genres from a linguistic
point of view.
Theoretical framework
This section summarizes and presents key features of the complementary theoretical
frameworks that were adopted for this research, related to three linguistic levels:
textual, lexical and discourse levels. In the first place, regarding the textual level, we
start from two approaches. On the one hand, the approach proposed by van Dijk
(1977) is used, which indicates that textual genres usually follow a clearly codified
and widely accepted pattern. For example, the research article usually consists of
Introduction, Problem, Solution and Conclusions. Van Dijk (1989) defines the super-
structure as the organizational structure of a text, which varies depending on the type
of text. In general, the superstructure is shown by means of different sections, some
of them including titles and subtitles. On the other hand, the moves proposed by
Swales (1990) are used to characterize the textual structure of the genres of our cor-
pus, along the same lines as the corpus-based analysis proposed by Biber et al.
(2007). A move ‘represents a stretch of text serving a particular communicative (that
is, semantic) function’ (Upton and Cohen, 2009: 589). Decisions related to the clas-
sification of the moves, as López-Ferrero and Bach (2016: 291) mention, ‘are made
on the basis of linguistic evidence (lexical cues) and comprehension of the text’.
In the second place, concerning the lexical level, we follow the Communicative
Theory of Terminology (CTT) by Cabré (1999), which is a terminology and specialized
discourse theory that highlights the importance of the communicative dimension of spe-
cialized communication. According to CTT, textual genres produced in specialized
domains present some global characteristics, such as precision, concision, systematicity,
impersonality and objectivity. These characteristics have linguistic evidences in texts
through different features. For example, concision can be shown through the use of
initialisms, impersonality can be expressed using the passive voice, and objectivity can
be achieved avoiding subjectivity markers, among other strategies (c.f. Cabré et al.,
2010; da Cunha et al., 2011).
In the third place, with regard to the discourse level, we use Mann and Thompson’s
(1988) Rhetorical Structure Theory (RST), which describes how texts are organized in
terms of discourse relations among text’s discourse segments. As a rule, one of the seg-
ments is more essential to the speaker’s purpose (nucleus), while the other (satellite)
provides some rhetorical information about it. Although relations are not always marked,
they can be indicated explicitly using discourse connectors. Researchers have found that
20%–43% of relations are marked in English, Spanish and Basque, depending on the
corpus (da Cunha et al., 2012a; Iruskieta and da Cunha, 2010; Taboada, 2006).
In our research, in order to carry out the linguistic analysis of the genres included in
the corpus, we assume the main proposals of the complementary aforementioned frame-
works: first, in the textual level, superstructure and moves; second, in the lexical level,
the global characteristics of texts produced in specialized domains; and, third, in the
discourse level, discourse segments, relations and connectors.
Methodology
Building the corpus
As noted earlier, this work considers five administration-related textual genres which are
as follows: allegation, cover letter, letter of complaint, claim and application. The corpus
comprises 20 texts per textual genre for a total of 100 texts from the administration
domain. Corpus statistics are included in Table 1.
Texts were selected from a wide variety of sources, including Spanish universities,
civic associations, lawyers’ associations, documents repositories, and job search sites,
among others. For example, the corpus includes a letter of complaint presented to the
Spanish Ombudsman,2 and an allegation3 and a claim4 presented to different Spanish
local governments.
Table 1. Corpus statistics.
Textual genre Number of texts Number of words

Allegation 20 19,405
Cover letter 20 3550
Letter of complaint 20 10,952
Claim 20 11,498
Application 20 3283
Total 100 48,688
Figure 1. Database structure.
Categories analyzed
The data were selected and analyzed on the textual, lexical and discourse levels follow-
ing the frameworks mentioned in section ‘Theoretical framework’. Figure 1 presents an
overview of the categories included in the database and analyzed herein.5
Textual level. On the textual level, the following three categories were considered:
•• Sections used to structure the text, that is, each of the sections that could be clearly
identified, either because section breaks were included or because their topic was
distinct. For example, in an allegation, sections included ‘Heading’, ‘Personal
details’, ‘Statement of facts’, ‘Presentation of allegations’ and ‘Conclusion’.
•• Titles included at the beginning of each section. For example, the ‘Statement of
facts’ section of an application includes the title ‘STATES:’.
Table 2. Subjectivity markers.
Type of subjectivity marker Subjectivity markers

Exclamatory sentences Use of ¡ or !
Superlatives Lexical units ending in -ísimo/-ísima/-ísimos/-ísimas
Nouns conveying opinion pena (shame), esperanza (hope), deseo (desire)
Adverbs conveying opinion desgraciadamente (unfortunately), ojalá (hopefully),
seguramente (surely), acertadamente (certainly),
desesperadamente (desperately), maravillosamente
(marvelously), perfectamente (perfectly), falsamente (falsely),
afortunadamente (fortunately), probablemente (probably),
naturalmente (naturally), evidentemente (evidently),
inevitablemente (inevitably), indiscutiblemente (indisputably),
indudablemente (undoubtedly), forzosamente (unavoidably),
curiosamente (strangely), paradójicamente (paradoxically)
Variable phrases conveying a mi parecer (in my view), en mi opinión (in my opinion),
opinion desde mi punto de vista (from my point of view)
Set phrases conveying sin duda (no doubt), sin ninguna duda (without a doubt), sin
opinion duda alguna (without any doubt whatsoever), de seguro que
(certainly), quizá (perhaps), quizás (perhaps), con buen criterio
(sensibly), es posible que (it is possible that), por supuesto (of
course), tal vez (maybe), desde luego (of course), de ninguna
manera (no way), en absoluto (at all), en modo alguno (in the
slightest), por suerte (luckily)
Adjectives conveying bueno (good), malo (bad), peor (worse), mejor (better),
opinion magnífico (magnificent), perfecto (perfect)
•• Moves each section includes, that is, semantic and functional textual units with a
specific communicative purpose. For example, the ‘Statement of facts’ section of
a claim includes two moves: ‘Reason(s) for writing the claim’ and ‘Reference to
attachments’.6
Lexical level. On the lexical level, the following categories were considered:
•• Initialisms: the definition of siglas propias (‘proper initialisms’) by Giraldo

(2008) is adopted, that is, initialisms exclusively constituted by the initials of the
lexical units included in a syntagmatic structure (e.g. WHO for World Health
Organization).
•• Definitions, as per three definitional patterns: concebir como (‘conceive of as’),
definir como (‘define as’), and denominar (‘refer to as’) (Alarcón, 2009).
•• Morphosyntactic features: use of the active and passive voices, and use of the first
person singular and plural.
•• Subjectivity markers: based on a clearly defined group of subjective units (Otaola,
1988), such as superlatives and nouns, adjectives and adverbs conveying opinion.
The complete list of subjectivity markers used in this work is included in Table 2.
Discourse level. On the discourse level, words, sentences, discourse segments and dis-
course connectors were analyzed for each text. This article adopts the definition of
discourse segment put forward by Tofiloski et al. (2009: 77): ‘Discourse segmentation is
the process of decomposing discourse into elementary discourse units (EDUs), which
may be simple sentences or clauses in a complex sentence, and from which discourse
trees are constructed’. Specifically, we use the criteria for discourse segmentation most
used in Spanish described in da Cunha et al. (2012b), da Cunha and Iruskieta (2010) and
Iruskieta et al. (2015). See, for instance, examples 1 and 2. These examples include sen-
tences in Spanish extracted from two different cover letters of our corpus (with their
corresponding translation into English). In these sentences, the different discourse seg-
ments are indicated in square brackets.
1. [Me pongo en contacto con ustedes para hacerles llegar mi currículum vítae,]
[ya que estoy buscando nuevas oportunidades profesionales.]
[I am writing to share my CV,] [as I am seeking new professional opportunities.]
2. [Tengo el gusto de remitirles mi currículum vítae] [con el objetivo de participar
en el proceso de selección.]
[I am pleased to submit my CV,] [with the objective of expressing my interest in
participating in the recruitment process.]
Our study identifies and classifies connectors related to eight RST discourse relations:
Antithesis, Cause, Concession, Condition, Contrast, Purpose, Restatement and Summary.7
These eight relations were selected because they occur most frequently in the RST Spanish
Treebank (da Cunha et al., 2011) and are also regularly indicated in this corpus using con-
nectors. The discourse connectors associated with the aforementioned relations and con-
sidered for the sake of this study have been extracted from da Cunha et al. (2011, 2012a).
This list of connectors includes both discourse connectors as they are traditionally viewed
(such as porque (‘because’) and si (‘if’)) and more complex types of connectors, such as
those including verbs (e.g. la razón es que (‘the reason is that’)). The complete list of
discourse connectors used in this work is included in Table 3.
Thus, the connectors indicated in Table 3 are searched in the corpus. For instance, in
reference to the examples set out earlier: in example 1, the connector ya que (‘as’), which
expresses Cause, is detected between the two discourse segments included in the sen-
tence, and, in example 2, the connector con el objetivo de (‘with the objective of’), which
expresses Purpose, is found linking the two different discourse segments of the sentence.
Semi-automatic corpus analysis

The linguistic analysis of the aforementioned levels of the corpus took place in two
phases. First, data were manually extracted for sections, titles and moves (from the tex-
tual level). Second, the remaining data were extracted automatically (from the lexical
and discourse levels), by using the following automatic Natural Language Processing
(NLP) tools: a morphosyntactic analyzer (Freeling; Atserias et al., 2006), and a discourse
segmentation system (DiSeg; da Cunha et al., 2012b).
Linguistic characterization of the analyzed genres

In order to characterize textual genres from a textual point of view, similarities in the
sections, titles and moves of each of the three analyzed genres were measured. A
Table 3. List of discourse connectors by discourse relation.
Relation Discourse connectors

Antithesis pero (but), sin embargo (however), no obstante (nevertheless), de todos
modos (in any case), de todas maneras (in any event), de todas formas
(anyway), con todo (in spite of)
Cause ya que (as), porque (because), debido a (due to), puesto que (since), pues
(as), esto se debe a (this is due to), se debe a (owing to), este hecho está
causado por (this is caused by), por eso (that’s why), por ello (therefore),
debido a eso (due to this), debido a ello (due to that), por ese motivo (on
account of this), por esa razón (this is why), la causa es que (the cause is), el
motivo es que (the motive is), la razón es que (the reason is)
Concession aunque (although), aun cuando (even when), si bien (though), de cualquier
forma (in any case), a pesar de (despite), a pesar de eso (despite this), a
pesar de ello (despite that), aun así (nonetheless)
Condition si (if), en caso de (in case), en caso de que eso ocurra (in case this occurs), en
caso de que eso suceda (in case this happens), en caso de que sea así (in case
it is so), si eso ocurre (if this occurs), si eso sucede (if this happens), si es así
(if this is the case), siempre que eso ocurra (provided this occurs), siempre
que eso suceda (provided this happens)
Contrast en lugar de (instead of), a diferencia de (as opposed to), en cambio
(whereas), por el contrario (on the contrary), en lugar de eso (instead of this)
Purpose para que (so that), con el fin de (with the aim of), a fin de (aiming to), con la
finalidad de (with the purpose of), con el objetivo de (with the objective of),
con el objeto de (with the objective of), su propósito es (its aim is), su objetivo
es (its goal is), tiene como propósito (the goal is to), tiene como objetivo (the
objective is to), lo que pretende es (its aim is to), con esa finalidad (with this
purpose), con ese objetivo (with this objective), con ese propósito (with this
aim), con ese fin (with this goal)
Restatement es decir (that is to say), o sea (that is), esto es (that means), dicho de otro
modo (to put it another way), en otras palabras (in other words)
Summary en resumen (in summary), para resumir (to summarize), a modo de resumen
(by way of summary), en conclusión (in conclusion), para concluir (to
conclude), a modo de conclusión (by way of conclusion), en definitiva (in short)
frequency threshold of 50% was established to determine relevance. Consequently, if a

section, title or move appeared in at least 50% of texts corresponding to a textual genre,
it was deemed to be a relevant feature of that genre. This information on textual structure
was utilized to design a model structure including the sections, titles and moves that tend
to appear most frequently in texts for each genre.
The study also analyzed the lexical and discourse aspects mentioned in sections
‘Lexical level’ and ‘Discourse level’. To characterize the genres, the averages of the dif-
ferent analyzed linguist aspects were used. As the texts included in the corpus had differ-
ent sizes (see Table 1), the data were normalized by using the frequency per million
words (fpmw, Biber et al., 1998).8 The fpmw is calculated dividing the absolute fre-
quency of the analyzed linguistic characteristic by the total number of words in the cor-
pus; the result is multiplied by 1 million. The equation can be represented in this way:
fpmw = (absolute frequency/number of words in the corpus) ×11,000,000

Contrastive analysis of the analyzed genres

Finally, quantitative data concerning the textual, lexical and discourse categories for each
textual genre were analyzed contrastively in order to determine which genres were most
similar. Several statistical tests were conducted to compare the normalized data across the
three linguistic levels (following Barón and Téllez, 2004). Levene’s test was used to ana-
lyze normal distribution. Later, several different tests were run to compare averages. An
analysis of variance (ANOVA) test was conducted to determine if it was possible to dif-
ferentiate among textual genres for parametric variables, while the Kruskal-Wallis test for
independent samples was utilized for non-parametric variables. Subsequently, in order to
determine if there were statistically significant differences among specific textual genres,
post hoc multiple comparison tests were conducted. Based on the results of Levene’s test,
either Tukey’s HSD test (for variables with equal within-group variance) or Dunnett’s T3
test (for variables with unequal variances) was utilized. Finally, a discriminant analysis
was conducted to describe significant differences among the five textual genres and deter-
mine which variables allowed for discriminating between them.
Linguistic characterization results

This section presents the results of the linguistic analysis of the five textual genres from
the administration domain on the textual, lexical and discourse levels, providing a com-
prehensive understanding of the most outstanding characteristics of each genre.
Allegation
As explained in section ‘Linguistic characterization of the analyzed genres’, sections,
titles and moves were selected taking a frequency threshold into account, in order to
design a model structure including those that tend to appear most frequently in this genre.
Table 4 presents this model structure, which includes six sections without titles, and a
different number of moves depending on the section, between one and five.9
As shown in Table 5, on the textual level, titles rarely headed a section in allegations,
and each section tended to include more than one move. On the lexical level, a limited
number of initialisms were found, no definitions were identified, and few subjective
units were detected. In this genre, the active voice was primarily used rather than the
passive voice. Moreover, verbs in the first person singular were used more frequently
than verbs in the first person plural. On the discourse level, sentences often comprised
various discourse segments. The most frequent connectors expressed Cause and Purpose.
Table 6 includes the different connectors found in the corpus of allegations for the
eight discourse relations analyzed in this research. The Cause and Purpose connectors
present more variation than the other connectors, since they include five and four differ-
ent connectors, respectively.
Cover letter
Table 7 presents the model structure for the cover letter genre. This structure includes
four sections without titles, and a very different number of moves depending on the sec-
tion, between one and ten.
Table 4. Model structure for the allegation textual genre.
Sections Titles Moves

Heading No title provided Addressee’s job title
Institution to which the allegation is sent
Personal details No title provided Name
Age
National identification number
Mailing address
Brief description of the author
Statement of No title provided Reason(s) for writing the allegation
facts
Presentation of No title provided List of allegations presented by the author
allegations Reference to attachments
Request No title provided Description of request(s) for the
institution receiving the allegation
Conclusion No title provided Author’s location
Date
Author’s signature
Table 5. Normalized averages related to analyzed data for the allegation textual genre.
Textual level Lexical level Discourse level
Variable Average Variable Average Variable Average

# of sections 16,923.9 # of initialisms 4548.4 # of sentences 45,398.0
# of titles 2426.5 # of definitions 0.0 # of discourse segments 72,647.4
# of moves 33,724.9 # of subjective units 3649.4 total # of connectors 4505.9
# of verbs in the 95,194.7 # of connectors 33.8
active voice expressing Antithesis
# of verbs in the 2252.2 # of connectors 1708.0
passive voice expressing Cause
# of verbs in the first 9077.1 # of connectors 429.0
person singular expressing Concession
person plural expressing Condition
# of connectors 145.7
expressing Contrast
expressing Purpose
expressing Restatement
# of connectors 8.4
expressing Summary
Table 6. Connectors found in the corpus of allegations.

Antithesis pero (but), sin embargo (however)
Cause ya que (as), porque (because), debido a (due to), puesto que (since), pues (as),
por ello (therefore)
Concession aunque (although), si bien (though)
Condition si (if), en caso de (in case)
Contrast en lugar de (instead of), por el contrario (on the contrary)
Purpose para que (so that), con el fin de (with the aim of), con el objetivo de (with the
objective of), con el objeto de (with the objective of)
Restatement es decir (that is to say)
Summary para concluir (to conclude)
Table 7. Model structure for the cover letter textual genre.

Heading No title provided Author’s name
Author’s mailing address
Author’s telephone number
Institution to which the cover letter is sent
Institution’s mailing address
Author’s location
Date
Salutation No title provided Salutation
Body No title provided Reference to where the job was posted (if responding
to a job advertisement)
Reference to the date the job was posted (if responding
to a job advertisement)
Reference to the type of job posted (if responding to a
job advertisement)
Information about the author’s professional experience
Information about the author’s academic qualifications
Reference to attached CV
Explanation of author’s skills
Explanation of author’s interest in the institution
Request for an interview
Reference to author’s availability
Closing No title provided Closing
Conclusion No title provided Author’s signature
Author’s name
As indicated in Table 8, on the textual level, in cover letters none of the sections
included a title and each section tended to include various moves. On the lexical level,
this genre contained some initialisms, did not include any definitions, and presented a
high frequency of subjective units. Furthermore, this genre predominantly used the
Table 8. Normalized averages related to analyzed data for the cover letter textual genre.

# of titles 0.0 # of definitions 0.0 # of discourse segments 110922.8
# of moves 109431.8 # of subjective units 9725.3 total # of connectors 5099.3
first person singular expressing Concession
first person plural expressing Condition
# of connectors 0.0
expressing Contrast
expressing Purpose
# of connectors 0.0
# of connectors 0.0
expressing Summary
Table 9. Connectors found in the corpus of cover letters.

Cause porque (because), por eso (that’s why), por ello (therefore)
Purpose para que (so that), con el fin de (with the aim of), a fin de (aiming to), con el
objetivo de (with the objective of), con el objeto de (with the objective of)
active voice rather than the passive voice. Finally, analysis revealed that first person
singular verbs appeared more frequently than first person plural verbs. On the dis-
course level, we highlight that sentences rarely comprised various discourse segments.
The most frequent connectors expressed Purpose and Cause. There were no connectors
expressing other discourse relations.
Table 9 shows the connectors detected in the corpus of cover letters. Both Cause and
Purpose connectors present variation, although Purpose connectors could be highlighted,
since they present five different variants.
Letter of complaint
Table 10 presents the model structure for the letter of complaint genre. This structure
includes six sections without titles, and one or two moves in each section.
Table 10. Model structure for the letter of complaint textual genre.

Heading No title provided Institution to which the letter of complaint is sent
Date
Personal details No title provided Author’s name
Author’s mailing address
Statement of facts No title provided Reason(s) for writing the letter of complaint
Request No title provided Description of request(s) for the institution
receiving the letter of complaint
Closing No title provided Closing
Conclusion No title provided Author’s signature
Author’s name
Table 11. Normalized averages related to analyzed data for the letter of complaint textual
genre.

# of titles 2747.7 # of definitions 0.0 # of discourse 87,616.1
segments
expressing Contrast
expressing Purpose
expressing
Restatement
expressing Summary
On the textual level, as can be observed in Table 11, in letters of complaint, most
sections did not include titles. Furthermore, each section tended to include more than
one move. On the lexical level, the number of initialisms used was very low and defini-
tions were not used. By contrary, we highlight the use of subjective units. Moreover, in
Table 12. Connectors found in the corpus of letters of complaint.

Antithesis sin embargo (however)
Cause porque (because), debido a (due to), pues (as), por ese motivo (on
account of this), por esa razón (this is why)
Concession aunque (although), si bien (though), a pesar de (despite)
Contrast por el contrario (on the contrary)
Purpose para que (so that), con el fin de (with the aim of), a fin de (aiming to)
Restatement es decir (that is to say), esto es (that means)
Summary en definitiva (in short)
Table 13. Model structure for the claim textual genre.

Heading No title provided Institution to which the claim is sent
Age
Mailing address
Brief description of the author
Lodging the claim No title provided Expressions for lodging the claim
Statement of facts No title provided Reason(s) for writing the claim
Reference to attachments
Request No title provided Description of request(s) for the
institution receiving the claim
Date
this genre very few verbs in the passive voice were used. Finally, it should be pointed
out that first person plural verbs appeared more frequently than first person singular
verbs. On the discourse level, sentences rarely comprised more than one discourse seg-
ment. The most frequent connectors expressed Cause. All connectors except connectors
expressing Condition were found in the genre.
Table 12 presents the connectors found in the corpus of letters of complaint. The connec-
tors of Cause are the ones that show more variation, with five different connectors detected.
Claim
Table 13 presents the model structure for the claim genre. This structure includes five
sections without titles, and between one and five moves in each section.
On the textual level, claims not always included titles in the sections, as shown in
Table 14. Each section tended to include two moves. On the lexical level, this genre
Table 14. Normalized averages related to analyzed data for the claim textual genre.

# of titles 4944.5 # of definitions 0.0 # of discourse segments 65,206.2
expressing Contrast
expressing Purpose
expressing Summary
Table 15. Connectors found in the corpus of claims.

Antithesis sin embargo (however), no obstante (nevertheless)
Cause ya que (as), porque (because), debido a (due to), puesto que (since), pues (as),
por ello (therefore)
Concession aunque (although), si bien (though)
Condition en caso de (in case)
Contrast en lugar de (instead of)
Purpose para que (so that), con el fin de (with the aim of), a fin de (aiming to)
Restatement es decir (that is to say), esto es (that means)
Summary en definitiva (in short)
contained some initialisms and subjective units, but did not include any definitions.
Moreover, this genre primarily used the active voice and first person singular verbs,
rather than the passive voice and first person plural verbs, which appeared infrequently.
On the discourse level, in claims, sentences often comprised more than one discourse
segment. Furthermore, all analyzed types of connectors were found in this genre,
although the most frequent connectors expressed Cause.
Table 15 includes the connectors found in the corpus of claims. Again, the connectors of
Cause are the ones presenting more variation, in this case with six different connectors detected.
Table 16. Model structure for the application textual genre.

Mailing address
Telephone number
Statement of facts STATES: Reason(s) for writing the application
Reference to attachments
Request REQUESTS: Description of request(s) for the
institution receiving the application
Date
Author’s name
Addressee’s job title
Institution to which the application is sent
Application
Table 16 presents the model structure for the application genre. This structure includes
four sections, and two of them ( ‘Statement of facts’ and ‘Request’) are headed by a title,
which usually appears with capital letters followed by a colon. These four sections
include a different number of moves, between one and five.
On the textual level, in applications not all sections included a title, but titles were
more frequent than in the other analyzed genres, as can be observed in Table 17. Also,
sections tended to have various moves. On the lexical level, this genre included a high
number of initialisms, but no definitions and few subjective units were identified.
Moreover, this genre predominantly used the active voice rather than the passive voice,
although we highlight that passive voice was often used in comparison to the other ana-
lyzed genres. Finally, verbs in the first person singular were used more frequently than
verbs in the first person plural. On the discourse level, few sentences contained more
than one discourse segment. The application was clearly the genre in which connectors
were used least frequently. The only connectors found expressed Purpose and Cause.
Table 18 shows the connectors found in the corpus of applications. In this case, only
one type of connector for Cause and for Purpose is detected.
Contrastive analysis results

This section presents the results of the statistical analysis of quantitative features in the
textual genres characterized earlier. It aims to compare and contrast among them, high-
lighting significant differences.
Concerning the textual level, the statistical tests revealed statistically significant dif-
ferences among genres for the three textual variables analyzed. For the ‘# of sections’
variable, the allegation and the claim differed the most from the other genres. The reason
Table 17. Normalized averages related to analyzed data for the application textual genre.

# of sections 43,173.1 # of initialisms 12,553.2# of sentences 74,803.8
# of titles 12,820.9 # of definitions 0.0# of discourse segments 91,482.1
# of verbs in the 70,298.6# of connectors 0.0
person singular expressing Concession
person plural expressing Condition
# of connectors 0.0
expressing Contrast
expressing Purpose
# of connectors 0.0
# of connectors 0.0
expressing Summary
Table 18. Connectors found in the corpus of applications.

Cause debido a (due to)
Purpose a fin de (aiming to)
is that, when genre pairs were compared, the allegation and the claim presented signifi-
cant differences with three out of four genres. On the contrary, the cover letter, the letter
of complaint and the application presented differences only with two out of four genres.
These differences were detected from the quantitative results of the conducted statistical
tests regarding the textual level, which are included in Appendix 1. For the ‘# of moves’
variable, in general, most of the genres differed significantly among them. The genre that
presented minor differences with respect to the others was the application. For the vari-
able ‘# of titles’, the application was the most different genre, whereas the claim was the
genre presenting fewer differences with respect to the others.
Regarding the lexical level, statistical tests did identify statistically significant differ-
ences for the most of variables among the analyzed textual genres, except for the ‘# of
definitions’ variable, since they usually do not contain definitions. For the ‘# of initial-
isms’ and ‘# of verbs in the active voice’ variables, the application was the genre that
presented the most significant differences with respect to the other genres. For the
Figure 2. Results of the discriminant analysis in graphic format.
variable ‘# of subjective units’, the most different genre was the cover letter. For the ‘#
of verbs in the passive voice’ variable, the genre that differed the most from the other
genres was the letter of complaint. For the ‘# of verbs in the first person singular’ varia-
ble, the cover letter was the only genre that differed from the others, while for the ‘# of
verbs in the first person plural’ variable, the letter of complaint was the only one present-
ing significant differences. Appendix 2 includes detailed results of the statistical tests
regarding the lexical level.
With respect to the discourse level, statistical differences were found for the most of the
variables, with the exception of four types of connectors (Antithesis, Contrast, Restatement
and Summary). Again, the reason seems to be the low frequency of the use of connectors
expressing these discourse relations. For the ‘# of discourse segments’ variable, the cover
letter and the claim differed the most from the other genres. Regarding the ‘# of sentences’
variable, the allegation and the claim were the genres that presented more differences from
the others. In the case of the variables related to connectors, no significant differences
between genres were found. The only exception was the variable ‘# of connectors express-
ing Purpose’, which revealed differences between the cover letter and the claim. Regarding
the variable ‘total # of connectors’, the application was the most different genre. The statis-
tical results regarding the discourse level are shown in Appendix 3.
Finally, the results of a discriminant analysis revealed statistically significant differ-
ences between the five analyzed genres. The classification of textual genres proved to be
rather suitable, since 78.0% of texts were assigned to the correct genre. Cross validation
results were also acceptable, since 69.0% of texts were correctly classified.
Results are depicted graphically in Figure 2. Centroids (that is, the mean discriminant
score for each group) were clearly distinct for the cover letter and application genres,
which fell far from one another and the remaining genres. Centroids for the other three
genres fell closer to one another, although the distance was furthest for the letter of com-
plaint genre. Centroids for the allegation and claim genres nearly overlapped, implying
that the genres have more similar textual, lexical and discourse features.
Texts from different genres overlapped in nearly all cases. Nevertheless, the least
dispersion was found for the allegation and claim, implying that texts in these genres
most resemble each other. Conversely, the application, cover letter and letter of com-
plaint presented the greatest degree of dispersion, indicating that the characteristics of
these texts tended to differ.
Conclusions and future work

To accomplish the objectives of the research, a three-stage methodology was adopted:
(a) building a corpus comprised of Spanish-language texts from the five studied tex-
tual genres, (b) analyzing different linguistic categories in order to gain a systematic
understanding of the representative characteristics of each genre, and (c) contras-
tively analyzing these genres using statistical techniques to shed light on their signifi-
cant differences.
This study makes three major contributions. First, it presents the most frequent model
structures utilized in five textual genres (allegation, cover letter, letter of complaint,
claim and application), including textual information (sections, titles and moves) that
appear regularly in each of them.
Second, it presents relevant results regarding statistically significant linguistic differ-
ences between genres, by using normalized data and the statistical tests mentioned in
section ‘Contrastive analysis of the analyzed genres’. On the textual level, statistically
significant differences between genres appear for all the variables, as follows. Regarding
the variable related to sections, the allegation and the claim are the genres with the fewest
sections, and that is the reason why they significantly differ from the others. In terms of
moves, on the one hand, the allegation and the claim are similar, as they include the few-
est moves. On the other hand, the cover letter differs significantly from both of them,
because it is the genre that contains the most moves. These differences concerning sec-
tions and moves may be due to the fact that an allegation or a claim generally focuses on
a single topic – the reason for lodging an objection – while a cover letter discusses vari-
ous topics related to the author. Besides, some genres also require more titles than others.
For example, the application has significant differences with respect to the other genres
because usually includes titles. On the contrary, in our corpus, the cover letter does not
include any titles. These significant differences are due to each genre’s prototypical tex-
tual structure.
On the lexical level, statistical tests revealed significant differences between some
genres for six out of the seven analyzed variables, as will be discussed in detail below.
Concerning the variable related to initialisms, in general, they appear scarcely in the texts
of our corpus. Nevertheless, the application is the genre that presents significant differ-
ences with the others, because, although these texts are short, they usually contain initial-
isms. Some of these initialisms are often used in this genre, such as DNI (‘ID’). As for
subjective units, the cover letter is the genre including more. This fact seems logical,
since in this genre the authors tend to present their professional background through
units expressing subjectivity; for example, verbs such as merecer (‘deserve’) or expres-
sions such as en mi opinión (‘in my opinion’). Regarding verbs in the active voice, a
particularly noteworthy case is the application, which has fewer such verbal forms, since
a more personal tone tend to be used in this genre. As regards the passive voice, it is
rarely used in the letter of complaint, maybe because the authors try to offer a more direct
information. The main difference regarding verbs in the first person singular is found
when comparing the cover letter with the other genres, since, as mentioned, in this genre,
authors explain their professional background; to do it, the first person singular is the
most suitable form. Finally, in the case of first person plural verbs, significant differences
are found between the letter of complaint and the other genres. This result is surprising,
since single individuals usually lodge complaints. One possible explanation could be that
the letters of complaint included in our corpus are sometimes sent on behalf of various
individuals, such as civic associations.
On the discourse level, statistically significant differences are found between certain
genres, as explained below. Regarding the number of sentences, the allegation and the
claim stand out because they are the genres with fewer sentences. These two genres con-
tain the highest number of words, which means that their sentences are very long. The
case of the claim is noteworthy because, additionally, the amount of discourse segments
is also low. That means that the sentences are really complex. On the contrary, the cover
letter presents significant differences with the other genres regarding discourse segments,
since it includes a higher number. Concerning connectors, the only significant differences
are detected in the total number of connectors and in the use of Purpose connectors. With
respect to the total number of connectors, the most different genre is the application, espe-
cially with respect to the allegation and the cover letter, since the application is the genre
containing less connectors. The reason is that it includes a more direct and simplified
discourse, expressing the different reasons for submitting a petition as a list of items. By
contrast, the allegation and the cover letter present a more elaborated discourse, using
connectors to relate the different ideas in the text. Regarding the use of Purpose connec-
tors, the only significant difference is found between cover letter and claim. This is likely
due to the fact that Purpose connectors are frequently used in cover letters to explain the
reasoning for submitting the CV attached to these type of letters. On the contrary, claims
tend to state their aims directly rather than explaining their reasoning.
This study’s third contribution is the discriminant analysis, which offers a global
overview of the statistically significant linguistic differences among the five analyzed
genres. As seeing in ‘Contrastive analysis results’ section (Figure 2), texts from different
genres overlapped, but the centroids for the cover letter and application were especially
distant, both from one another and from the other three textual genres. As a result, they
can be considered the most different textual genres that were analyzed.
Currently, results from this corpus-based research are being used in a research project
related with the use of technology on writing. Its aim is to design and roll out a tool to assist
in automatically drafting administrative texts in Spanish (da Cunha et al., 2017).10 The tool
includes recommendations based on the features identified for each of these five genres. The
results of our article could also lay the groundwork for allowing public servants to improve
and draft resources and materials (such as online templates) to assist laymen who need to
write these types of administrative texts. Subsequently, an English corpus including the
same textual genres will be used to replicate this research and carry out a contrastive study
of textual genres in the administrative domain in English and Spanish.
Acknowledgements
This research has been developed in the framework of the ACTUALing and IULATERM research
groups. The authors would like to thank Josh Goldsmith for the translation of the text, Sheila
Queralt for the statistical advice, and Mikel Iruskieta for his insightful and valuable comments
about this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/
or publication of this article: This article is part of the ‘Automatic system to help in writing special-
ized texts in domains relevant to Spanish society’ (‘Un sistema automático de ayuda a la redacción
de textos especializados de ámbitos relevantes en la sociedad española actual’) research project,
which received a ‘2015 BBVA Foundation Grants for Researchers and Cultural Creators’
(‘Convocatoria 2015 de Ayudas Fundación BBVA a Investigadores y Creadores Culturales’) grant.
This work was also supported by a Ramón y Cajal contract (RYC-2014-16935), associated with
the Departamento de Filologías Extranjeras y sus Lingüísticas at the Universidad Nacional de
Educación a Distancia (UNED).
Notes
1. These definitions were adapted from the Diccionari de dret administratiu (Departament de
Justícia and TERMCAT Centre de Terminologia, 2014).
2. http://es.slideshare.net/almelini/queja-a-defensor-del-pueblo-presentation (accessed May
2019).
3. https://es.slideshare.net/AuditoriaVLC/234861990-alplenodelayuntamientodevaldemoro
(accessed May 2019).
4. https://es.slideshare.net/chazaragoza/modelo-reclamacin-contra-el-presupuesto (accessed
May 2019).
5. In this article, the symbol # is used to indicate ‘number’.
6. All the moves were validated by the authors of this article.
7. To see the definitions of these relations, we recommend to access to the RST website: http://
www.sfu.ca/rst/index.html (accessed May 2019).
8. This method is frequently used in corpus linguistics, for example, in the British National
Corpus, the Corpus of Historical American English, and the Corpus of Contemporary
American English, among others (Molina and Sierra, 2015).
9. Literal English translations of the Spanish model structures are included in this article. We are
aware that the structures (sections, titles and moves) for the same textual genres may differ
in English, and that a textual analysis of an English language corpus would be necessary to
linguistically characterize texts written in this language.
10. This tool, named arText system, is available at: http://sistema-artext.com/ (accessed May 2019).
References
Abaitua JK, Casillas A and Martínez R (1997) Tratamiento de textos administrativos bilingües: el
proyecto Legebidun. Philologia hispalensis 11(2): 115–130.
Alarcón R (2009) Descripción y evaluación de un sistema basado en reglas para la extracción
automática de contextos definitorios. Barcelona: Institut Universitari de Lingüística Aplicada.
Atserias J, Casas B, Comelles E, et al. (2006) FreeLing 1.3. Syntactic and semantic services in an
open-source NLP library. In: LREC 2006 proceedings. 5th Edition of the international confer-
ence on language resources and evaluation (eds Schuurman I and Vandeghinste V), Genoa,
22–28 May, pp. 48–55. Paris: European Language Resources Association (ELRA).
Ayala P, Domínguez E, Martel F, et al. (2000) Manual de normalización de documentos adminis-
trativos. Las Palmas: Universidad de Las Palmas de Gran Canaria.
Barón FJ and Téllez F (2004) Apuntes de Bioestadística. Málaga: Universidad de Málaga.
Bhatia VK (1993) Analyzing Genre: Language Use in Professional Settings. London: Longman.
Biber D, Connor U and Upton TA (2007) Discourse on the Move: Using Corpus Analysis to
Describe Discourse Structure. Amsterdam: John Benjamins.
Biber D, Conrad S and Reppen R (1998) Corpus Linguistics: Investigating Language Structure
and Use. Cambridge: Cambridge University Press.
Cabré MT (1999) La Terminología. Representación y comunicación. Barcelona: Institut
Universitari de Lingüística Aplicada.
Cabré MT, Bach C, da Cunha I, et al. (2010) Comparación de algunas características lingüís-
ticas del discurso especializado frente al discurso general: el caso del discurso economic.
In: Caballero R and Pinar MJ (eds) Modos y formas de la comunicación humana, Ways and
Modes of Human Communication. Ciudad Real: Universidad de Castilla-La Mancha, pp.
453–460.
Castellón H (2001) El lenguaje administrativo. Formas y uso. Granada: Editorial La Vela.
Castellón H (2009) Hacia la claridad en los textos administrativos. Revista de Llengua i Dret 52:
85–115.
da Cunha I and Iruskieta M (2010) Comparing rhetorical structures in different languages: The
influence of translation strategies. Discourse Studies 12(5): 563–598.
da Cunha I and Montané MA (2019) Textual genres and writing difficulties in specialized domains.
Revista Signos 52(99): 4–30.
da Cunha I, Montané MA and Hysa L (2017) The arText prototype: An automatic system for
writing specialized texts. In: Proceedings of the 15th conference of the European Chapter of
the Association for Computational Linguistics EACL 2017. Software demonstrations 152017
(eds Peñas A and Martins A), Valencia, 3–7 April, pp. 57–60. Valencia: Association for
Computational Linguistics.
da Cunha I, SanJuan E, Torres-Moreno JM, et al. (2012a) A symbolic approach for automatic
detection of nuclearity and rhetorical relations among intra-sentence discourse segments in
Spanish. Lecture Notes in Computer Science 7181: 462–474.
da Cunha I, SanJuan E, Torres-Moreno JM, et al. (2012b) DiSeg 1.0: The first system for Spanish
discourse segmentation. Expert Systems With Applications 39(2): 1671–1678.
da Cunha I, Torres-Moreno JM and Sierra G (2011) On the development of the RST Spanish
treebank. In: Proceedings of the 5th linguistic annotation workshop. The annual meeting of
the ACL49, Portland, OR, 19–24 June, pp. 1–10. Portland, OR: Association for Computational
Linguistics.
Departament de Justícia and TERMCAT Centre de Terminologia (2014) Diccionari de dret
administratiu. Barcelona: TERMCAT Centre de Terminologia.
Eurrutia M (ed.) (2016) El lenguaje jurídico y administrativo en el ámbito de la extranjería.
Estudio multilingüe e implicaciones culturales. Berna: Peter Lang.
Ferrando Martínez R (2013) El documento administrativo, su contexto electrónico, tecnológico y
normativo: una propuesta de cambio de paradigma. PhD Thesis, Universidad de Murcia, Murcia.
García de Toro C (2009) La traducción entre lenguas en contacto. Catalán y Español. Berna:
Peter Lang.
Giraldo JJ (2008) Análisis y descripción de las siglas en el discurso especializado de genoma
humano y medio ambiente. Barcelona: Institut Universitari de Lingüística Aplicada.
Gotti M (2008) Investigating Specialized Discourse. Frankfurt: Peter Lang.

Iruskieta M and da Cunha I (2010) Marcadores y relaciones discursivas en el ámbito médico: un
estudio en español y euskera. In: Bueno JL, et al. (eds) Analizar datos > Describir variación,
Analysing data > Describing variation. Vigo: Universidade de Vigo, Servizo de Publicacións,
pp. 146–159.
Iruskieta M, da Cunha I and Taboada M (2015) A qualitative comparison method for rhetori-
cal structures: Identifying different discourse structures in multilingual corpora. Language
Resources and Evaluation 49: 263–309.
López-Ferrero C and Bach C (2016) Discourse analysis of statements of purpose: Connecting
academic and professional genres. Discourse Studies 18(3): 286–310.
Mann WC and Thompson SA (1988) Rhetorical structure theory: Toward a functional theory of
text organization. Text 8(3): 243–281.
Ministerio de las Administraciones Públicas (2003) Manual de documentos administrativos.
Madrid: Tecnos.
Molina C and Sierra GE (2015) Hacia una normalización de la frecuencia de los corpus CREA y
CORDE. Revista Signos 48(89): 307–331.
Montolío E and Tascón M (2017) Comunicación Clara. Guía práctica. Madrid: Ayuntamiento de
Madrid.
Otaola C (1988) La modalidad (con especial referencia a la lengua española). Revista de Filología
Española 68(1): 97–117.
Sánchez Alonso F (2014) Lenguaje y Estilo Administrativo. Redacción de documentos. Murcia:
Escuela de Formación e Innovación de la Administración Pública.
Swales JM (1990) Genre Analysis: English in Academic and Research Settings. Cambridge:
Cambridge University Press.
Taboada M (2006) Discourse markers as signals (or not) of rhetorical relations. Journal of
Pragmatics 38(4): 567–592.
Tofiloski M, Brooke J and Taboada M (2009) A syntactic and lexical-based discourse segmenter.
In: Proceedings of the 47th annual meeting of ACL, Singapore, 2–7 August, pp. 77–80.
Singapore: Association for Computational Linguistics.
Torres C (2016) La evolución del lenguaje administrativo a lo largo del siglo xx. Un análisis
a través de los textos del Boletín Oficial del Estado. Final Degree Project, Universidad de
Zaragoza, Zaragoza.
Upton TA and Cohen MA (2009) An approach to corpus-based discourse analysis: The move
analysis as example. Discourse Studies 11(5): 585–605.
van Dijk TA (1977) Text and Context: Explorations in the Semantics and Pragmatics of Discourse.
London: Longman.
van Dijk TA (1989) La ciencia del texto. Barcelona: Paidós.
Author biographies
Iria da Cunha holds a PhD in Applied Linguistics (Universitat Pompeu Fabra, 2008). She is a
Ramón y Cajal researcher at the Foreign Languages Department of the Universidad Nacional de
Educación a Distancia (UNED) in Spain. Her main fields of research are Specialized Discourse,
Textual Genres, Academic Writing, Terminology and Natural Language Processing (NLP). She is
a member of the ACTUALing and IULATERM research groups.
M Amor Montané holds a PhD in Applied Linguistics (Universitat Pompeu Fabra, 2012). She is a mem-
ber of the IULATERM research group at Institut de Lingüística Aplicada (IULA-CER) of Universitat
Pompeu Fabra, and a researcher at the Institut d’Estudis Catalans (IEC). She is also a lecturer at
Universitat de Barcelona (UB) and at Universitat Oberta de Catalunya (UOC). Her main fields of
research are Specialized Discourse, Academic Writing, Corpus Linguistics, Neology and Terminology.
Appendix 1
Results of the statistical tests regarding variables of the textual level
Results of Tukey’s HSD test for variables presenting significant differences in the ANOVA test.
Variable Genres p-value

# of sections allegation ↔ cover letter 0.000
allegation ↔ letter of complaint 0.000
allegation ↔ claim 0.994
allegation ↔ application 0.000
cover letter ↔ letter of complaint 0.997
cover letter ↔ claim 0.000
cover letter ↔ application 0.999
letter of complaint ↔ claim 0.001
letter of complaint ↔ application 1.000
claim ↔ application 0.000
# of moves allegation ↔ cover letter 0.000
HSD: honestly significant difference; ANOVA: analysis of variance.

Statistically significant values (p < 0.05) are indicated in bold in all tables presented herein.
Results of Dunnett’s T3 test for variables presenting significant differences in the Kruskal-Wallis
test.

# of titles allegation ↔ cover letter 0.000
Appendix 2
Results of the statistical tests regarding variables of the lexical level

# of initialisms allegation ↔ cover letter 0.767
# of verbs in the active voice allegation ↔ cover letter 0.159

test.

# of subjective units allegation ↔ cover letter 0.047
# of verbs in the passive voice allegation ↔ cover letter 0.995
(Continued)
Appendix 2. (Continued)

# of verbs in the first person singular allegation ↔ cover letter 0.000
# of verbs in the first person plural allegation ↔ cover letter 0.992
Appendix 3
Results of the statistical tests regarding variables of the discourse level

# of discourse segments allegation ↔ cover letter 0.000
(Continued)


test.a

# of sentences allegation ↔ cover letter 0.000
# of connectors expressing Cause allegation ↔ cover letter 1.000
# of connectors expressing Concession allegation ↔ cover letter 0.609
cover letter ↔ application –
# of connectors expressing Condition allegation ↔ cover letter 0.164
(Continued)

cover letter ↔ letter of complaint –
cover letter ↔ application –
letter of complaint ↔ application –
# of connectors expressing Purpose allegation ↔ cover letter 0.747
total # of Connectors allegation ↔ cover letter 1.000
aWhen there are no occurrences of a variable in none of the texts corresponding to the two genres compared, a
dash is included in the table.

Da Cunha Montane 2019 A Corpus Based Analysis of Textual Genres in The Administration Domain

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Da Cunha Montane 2019 A Corpus Based Analysis of Textual Genres in The Administration Domain

Uploaded by

Copyright:

Available Formats

887538

To achieve these objectives, this article draws on a corpus comprised of Spanish-

Table 1. Corpus statistics.

Textual genre Number of texts Number of words

Figure 1. Database structure.

Table 2. Subjectivity markers.

Type of subjectivity marker Subjectivity markers

•• Initialisms: the definition of siglas propias (‘proper initialisms’) by Giraldo

Semi-automatic corpus analysis

Linguistic characterization of the analyzed genres

Table 3. List of discourse connectors by discourse relation.

Relation Discourse connectors

frequency threshold of 50% was established to determine relevance. Consequently, if a

fpmw = (absolute frequency/number of words in the corpus) ×11,000,000

Contrastive analysis of the analyzed genres

Linguistic characterization results

Table 4. Model structure for the allegation textual genre.

Sections Titles Moves

Textual level Lexical level Discourse level

Variable Average Variable Average Variable Average

Table 6. Connectors found in the corpus of allegations.

Relation Discourse connectors

Table 7. Model structure for the cover letter textual genre.

Sections Titles Moves

Textual level Lexical level Discourse level

Variable Average Variable Average Variable Average

Table 9. Connectors found in the corpus of cover letters.

Relation Discourse connectors

Sections Titles Moves

Textual level Lexical level Discourse level

Variable Average Variable Average Variable Average

Table 12. Connectors found in the corpus of letters of complaint.

Relation Discourse connectors

Table 13. Model structure for the claim textual genre.

Sections Titles Moves

Textual level Lexical level Discourse level

Variable Average Variable Average Variable Average

Table 15. Connectors found in the corpus of claims.

Relation Discourse connectors

Table 16. Model structure for the application textual genre.

Sections Titles Moves

Contrastive analysis results

Textual level Lexical level Discourse level

Variable Average Variable Average Variable Average

Table 18. Connectors found in the corpus of applications.

Relation Discourse connectors

Figure 2. Results of the discriminant analysis in graphic format.

Conclusions and future work

Gotti M (2008) Investigating Specialized Discourse. Frankfurt: Peter Lang.

Variable Genres p-value

HSD: honestly significant difference; ANOVA: analysis of variance.

Variable Genres p-value

Variable Genres p-value

HSD: honestly significant difference; ANOVA: analysis of variance.

Variable Genres p-value

Variable Genres p-value

Variable Genres p-value

Variable Genres p-value

HSD: honestly significant difference; ANOVA: analysis of variance.

Variable Genres p-value

Variable Genres p-value

dash is included in the table.