Professional Documents
Culture Documents
Da Cunha Montane 2019 A Corpus Based Analysis of Textual Genres in The Administration Domain
Da Cunha Montane 2019 A Corpus Based Analysis of Textual Genres in The Administration Domain
research-article2019
DIS0010.1177/1461445619887538Discourse Studiesda Cunha and Montané
Article
Discourse Studies
2020, Vol. 22(1) 3–31
A corpus-based analysis © The Author(s) 2019
Article reuse guidelines:
of textual genres in the sagepub.com/journals-permissions
DOI: 10.1177/1461445619887538
https://doi.org/10.1177/1461445619887538
administration domain journals.sagepub.com/home/dis
Iria da Cunha
Universidad Nacional de Educación a Distancia (UNED), Spain
M Amor Montané
Universitat Pompeu Fabra (UPF), Spain
Abstract
Laymen occasionally have to write texts in specialized domains, such as when interacting with the
public administration. However, in Spanish, most studies related to administrative texts have focused
on texts written by the administration for laymen rather than texts that average citizens write to the
administration. Against this backdrop, this article aims to carry out the following: (a) a linguistic analysis
on a corpus of Spanish-language texts from five textual genres written by laymen and addressed to
the public administration (allegation, cover letter, letter of complaint, claim, and application) on the
textual, lexical and discourse levels; (b) a contrastive analysis that applies statistical techniques to
quantitative data to identify significant differences among these textual genres. The results show a
detailed and comprehensive understanding of the characteristics of each genre. Also, the statistical
results indicate that there are significant differences among some of the analyzed genres.
Keywords
Contrastive analysis, corpus linguistics, discourse analysis, genre analysis, public administration,
statistics
Introduction
Writing specialized texts can be challenging, since they must respect concrete char-
acteristics of a specific textual genre (Cabré, 1999; Gotti, 2008) that depend on
Corresponding author:
Iria da Cunha, Departamento de Filologías Extranjeras y sus Lingüísticas, Facultad de Filología, Universidad
Nacional de Educación a Distancia (UNED), Senda del Rey 7, Despacho 02, 28040 Madrid, Spain.
Email: iriad@flog.uned.es
4 Discourse Studies 22(1)
target domain and textual genre (Bhatia, 1993; Swales, 1990; van Dijk, 1989).
Authors of specialized texts are subject matter experts in their domains (Cabré,
1999), such as doctors in the medical domain, lawyers in the legal domain or econo-
mists in the financial domain. Recipients of these texts could be other specialists,
students and laymen. Textual genres addressed to them will be different. For exam-
ple, specialists tend to read scientific articles, students read textbooks, and laymen
read informative articles.
Nonetheless, average citizens rather than specialists have to write texts in certain
specialized domains, such as when interacting with the public administration. In this
domain, public servants regularly write texts for public consumption in specific textual
genres (such as certificates, notices, and subpoenas), yet average citizens must also cor-
respond with the administration in writing, for example, to present a claim, application
or letter of complaint. However, laymen tend not to be familiar with administrative
textual genres.
Some authors have tried to catalog and draw up a linguistic characterization of tex-
tual genres typically used in the public administration in Spain (Ayala et al., 2000;
Castellón, 2001; Sánchez Alonso, 2014). Based on this work, da Cunha and Montané
(2019) considered the textual genres most often written by Spanish laymen, and empiri-
cally analyzed the most frequently occurring textual genres posing the greatest writing
difficulties. Their conclusions highlighted five genres – allegation, cover letter, letter of
complaint, claim and application – and found that the greatest writing difficulties were
related to textual structure, selection of contents, lexical choices, text cohesion, and the
degree of formality. Nonetheless, to our knowledge, a linguistic characterization of
these textual genres is lacking.
Against this backdrop, this article aims to carry out the following:
(a) a linguistic analysis of the five aforementioned textual genres from the public
administration domain on the textual, lexical and discourse levels to gain a
comprehensive understanding of the characteristics of these types of texts; and
(b) a contrastive analysis that applies statistical techniques to quantitative data to
identify significant differences among these textual genres.
section presents the results of the contrastive analysis. Finally, the ‘Conclusions and
future work’ section lays out conclusions and directions for future work.
Literature review
Textual genres written in the administration domain allow citizens and public servants
to communicate with one another. These genres can be classified based on the partici-
pants in an administrative act, the pragmatic aim of documents (e.g. communicating
information, sharing a ruling or sentence, etc.) or the phases of the administration (ini-
tial, processing or concluding phase) (García de Toro, 2009). Our study adopts the for-
mer criterion. It considers five textual genres that laymen use when writing to public
servants which are as follows: an allegation, in which citizens involved in an adminis-
trative procedure defend their rights and legitimate interests; a cover letter, which serves
to introduce an individual, explain why they are a good fit for a job, and accompany a
curriculum vitae (CV); a letter of complaint, which expresses an individual’s dissatis-
faction with the administration or a public service; a claim, which calls for damages to
redress irregularities or unfair treatment, and which tends to be administrative or finan-
cial in nature in this domain, challenging taxes or fines levied by the administration; and
an application, which allows an individual to officially submit a petition to the admin-
istration about a given matter.1
The Spanish-language literature on administrative genres tends to classify genres –
generally based on one or more of the aforementioned criteria, – describe their character-
istics, and provide drafting templates containing real, or more frequently, ad hoc examples
(c.f. Ayala et al., 2000; Castellón, 2001; Ministerio de las Administraciones Públicas,
2003; Montolío and Tascón, 2017; Sánchez Alonso, 2014). Textual genres in the admin-
istration domain can be described and characterized in varying degrees of detail. Often,
these studies describe lexical features (e.g. archaisms and paraphrasing), morphological
features (e.g. the use of specific verb forms), or syntactic features (e.g. the length and
structure of phrases). They also tend to present an outline of the typical text structure as
well as some frequently occurring phrases. Moreover, recent guidelines have called for
administrative documents to use clear, simple language, especially when documents are
for laymen (Montolío and Tascón, 2017).
Some studies have approached administrative texts from the perspective of transla-
tion studies (c.f. Abaitua et al.’s (1997) corpus-based study about government gazettes,
or Eurrutia’s (2016) study about administrative immigration texts). Research fre-
quently uses real texts as case studies, making it qualitative rather than quantitative in
nature, focusing on drafting several linguistic features, and the style of certain admin-
istrative texts. For example, Castellón (2009) analyses clarity, legibility and sexist
language in textual genres addressed to laymen (edicts, administrative letters and reso-
lutions, among others); and Torres (2016) analyses some specific lexical, morphosyn-
tactic and stylistic features in a small diachronic corpus including texts extracted from
government gazettes.
6 Discourse Studies 22(1)
Also, there are authors who study the textual genres in the context of the Spanish
electronic administration. For example, Ferrando Martínez (2013) makes proposals for
some genres (e.g. applications, administrative resolutions and notifications) related to
structure, metadata and confidentiality, among other aspects, from the point of view of
the documentation domain.
With regard to the corpus of our research, like some of the aforementioned studies,
real texts from the administration domain are gathered. Specifically, it is a balanced
corpus including five textual genres which are as follows: allegation, cover letter, let-
ter of complaint, claim and application. To our knowledge, no corpus-based analysis
of these genres in Spanish has been carried out. Unlike other studies, our corpus is
exclusively comprised of texts written by laymen. Concerning the methodology used
in our research, a systematic analysis of textual, lexical and discourse features is per-
formed, using both qualitative and quantitative approaches. The systematic approach
adopted herein also allows for comparison among the analyzed genres, establishing a
methodology for carrying out a contrastive analysis of textual genres from a linguistic
point of view.
Theoretical framework
This section summarizes and presents key features of the complementary theoretical
frameworks that were adopted for this research, related to three linguistic levels:
textual, lexical and discourse levels. In the first place, regarding the textual level, we
start from two approaches. On the one hand, the approach proposed by van Dijk
(1977) is used, which indicates that textual genres usually follow a clearly codified
and widely accepted pattern. For example, the research article usually consists of
Introduction, Problem, Solution and Conclusions. Van Dijk (1989) defines the super-
structure as the organizational structure of a text, which varies depending on the type
of text. In general, the superstructure is shown by means of different sections, some
of them including titles and subtitles. On the other hand, the moves proposed by
Swales (1990) are used to characterize the textual structure of the genres of our cor-
pus, along the same lines as the corpus-based analysis proposed by Biber et al.
(2007). A move ‘represents a stretch of text serving a particular communicative (that
is, semantic) function’ (Upton and Cohen, 2009: 589). Decisions related to the clas-
sification of the moves, as López-Ferrero and Bach (2016: 291) mention, ‘are made
on the basis of linguistic evidence (lexical cues) and comprehension of the text’.
In the second place, concerning the lexical level, we follow the Communicative
Theory of Terminology (CTT) by Cabré (1999), which is a terminology and specialized
discourse theory that highlights the importance of the communicative dimension of spe-
cialized communication. According to CTT, textual genres produced in specialized
domains present some global characteristics, such as precision, concision, systematicity,
impersonality and objectivity. These characteristics have linguistic evidences in texts
through different features. For example, concision can be shown through the use of
da Cunha and Montané 7
initialisms, impersonality can be expressed using the passive voice, and objectivity can
be achieved avoiding subjectivity markers, among other strategies (c.f. Cabré et al.,
2010; da Cunha et al., 2011).
In the third place, with regard to the discourse level, we use Mann and Thompson’s
(1988) Rhetorical Structure Theory (RST), which describes how texts are organized in
terms of discourse relations among text’s discourse segments. As a rule, one of the seg-
ments is more essential to the speaker’s purpose (nucleus), while the other (satellite)
provides some rhetorical information about it. Although relations are not always marked,
they can be indicated explicitly using discourse connectors. Researchers have found that
20%–43% of relations are marked in English, Spanish and Basque, depending on the
corpus (da Cunha et al., 2012a; Iruskieta and da Cunha, 2010; Taboada, 2006).
In our research, in order to carry out the linguistic analysis of the genres included in
the corpus, we assume the main proposals of the complementary aforementioned frame-
works: first, in the textual level, superstructure and moves; second, in the lexical level,
the global characteristics of texts produced in specialized domains; and, third, in the
discourse level, discourse segments, relations and connectors.
Methodology
Building the corpus
As noted earlier, this work considers five administration-related textual genres which are
as follows: allegation, cover letter, letter of complaint, claim and application. The corpus
comprises 20 texts per textual genre for a total of 100 texts from the administration
domain. Corpus statistics are included in Table 1.
Texts were selected from a wide variety of sources, including Spanish universities,
civic associations, lawyers’ associations, documents repositories, and job search sites,
among others. For example, the corpus includes a letter of complaint presented to the
Spanish Ombudsman,2 and an allegation3 and a claim4 presented to different Spanish
local governments.
Categories analyzed
The data were selected and analyzed on the textual, lexical and discourse levels follow-
ing the frameworks mentioned in section ‘Theoretical framework’. Figure 1 presents an
overview of the categories included in the database and analyzed herein.5
Textual level. On the textual level, the following three categories were considered:
•• Sections used to structure the text, that is, each of the sections that could be clearly
identified, either because section breaks were included or because their topic was
distinct. For example, in an allegation, sections included ‘Heading’, ‘Personal
details’, ‘Statement of facts’, ‘Presentation of allegations’ and ‘Conclusion’.
•• Titles included at the beginning of each section. For example, the ‘Statement of
facts’ section of an application includes the title ‘STATES:’.
da Cunha and Montané 9
•• Moves each section includes, that is, semantic and functional textual units with a
specific communicative purpose. For example, the ‘Statement of facts’ section of
a claim includes two moves: ‘Reason(s) for writing the claim’ and ‘Reference to
attachments’.6
Lexical level. On the lexical level, the following categories were considered:
Discourse level. On the discourse level, words, sentences, discourse segments and dis-
course connectors were analyzed for each text. This article adopts the definition of
10 Discourse Studies 22(1)
discourse segment put forward by Tofiloski et al. (2009: 77): ‘Discourse segmentation is
the process of decomposing discourse into elementary discourse units (EDUs), which
may be simple sentences or clauses in a complex sentence, and from which discourse
trees are constructed’. Specifically, we use the criteria for discourse segmentation most
used in Spanish described in da Cunha et al. (2012b), da Cunha and Iruskieta (2010) and
Iruskieta et al. (2015). See, for instance, examples 1 and 2. These examples include sen-
tences in Spanish extracted from two different cover letters of our corpus (with their
corresponding translation into English). In these sentences, the different discourse seg-
ments are indicated in square brackets.
1. [Me pongo en contacto con ustedes para hacerles llegar mi currículum vítae,]
[ya que estoy buscando nuevas oportunidades profesionales.]
[I am writing to share my CV,] [as I am seeking new professional opportunities.]
2. [Tengo el gusto de remitirles mi currículum vítae] [con el objetivo de participar
en el proceso de selección.]
[I am pleased to submit my CV,] [with the objective of expressing my interest in
participating in the recruitment process.]
Our study identifies and classifies connectors related to eight RST discourse relations:
Antithesis, Cause, Concession, Condition, Contrast, Purpose, Restatement and Summary.7
These eight relations were selected because they occur most frequently in the RST Spanish
Treebank (da Cunha et al., 2011) and are also regularly indicated in this corpus using con-
nectors. The discourse connectors associated with the aforementioned relations and con-
sidered for the sake of this study have been extracted from da Cunha et al. (2011, 2012a).
This list of connectors includes both discourse connectors as they are traditionally viewed
(such as porque (‘because’) and si (‘if’)) and more complex types of connectors, such as
those including verbs (e.g. la razón es que (‘the reason is that’)). The complete list of
discourse connectors used in this work is included in Table 3.
Thus, the connectors indicated in Table 3 are searched in the corpus. For instance, in
reference to the examples set out earlier: in example 1, the connector ya que (‘as’), which
expresses Cause, is detected between the two discourse segments included in the sen-
tence, and, in example 2, the connector con el objetivo de (‘with the objective of’), which
expresses Purpose, is found linking the two different discourse segments of the sentence.
Allegation
As explained in section ‘Linguistic characterization of the analyzed genres’, sections,
titles and moves were selected taking a frequency threshold into account, in order to
design a model structure including those that tend to appear most frequently in this genre.
Table 4 presents this model structure, which includes six sections without titles, and a
different number of moves depending on the section, between one and five.9
As shown in Table 5, on the textual level, titles rarely headed a section in allegations,
and each section tended to include more than one move. On the lexical level, a limited
number of initialisms were found, no definitions were identified, and few subjective
units were detected. In this genre, the active voice was primarily used rather than the
passive voice. Moreover, verbs in the first person singular were used more frequently
than verbs in the first person plural. On the discourse level, sentences often comprised
various discourse segments. The most frequent connectors expressed Cause and Purpose.
Table 6 includes the different connectors found in the corpus of allegations for the
eight discourse relations analyzed in this research. The Cause and Purpose connectors
present more variation than the other connectors, since they include five and four differ-
ent connectors, respectively.
Cover letter
Table 7 presents the model structure for the cover letter genre. This structure includes
four sections without titles, and a very different number of moves depending on the sec-
tion, between one and ten.
da Cunha and Montané 13
Table 5. Normalized averages related to analyzed data for the allegation textual genre.
As indicated in Table 8, on the textual level, in cover letters none of the sections
included a title and each section tended to include various moves. On the lexical level,
this genre contained some initialisms, did not include any definitions, and presented a
high frequency of subjective units. Furthermore, this genre predominantly used the
da Cunha and Montané 15
Table 8. Normalized averages related to analyzed data for the cover letter textual genre.
active voice rather than the passive voice. Finally, analysis revealed that first person
singular verbs appeared more frequently than first person plural verbs. On the dis-
course level, we highlight that sentences rarely comprised various discourse segments.
The most frequent connectors expressed Purpose and Cause. There were no connectors
expressing other discourse relations.
Table 9 shows the connectors detected in the corpus of cover letters. Both Cause and
Purpose connectors present variation, although Purpose connectors could be highlighted,
since they present five different variants.
Letter of complaint
Table 10 presents the model structure for the letter of complaint genre. This structure
includes six sections without titles, and one or two moves in each section.
16 Discourse Studies 22(1)
Table 10. Model structure for the letter of complaint textual genre.
Table 11. Normalized averages related to analyzed data for the letter of complaint textual
genre.
On the textual level, as can be observed in Table 11, in letters of complaint, most
sections did not include titles. Furthermore, each section tended to include more than
one move. On the lexical level, the number of initialisms used was very low and defini-
tions were not used. By contrary, we highlight the use of subjective units. Moreover, in
da Cunha and Montané 17
this genre very few verbs in the passive voice were used. Finally, it should be pointed
out that first person plural verbs appeared more frequently than first person singular
verbs. On the discourse level, sentences rarely comprised more than one discourse seg-
ment. The most frequent connectors expressed Cause. All connectors except connectors
expressing Condition were found in the genre.
Table 12 presents the connectors found in the corpus of letters of complaint. The connec-
tors of Cause are the ones that show more variation, with five different connectors detected.
Claim
Table 13 presents the model structure for the claim genre. This structure includes five
sections without titles, and between one and five moves in each section.
On the textual level, claims not always included titles in the sections, as shown in
Table 14. Each section tended to include two moves. On the lexical level, this genre
18 Discourse Studies 22(1)
Table 14. Normalized averages related to analyzed data for the claim textual genre.
contained some initialisms and subjective units, but did not include any definitions.
Moreover, this genre primarily used the active voice and first person singular verbs,
rather than the passive voice and first person plural verbs, which appeared infrequently.
On the discourse level, in claims, sentences often comprised more than one discourse
segment. Furthermore, all analyzed types of connectors were found in this genre,
although the most frequent connectors expressed Cause.
Table 15 includes the connectors found in the corpus of claims. Again, the connectors of
Cause are the ones presenting more variation, in this case with six different connectors detected.
da Cunha and Montané 19
Application
Table 16 presents the model structure for the application genre. This structure includes
four sections, and two of them ( ‘Statement of facts’ and ‘Request’) are headed by a title,
which usually appears with capital letters followed by a colon. These four sections
include a different number of moves, between one and five.
On the textual level, in applications not all sections included a title, but titles were
more frequent than in the other analyzed genres, as can be observed in Table 17. Also,
sections tended to have various moves. On the lexical level, this genre included a high
number of initialisms, but no definitions and few subjective units were identified.
Moreover, this genre predominantly used the active voice rather than the passive voice,
although we highlight that passive voice was often used in comparison to the other ana-
lyzed genres. Finally, verbs in the first person singular were used more frequently than
verbs in the first person plural. On the discourse level, few sentences contained more
than one discourse segment. The application was clearly the genre in which connectors
were used least frequently. The only connectors found expressed Purpose and Cause.
Table 18 shows the connectors found in the corpus of applications. In this case, only
one type of connector for Cause and for Purpose is detected.
Table 17. Normalized averages related to analyzed data for the application textual genre.
is that, when genre pairs were compared, the allegation and the claim presented signifi-
cant differences with three out of four genres. On the contrary, the cover letter, the letter
of complaint and the application presented differences only with two out of four genres.
These differences were detected from the quantitative results of the conducted statistical
tests regarding the textual level, which are included in Appendix 1. For the ‘# of moves’
variable, in general, most of the genres differed significantly among them. The genre that
presented minor differences with respect to the others was the application. For the vari-
able ‘# of titles’, the application was the most different genre, whereas the claim was the
genre presenting fewer differences with respect to the others.
Regarding the lexical level, statistical tests did identify statistically significant differ-
ences for the most of variables among the analyzed textual genres, except for the ‘# of
definitions’ variable, since they usually do not contain definitions. For the ‘# of initial-
isms’ and ‘# of verbs in the active voice’ variables, the application was the genre that
presented the most significant differences with respect to the other genres. For the
da Cunha and Montané 21
variable ‘# of subjective units’, the most different genre was the cover letter. For the ‘#
of verbs in the passive voice’ variable, the genre that differed the most from the other
genres was the letter of complaint. For the ‘# of verbs in the first person singular’ varia-
ble, the cover letter was the only genre that differed from the others, while for the ‘# of
verbs in the first person plural’ variable, the letter of complaint was the only one present-
ing significant differences. Appendix 2 includes detailed results of the statistical tests
regarding the lexical level.
With respect to the discourse level, statistical differences were found for the most of the
variables, with the exception of four types of connectors (Antithesis, Contrast, Restatement
and Summary). Again, the reason seems to be the low frequency of the use of connectors
expressing these discourse relations. For the ‘# of discourse segments’ variable, the cover
letter and the claim differed the most from the other genres. Regarding the ‘# of sentences’
variable, the allegation and the claim were the genres that presented more differences from
the others. In the case of the variables related to connectors, no significant differences
between genres were found. The only exception was the variable ‘# of connectors express-
ing Purpose’, which revealed differences between the cover letter and the claim. Regarding
the variable ‘total # of connectors’, the application was the most different genre. The statis-
tical results regarding the discourse level are shown in Appendix 3.
Finally, the results of a discriminant analysis revealed statistically significant differ-
ences between the five analyzed genres. The classification of textual genres proved to be
rather suitable, since 78.0% of texts were assigned to the correct genre. Cross validation
results were also acceptable, since 69.0% of texts were correctly classified.
Results are depicted graphically in Figure 2. Centroids (that is, the mean discriminant
score for each group) were clearly distinct for the cover letter and application genres,
22 Discourse Studies 22(1)
which fell far from one another and the remaining genres. Centroids for the other three
genres fell closer to one another, although the distance was furthest for the letter of com-
plaint genre. Centroids for the allegation and claim genres nearly overlapped, implying
that the genres have more similar textual, lexical and discourse features.
Texts from different genres overlapped in nearly all cases. Nevertheless, the least
dispersion was found for the allegation and claim, implying that texts in these genres
most resemble each other. Conversely, the application, cover letter and letter of com-
plaint presented the greatest degree of dispersion, indicating that the characteristics of
these texts tended to differ.
since in this genre the authors tend to present their professional background through
units expressing subjectivity; for example, verbs such as merecer (‘deserve’) or expres-
sions such as en mi opinión (‘in my opinion’). Regarding verbs in the active voice, a
particularly noteworthy case is the application, which has fewer such verbal forms, since
a more personal tone tend to be used in this genre. As regards the passive voice, it is
rarely used in the letter of complaint, maybe because the authors try to offer a more direct
information. The main difference regarding verbs in the first person singular is found
when comparing the cover letter with the other genres, since, as mentioned, in this genre,
authors explain their professional background; to do it, the first person singular is the
most suitable form. Finally, in the case of first person plural verbs, significant differences
are found between the letter of complaint and the other genres. This result is surprising,
since single individuals usually lodge complaints. One possible explanation could be that
the letters of complaint included in our corpus are sometimes sent on behalf of various
individuals, such as civic associations.
On the discourse level, statistically significant differences are found between certain
genres, as explained below. Regarding the number of sentences, the allegation and the
claim stand out because they are the genres with fewer sentences. These two genres con-
tain the highest number of words, which means that their sentences are very long. The
case of the claim is noteworthy because, additionally, the amount of discourse segments
is also low. That means that the sentences are really complex. On the contrary, the cover
letter presents significant differences with the other genres regarding discourse segments,
since it includes a higher number. Concerning connectors, the only significant differences
are detected in the total number of connectors and in the use of Purpose connectors. With
respect to the total number of connectors, the most different genre is the application, espe-
cially with respect to the allegation and the cover letter, since the application is the genre
containing less connectors. The reason is that it includes a more direct and simplified
discourse, expressing the different reasons for submitting a petition as a list of items. By
contrast, the allegation and the cover letter present a more elaborated discourse, using
connectors to relate the different ideas in the text. Regarding the use of Purpose connec-
tors, the only significant difference is found between cover letter and claim. This is likely
due to the fact that Purpose connectors are frequently used in cover letters to explain the
reasoning for submitting the CV attached to these type of letters. On the contrary, claims
tend to state their aims directly rather than explaining their reasoning.
This study’s third contribution is the discriminant analysis, which offers a global
overview of the statistically significant linguistic differences among the five analyzed
genres. As seeing in ‘Contrastive analysis results’ section (Figure 2), texts from different
genres overlapped, but the centroids for the cover letter and application were especially
distant, both from one another and from the other three textual genres. As a result, they
can be considered the most different textual genres that were analyzed.
Currently, results from this corpus-based research are being used in a research project
related with the use of technology on writing. Its aim is to design and roll out a tool to assist
in automatically drafting administrative texts in Spanish (da Cunha et al., 2017).10 The tool
includes recommendations based on the features identified for each of these five genres. The
results of our article could also lay the groundwork for allowing public servants to improve
and draft resources and materials (such as online templates) to assist laymen who need to
24 Discourse Studies 22(1)
write these types of administrative texts. Subsequently, an English corpus including the
same textual genres will be used to replicate this research and carry out a contrastive study
of textual genres in the administrative domain in English and Spanish.
Acknowledgements
This research has been developed in the framework of the ACTUALing and IULATERM research
groups. The authors would like to thank Josh Goldsmith for the translation of the text, Sheila
Queralt for the statistical advice, and Mikel Iruskieta for his insightful and valuable comments
about this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/
or publication of this article: This article is part of the ‘Automatic system to help in writing special-
ized texts in domains relevant to Spanish society’ (‘Un sistema automático de ayuda a la redacción
de textos especializados de ámbitos relevantes en la sociedad española actual’) research project,
which received a ‘2015 BBVA Foundation Grants for Researchers and Cultural Creators’
(‘Convocatoria 2015 de Ayudas Fundación BBVA a Investigadores y Creadores Culturales’) grant.
This work was also supported by a Ramón y Cajal contract (RYC-2014-16935), associated with
the Departamento de Filologías Extranjeras y sus Lingüísticas at the Universidad Nacional de
Educación a Distancia (UNED).
Notes
1. These definitions were adapted from the Diccionari de dret administratiu (Departament de
Justícia and TERMCAT Centre de Terminologia, 2014).
2. http://es.slideshare.net/almelini/queja-a-defensor-del-pueblo-presentation (accessed May
2019).
3. https://es.slideshare.net/AuditoriaVLC/234861990-alplenodelayuntamientodevaldemoro
(accessed May 2019).
4. https://es.slideshare.net/chazaragoza/modelo-reclamacin-contra-el-presupuesto (accessed
May 2019).
5. In this article, the symbol # is used to indicate ‘number’.
6. All the moves were validated by the authors of this article.
7. To see the definitions of these relations, we recommend to access to the RST website: http://
www.sfu.ca/rst/index.html (accessed May 2019).
8. This method is frequently used in corpus linguistics, for example, in the British National
Corpus, the Corpus of Historical American English, and the Corpus of Contemporary
American English, among others (Molina and Sierra, 2015).
9. Literal English translations of the Spanish model structures are included in this article. We are
aware that the structures (sections, titles and moves) for the same textual genres may differ
in English, and that a textual analysis of an English language corpus would be necessary to
linguistically characterize texts written in this language.
10. This tool, named arText system, is available at: http://sistema-artext.com/ (accessed May 2019).
References
Abaitua JK, Casillas A and Martínez R (1997) Tratamiento de textos administrativos bilingües: el
proyecto Legebidun. Philologia hispalensis 11(2): 115–130.
Alarcón R (2009) Descripción y evaluación de un sistema basado en reglas para la extracción
automática de contextos definitorios. Barcelona: Institut Universitari de Lingüística Aplicada.
da Cunha and Montané 25
Atserias J, Casas B, Comelles E, et al. (2006) FreeLing 1.3. Syntactic and semantic services in an
open-source NLP library. In: LREC 2006 proceedings. 5th Edition of the international confer-
ence on language resources and evaluation (eds Schuurman I and Vandeghinste V), Genoa,
22–28 May, pp. 48–55. Paris: European Language Resources Association (ELRA).
Ayala P, Domínguez E, Martel F, et al. (2000) Manual de normalización de documentos adminis-
trativos. Las Palmas: Universidad de Las Palmas de Gran Canaria.
Barón FJ and Téllez F (2004) Apuntes de Bioestadística. Málaga: Universidad de Málaga.
Bhatia VK (1993) Analyzing Genre: Language Use in Professional Settings. London: Longman.
Biber D, Connor U and Upton TA (2007) Discourse on the Move: Using Corpus Analysis to
Describe Discourse Structure. Amsterdam: John Benjamins.
Biber D, Conrad S and Reppen R (1998) Corpus Linguistics: Investigating Language Structure
and Use. Cambridge: Cambridge University Press.
Cabré MT (1999) La Terminología. Representación y comunicación. Barcelona: Institut
Universitari de Lingüística Aplicada.
Cabré MT, Bach C, da Cunha I, et al. (2010) Comparación de algunas características lingüís-
ticas del discurso especializado frente al discurso general: el caso del discurso economic.
In: Caballero R and Pinar MJ (eds) Modos y formas de la comunicación humana, Ways and
Modes of Human Communication. Ciudad Real: Universidad de Castilla-La Mancha, pp.
453–460.
Castellón H (2001) El lenguaje administrativo. Formas y uso. Granada: Editorial La Vela.
Castellón H (2009) Hacia la claridad en los textos administrativos. Revista de Llengua i Dret 52:
85–115.
da Cunha I and Iruskieta M (2010) Comparing rhetorical structures in different languages: The
influence of translation strategies. Discourse Studies 12(5): 563–598.
da Cunha I and Montané MA (2019) Textual genres and writing difficulties in specialized domains.
Revista Signos 52(99): 4–30.
da Cunha I, Montané MA and Hysa L (2017) The arText prototype: An automatic system for
writing specialized texts. In: Proceedings of the 15th conference of the European Chapter of
the Association for Computational Linguistics EACL 2017. Software demonstrations 152017
(eds Peñas A and Martins A), Valencia, 3–7 April, pp. 57–60. Valencia: Association for
Computational Linguistics.
da Cunha I, SanJuan E, Torres-Moreno JM, et al. (2012a) A symbolic approach for automatic
detection of nuclearity and rhetorical relations among intra-sentence discourse segments in
Spanish. Lecture Notes in Computer Science 7181: 462–474.
da Cunha I, SanJuan E, Torres-Moreno JM, et al. (2012b) DiSeg 1.0: The first system for Spanish
discourse segmentation. Expert Systems With Applications 39(2): 1671–1678.
da Cunha I, Torres-Moreno JM and Sierra G (2011) On the development of the RST Spanish
treebank. In: Proceedings of the 5th linguistic annotation workshop. The annual meeting of
the ACL49, Portland, OR, 19–24 June, pp. 1–10. Portland, OR: Association for Computational
Linguistics.
Departament de Justícia and TERMCAT Centre de Terminologia (2014) Diccionari de dret
administratiu. Barcelona: TERMCAT Centre de Terminologia.
Eurrutia M (ed.) (2016) El lenguaje jurídico y administrativo en el ámbito de la extranjería.
Estudio multilingüe e implicaciones culturales. Berna: Peter Lang.
Ferrando Martínez R (2013) El documento administrativo, su contexto electrónico, tecnológico y
normativo: una propuesta de cambio de paradigma. PhD Thesis, Universidad de Murcia, Murcia.
García de Toro C (2009) La traducción entre lenguas en contacto. Catalán y Español. Berna:
Peter Lang.
Giraldo JJ (2008) Análisis y descripción de las siglas en el discurso especializado de genoma
humano y medio ambiente. Barcelona: Institut Universitari de Lingüística Aplicada.
26 Discourse Studies 22(1)
Author biographies
Iria da Cunha holds a PhD in Applied Linguistics (Universitat Pompeu Fabra, 2008). She is a
Ramón y Cajal researcher at the Foreign Languages Department of the Universidad Nacional de
Educación a Distancia (UNED) in Spain. Her main fields of research are Specialized Discourse,
Textual Genres, Academic Writing, Terminology and Natural Language Processing (NLP). She is
a member of the ACTUALing and IULATERM research groups.
M Amor Montané holds a PhD in Applied Linguistics (Universitat Pompeu Fabra, 2012). She is a mem-
ber of the IULATERM research group at Institut de Lingüística Aplicada (IULA-CER) of Universitat
Pompeu Fabra, and a researcher at the Institut d’Estudis Catalans (IEC). She is also a lecturer at
Universitat de Barcelona (UB) and at Universitat Oberta de Catalunya (UOC). Her main fields of
research are Specialized Discourse, Academic Writing, Corpus Linguistics, Neology and Terminology.
da Cunha and Montané 27
Appendix 1
Results of the statistical tests regarding variables of the textual level
Results of Tukey’s HSD test for variables presenting significant differences in the ANOVA test.
Results of Dunnett’s T3 test for variables presenting significant differences in the Kruskal-Wallis
test.
Appendix 2
Results of the statistical tests regarding variables of the lexical level
Results of Tukey’s HSD test for variables presenting significant differences in the ANOVA test.
Results of Dunnett’s T3 test for variables presenting significant differences in the Kruskal-Wallis
test.
(Continued)
da Cunha and Montané 29
Appendix 2. (Continued)
Appendix 3
Results of the statistical tests regarding variables of the discourse level
Results of Tukey’s HSD test for variables presenting significant differences in the ANOVA test.
(Continued)
30 Discourse Studies 22(1)
Appendix 3. (Continued)
(Continued)
da Cunha and Montané 31
Appendix 3. (Continued)